Skip to content

fix: keep Put/Call positions distinct in 13F-HR holdings (#824)#828

Open
0ywfe wants to merge 1 commit into
dgunning:mainfrom
0ywfe:fix/issue-824-13f-putcall-aggregation
Open

fix: keep Put/Call positions distinct in 13F-HR holdings (#824)#828
0ywfe wants to merge 1 commit into
dgunning:mainfrom
0ywfe:fix/issue-824-13f-putcall-aggregation

Conversation

@0ywfe
Copy link
Copy Markdown
Contributor

@0ywfe 0ywfe commented May 22, 2026

Summary

Fixes #824. ThirteenF.holdings was merging Put/Call positions into equity rows for the same CUSIP, then losing the PutCall value entirely because the categorical conversion used uppercase labels that didn't match the SEC XML format.

Root cause

Three issues in edgar/thirteenf/models.py aggregation logic (around line 454):

  1. df.groupby('Cusip', as_index=False) collapsed Put/Call into the same-CUSIP equity row
  2. agg_dict['PutCall'] = 'first' preserved only one value when multiple Put/Call/equity rows existed
  3. pd.Categorical(..., categories=['', 'PUT', 'CALL']) used uppercase while SEC XML emits title-case 'Put'/'Call' — categorical conversion silently dropped values that didn't match

Fix

  • Group by ['Cusip', 'PutCall'] when PutCall is present, so option positions stay separate from equity for the same security
  • Skip PutCall in agg_dict (it's now a grouping key)
  • Use title-case ['', 'Put', 'Call'] categories to match SEC XML

Verification

SG Capital Management LLC 13F-HR/A (accession 0001172661-24-002551) is the existing test fixture used by tests/test_thirteenf.py::test_thirteenf_put_call. Its raw infotable contains 3 Put positions.

Before fix: those 3 Puts were merged into same-CUSIP equity rows; the PutCall == 'Put' query on .holdings returned 0 rows.
After fix: .holdings.query(\"PutCall == 'Put'\") returns 3 rows with distinct CUSIPs.

Test plan

  • New regression test tests/issues/regression/test_issue_824_13f_putcall_aggregation.py (3 cases, @pytest.mark.network):
    • Asserts 3 distinct Put rows survive aggregation
    • Asserts PutCall values use title case (matches existing test_thirteenf_put_call ground truth)
    • Baseline check that infotable still shows 3 Puts (catches if SEC data shape changes)
  • Existing test_thirteenf_put_call still passes
  • All 16 fast tests in test_thirteenf.py + test_thirteenf_funds.py + test_thirteenf_rendering.py pass

ThirteenF.holdings grouped by Cusip alone, so Put/Call rows merged into
the same security's equity row and the PutCall column was lost on the
merged result. Categories used uppercase 'PUT'/'CALL' while SEC XML emits
title-case 'Put'/'Call', so the categorical conversion silently dropped
those values too.

Group by ['Cusip', 'PutCall'] when the column exists, drop PutCall from
agg_dict (it's now a grouping key), and fix the category labels to match
SEC XML.

Regression test asserts SG Capital Management 13F-HR/A
(accession 0001172661-24-002551) keeps its 3 Put positions distinct in
the aggregated view, matching the existing infotable-level test.
@0ywfe 0ywfe force-pushed the fix/issue-824-13f-putcall-aggregation branch from fbe446c to 8670056 Compare May 23, 2026 11:45
@0ywfe
Copy link
Copy Markdown
Contributor Author

0ywfe commented May 23, 2026

test-network failure was real, not a flake — two existing tests in tests/thirteenf/test_holdings_aggregation.py (test_holdings_aggregation_math_correct and test_holdings_preserves_all_securities) encoded the old "1 row per CUSIP" assumption that this fix corrects.

State Street's 13F has both equity and option positions on the same CUSIPs (e.g. 02079K305 / GOOGL), so the new aggregation produces multiple rows per CUSIP — that's the correct behaviour per #824, but it breaks tests written before the bug was identified.

Amended into the existing commit (force-pushed):

  • test_holdings_aggregation_math_correct: now groups by (CUSIP, PutCall) to find multi-entry groups, filters by both keys when comparing infotable vs holdings rows
  • test_holdings_preserves_all_securities: kept the unique-CUSIP check, replaced the len(holdings) == unique_cusips line with len(holdings) == groups_by_(CUSIP, PutCall) (allows equity + Put + Call on same security)

All 11 tests now pass locally (1.4s): the 3 new regression tests + 7 existing aggregation tests + the existing test_thirteenf_put_call. CI should be clean on re-run.

Copy link
Copy Markdown
Owner

@dgunning dgunning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean fix and the regression tests are well-scoped (real fixture, specific value assertions, infotable baseline guard). CI's green, the three root causes are correctly diagnosed, and the test updates to test_holdings_aggregation_math_correct / test_holdings_preserves_all_securities properly track the new (CUSIP, PutCall) key.

Two requests before merging:

1. Add the column-order preservation from dima's proposed fix. The issue body included this and it's missing here:

if 'PutCall' in holdings.columns:
    cols = [c for c in holdings.columns if c != 'PutCall']
    insert_after = 'Ticker' if 'Ticker' in cols else (id_cols[-1] if id_cols[-1] in cols else cols[-1])
    idx = cols.index(insert_after) + 1
    cols.insert(idx, 'PutCall')
    holdings = holdings[cols]

Without it, pandas' groupby places PutCall right after Cusip (as a group key), shifting it from its prior position. That's an undocumented behavior change on top of the correctness fix — affects positional column access, table rendering order, and any notebook code that hardcodes column indices.

2. Credit dima in the commit message. The technical diff in this PR is theirs — they wrote it in the issue body. A Co-Authored-By: dima <email> trailer (or even just a "Fix proposed by @dima in #824" line) would be the right call.

Behavior-change note for the v5.32.0 release notes (no action needed from you, just flagging for the maintainer): .holdings can now contain multiple rows per CUSIP (equity + Put + Call). Anyone doing .holdings.set_index('Cusip') will hit a duplicate-index error. This is the correct semantics — option positions are distinct securities — but it's an API contract change worth surfacing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PutCall values are not handled properly in 13F-HR holdings()

2 participants