sec-parser icon indicating copy to clipboard operation
sec-parser copied to clipboard

Parse SEC EDGAR HTML documents into a tree of elements that correspond to the visual (semantic) structure of the document.

Results 8 sec-parser issues
Sort by recently updated
recently updated
newest added

### Problem The regular expression employed in the top section manager for 10q needs modification, specifically to eliminate accented characters from both the regular expression itself and the input being...

for-internal-team
status:deferred

### Problem For MSFT 0000950170-23-014423, the top section title "PART I. FINANCIAL INFORMATION " is identified as two semantic elements: { "cls_name": "TopSectionTitle", "level": 0, "section_type": "part1", "text_content": "PART I....

for-internal-team
status:deferred

### Discussed in https://github.com/orgs/alphanome-ai/discussions/56 Originally posted by **Elijas** November 24, 2023 # Example document https://www.sec.gov/Archives/edgar/data/1675149/000119312518236766/d828236d10q.htm ```html Options to purchase 1 million shares of common stock at a weighted average exercise...

contributions-welcome
feature:elements

Fixed snapshot verify tests via https://github.com/orgs/alphanome-ai/discussions/60. Updated test data PR - https://github.com/alphanome-ai/sec-parser-test-data/pull/4 New pull request for [issue # 65](https://github.com/alphanome-ai/sec-parser/issues/65.).

This PR contains changes as in commit https://github.com/alphanome-ai/sec-parser/commit/391f8cf3400ee488399c0c1bf328987240b53035 of PR #73. --- PR got out of sync for that specific branch https://github.com/alphanome-ai/sec-parser/pull/73. To overcome this, @deenaawny-github-account's commit https://github.com/alphanome-ai/sec-parser/pull/73/commits/391f8cf3400ee488399c0c1bf328987240b53035 was recreated.

# Context [MSFT accuracy-test](https://github.com/alphanome-ai/sec-parser-test-data/blob/main/10-Q/MSFT/0000950170-23-014423/actual-structure-and-text_summary.json) ([permalink](https://github.com/alphanome-ai/sec-parser-test-data/blob/3a3b2616be9569f62d7d1028662fe19ded400601/10-Q/MSFT/0000950170-23-014423/actual-structure-and-text_summary.json) at the time of posting) # Problem The header "PART I" is identified as `top section title element`, when it should be identified as `page...

contributions-welcome
status:in-progress
feature:elements

Annotate filings from these links - https://www.sec.gov/Archives/edgar/data/1675149/000119312518236766/d828236d10q.htm - https://www.sec.gov/Archives/edgar/data/19617/000001961723000432/jpm-20230630.htm - https://github.com/orgs/alphanome-ai/discussions/58 - https://github.com/orgs/alphanome-ai/discussions/48

for-internal-team

# Context [MSFT accuracy-test](https://github.com/alphanome-ai/sec-parser-test-data/blob/main/10-Q/MSFT/0000950170-23-014423/actual-structure-and-text_summary.json) ([permalink](https://github.com/alphanome-ai/sec-parser-test-data/blob/3a3b2616be9569f62d7d1028662fe19ded400601/10-Q/MSFT/0000950170-23-014423/actual-structure-and-text_summary.json) at the time of posting) # Problem Titles come out as two separate title elements ``` { "text_content": "PART I. FINANCI" }, { "text_content": "AL...

contributions-welcome
feature:elements