finsec icon indicating copy to clipboard operation
finsec copied to clipboard

AttributeError: 'NoneType' object has no attribute 'find_all'

Open philsv opened this issue 1 year ago • 6 comments

Python: 3.10 OS: ubuntu 22.04 (LTS) running in bitbucket pipelines image: atlassian/default-image:4

Tried running on windows and WSL2 with no issues, but as I wanted to automate this script via bitbucket pipelines,

from finsec import Filing
cik = "0001536411"
filing = Filing(cik)
filing_df = filing.latest_13f_filing

I encountered the following traceback issue (prob related to beautifulsoup?):

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/atlassian/pipelines/agent/build/src/modules/sec.py", line 108, in <module>
    df = get_13f_filing(duquesne_family_office_llc, session)
  File "/opt/atlassian/pipelines/agent/build/src/modules/sec.py", line 43, in get_13f_filing
    filing_df = filing.latest_13f_filing
  File "/usr/local/lib/python3.10/dist-packages/finsec/filing.py", line 16, in latest_13f_filing
    return self.get_latest_13f_filing()
  File "/usr/local/lib/python3.10/dist-packages/finsec/base.py", line 181, in get_latest_13f_filing
    self._get_last_100_13f_filings_url()
  File "/usr/local/lib/python3.10/dist-packages/finsec/base.py", line 41, in _get_last_100_13f_filings_url
    for headers in results_table.find_all('th'):
AttributeError: 'NoneType' object has no attribute 'find_all'

Could you replicate this bug on a linux machine? What could be the issue here?

philsv avatar May 07 '24 19:05 philsv

Hi @philsv I haven't had a chance to replicate and test this yet. I was wondering if you can provide me some more details? How are you going about installing the finsec package when using the cicd pipeline? I suspect this could be in the .yml file you have? Can you share a snippet of the script that is run for finsec install in the pipeline?

git-shogg avatar May 07 '24 22:05 git-shogg

Sure @git-shogg I have tested it with the following pipeline configuration and got the above error message.

bitbucket-pipelines.yml

image: atlassian/default-image:4

pipelines:
  custom:
    sec_test:
      - step:
          name: "[Ubuntu 22.04] Update sec_test 13f filings"
          caches:
            - pip
          script:
              - apt-get update
              - apt-get install -y --no-install-recommends software-properties-common python3.10-distutils python3-pip
              - export DEBIAN_FRONTEND=noninteractive
              - pip3 install --upgrade pip
              - pip3 install --no-cache-dir --use-feature=fast-deps finsec
              - python3.10 -m src.modules.sec_test

Here are my dependencies that got installed via pipeline:

pip3 install --no-cache-dir --use-feature=fast-deps finsec

WARNING: pip is using lazily downloaded wheels using HTTP range requests to obtain dependency information. This experimental feature is enabled through --use-feature=fast-deps and it is not ready for production.

Collecting finsec

  Downloading finsec-0.0.9-py3-none-any.whl.metadata (10 kB)

Collecting beautifulsoup4>=4.11.1 (from finsec)

  Downloading beautifulsoup4-4.12.3-py3-none-any.whl.metadata (3.8 kB)

Collecting pandas>=1.3.5 (from finsec)

  Downloading pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (19 kB)

Collecting requests>=2.27.1 (from finsec)

  Downloading requests-2.31.0-py3-none-any.whl.metadata (4.6 kB)

Collecting lxml>=4.8.0 (from finsec)

  Downloading lxml-5.2.1-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (3.4 kB)

Collecting openpyxl>=3.0.9 (from finsec)

  Downloading openpyxl-3.1.2-py2.py3-none-any.whl.metadata (2.5 kB)

Collecting soupsieve>1.2 (from beautifulsoup4>=4.11.1->finsec)

  Downloading soupsieve-2.5-py3-none-any.whl.metadata (4.7 kB)

Collecting et-xmlfile (from openpyxl>=3.0.9->finsec)

  Downloading et_xmlfile-1.1.0-py3-none-any.whl.metadata (1.8 kB)

Collecting numpy>=1.22.4 (from pandas>=1.3.5->finsec)

  Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (61 kB)

     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 88.4 MB/s eta 0:00:00

Collecting python-dateutil>=2.8.2 (from pandas>=1.3.5->finsec)

  Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl.metadata (8.4 kB)

Collecting pytz>=2020.1 (from pandas>=1.3.5->finsec)

  Downloading pytz-2024.1-py2.py3-none-any.whl.metadata (22 kB)

Collecting tzdata>=2022.7 (from pandas>=1.3.5->finsec)

  Downloading tzdata-2024.1-py2.py3-none-any.whl.metadata (1.4 kB)

Collecting charset-normalizer<4,>=2 (from requests>=2.27.1->finsec)

  Downloading charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (33 kB)

Collecting idna<4,>=2.5 (from requests>=2.27.1->finsec)

  Downloading idna-3.7-py3-none-any.whl.metadata (9.9 kB)

Collecting urllib3<3,>=1.21.1 (from requests>=2.27.1->finsec)

  Downloading urllib3-2.2.1-py3-none-any.whl.metadata (6.4 kB)

Collecting certifi>=2017.4.17 (from requests>=2.27.1->finsec)

  Downloading certifi-2024.2.2-py3-none-any.whl.metadata (2.2 kB)

Requirement already satisfied: six>=1.5 in /usr/lib/python3/dist-packages (from python-dateutil>=2.8.2->pandas>=1.3.5->finsec) (1.16.0)

Downloading finsec-0.0.9-py3-none-any.whl (13 kB)

Downloading beautifulsoup4-4.12.3-py3-none-any.whl (147 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 147.9/147.9 kB 171.8 MB/s eta 0:00:00

Downloading lxml-5.2.1-cp310-cp310-manylinux_2_28_x86_64.whl (5.0 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.0/5.0 MB 134.3 MB/s eta 0:00:00

Downloading openpyxl-3.1.2-py2.py3-none-any.whl (249 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 250.0/250.0 kB 342.3 MB/s eta 0:00:00

Downloading pandas-2.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.0 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.0/13.0 MB 265.0 MB/s eta 0:00:00

Downloading requests-2.31.0-py3-none-any.whl (62 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.6/62.6 kB 291.3 MB/s eta 0:00:00

Downloading certifi-2024.2.2-py3-none-any.whl (163 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 163.8/163.8 kB 356.4 MB/s eta 0:00:00

Downloading charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (142 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 142.1/142.1 kB 334.7 MB/s eta 0:00:00

Downloading idna-3.7-py3-none-any.whl (66 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 66.8/66.8 kB 295.6 MB/s eta 0:00:00

Downloading numpy-1.26.4-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (18.2 MB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.2/18.2 MB 216.0 MB/s eta 0:00:00

Downloading python_dateutil-2.9.0.post0-py2.py3-none-any.whl (229 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 229.9/229.9 kB 382.7 MB/s eta 0:00:00

Downloading pytz-2024.1-py2.py3-none-any.whl (505 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 505.5/505.5 kB 413.8 MB/s eta 0:00:00

Downloading soupsieve-2.5-py3-none-any.whl (36 kB)

Downloading tzdata-2024.1-py2.py3-none-any.whl (345 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 345.4/345.4 kB 423.5 MB/s eta 0:00:00

Downloading urllib3-2.2.1-py3-none-any.whl (121 kB)

   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 121.1/121.1 kB 385.3 MB/s eta 0:00:00

Downloading et_xmlfile-1.1.0-py3-none-any.whl (4.7 kB)

Installing collected packages: pytz, urllib3, tzdata, soupsieve, python-dateutil, numpy, lxml, idna, et-xmlfile, charset-normalizer, certifi, requests, pandas, openpyxl, beautifulsoup4, finsec

WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv

Successfully installed beautifulsoup4-4.12.3 certifi-2024.2.2 charset-normalizer-3.3.2 et-xmlfile-1.1.0 finsec-0.0.9 idna-3.7 lxml-5.2.1 numpy-1.26.4 openpyxl-3.1.2 pandas-2.2.2 python-dateutil-2.9.0.post0 pytz-2024.1 requests-2.31.0 soupsieve-2.5 tzdata-2024.1 urllib3-2.2.1

sec_test.py

from finsec import Filing

cik = "0001536411"
filing = Filing(cik)
filing_df = filing.latest_13f_filing
print(filing_df.head(5))
print("Done!")

Tested it with different images e.g. image: python:3.11.3-slim same error as a result.

philsv avatar May 08 '24 13:05 philsv

Hi @philsv , thanks for sending this through. I think I may have identified the issue... Firstly, the script below is working just fine on my end (with Python 3.10).


from finsec import Filing

cik = "0001536411"
filing = Filing(cik)
filing_df = filing.latest_13f_filing
print(filing_df.head(5))
print("Done!")

I suspect it is something to do with a change that has been made to the yml file. I can see in the initial error message that the yml file is pointing to src.modules.sec (that is sec.py not the sec_test.py file you sent through) and this sec.py file is executing a line I am unfamiliar with:

df = get_13f_filing(duquesne_family_office_llc, session)

This line appears to be executing a function that sits directly in the base.py file of the finsec package (generally I'd recommend using the off-the-shelf functions included in the package - as you have with sec_test.py above). The function get_13f_filing only accepts self and qtr_year as arguments so suspect that passing duquesne_family_office_llc and session as arguments is most likely the cause of the issue.

def get_13f_filing(self, qtr_year: str):
        """Returns the requested 13F-HR filing."""

Can I ask that you re-test ensuring the yml file is pointing to test_sec.py and that test_sec.py aligns with exactly what you've sent me above? Let me know how you go with it?

git-shogg avatar May 09 '24 23:05 git-shogg

@git-shogg

get_13f_filing(duquesne_family_office_llc, session) is a custom function to upload my results to my db. I have tried to change the name but I can say that, this is not the issue.

As still with sec_test.py I get the same error:

+ python3.10 -m src.modules.sec_test
Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/opt/atlassian/pipelines/agent/build/src/modules/sec_test.py", line 5, in <module>
    filing_df = filing.latest_13f_filing
  File "/usr/local/lib/python3.10/dist-packages/finsec/filing.py", line 16, in latest_13f_filing
    return self.get_latest_13f_filing()
  File "/usr/local/lib/python3.10/dist-packages/finsec/base.py", line 181, in get_latest_13f_filing
    self._get_last_100_13f_filings_url()
  File "/usr/local/lib/python3.10/dist-packages/finsec/base.py", line 41, in _get_last_100_13f_filings_url
    for headers in results_table.find_all('th'):
AttributeError: 'NoneType' object has no attribute 'find_all'

also have no problems using windows, WSL2 or linux on my own system. I also have tested it within docker no problem.

Only when using bitbucket pipelines I encounter this issue on different images, same error unfortunatly. I can't really point to anything why this happens.

Can you show me your bitbucket-pipeline.yml with which your code worked?

Edit: For me it definitely has to do something with how bitbucket CI/CD is set up or how the images I have used are set up.

I've tested the sec.py and sec_test.py with the bitbucket-pipeline.yml I used prev with my self run windows runner

sec_self_hosted:
  - step:
      name: "Update sec 13f filings"
      runs-on:
        - self.hosted
        - windows
      caches:
        - pip
      script:
          - python -m src.modules.sec

and the pipeline went through without any errors.

philsv avatar May 10 '24 22:05 philsv

Hi @philsv , I spooled up an Ubuntu 22 server and tested this line by line to get to the bottom of this issue so thank you for your patience. The issue relates to the user agent that is passed to the SEC website. In general, the SEC requires that users identify themselves with their own user agent being passed to the website (generally this is not required when running local instances), this can simply be your name and email.

I have now updated and released a new version of the finsec package (v0.0.10) allowing you to easily pass in your own user agent when running on any server. Code would be updated as follows.


from finsec import Filing

cik = "0001536411"
filing = Filing(cik,declared_user="Joe Blog [email protected]")
filing_df = filing.latest_13f_filing
print(filing_df.head(5))
print("Done!")

git-shogg avatar May 15 '24 12:05 git-shogg

@git-shogg Amazing it works now, thank you very much for the update! :)

philsv avatar May 15 '24 13:05 philsv