Block_Codes icon indicating copy to clipboard operation
Block_Codes copied to clipboard

This depository uses SEC EDGAR data in Schedule 13D and Schedule 13G data to find all positions above 5% in all US stocks between 1994 and 2018.

Block_Codes

This GitHub page describes construction of the data in the paper "Is Blockholder Diversity Detrimental?" by Miriam Schwartz-Ziv and Ekaterina Volkova (2020)

The most recent version of the paper is avaliable as SSRN: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3621939

Step 1. Download Files.

  • download_forms.R file downloads sc13d/13g files and their amendments and puts them into SQL database.
  • this file downloads the list of all forms for each year from SEC website, the only thing you need to specify is a range of years in loop and working directory
  • code is slow and takes up to several hours to complete. To make sure, that I get all posible files, I download each file twice from master file for filer and for subject.

Step 2. Extract and Convert Main Filings.

  • extract_body_form.R extracts main filing from complete submission files and convert .htm to plain text format if needed.
  • I put output into another SQL database.

Step 3. Parse SEC Header.

  • pasing_SEC_header.R extracts filer and subject information from the form
  • This script could be used for data extraction from other forms
  • I have a blog entry about this function (https://orhahog.wordpress.com/2016/11/26/parsing-sec-header/)

Step 4. Extract CUSIP from the filings.

  • extract_CUSIP.R script returns six and eight digit CUSIP from SEC filings.
  • Output of this part is a CIK-CUSIP map, which could be downloaded in .csv format from my website (www.evolkova.info)

Step 5. Extract size of the block positon.

  • parsing_prc_position.R extracts the aggregate block size from the filing.

Step 6. Extract identity of blockholders.

  • parsing_block_identity.R extracts identity of blockholders from the information in the question 12.
  • here is the list of all identities (https://www.law.cornell.edu/cfr/text/17/240.13d-102)