factorsSPGMI and stocksCRSP incorrect sector assignments
A subset of securities have incorrect Sectors and GICS data. This appears to be the result of human error in creating data sets...my best guess is that S&P used the ticker (or TickerLast) to match CRSP data to their data sets and a subset of securities were mismatched due to ticker recycling, etc.
This can be seen in the file stocksTickers310GICSgovindSPGMI.xlsx, in the Sandbox, which has incorrect mappings for several securities (observe row 280, ticker STJ, which has matched "St Jude Medical Inc" to "St James's Place plc for example).
Subsequent to the original data set creation, however, there appears to have been some human intervention to clean up a few securities. I identified 11 securities in the original stocksTickers310GICSgovindSPGMI.xlsx with issues, but a few of them are now correct in stocksCRSP and factorsSPGMI so they must have been fixed later?
Note that both the Sector name and the GICS number appear to be incorrect in some cases.
Fixing this will result in cleaner data in the FactorAnalytics package, a better ability to merge data sets with vendor sources later, and will allow resolution of additional issues with factorsSPGMI and stocksCRSP. The following table lists the securities where there remain uncorrected data.
| TickerLast | Assigned Sector | Correct Sector | Assigned GICS | Correct GICS |
|---|---|---|---|---|
| AVP | Industrials | Consumer Staples | 20101010 | 30302010 |
| CSH | Information Technology | Financials | 45103010 | 40202010 |
| CTS | Information Technology | Information Technology | 45102010 | 45203020 |
| PIR | Financials | Consumer Discretionary | 40301040 | 25504060 |
| RTN | Consumer Discretionary | Industrials | 25301040 | 20101010 |
| STJ | Financials | Healthcare | 40203010 | 35101010 |
| TSS | Industrials | Information Technology | 20102010 | 45102020 |
Once this is fixed, it will be linked to issue #73 in that there will be 1 real estate, ~1 financials, and ~3 utilities stocks in the final sample , requiring replacement of ~5 securities from the additional 10 securities and so the solution to #73 will need to also account for the financials name(s).
Edited for completeness.
Great catch @spinnj , yes we should stamp out these errors!