ParserError: Error tokenizing data. C error: Expected 1 fields in line 23, saw 46
When I try:
from get_all_tickers import get_tickers as gt
tickers = gt.get_tickers()
I get an error:
tickers = gt.get_tickers(NASDAQ=False)
ParserError Traceback (most recent call last)
c:\Users\Mislav\Documents\GitHub\stocksee\stocksee\ in
----> 36 tickers = gt.get_tickers(NASDAQ=False)
C:\ProgramData\Anaconda3\lib\site-packages\get_all_tickers\ in get_tickers(NYSE, NASDAQ, AMEX)
71 tickers_list = []
72 if NYSE:
---> 73 tickers_list.extend(__exchange2list('nyse'))
74 if NASDAQ:
75 tickers_list.extend(__exchange2list('nasdaq'))
C:\ProgramData\Anaconda3\lib\site-packages\get_all_tickers\ in __exchange2list(exchange)
137 def __exchange2list(exchange):
--> 138 df = __exchange2df(exchange)
139 # removes weird tickers
140 df_filtered = df[~df['Symbol'].str.contains("\.|\^")]
C:\ProgramData\Anaconda3\lib\site-packages\get_all_tickers\ in __exchange2df(exchange)
132 response = requests.get('', headers=headers, params=params(exchange))
133 data = io.StringIO(response.text)
--> 134 df = pd.read_csv(data, sep=",")
135 return df
ParserError: Error tokenizing data. C error: Expected 1 fields in line 23, saw 46
674 )
--> 676 return _read(filepath_or_buffer, kwds)
678 parser_f.__name__ = name
~\AppData\Roaming\Python\Python38\site-packages\pandas\io\ in _read(filepath_or_buffer, kwds)
453 try:
--> 454 data =
455 finally:
456 parser.close()
~\AppData\Roaming\Python\Python38\site-packages\pandas\io\ in read(self, nrows)
1131 def read(self, nrows=None):
1132 nrows = _validate_integer("nrows", nrows)
-> 1133 ret =
1135 # May alter columns / col_dict
~\AppData\Roaming\Python\Python38\site-packages\pandas\io\ in read(self, nrows)
2035 def read(self, nrows=None):
2036 try:
-> 2037 data =
2038 except StopIteration:
2039 if self._first_chunk:
pandas\_libs\parsers.pyx in
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._read_rows()
pandas\_libs\parsers.pyx in pandas._libs.parsers.TextReader._tokenize_rows()
pandas\_libs\parsers.pyx in pandas._libs.parsers.raise_parser_error()
ParserError: Error tokenizing data. C error: Expected 1 fields in line 23, saw 46
I am getting the same error. Lets hope it gets fixed. @shilewenuw
I am seeing this same issue. I really hope it gets fixed soon!
Nasdaq API got updated, so the old URL is no longer available I believe.
The following is a quick implementation of the new API.
import requests
import pandas as pd
headers = {
'authority': '',
'accept': 'application/json, text/plain, */*',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36',
'origin': '',
'sec-fetch-site': 'same-site',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': '',
'accept-language': 'en-US,en;q=0.9',
params = (
('tableonly', 'true'),
('limit', '25'),
('offset', '0'),
('download', 'true'),
r = requests.get('', headers=headers, params=params)
data = r.json()['data']
df = pd.DataFrame(data['rows'], columns=data['headers'])
Hi, Anyone has a workaround for this please?
I made a quick&dirty fix in Filtering by mktcap_min, mktcap_max and sectors works for me. I didn't test regions. Github doesn't allow me to upload a .py file, so you need to remove the '.txt' ending of this one and replace the corresponding file in the package. Thanks to @Possums for the basics!
I have the same issue.
@ErlerPhilipp Great! The standard way to suggest code changes is to create a pull request, so that might be why you can't upload a .py file (it's slightly more work, though).
@krikru I know but I decided against it, for now, because it's mostly untested changes and really dirty ;) If I have time, I'll create a PR.
Hi, any updates? I am still getting this error
@ErlerPhilipp getting the following error with the new
File "/anaconda3/lib/python3.7/site-packages/get_all_tickers/", line 92, in get_tickers tickers_list.extend(__exchange2list('nyse')) File "/anaconda3/lib/python3.7/site-packages/get_all_tickers/", line 162, in __exchange2list df_filtered = df[~df['Symbol'].str.contains(".|^")] File "/anaconda3/lib/python3.7/site-packages/pandas/core/", line 2688, in getitem return self._getitem_column(key) File "/anaconda3/lib/python3.7/site-packages/pandas/core/", line 2695, in _getitem_column return self._get_item_cache(key) File "/anaconda3/lib/python3.7/site-packages/pandas/core/", line 2489, in _get_item_cache values = self._data.get(item) File "/anaconda3/lib/python3.7/site-packages/pandas/core/", line 4115, in get loc = self.items.get_loc(item) File "/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/", line 3080, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 140, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 162, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1492, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1500, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'Symbol'
@patroucheva, You can fix the issue by changing 'Symbol' to 'symbol' on lines 162 and 163
Fix and full output tickers.xlsx
I made a quick&dirty fix in Filtering by mktcap_min, mktcap_max and sectors works for me. I didn't test regions. Thanks to @Possums for the basics!
After making the change, what were the next steps for getting the package to update? I've tried making the swap and reinstalling with the new file, but when trying to use the package in script it still renders the same error.
Nasdaq API got updated, so the old URL is no longer available I believe.
The following is a quick implementation of the new API.
import requests
import pandas as pd

headers = {
    'authority': '',
    'accept': 'application/json, text/plain, */*',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36',
    'origin': '',
    'sec-fetch-site': 'same-site',
    'sec-fetch-mode': 'cors',
    'sec-fetch-dest': 'empty',
    'referer': '',
    'accept-language': 'en-US,en;q=0.9',
}

params = (
    ('tableonly', 'true'),
    ('limit', '25'),
    ('offset', '0'),
    ('download', 'true'),
)

r = requests.get('', headers=headers, params=params)
data = r.json()['data']
df = pd.DataFrame(data['rows'], columns=data['headers'])
There is a small typo on the following line:
r = requests.get('', headers=headers, params=params)'
Simply remove the extra quotation mark at the end. This works @Possums
Updated the code by DimaDDM to be able to get individual exchanges rather than return all symbols get_tickers_by_region also updated but untested
@DimaDDM I just copied and pasted that file into
Could you open a PR for it?
Has anyone managed to get_tickers_by_region working? I can only seem to get it to return US tickers instead of the region specified.
hey @here, as it doesn't look like the author cares much about this. I'd suggest that somebody that does, and yet doesn't want to just use the script provided above, create their own repo/fork this with the appropriate modifications.
There's no point in creating PRs if the author is MIA.
I made a quick&dirty fix in Filtering by mktcap_min, mktcap_max and sectors works for me. I didn't test regions. Thanks to @Possums for the basics!
Hey, the "Symbol" column name shall be in all lower case "symbol"
Copy and paste this code to in your library by searching in your PC where it is. Cheers.
import pandas as pd
from enum import Enum
import io
import requests
_EXCHANGE_LIST = ['nyse', 'nasdaq', 'amex']
_SECTORS_LIST = set(['Consumer Non-Durables', 'Capital Goods', 'Health Care',
'Energy', 'Technology', 'Basic Industries', 'Finance',
'Consumer Services', 'Public Utilities', 'Miscellaneous',
'Consumer Durables', 'Transportation'])
headers = {
'authority': '',
'accept': 'application/json, text/plain, */*',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36',
'origin': '',
'sec-fetch-site': 'same-site',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': '',
'accept-language': 'en-US,en;q=0.9',
def params(exchange):
return (
('letter', '0'),
('exchange', exchange),
('render', 'download'),
params = (
('tableonly', 'true'),
('limit', '25'),
('offset', '0'),
('download', 'true'),
def params_region(region):
return (
('letter', '0'),
('region', region),
('render', 'download'),
class Region(Enum):
class SectorConstants:
NON_DURABLE_GOODS = 'Consumer Non-Durables'
CAPITAL_GOODS = 'Capital Goods'
HEALTH_CARE = 'Health Care'
ENERGY = 'Energy'
TECH = 'Technology'
BASICS = 'Basic Industries'
FINANCE = 'Finance'
SERVICES = 'Consumer Services'
UTILITIES = 'Public Utilities'
DURABLE_GOODS = 'Consumer Durables'
TRANSPORT = 'Transportation'
def get_tickers(NYSE=True, NASDAQ=True, AMEX=True):
tickers_list = []
if NYSE:
if AMEX:
return tickers_list
def get_tickers_filtered(mktcap_min=None, mktcap_max=None, sectors=None):
tickers_list = []
for exchange in _EXCHANGE_LIST:
tickers_list.extend(__exchange2list_filtered(exchange, mktcap_min=mktcap_min, mktcap_max=mktcap_max, sectors=sectors))
return tickers_list
def get_biggest_n_tickers(top_n, sectors=None):
df = pd.DataFrame()
for exchange in _EXCHANGE_LIST:
temp = __exchange2df(exchange)
df = pd.concat([df, temp])
df = df.dropna(subset={'marketCap'})
df = df[~df['Symbol'].str.contains("\.|\^")]
if sectors is not None:
if isinstance(sectors, str):
sectors = [sectors]
if not _SECTORS_LIST.issuperset(set(sectors)):
raise ValueError('Some sectors included are invalid')
sector_filter = df['Sector'].apply(lambda x: x in sectors)
df = df[sector_filter]
def cust_filter(mkt_cap):
if 'M' in mkt_cap:
return float(mkt_cap[1:-1])
elif 'B' in mkt_cap:
return float(mkt_cap[1:-1]) * 1000
return float(mkt_cap[1:]) / 1e6
df['marketCap'] = df['marketCap'].apply(cust_filter)
df = df.sort_values('marketCap', ascending=False)
if top_n > len(df):
raise ValueError('Not enough companies, please specify a smaller top_n')
return df.iloc[:top_n]['Symbol'].tolist()
def get_tickers_by_region(region):
if region in Region:
response = requests.get('', headers=headers,
data = io.StringIO(response.text)
df = pd.read_csv(data, sep=",")
return __exchange2list(df)
raise ValueError('Please enter a valid region (use a Region.REGION as the argument, e.g. Region.AFRICA)')
def __exchange2df(exchange):
r = requests.get('', headers=headers, params=params)
data = r.json()['data']
df = pd.DataFrame(data['rows'], columns=data['headers'])
return df
def __exchange2list(exchange):
df = __exchange2df(exchange)
df_filtered = df[~df['symbol'].str.contains("\.|\^")]
return df_filtered['symbol'].tolist()
def __exchange2list_filtered(exchange, mktcap_min=None, mktcap_max=None, sectors=None):
df = __exchange2df(exchange)
df = df.dropna(subset={'marketCap'})
df = df[~df['symbol'].str.contains("\.|\^")]
if sectors is not None:
if isinstance(sectors, str):
sectors = [sectors]
if not _SECTORS_LIST.issuperset(set(sectors)):
raise ValueError('Some sectors included are invalid')
sector_filter = df['sector'].apply(lambda x: x in sectors)
df = df[sector_filter]
def cust_filter(mkt_cap):
if 'M' in mkt_cap:
return float(mkt_cap[1:-1])
elif 'B' in mkt_cap:
return float(mkt_cap[1:-1]) * 1000
elif mkt_cap == '':
return 0.0
return float(mkt_cap[1:]) / 1e6
df['marketCap'] = df['marketCap'].apply(cust_filter)
if mktcap_min is not None:
df = df[df['marketCap'] > mktcap_min]
if mktcap_max is not None:
df = df[df['marketCap'] < mktcap_max]
return df['symbol'].tolist()
def save_tickers(NYSE=True, NASDAQ=True, AMEX=True, filename='tickers.csv'):
tickers2save = get_tickers(NYSE, NASDAQ, AMEX)
df = pd.DataFrame(tickers2save)
df.to_csv(filename, header=False, index=False)
def save_tickers_by_region(region, filename='tickers_by_region.csv'):
tickers2save = get_tickers_by_region(region)
df = pd.DataFrame(tickers2save)
df.to_csv(filename, header=False, index=False)
if __name__ == '__main__':
tickers = get_tickers()
tickers = get_tickers(AMEX=False)
# default filename is tickers.csv, to specify, add argument filename='yourfilename.csv'
# save tickers from NYSE and AMEX only
# get tickers from Asia
tickers_asia = get_tickers_by_region(Region.ASIA)
# save tickers from Europe
save_tickers_by_region(Region.EUROPE, filename='EU_tickers.csv')
# get tickers filtered by market cap (in millions)
filtered_tickers = get_tickers_filtered(mktcap_min=500, mktcap_max=2000)
# not setting max will get stocks with $2000 million market cap and up.
filtered_tickers = get_tickers_filtered(mktcap_min=2000)
# get tickers filtered by sector
filtered_by_sector = get_tickers_filtered(mktcap_min=200e3, sectors=SectorConstants.FINANCE)
# get tickers of 5 largest companies by market cap (specify sectors=SECTOR)
top_5 = get_biggest_n_tickers(5)
Same issue
Exception has occurred: ParserError
Error tokenizing data. C error: Expected 1 fields in line 6, saw 47
File "C:\dev\Python\stocks\", line 9, in <module>
list_of_tickers = gt.get_tickers_filtered(mktcap_max=1)
@justege: looks like lines 97 and 120 should be lower case 'symbol' vs 'Symbol', as found by @JaisinhBhosale9712 in the comment just before yours.