binance-public-data
binance-public-data copied to clipboard
Data consistency!!!
Indeed it was a great idea to add a header to the aggTrades for the um data from the 2022-08-11.
But what is even greater in a database it to have data consistency, meaning that all elements have the same structure... BUT lol what can we do!
THEN EVEN MORE AWESOME: when you change something: annouce it !
Sorry for the inconvenience. We will add the title to the existing zip files, it may take a while for the amount files that need to be updated.
Dont worry!
But really in the future if you modify the data structure it would be great if you can announce the modification in advances, just to let people implement safeguards. OK definitely my backtesting and data analysis pipelines should have be more robust... and i learned how to detect if there is an header or not with pandas :-).
Then just an idea as any: What about using a better format than CSV, i dont know like HDF5, or even raw binary. That would save a considerable amount of memory in your side, and avoid to need of data conversion in clients side.
Thank you for your understanding. CSV is the most convenient way to store this data. It's easy to open and read manually on all platforms. CSV files can be handled well in most popular languages, Python has the well-known library ( Pandas), which can handle the CSV files without any problem. We may look into other storage to handle big data, but for now, CSV will be our main storage format.
when the header of other old csv files will be added?
For new generated data, header is already added. For old data, please check faq for header(https://www.binance.com/en/support/faq/how-to-download-historical-market-data-on-binance-5810ae42176b4770b880ce1f14932262)