bikeshare
bikeshare copied to clipboard
Use HDF5
If you store the data with HDF5, the whole dataset becomes much much smaller. I have 5 months of minute-wise docks available data for citibike stored in ~15Megs. The original json files are 25Gigs. If you put the core data structures into HDF5, you can probably throw them up on s3.
If you remove postgresql, it will be much easier for other people to start hacking on it.