switchy icon indicating copy to clipboard operation
switchy copied to clipboard

pandas/CSV storage breaks on Py3.5

Open goodboy opened this issue 9 years ago • 0 comments

The py35-pandas test run contains a bunch of failures currently because of:

  • shmarry doen't support unicode as per the docs on multiprocessing.RawArray so if we want to keep the shared numpy array stuff it would seem we have to use Bytes.
  • pandas "encoding" (type coercing?) problems are stemming from pydata/pandas#9712 where when a csv data store is written it keeps the b bytes prefix. This ends up causing problems with round tripping (which is done implicitly when reading the entire contents of a DataStorer in mem + on disk) since pd.read_csv then parses the b as part of the data point.

I have a feeling there might be an some way to always hack around this either using str.encode directly or somehow with to_csv although it seems the latter is being battled in the issue above to little avail.

I made the decision to bring in #44 since I don't expect many people to jump onto py3.5 immediately anyway.

goodboy avatar Aug 29 '16 20:08 goodboy