pygtfs
pygtfs copied to clipboard
Add module for exporting the data back out to csv
I've added a module to export the data back out to csv. It's not quite complete but I think I'll need some help with the final piece. I've got it looping through each GTFS table and creating the appropriate .txt file, but I can't figure out how to get the table headers and rows written to the csv file. It seems like it should be easy to do and I've been reading through the SQLAlchemy docs, but I'm just not getting it. Do you mind taking a look?
You can ignore the pull request for now. I can re-send once I have it all working properly.
P.S. I'm pretty new to all of this so if there's a better way to have you review what I'm doing, please suggest.
Also, please disregard the version change. I was playing around with things on my local copy and forgot to revert that line.
That's a good start, but there is indeed no point in merging until it is more complete. The issues we'll need to solve are:
- A csv might miss columns, replaced with NULL in the db. So currently the exporter cannot tell the difference between missing columns and missing fields. Are we okay with that?
- Some fields are "validated" or in fact converted when input to the database. We need to figure out a way to revert this conversion. This is vital, and needs to be done anyway. Perhaps we need to change something more basic in the architecture. For example, a binary is stored in the csv as
0
or1
.
Let's try to work on those issues, and keep the conversation here.
The only columns that could be overlooked are optional columns according to the spec, right? In other words, if our model defines a column as nullable, it might be left out. I think that should be OK as long as our definitions of nullable columns corresponds to the optional columns in the spec.
The point was that we cannot tell the difference between a missing column and a column full of nulls. I guess we could just output empty columns.
Yeah, optional in specs <=> nullable in our model.
I'm OK with outputting empty columns.
As for the field validation, what are the challenges? I don't imagine the mechanics of converting between, for example, binary and 0/1 is too difficult. Is the real issue determining the best way to operationalize the conversion? For instance, do we store that information within the relevant class in gtfs_entities, or do we create a new type_conversion class to handle it?
Again, sorry for my ignorance - I'm still learning how to think correctly about these kinds of problems.
Just reading this and would like to understand the use-case but since this is 10y old, I guess there is none anymore ?