pdk
pdk copied to clipboard
Import should create db and frames
With the schema updates, running an import will fail unless dbs and frames are created first. This should be done transparently.
What are you proposing @alanbernstein ? Do you mean you want the import to create databases and frames?
We'll need to be careful with that (since it's something we essentially just moved away from). For example, the import would need to specify ColumnLabel
and RowLabel
in addition to DB
and Frame
.
Another issue is that the propagation of the CreateDB
and CreateFrame
events around the cluster is currently not implemented, and even the memberlist
implementation I have in progress is only eventually consistent, so there's a chance the import could start sending bits to a node that is not yet aware of the database.
This comes from wanting to simplify a new user's first experience with Pilosa, I didn't think on it much beyond that. The taxi usecase notebook has a short intro section that points to the Pilosa and PDK repos, to show how to get started (get Pilosa running, import data). A new user should be able to do all of this setup with a handful of steps, ideally copy+pasting from the READMEs with minimal effort.
I'm suggesting that a user should be able to run pdk taxi -d dbname -f filename
, and this command would handle everything necessary to get the import started. If an additional setup command had to be used to create the db and frame, that wouldn't be a problem. I thought anything more involved than that would appear unnecessarily complicated.
I've added EnsureDatabaseExists
and EnsureFrameExists
methods to the Java, Go and Node clients, which creates the database or frame if they don't exist. That helps with having simple and repeatable instructions for users. So, instead of using CreateDatabase
and CreateFrame
instructions (which would work the first time they are used, but not in the subsequent runs) it would be better to use EnsureDatabaseExists
and EnsureFrameExists
in our tutorials and use cases.