qri
qri copied to clipboard
supplying configuration data to setup creates an erroneous user initID
repro on a fresh machine:
$ export QRI_CONFIG_SETUP_DATA=[config data as json string from another machine]
$ curl -fsSL https://qri.io/install.sh | sh
$ qri setup
$ qri pull me/dataset_I_have_already_pushed
If you look at the logbook data, You'll see two user level branches with the same username, in this case qri_weather_bot:
$ qri logbook --summary
user rulxks4qbloc5hzs3brzuoyqyg6njijnqzcoxuioanl7wkt3whwa 1 qri_weather_bot
user r4lujn76cyxp7iczvggu7uekshoc55ruex7pfykuofyhq4pcn66a 1 qri_weather_bot
dataset 4mpnipa7qul3dgv3l65jllof56r73dnvjtayrakz5lkmltgb2tna 1 brooklyn
branch wo6tryl2ripczqv7mf4gbc65by5uzsiw4kn2kxd3lshou6dsu62a 3 main
Steps to Fix
- add a new field to
config.ProfilecalledInitIDthat will hold the init id of the user - add a function
migrations.SetProfileInitIDIfEmptyto the config migrations package that takes a*logbook.Bookand sets-then-writes the config, call it at the end oflib.NewInstance - adjust logbook to no longer auto-init on construction:
- make
logbook.Initializean exported function - adjust the logbook constructor to accept a
userInitIDinstead of a username - remove
- make
- call
logbook.Initializewithinlib.Setupifcfg.Profile.InitIDis empty
I think you're right that this is the same problem as what's happening in https://github.com/qri-io/qri/issues/1483. A user with two different logbooks, attached to the same handle, pushed those to cloud, and then cloud is feeding modified UserLog data to clients that pull. You were saying this before but I didn't really understand, and my thinking it was due to cloud's compression of logbook data was completely wrong. Apologies.
Would very much prefer that we don't use "initID" here. So far, we've only used that term to talk about datasets, and there's value in keeping it unambiguously referring to datasets. Something else like "userCreateID" would preferable.
Is it possible to also get into this state using qri registry prove? In either case, I think we need to have cloud, when it establishes a username has an existing identify, needs to convey this by pushing down the UserLog's first op to the new machine so that it matches up ids.