Feature requests for new dataset
We would like to move from old dataset structure in CPH for cQED measurements. I believe almost everything is in place except for plotting functionalities that is currently implemented in show_num and a metadata viewer.
The functions in question are:
- [x] Subtract average along each row or column
- [x] Transpose data (swap x and y axis on plot)
- [ ] Add multiple dataset to the same plot
- [x] Function to extract metadata from a dataset (print/view a readable format of an instrument snapshot).
I only had a brief look at plottr so it is possible that some of these are already implemented but I believe at least the first point is missing.
Also is it clear from which repository we should be cloning plottr to have the latest updates?
@jenshnielsen @WilliamHPNielsen (@wpfff adding you to keep you in the loop).
- "Transpose data (swap axes)" is supported in plottr (you always need to select which parameters to plot against which)
- "Subtract average by row/column" has been added to plottr
- "Extracting metadata" from the new DataSet is available: try the following:
# import json # don't forget to import this :)
snapshot_json = dataset.get_metadata('snapshot')
snapshot_dict = json.loads(snapshot_json)
# explore 'snapshot_dict'
- "adding multiple dataset to the same plot" - what is meant here is a "union" of datasets; this is being implemented via
mergefunction in #1214, so thatplot_by_idcan be called on a merged dataset.
so far my 2 cents:
High priority
- [x] alazar multi channel parameter can't be registered to dataset with register_parameter function which means at the moment I can't save the magnitude AND phase data unless I take the data and then save it separately which is baaad
- [x] loading the values of array parameter from the dataset doesn't work with dataset.get_data(), is there another method (and if so can an example go into the example notebook)? So far I'm unable to check if any of the array parameter datasets are correct :)
Low(er) priority
- [x] plot_by_id doesn't always do good axis scaling so I end up with overlapping axes labels
- [x] plot_by_id doesn't have a title
another for lower priority:
- [x] it is possible for me to save strings as parameter values and as setpoints but I am not clear on if it is possible to save these values in the new dataset? It would be great if I could have strings as setpoints and axes labels (and the spacing just assumed to be linear for the plots)
@nataliejpg could you expand more on your requests (also, you can edit your first comment to incorporate the "another lower priority" item ;) ) in the following way:
- loading the values of array parameter from the dataset - have you tried
get_data_by_id? - plot_by_id and axis scaling - does calling matplotlib's
tight_layoutafter the call to plot_by_id help? - plot_by_id doesn't have a title - what would you like to have there? run_id, experiment name, sample name, timestamp (of start or stop or both)?
- "another for lower priority" - i did not understand what you want to achieve and why. could you explain it a bit more in detail? is it actually multiple requests in one? :)
thank you for your input @astafan8
- I tried
load_by_idto get the dataset which is maybe what you meant? if not then I think it's pretty problematic that there are two different functions.load_by_idwill get me back the python dataset object but then callingget_dataorget_valuesto actually get the data itself fails for array parameters. If it really is just me making a silly mistake (and someone else has successfully done this) then I think that an extra few lines in one of the example notebooks where data is retrieved for the different paramspec types would be very helpful. - of course it is possible for me to rescale the axes myself, the issue is that in order for the new dataset to be useful in copenhagen there needs to be some plotting functionality alongside which has good defaults and saves plots which contain some information. Running plot_by_id and then rescaling the axes afterwards every time isn't really an option and tight_layout in this scenario isn't enough as it by itself chooses really inconvenient decimals like 0.00000001 which are naturally problematic to have as tick labels.
- experiment_name, sample_name and run_id are probably enough for a title.
- is it possible to save strings as paramspec datapoints (or arrays) in the new dataset?
- if yes to the above then we should also be able to plot where the tick labels can be strings and make some assumption about the values so that we can make a plot. eg
tick_labels = ["X_Y", "X_X", "X_I", "I_I"]andtick_values = np.arange(len(tick_labels)) - if no then it seems bad as there are lots of qcodes parameters which have string values
you are very welcome, @nataliejpg !
- (about loading array parameters) Yes, but there is also
qcodes.dataset.data_export.get_data_by_idfunction - could you try it to see if it works for your case with array parameters? I agree with you that there should be one default way of retrieving data and it should just work. @jenshnielsen , do your recent PRs solve this particular issue by chance? - (about rescaling) ok, i thought it was the problem that is solvable by
tight_layout, but what you are talking about seems to be one of the things that Thorvald mentioned to me the other day: he wants to have the axes automatically scaled to nice numbers (e.g. 0.000001V -> 1uV) (old qcodes dataset has this feature). Am I correct? - (about strings as datapoints) that's unexpected a bit, could you expand on what you are using it for?) also, it seems that one of @jenshnielsen PRs allows this, am I right?
@astafan8
- will try it as soon as I can get back onto the measurement pc so probably later this week. Does it work for you? Has anyone actually used the "array" that you know?
- lets just say I don't want this:

- I have a set of gate combinations which are labeled 'X_X', X_Y' etc and would like to plot the result as a function of gate combination. In this scenario the gate combinations are the setpoints and I would like them to be saved as such and appear on the axis of a plot, in the meantime I have instead been using an index and then having to compare the index to the gate combination list to work out which result corresponds to which gate combination but this seems like an unnecessary step. It is clear to me that it is not currently supported to have this plotted but it is not clear to me if it is possible to save this in the database. Do you know? I would like both.
- (about rescaling and/or overlapping tick labels) - i've just had a quick google: it's a known problem of matplotlib, and there is no smart way of doing it due to the internals of matplotlib. so people end up manipulating these manually (rotating tick labels, or reducing their number, etc.). Thorvald's request might help in some of these cases as well.
@astafan8 so to clarify you are saying that it's not feasible to expect better labelling than this by default?
also just to be extra clear that while I do find the plotting very important the biggest thing for me right now is that the MultiChannel alazar parameter can't be saved. For me this is number 1.
yes. unless.... did you have a fix/workaround for the overlapping labels before (meaning, when you were working with the old dataset)? If so, could you give me a link?
@astafan8 it can't be true that qcodes solved one issue with the old dataset that we now have to convince you can be solved again for the new dataset. Changing to the new dataset should not put more work on us compared to the old dataset - otherwise we will just keep using the old one.
https://github.com/qdev-dk/qdev-wrappers/blob/master/qdev_wrappers/plot_functions.py
@ThorvaldLarsen please, do not get me wrong. With my questions to Nathalie about the "overlapping labels" issue, I'm trying to understand if it is purely the problem of the plotting utilities of the new dataset, or the plotting utilities of the old dataset had them as well. As I mentioned above, most probably, adding automatic "rescale axes to uV/nV/GHz/etc" (that you requested some days ago) for the new dataset (yes, this feature exists for the old dataset) will also help with "overlapping labels" because the labels will effectively become shorter.
When multidimentional arrays are stored in the dataset they are not completely converted to individual datapoints. Only the outer most dimension is unrolled. This is fixed by #1207
@jenshnielsen do I understand correctly that right now (before merge of #1207) you really can't use the array paramspec or is it that you can if you use it with this get_data_by_id function? Also what is the motivation for having load_by_id and get_data_by_id?
It only works correctly for 1d arrays before that pr is merged. get_data_by_id is as fast as possible because it loads the data as is in the database but get_data_by_id is more convenient but we should probably get rid of this
I've summarized the descriptions for each of the mentioned items, and added them to our VSO. Soon they will be done :)
With all my understanding and respect, @ThorvaldLarsen and @nataliejpg , please, next time, make 1 github issue per 1 request/discussion. The set of issues above (because it is a set) was quite challenging to discuss and manage.
@astafan8 is there any sort of ETA for the alazar parameter one (or the others)? Thanks