virtualdatalab
virtualdatalab copied to clipboard
Error Handling: Sequence count starting at one
Context
VDL expects datasets to start they sequence count (column sequence_pos) from 0. If a dataset is provided that has a sequence starting with 1, virtualdatalab.benchmark.compare() fails with a crypt error message about columns not found.
Improvement
Add a check for the correct sequence count to virtualdatalab.synthesizers.utils.check_common_data_format() and make sure that the data format is either checked before compare, or after generation. This could be combined with issue #1 to enforce correct dataset formats.
Benefits
Better user experience in case of wrong values in sequence_pos.