Don't display attributes expanded for dataset
When you look at a dataset derived from VCF in a notebook, you get this:
The attributes are automatically "open",and this means that the VCF header attibute (which will be several megabytes for large datasets) dominates.
I'm not sure this is something we can influence, but can we either truncate the vcf header attribute for display, or tweak the display of the dataset somehow to at least keep the attributes "closed" by default?
Alternatively we could discard the "#CHROM ..." line of the VCF header, since we can reproduce it using the sample_id variable. Also, it's wrong when we do a subset operation.
It can be controlled with an xarray setting: https://docs.xarray.dev/en/stable/generated/xarray.set_options.html#xarray-set-options
This originally came up here: https://github.com/pystatgen/sgkit/issues/463#issuecomment-827445369
As a quick aside @tomwhite, do we ever use the "#CHROM POS.." line from the vcf header? If not I think we should discard it, as there's no real information there (i'll open an issue)
@jeromekelleher we used to use the "#CHROM POS.." line to support round-tripping of VCF -> Zarr -> VCF, but we can generate the header now, so it may not be necessary to store it. See https://github.com/pystatgen/sgkit/blob/2ab47b587768bed166d3c477694bed06250123c9/sgkit/io/vcf/vcf_writer.py#L412-L559