cooler icon indicating copy to clipboard operation
cooler copied to clipboard

metadata for cooler balance is hard to find

Open golobor opened this issue 8 years ago • 7 comments

currently, the parameters of balancing are stored as a metadata of a particular column of the bins table, which is really hard to find. While this is technically valid, on practice, metadata of particular columns of particular datasets are really hard to discover (for balance, it doesn't seem to be documented, or, documenation is hard to find). It would be really useful if at least cooler info could print all metadata of all columns.

golobor avatar Oct 05 '17 16:10 golobor

On a related topic - is it even possible to quickly check if a given .cool is balanced at all or not, via CLI?

cooler info does not show it ...

Right now people in the lab check if there is a e.g. weights-column in clr.bins()[:10] ...

Maybe some hdf5 CLI tools that allow to sneak into a given cooler-uri to see table headers or something like that?

sergpolly avatar Sep 17 '18 14:09 sergpolly

I'll leave it here, just in case: h5dump -n filename.cool yields HDF5 file content, e.g.:

HDF5 "blahblah_hg19.10000.cool" {
FILE_CONTENTS {
 group      /
 group      /bins
 dataset    /bins/chrom
 dataset    /bins/end
 dataset    /bins/start
 dataset    /bins/weight
 group      /chroms
 dataset    /chroms/length
 dataset    /chroms/name
 group      /indexes
 dataset    /indexes/bin1_offset
 dataset    /indexes/chrom_offset
 group      /pixels
 dataset    /pixels/bin1_id
 dataset    /pixels/bin2_id
 dataset    /pixels/count
 }
}

if dataset /bins/weight is present - balancing was attempted, at the very least.

sergpolly avatar Sep 17 '18 15:09 sergpolly

@sergpolly, there is cooler balance --check

nvictus avatar Sep 18 '18 17:09 nvictus

oh my! that's handy! - I wish I read docs more carefully. But this one would only look for weight column in bins, right? It would return False in case balancing weights are in a column that is named something else, wouldn't it ?

sergpolly avatar Sep 18 '18 17:09 sergpolly

As long as you know the name of the column, you can provide it as the --name parameter.

nvictus avatar Sep 18 '18 17:09 nvictus

See new cooler tree and cooler attrs commands in 0.8

nvictus avatar Dec 31 '18 16:12 nvictus

could we have something similar in Python API, pretty please? Again, balancing parameters are just impossible to discover without googling

golobor avatar Mar 11 '19 12:03 golobor