gosling.js icon indicating copy to clipboard operation
gosling.js copied to clipboard

the aggregate property of x and xe channels worked unexpectedly

Open zhangzhen opened this issue 2 years ago • 4 comments

I use the following spec to drag a chromosome band plot for the whole genome. Only chr19 was drawn as two separate parts. It's a bit weird.

{
  "arrangement": "vertical",
  "title": "Large Rearrangment Auditing",
  "assembly": "unknown",
  "spacing": 50,
  "xDomain": { "interval": [0, 20000] },
  "views": [
    {
      "alignment": "overlay",
      "data": {
        "url": "https://dataviz.brbiotech.com/RS20210907013FFP.panelData_.tsv",
        "type": "csv",
        "separator": "\t",
        "sampleLength": 20000,
        "genomicFields": ["index", "index2"],
        "quantitativeFields": ["CN"]
      },
      "tracks": [
        {
          "mark": "rect",
          "x": { "field": "index", "aggregate": "min", "type": "genomic" },
          "xe": { "field": "index2", "aggregate": "max", "type": "genomic" },
          "stroke": { "value": "white" },
          "strokeWidth": { "value": 2 },
          "color": {
            "field": "chr",
            "type": "nominal",
            "domain": [
              "1",
              "2",
              "3",
              "4",
              "5",
              "6",
              "7",
              "8",
              "9",
              "10",
              "11",
              "12",
              "13",
              "14",
              "15",
              "16",
              "17",
              "18",
              "19",
              "20",
              "21",
              "22"
            ],
            "range": ["#0072B2"]
          } 
        },
        {"mark": "brush", "x": { "linkingId": "detail" }}
      ],
      "width": 1000,
      "height": 30
    }
  ],
  ....
}

The red box in the screenshot that follows shows the wrongly drawn chr19. image

zhangzhen avatar Oct 20 '21 06:10 zhangzhen

Hi @zhangzhen,

Would you be able to share the data you used (https://dataviz.brbiotech.com/RS20210907013FFP.panelData_.tsv) so that I can take a closer look at this issue?

In my example with ideograms, chr19 is displayed correctly with the aggregate property, so I wonder if this issue is related to the data you used.

For example, please refer to my example:

{
  "tracks": [
    {
      "data": {
        "url": "https://raw.githubusercontent.com/sehilyi/gemini-datasets/master/data/UCSC.HG38.Human.CytoBandIdeogram.csv",
        "type": "csv",
        "chromosomeField": "Chromosome",
        "genomicFields": ["chromStart", "chromEnd"]
      },
      "mark": "rect",
      "color": {
        "field": "Chromosome",
        "type": "nominal",
        "domain": [
          "chr1",
          "chr2",
          "chr3",
          "chr4",
          "chr5",
          "chr6",
          "chr7",
          "chr8",
          "chr9",
          "chr10",
          "chr11",
          "chr12",
          "chr13",
          "chr14",
          "chr15",
          "chr16",
          "chr17",
          "chr18",
          "chr19",
          "chr20",
          "chr21",
          "chr22",
          "chrX",
          "chrY"
        ],
        "range": ["#F6F6F6", "gray"]
      },
      "x": {"field": "chromStart", "type": "genomic", "aggregate": "min"},
      "xe": {"field": "chromEnd", "aggregate": "max", "type": "genomic"},
      "strokeWidth": {"value": 2},
      "stroke": {"value": "gray"},
      "style": {"outline": "white"},
      "width": 800,
      "height": 25
    }
  ]
}

Screenshot Screen Shot 2021-10-25 at 10 29 39 AM

sehilyi avatar Oct 25 '21 14:10 sehilyi

Would you be able to share the data you used (https://dataviz.brbiotech.com/RS20210907013FFP.panelData_.tsv) so that I can take a closer look at this issue?

I will send the data file regarding only chr19 to your harvard mail.

zhangzhen avatar Oct 26 '21 07:10 zhangzhen

@zhangzhen, thanks for sharing your file. Something I found after using your file and my above example is that chromosomes are sometimes separated depending on zoom levels when I used stroke of a rect mark.

Screen Shot 2021-11-15 at 8 06 18 PM

I think this is due to the tilling approach, i.e., chromosome 19 spans across two tiles, so min and max values are calculated two times.

Removing the use of strokes will make chromosomes visually not separated, but this will be just a workaround. Perhaps, we may not need to recommend using min and max aggregation functions for genomic fields considering that we use tiles.

sehilyi avatar Nov 16 '21 01:11 sehilyi

@zhangzhen, in your case, would it be better (in terms of rendering performance) to create a tiny file for this kind of track, i.e., a file that contains the start and end position of each chromosome?

sehilyi avatar Nov 16 '21 01:11 sehilyi