bigquery-examples icon indicating copy to clipboard operation
bigquery-examples copied to clipboard

optimize all queries using FLATTEN

Open deflaux opened this issue 10 years ago • 3 comments

Restructure them so that the clause within the FLATTEN returns only the columns needed and also applies any applicable WHERE clauses.

deflaux avatar Aug 01 '14 22:08 deflaux

For example compare:

SELECT
  contig,
  position,
  reference_bases,
  GROUP_CONCAT(alternate_bases) WITHIN RECORD AS alt,
  vt,
  END,
  super_population,
  IF(af >= 0.05,
    TRUE,
    FALSE) AS is_common_variant,
  IF(genotype.first_allele > 0
    OR genotype.second_allele > 0,
    1,
    0) AS has_variant
FROM
  FLATTEN([google.com:biggene:1000genomes.variants1kG],
    genotype) AS samples
JOIN
  [google.com:biggene:1000genomes.sample_info] p
ON
  samples.genotype.sample_id = p.sample
WHERE
  contig = '17'
  AND position BETWEEN 41196312
  AND 41277500

to a better version of the same thing - notice that the inner most query is now returning a lot less data to the outer query:

SELECT
  contig,
  position,
  reference_bases,
  alt,
  vt,
  END,
  super_population,
  is_common_variant,
  has_variant
FROM
  FLATTEN(
    (
    SELECT
      contig,
      position,
      reference_bases,
      GROUP_CONCAT(alternate_bases) WITHIN RECORD AS alt,
      vt,
      END,
      IF(af >= 0.05,
        TRUE,
        FALSE) AS is_common_variant,
      genotype.sample_id,
      IF(genotype.first_allele > 0
        OR genotype.second_allele > 0,
        1,
        0) AS has_variant
    FROM
      [google.com:biggene:1000genomes.variants1kG]
    WHERE
      contig = '17'
      AND position BETWEEN 41196312
      AND 41277500),
    genotype) AS samples
JOIN
  [google.com:biggene:1000genomes.sample_info] p
ON
  samples.genotype.sample_id = p.sample

deflaux avatar Aug 01 '14 22:08 deflaux

IF(af >= 0.05, TRUE, FALSE) AS is_common_variant

same as

af >= 0.05 AS is_common_variant

?

fhoffa avatar Aug 02 '14 00:08 fhoffa

agreed, can be simplified

I'm pretty sure that pattern is in a few other places too, filed #14

please point out other things like this! Thanks Felipe

deflaux avatar Aug 02 '14 00:08 deflaux