seasonal-flu icon indicating copy to clipboard operation
seasonal-flu copied to clipboard

Use metadata from NA segment in joined metadata when HA segment isn't available

Open huddlej opened this issue 10 months ago • 0 comments

Current Behavior

Our current approach to joining segment-level metadata records into isolate-level metadata records is an HA-centric one such that NA records without a matching HA do not get any metadata from the NA record in the isolate-level record.

Expected behavior

When HA records are missing, we still want to know as much as possible about the NA record including the isolate id, the collection date, etc. We will use this information in segment-level analyses such as the flu_frequencies workflow where we estimate NA-specific clade frequencies and want to use all available NA records.

Possible solution

One solution could be to update the join_metadata script to define all segment-specific columns (e.g., "passage_category" should be segment-specific) and then update the isolate-level metadata with the first set of remaining isolate-level columns that are presenting in a segment's record (e.g., date, region, country, etc.).

huddlej avatar Apr 15 '24 18:04 huddlej