SARS-CoV-2-Sequenzdaten_aus_Deutschland
SARS-CoV-2-Sequenzdaten_aus_Deutschland copied to clipboard
implausible Omicron classification
I see a number of sequences from 2020 and early 2021 classified as Omicron. This does not look plausible to me. May this be related to https://github.com/robert-koch-institut/SARS-CoV-2-Sequenzdaten_aus_Deutschland/issues/9 and the result of a variant PCR?
Data in question:
date_draw | IMS_ID | lineage | scorpio_call | sequencing_lab_pc 1 | sending_lab_pc | seq_type | |
---|---|---|---|---|---|---|---|
2021-03-17 | IMS-10013-CVDP-BD31B70A-5B53-4588-B22A-E40EE489E32... | BA.1.1 | 4779 | 73035 | ILLUMINA | ||
2021-03-10 | IMS-10013-CVDP-7B4A9C47-3B1B-4F3A-85BC-94B4DA5FEB2... | BA.1.1 | 4779 | 95448 | ILLUMINA | ||
2021-01-04 | IMS-10013-CVDP-B219DF48-6F4F-4A17-B19D-98DC45AF974... | BA.1.1 | Omicron (BA.1-like) | 4779 | 4779 | ILLUMINA | |
2021-04-02 | IMS-10013-CVDP-4D759CD1-2209-41DE-980C-E98F38D54BA... | BA.1.1 | 4779 | 28357 | ILLUMINA | ||
2021-03-11 | IMS-10013-CVDP-333514E8-FCA9-49A4-BCE2-6F58419756B... | BA.1.1 | 4779 | 95448 | ILLUMINA | ||
2021-03-25 | IMS-10013-CVDP-EB74C98E-815A-445E-AB76-8C659BE07B3... | BA.1.1 | 4779 | 28357 | ILLUMINA | ||
2021-03-03 | IMS-10013-CVDP-035CC7B7-2FCA-4831-8089-3937D681718... | BA.1.1 | 4779 | 66386 | ILLUMINA | ||
2020-12-26 | IMS-10013-CVDP-D91BBF83-C15E-4280-825F-26E4B301A2F... | BA.1 | Omicron (BA.1-like) | 4779 | 28357 | ILLUMINA | |
2021-04-16 | IMS-10013-CVDP-4EFD6A2A-4346-433F-8D2C-A2AEC01E1E0... | BA.1.1 | 4779 | 86154 | ILLUMINA | ||
2021-03-10 | IMS-10013-CVDP-6D3F28AF-44EA-4BA8-B0D6-308DAD2E4CC... | BA.1.1 | 4779 | 95448 | ILLUMINA | ||
2021-03-24 | IMS-10013-CVDP-5E19AFA5-5AC8-4812-AD7D-C1B6F7ABD87... | BA.1.1 | 4779 | 86154 | ILLUMINA | ||
2021-03-05 | IMS-10013-CVDP-5FD87A5E-FAEA-43C1-B41E-D5F0BF2E0F8... | BA.1.1 | 4779 | 81737 | ILLUMINA | ||
2020-12-22 | IMS-10013-CVDP-652AEF69-8797-4473-9730-40C8422356E... | BA.1 | Omicron (BA.1-like) | 4779 | 28357 | ILLUMINA | |
2021-05-03 | IMS-10013-CVDP-D1C0DC48-97F7-483D-9248-05CCD4DCB36... | BA.1.1 | 4779 | 4779 | ILLUMINA | ||
2021-03-04 | IMS-10013-CVDP-4528BCA3-144F-47DB-BF9E-CCB3D373C74... | BA.1.1 | 4779 | 1665 | ILLUMINA | ||
2021-02-19 | IMS-10013-CVDP-F6D01735-2811-4A43-9656-F8AF4506AD0... | BA.1.1 | 4779 | 81737 | ILLUMINA | ||
2021-03-10 | IMS-10013-CVDP-9BE8CEF8-0042-48FE-B796-3475C6AA707... | BA.1.1 | 4779 | 95448 | ILLUMINA | ||
2021-06-08 | IMS-10013-CVDP-536E691D-7DA2-4D70-BE14-0C512D8DBB0... | BA.1.1 | 4779 | 4779 | ILLUMINA | ||
2021-03-17 | IMS-10013-CVDP-8D6464EE-0C90-48B8-8976-13714B74F79... | BA.1.1 | 4779 | 86154 | ILLUMINA | ||
2021-03-11 | IMS-10013-CVDP-7EA8A3B8-F7CF-4422-B96A-91B02ECA4CE... | BA.1.1 | 4779 | 4779 | ILLUMINA | ||
2021-03-23 | IMS-10013-CVDP-DABBDE37-F49C-4F25-B604-7DF183E3661... | BA.1.1 | 4779 | 4779 | ILLUMINA | ||
2021-03-10 | IMS-10013-CVDP-4FA094F7-E334-41F7-87F5-38A42C2F478... | BA.1.1 | 4779 | 1665 | ILLUMINA | ||
2021-09-02 | IMS-10013-CVDP-3445725E-9F15-4E2D-A4E4-F23949A8FEB... | BA.1.1 | 4779 | 4779 | ILLUMINA | ||
2021-04-03 | IMS-10004-CVDP-33332ED0-2EB6-42F6-9FDD-166D0C19CAD... | BA.1.1 | 21502 | 21502 | ILLUMINA | ||
2021-01-01 | IMS-10061-CVDP-D28E7308-BDB2-47C6-ABD9-A26778807F4... | BA.1.1 | Probable Omicron (BA.1-like) | 30159 | 30159 | ILLUMINA |
It's striking that almost all affected samples have been sequenced by the lab with ID 10013 / postal code 04779 (I was confused for a moment by the four-digit postal code in the table).
Can you also add the processing date? I checked it manually for the bottom three entries of the table:
date_draw | PROCESSING_DATE | IMS_ID | lineage | scorpio_call | sequencing_lab_pc 1 | sending_lab_pc | seq_type | |
---|---|---|---|---|---|---|---|---|
2021-09-02 | 2021-09-20 | IMS-10013-CVDP-3445725E-9F15-4E2D-A4E4-F23949A8FEB... | BA.1.1 | 4779 | 4779 | ILLUMINA | ||
2021-04-03 | 2021-04-14 | IMS-10004-CVDP-33332ED0-2EB6-42F6-9FDD-166D0C19CAD... | BA.1.1 | 21502 | 21502 | ILLUMINA | ||
2021-01-01 | 2022-01-15 | IMS-10061-CVDP-D28E7308-BDB2-47C6-ABD9-A26778807F4... | BA.1.1 | Probable Omicron (BA.1-like) | 30159 |
For the last one, it maybe just a typo in the year of the date_draw. For the other ones, date_draw and PROCESSING_DATE seem plausible in relaton to each other.
Indeed, processing date is interesting - is somebody analyzing old samples?
I have removed sending_lab to keep the table from becoming too wide. If useful, I can export the data set. And sorry for the postcode confusion - I have an integer column inside the database to preserve space and gain speed.
date_draw 2 | processing_date | IMS_ID | lineage | seq_type | sequencing_lab_pc 1 | |
---|---|---|---|---|---|---|
2020-12-22 | 2022-01-13 | IMS-10013-CVDP-652AEF69-8797-4473-9730-40C8422356E... | BA.1 | ILLUMINA | 4779 | |
2020-12-26 | 2022-01-13 | IMS-10013-CVDP-D91BBF83-C15E-4280-825F-26E4B301A2F... | BA.1 | ILLUMINA | 4779 | |
2021-01-04 | 2022-01-24 | IMS-10013-CVDP-B219DF48-6F4F-4A17-B19D-98DC45AF974... | BA.1.1 | ILLUMINA | 4779 | |
2021-02-19 | 2021-03-08 | IMS-10013-CVDP-F6D01735-2811-4A43-9656-F8AF4506AD0... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-03 | 2021-03-22 | IMS-10013-CVDP-035CC7B7-2FCA-4831-8089-3937D681718... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-04 | 2021-03-22 | IMS-10013-CVDP-4528BCA3-144F-47DB-BF9E-CCB3D373C74... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-05 | 2021-03-22 | IMS-10013-CVDP-5FD87A5E-FAEA-43C1-B41E-D5F0BF2E0F8... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-10 | 2021-03-22 | IMS-10013-CVDP-7B4A9C47-3B1B-4F3A-85BC-94B4DA5FEB2... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-10 | 2021-03-22 | IMS-10013-CVDP-9BE8CEF8-0042-48FE-B796-3475C6AA707... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-10 | 2021-03-22 | IMS-10013-CVDP-6D3F28AF-44EA-4BA8-B0D6-308DAD2E4CC... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-10 | 2021-03-22 | IMS-10013-CVDP-4FA094F7-E334-41F7-87F5-38A42C2F478... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-11 | 2021-03-22 | IMS-10013-CVDP-333514E8-FCA9-49A4-BCE2-6F58419756B... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-11 | 2021-03-22 | IMS-10013-CVDP-7EA8A3B8-F7CF-4422-B96A-91B02ECA4CE... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-17 | 2021-03-29 | IMS-10013-CVDP-BD31B70A-5B53-4588-B22A-E40EE489E32... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-17 | 2021-03-25 | IMS-10013-CVDP-8D6464EE-0C90-48B8-8976-13714B74F79... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-23 | 2021-04-16 | IMS-10013-CVDP-DABBDE37-F49C-4F25-B604-7DF183E3661... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-24 | 2021-04-06 | IMS-10013-CVDP-5E19AFA5-5AC8-4812-AD7D-C1B6F7ABD87... | BA.1.1 | ILLUMINA | 4779 | |
2021-03-25 | 2021-04-06 | IMS-10013-CVDP-EB74C98E-815A-445E-AB76-8C659BE07B3... | BA.1.1 | ILLUMINA | 4779 | |
2021-04-02 | 2021-04-19 | IMS-10013-CVDP-4D759CD1-2209-41DE-980C-E98F38D54BA... | BA.1.1 | ILLUMINA | 4779 | |
2021-04-16 | 2021-04-29 | IMS-10013-CVDP-4EFD6A2A-4346-433F-8D2C-A2AEC01E1E0... | BA.1.1 | ILLUMINA | 4779 | |
2021-05-03 | 2021-05-17 | IMS-10013-CVDP-D1C0DC48-97F7-483D-9248-05CCD4DCB36... | BA.1.1 | ILLUMINA | 4779 | |
2021-06-08 | 2021-06-21 | IMS-10013-CVDP-536E691D-7DA2-4D70-BE14-0C512D8DBB0... | BA.1.1 | ILLUMINA | 4779 | |
2021-09-02 | 2021-09-20 | IMS-10013-CVDP-3445725E-9F15-4E2D-A4E4-F23949A8FEB... | BA.1.1 | ILLUMINA | 4779 | |
2021-04-03 | 2021-04-14 | IMS-10004-CVDP-33332ED0-2EB6-42F6-9FDD-166D0C19CAD... | BA.1.1 | ILLUMINA | 21502 | |
2021-01-01 | 2022-01-15 | IMS-10061-CVDP-D28E7308-BDB2-47C6-ABD9-A26778807F4... | BA.1.1 | ILLUMINA | 30159 |
side note: SQL I use. Both CSVs are imported into separate tables as they are.
SELECT rki_sequenzen_meta.date_draw, processing_date, rki_sequenzen.IMS_ID, lineage, seq_type, sequencing_lab_pc FROM rki_sequenzen inner JOIN rki_sequenzen_meta on rki_sequenzen_meta.IMS_ID = rki_sequenzen.ims_id and rki_sequenzen_meta.date_draw <= "2021-11-01" where (lineage = 'B.1.1.529' or lineage like 'BA.%') and rki_sequenzen_meta.SEQ_REASON like 'N%' ORDER BY rki_sequenzen_meta.sequencing_lab_pc ASC, date_draw