checkQC icon indicating copy to clipboard operation
checkQC copied to clipboard

Supporting NextSeq and MiniSeq

Open avilella opened this issue 7 years ago • 14 comments
trafficstars

Hi, what would be needed to support NextSeq and MiniSeq instruments? Anything I can provide?

avilella avatar Aug 14 '18 14:08 avilella

Hi!

Adding new instruments should be relatively simple. What is needed is:

  1. to implement new classes for the instruments here https://github.com/Molmed/checkQC/blob/ccade4f13a191b8480ccea75ba65dbeba0263aae/checkQC/run_type_recognizer.py#L61 for each of the new instruments
  2. add the correct instrument identifier prefix here: https://github.com/Molmed/checkQC/blob/ccade4f13a191b8480ccea75ba65dbeba0263aae/checkQC/run_type_recognizer.py#L187
  3. implement reasonable default qc criteria in the config file

I'd be happy to add support for them. What I would need from you, is if you could tell me what prefix the instruments uses, and what you think would be reasonable default qc criteria. And, since I don't have access to data from these instruments it would be great if you could run some beta testing making sure that everything seems to work (or if possible send me some data that I could try it out on).

johandahlberg avatar Aug 14 '18 16:08 johandahlberg

Hm, I have a couple of NextSeq 500 runs here that I could get hands on.

apeltzer avatar Sep 12 '18 11:09 apeltzer

That's great, @apeltzer. I found some information which indicated that the NextSeq instruments have serial numbers that start with SN, is that correct? If I get a pre-release out, would you be willing to beta test it?

johandahlberg avatar Sep 12 '18 11:09 johandahlberg

I guess I could do that yes - regarding the serial number, I will check. Could however very well be the case yes.

apeltzer avatar Sep 12 '18 11:09 apeltzer

Thats a normal FastQ file out here:

@NS500559:25:HJHMNBGXX:1:11101:4226:1073 1:N:0:TTACTTCT+CTAACTTA
GATCTNGGTCTGGTTTCATCCGCGGCATTTTGCCACCCTGACCGGAGTGGTCTTTGCCGTCGGTTATCTGGGAAA
+
AAAAA#EEEEEEEEEEEEEEAEEEEEEE/A<EAE/EEAEEEEAEEEEEAEAA/E/EEEEEEEEEEAAA6EEEEE/
@NS500559:25:HJHMNBGXX:1:11101:18957:1076 1:N:0:TTACTTCT+CTAACTTA
 

apeltzer avatar Sep 12 '18 11:09 apeltzer

Thanks! Do you have any idea about what values can be used to differentiate between the High and Mid-output modes of the instrument? I'm guessing that information would be available somewhere like the runParameters.xml, but since I don't have a runfolder I can't check it.

johandahlberg avatar Sep 13 '18 08:09 johandahlberg

I'm linking in Stephen here who should have access to such runParameters.xml - could you maybe make some available to Johan for that purpose? one for High and one for Mid Output modes on a NextSeq 500?

@sc13-bioinf

apeltzer avatar Sep 13 '18 08:09 apeltzer

Here's some information based on our NextSeq 550 DX. The DX instrument version is certified for diagnostic use, so has a different instrument ID, e.g. NDX550213 in our case.

I don't have access to medium output kit runs, but the high output ones have this in RunParameters.xml under the RunParameters node: <Chemistry>NextSeq High</Chemistry>

cbrueffer avatar May 15 '20 13:05 cbrueffer

Sorry for the very later reply @cbrueffer , and thank you for the information. While we don't currently have the resources to implement this, we would very much welcome a PR to fix it.

There is a stale PR here https://github.com/Molmed/checkQC/pull/69 where I stared work on this, that basically should take you through most of the changes that needs to be made.

johandahlberg avatar May 28 '20 08:05 johandahlberg

No worries Johan; I haven't had time to look into this further yet (hopefully soon), but for now I can at least add some more information:

The mid output kit is marked as <Chemistry>NextSeq Mid</Chemistry> in RunParameters.xml.

ID strings for the regular NextSeq start with @NS and @NB, according to https://github.com/OpenGene/fastp/blob/e30ec117f2dd45148942064128f0c9b3a48876e3/src/evaluator.cpp#L25

cbrueffer avatar Oct 13 '20 14:10 cbrueffer

Nice! I can add that Illumina have NextSeq and MiniSeq data in their demo data collection (requires registration to access), could perhaps serve as testdata.

matrulda avatar Oct 13 '20 14:10 matrulda

I was wondering if there is any progress in supporting NextSeq?

maleasy avatar Sep 06 '21 09:09 maleasy

FYI, I started this PR: https://github.com/Molmed/checkQC/pull/119

lbeltrame avatar Feb 10 '25 07:02 lbeltrame

For those still following this issue: the PR has been merged, so once a new checkQC release is out, this can be closed.

lbeltrame avatar Feb 26 '25 06:02 lbeltrame