peppy
peppy copied to clipboard
Allow a merged sample where some attributes are missing values
I encountered a problem where one sample from sample_table had two runs: one for paired-end and second for single-end sequencing and received following error.
(databio) cgf8xr@cphg-fqvt2j3:~/databio/repos/pep-nextflow/pseudo_nextflow_task$ eido validate --st-index sample nextflow_files/samplesheet.csv -s samplesheet_schema.yaml -e
Found 1 samples with non-unique names: {'2612'}. Attempting to auto-merge.
Traceback (most recent call last):
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/pathex_attmap.py", line 39, in __getattr__
v = super(PathExAttMap, self).__getattribute__(item)
AttributeError: 'Sample' object has no attribute 'fastq_2'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/ordattmap.py", line 46, in __getitem__
return super(OrdAttMap, self).__getitem__(item)
KeyError: 'fastq_2'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/pathex_attmap.py", line 42, in __getattr__
return self.__getitem__(item, expand)
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/pathex_attmap.py", line 59, in __getitem__
v = super(PathExAttMap, self).__getitem__(item)
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/ordattmap.py", line 48, in __getitem__
return AttMap.__getitem__(self, item)
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/attmap.py", line 32, in __getitem__
return self.__dict__[item]
KeyError: 'fastq_2'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/cgf8xr/databio/venvs/databio/bin/eido", line 8, in <module>
sys.exit(main())
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/eido/cli.py", line 104, in main
p = Project(cfg=args.pep, sample_table_index=args.st_index)
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 138, in __init__
self.create_samples(modify=False if self[SAMPLE_TABLE_FILE_KEY] else True)
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 164, in create_samples
self._auto_merge_duplicated_names()
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 484, in _auto_merge_duplicated_names
flatten([getattr(s, attr) for s in dup_samples])
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/peppy/project.py", line 484, in <listcomp>
flatten([getattr(s, attr) for s in dup_samples])
File "/home/cgf8xr/databio/venvs/databio/lib/python3.8/site-packages/attmap/pathex_attmap.py", line 46, in __getattr__
raise AttributeError(item)
AttributeError: fastq_2
I think we must handle this problem.
Exemplary file: samplesheet.csv
According to the Nextflow people (providers of the sample table) this example is valid because "You can sequence the same library across different platforms and chemistries, so you could have different run types for one or different libraries of the same sample (this regularly happens in aDNA)".
Ok, agreed.
Peppy should not choke on this or raise an error, but we should handle this and just add those attributes as appropriate.
https://github.com/pepkit/peppy/pull/396
After tests this issue seems to be fixed, closing