peppy
peppy copied to clipboard
peppy creates projects with no sample name column
Related to #473
here's a csv file demo_fasta.csv
:
assembly,local_file
demo0,data/demo/demo0.fasta
demo1,data/demo/demo1.fasta
demo2,data/demo/demo2.fasta
demo3,data/demo/demo3.fasta
demo4,data/demo/demo4.fasta
demo5,data/demo/demo5.fasta
demo6,data/demo/demo6.fasta
here's a pep yaml demo_fasta.yaml
:
sample_annotation: demo_fasta.csv
watch this:
- Can't load a CSV file directly, because it has no
sample_name
column. This is correct:
p = peppy.Project("analysis/config/demo_fasta.csv")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 163, in __init__
self.create_samples(modify=False if self[SAMPLE_TABLE_FILE_KEY] else True)
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 264, in create_samples
self._assert_samples_have_names()
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 561, in _assert_samples_have_names
raise InvalidSampleTableFileException(message)
peppy.exceptions.InvalidSampleTableFileException: sample_table is missing 'sample_name' column; you must specify sample_tables in sample_name or derive them
BUT, it's not problem going through the yaml (that provides nothing other than a pointer to the CSV):
p = peppy.Project("analysis/config/demo_fasta.yaml")
Config file does not have version key. Defaulting to 2.1.0
This happily creates a project with no samples in it, despite having the annotation table:
p
Project
_config_file: analysis/config/demo_fasta.yaml
_sample_table_path: null
_subsample_tables_path: null
_config:
sample_annotation: demo_fasta.csv
pep_version: 2.1.0
st_index: sample_name
sst_index:
- sample_name
- subsample_name
_samples: []
_samples_touched: False
is_private: False
progressbar: False
name: config
description: null
_sample_table: Empty DataFrame
Columns: []
Index: []
>>> p.samples
[]
Interesting. The error is actually that I mis-specified the sample_table
attribute as sample_annotation
.
So, the problem is actually that it's not warning me of a missing sample_table, leading to my confusion. When I correct that error, using the config does give the error I expect:
p = peppy.Project("analysis/config/demo_fasta.yaml")
Config file does not have version key. Defaulting to 2.1.0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 163, in __init__
self.create_samples(modify=False if self[SAMPLE_TABLE_FILE_KEY] else True)
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 262, in create_samples
self.modify_samples()
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 438, in modify_samples
self._assert_samples_have_names()
File "/home/nsheff/.local/lib/python3.8/site-packages/peppy/project.py", line 561, in _assert_samples_have_names
raise InvalidSampleTableFileException(message)
peppy.exceptions.InvalidSampleTableFileException: sample_table is missing 'sample_name' column; you must specify sample_tables in sample_name or derive them
#473
That is not related. It is a red herring