peppy
peppy copied to clipboard
the project missing `sample_name`, but with `sample_table_index` set correctly, cannot run in looper
Unclear if this is actually a looper issue
I just checked all peppy tests related to this issue, and it seems, this is not a peppy issue...
Is there some way of getting the sample's index (whether it be from sample_name
or some custom attribute) from a peppy sample object?
- You can specify different sample id. By default it is 'sample_name' (This functionality works fine)
- to get sample by id (whether it is
sample_name
or other id) you can use this function:prj.get_sample("sample_id")
- to get sample_table index:
prj.sample_table_index
- There is no function that will return id (if you don't know
sample_table_index
). So,sample.get_id()
is not implemented. -
Or, for example, does peppy allow you to use
sample.sample_name
when someone uses sample.sample as index withsample_table_index
: sample - this is not implemented either.
Traceback from Looper:
/home/drc/GITHUB/looper/master/looper/venv/bin/python -m looper run --looper-config .looper.yaml
Looper version: 1.6.0a2
Command: run
Using default divvy config. You may specify in env var: ['DIVCFG']
Pipestat compatible: False
Traceback (most recent call last):
File "/home/drc/GITHUB/looper/master/looper/venv/lib/python3.10/site-packages/peppy/simple_attr_map.py", line 45, in __getattr__
return self._mapped_attr[item]
KeyError: 'sample_name'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/drc/GITHUB/looper/master/looper/looper/__main__.py", line 8, in <module>
sys.exit(main())
File "/home/drc/GITHUB/looper/master/looper/looper/cli_looper.py", line 715, in main
return run(args, rerun=(args.command == "rerun"), **compute_kwargs)
File "/home/drc/GITHUB/looper/master/looper/looper/looper.py", line 457, in __call__
name=sample.sample_name,
File "/home/drc/GITHUB/looper/master/looper/venv/lib/python3.10/site-packages/peppy/simple_attr_map.py", line 47, in __getattr__
raise AttributeError(f"Attribute not found: {item}")
AttributeError: Attribute not found: sample_name
Process finished with exit code 1
This is if I attempt to use project config:
pep_version: 2.0.0
sample_table: sample_annotation.csv
sample_table_index: not_sample_name
with sample annotation:
not_sample_name,library,file,toggle
frog_1,anySampleType,data/frog1_data.txt,1
frog_2,anySampleType,data/frog2_data.txt,1
Confirmed that the issue is here in Peppy: https://github.com/pepkit/peppy/blob/c0affde7f93ef4cb084aeec25a8d82155d35299c/peppy/project.py#L641-L661
There is a reason why we don't want to do that. This change was made in this commit: https://github.com/pepkit/peppy/commit/3d5495b0241a117cdfb07ee60e15c6a3e2a516eb
So I think this is correct behaviour
After discussion, it was determined that this is something that should be solved within Peppy. We will return to using sample.sample_name
as a way to easily identify the sample identifier regardless of whether or not the user provided a different identifier via sample_table_index
. Generally, we want the user to call items via bracket notation sample[sample_name]
(since attmap has been removed). However, sample_name will be elevated such that the user can easily get the sample identifier via attribute access, e.g. sample.sample_name
.
This fix should happen in the above previously referenced Peppy code, and Looper changes should be reverted to access the same.sample_name
attribute.