peppy icon indicating copy to clipboard operation
peppy copied to clipboard

the project missing `sample_name`, but with `sample_table_index` set correctly, cannot run in looper

Open donaldcampbelljr opened this issue 1 year ago • 9 comments

donaldcampbelljr avatar Dec 05 '23 17:12 donaldcampbelljr

Unclear if this is actually a looper issue

donaldcampbelljr avatar Dec 05 '23 17:12 donaldcampbelljr

I just checked all peppy tests related to this issue, and it seems, this is not a peppy issue...

khoroshevskyi avatar Dec 07 '23 18:12 khoroshevskyi

Is there some way of getting the sample's index (whether it be from sample_name or some custom attribute) from a peppy sample object?

nsheff avatar Dec 18 '23 20:12 nsheff

  • You can specify different sample id. By default it is 'sample_name' (This functionality works fine)
  • to get sample by id (whether it is sample_name or other id) you can use this function: prj.get_sample("sample_id")
  • to get sample_table index: prj.sample_table_index

  • There is no function that will return id (if you don't know sample_table_index). So, sample.get_id() is not implemented.
  • Or, for example, does peppy allow you to use sample.sample_name when someone uses sample.sample as index with sample_table_index: sample - this is not implemented either.

khoroshevskyi avatar Dec 18 '23 21:12 khoroshevskyi

Traceback from Looper:

/home/drc/GITHUB/looper/master/looper/venv/bin/python -m looper run --looper-config .looper.yaml 
Looper version: 1.6.0a2
Command: run
Using default divvy config. You may specify in env var: ['DIVCFG']

Pipestat compatible: False
Traceback (most recent call last):
  File "/home/drc/GITHUB/looper/master/looper/venv/lib/python3.10/site-packages/peppy/simple_attr_map.py", line 45, in __getattr__
    return self._mapped_attr[item]
KeyError: 'sample_name'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/drc/GITHUB/looper/master/looper/looper/__main__.py", line 8, in <module>
    sys.exit(main())
  File "/home/drc/GITHUB/looper/master/looper/looper/cli_looper.py", line 715, in main
    return run(args, rerun=(args.command == "rerun"), **compute_kwargs)
  File "/home/drc/GITHUB/looper/master/looper/looper/looper.py", line 457, in __call__
    name=sample.sample_name,
  File "/home/drc/GITHUB/looper/master/looper/venv/lib/python3.10/site-packages/peppy/simple_attr_map.py", line 47, in __getattr__
    raise AttributeError(f"Attribute not found: {item}")
AttributeError: Attribute not found: sample_name

Process finished with exit code 1

This is if I attempt to use project config:

pep_version: 2.0.0
sample_table: sample_annotation.csv
sample_table_index: not_sample_name

with sample annotation:

not_sample_name,library,file,toggle
frog_1,anySampleType,data/frog1_data.txt,1
frog_2,anySampleType,data/frog2_data.txt,1

donaldcampbelljr avatar Dec 18 '23 22:12 donaldcampbelljr

image

donaldcampbelljr avatar Jan 16 '24 22:01 donaldcampbelljr

Confirmed that the issue is here in Peppy: https://github.com/pepkit/peppy/blob/c0affde7f93ef4cb084aeec25a8d82155d35299c/peppy/project.py#L641-L661

donaldcampbelljr avatar Jan 17 '24 17:01 donaldcampbelljr

There is a reason why we don't want to do that. This change was made in this commit: https://github.com/pepkit/peppy/commit/3d5495b0241a117cdfb07ee60e15c6a3e2a516eb

So I think this is correct behaviour

khoroshevskyi avatar Feb 15 '24 16:02 khoroshevskyi

After discussion, it was determined that this is something that should be solved within Peppy. We will return to using sample.sample_name as a way to easily identify the sample identifier regardless of whether or not the user provided a different identifier via sample_table_index. Generally, we want the user to call items via bracket notation sample[sample_name] (since attmap has been removed). However, sample_name will be elevated such that the user can easily get the sample identifier via attribute access, e.g. sample.sample_name.

This fix should happen in the above previously referenced Peppy code, and Looper changes should be reverted to access the same.sample_name attribute.

donaldcampbelljr avatar Mar 19 '24 16:03 donaldcampbelljr