grayskull
grayskull copied to clipboard
[FEATURE] Handling extras
Hi,
Would it please be possible to handle extras (like in pip install pettingzoo[atari]
) when generating the recipe?
Ideally, doing grayskull pypi pettingzoo[atari]
would fetch the extras and add them to the recipe.
Thanks,
Cyprien
hi @cyprienc ,
That is an interesting idea. It is not currently supported in grayskull, but it is a nice feature to have, indeed. I will add it to my todos :)
@marcelotrevisani Thanks for this awesome tool, first and foremost! Are you open to outside contributions? If so and if this feature isn't something you are actively working on, I'd love to help out.
@marcelotrevisani Thanks for this awesome tool, first and foremost! Are you open to outside contributions? If so and if this feature isn't something you are actively working on, I'd love to help out.
Hi @zzhengnan , sorry for my delay. Yes, please. We would love some help, contributions are very welcome! :)
Apologies for the delay. I finally got some time to study the code today and want to jot down some notes in case someone else wants to look into this issue, since my bandwidth will be fairly limited through the end of the year.
As it stands today, the code is already capturing information about extra dependencies. To see this, we can place a breakpoint right after the following section and inspect pypi_metadata
, sdist_metadata
, and metadata
.
https://github.com/conda-incubator/grayskull/blob/34d93e0ed57f68061cd6f23b967c3aee60a3fe81/grayskull/pypi/pypi.py#L555-L559
Using dask as an example (link to required dependencies, link to extra dependencies), I noticed the following
PyPI metadata contains info about extra dependencies as well as their categories
(Pdb) pypi_metadata['requires_dist']
['cloudpickle (>=1.1.1)',
'fsspec (>=0.6.0)',
'packaging (>=20.0)',
'partd (>=0.3.10)',
'pyyaml',
'toolz (>=0.8.2)',
"numpy (>=1.18) ; extra == 'array'",
"bokeh (!=2.0.0,>=1.0.0) ; extra == 'complete'",
"distributed (==2021.09.1) ; extra == 'complete'",
"jinja2 ; extra == 'complete'",
"numpy (>=1.18) ; extra == 'complete'",
"pandas (>=1.0) ; extra == 'complete'",
"numpy (>=1.18) ; extra == 'dataframe'",
"pandas (>=1.0) ; extra == 'dataframe'",
"bokeh (!=2.0.0,>=1.0.0) ; extra == 'diagnostics'",
"jinja2 ; extra == 'diagnostics'",
"distributed (==2021.09.1) ; extra == 'distributed'",
"pytest ; extra == 'test'",
"pytest-rerunfailures ; extra == 'test'",
"pytest-xdist ; extra == 'test'"]
sdist metadata contains info about just the categories
(Pdb) sdist_metadata['extras_require']
['diagnostics',
'array',
'complete',
'distributed',
'bag',
'delayed',
'dataframe',
'test']
After merging the two, the only info that's preserved is the categories
(Pdb) metadata['extras_require']
['diagnostics',
'array',
'complete',
'distributed',
'bag',
'delayed',
'dataframe',
'test']
In order to build the feature described in this issue, we need to achieve at least the following, possibly more
- Augment the command line interface so that specifications like
grayskull pypi pettingzoo[atari]
can be properly parsed - Take the extra dependencies info from the PyPI metadata, group them by their category and find a way to preserve that information after the the two sets of metadata (i.e., PyPI and sdist) have been merged
Indeed, that can be achieved as you mentioned, we need to parse the CLI add new arguments to pass which extras we want to extract and after, do some logic to add those packages. Yeah, pretty much that, very detailed explanation! :)
are you willing to open a PR?
I'll likely have limited bandwidth in the next few months. If that's not going to be problematic, I'm happy to keep digging. On the other hand, I want to be respectful of everyone's time and don't want to be the bottleneck for this feature not being built. Please let me know either way.
Duplicate of https://github.com/conda-incubator/grayskull/issues/150