cudf icon indicating copy to clipboard operation
cudf copied to clipboard

Enable numpydoc validation

Open vyasr opened this issue 2 years ago • 17 comments

numpydoc supports docstring validation that we should enable to supplement things like pydocstyle checks. The numpydoc validation will help us with more semantic checking, such as ensuring that docstrings actually align with the function signatures that they are attached to.

vyasr avatar Jul 12 '22 20:07 vyasr

Hi, I'd like to help out on this. When you say enabling docstring validation, is this the same thing as running it?

ShahRishi avatar Jul 12 '22 22:07 ShahRishi

Hi @ShahRishi, we'd love your help! It's not quite that simple, although it's close. You'll need to build the documentation with the checks turned on (that's how you run it), but then you will also need to fix any issues that it finds. Here are the instructions on building the docs. My recommendation would probably be to clone the repository and install the conda environment as it recommends, but instead of building cudf from source just try to install cudf from the rapidsai-nightly conda channel so that you get the most up-to-date version of the package without having to build it yourself. Then you should be able to start from step 2 on that page and build the documentation as described. Does that make sense? Let me know if you have any further questions!

vyasr avatar Jul 12 '22 23:07 vyasr

Hi, just wanted to let you know that I'm still working on this. Sorry for taking so long, this is new for me!

ShahRishi avatar Jul 15 '22 17:07 ShahRishi

Hi, I think I realize the problem that I'm having. I am working on a 3060ti GPU, which is not listed as a RAPIDS compatible GPU. Am I able to build the documentation with this GPU?

ShahRishi avatar Jul 17 '22 19:07 ShahRishi

Hmm I don't think that should be a problem. We only require a Pascal architecture, and the 3060ti is Ampere. What errors are you running into? Did you actually build cudf before building the documentation? cudf must be installed before building the docs.

vyasr avatar Jul 18 '22 20:07 vyasr

Hi, just wanted to let you know that I'm still working on this. Sorry for taking so long, this is new for me!

No worries! Please feel free to keep asking questions.

vyasr avatar Jul 18 '22 20:07 vyasr

Just wanted to let you know I haven't abandoned this and am still working!

ShahRishi avatar Aug 01 '22 06:08 ShahRishi

No problem! Please let us know if you need help with anything. You can also find us on Slack.

shwina avatar Aug 01 '22 09:08 shwina

@ShahRishi were you able to resolve the build issues? Hope that isn't still holding you up.

vyasr avatar Aug 01 '22 19:08 vyasr

Hi, I apologize for the delay. I've had to put this on hold as I've been studying for the MCAT exam through August. I am still working and am learning a lot. I know this is taking me far longer than it should, but I am determined to figure this out. Thank you so much for your patience.

ShahRishi avatar Sep 08 '22 01:09 ShahRishi

Hi @ShahRishi - it's no problem! We'll keep this open and assigned to you.

shwina avatar Sep 08 '22 03:09 shwina

Hello, thank you again for your patience. I have successfully added the validation checks and rebuilt the docs. I realized that the problem I was having was that I had an outdated version of cmake. After running the validation checks, several thousands of warnings appeared. Should I fix all of these warnings? If not, among the full mapping of validation checks are there only certain ones I should address? Thank you!

ShahRishi avatar Sep 14 '22 03:09 ShahRishi

Hi @ShahRishi, I'm glad that you were able to get this working! Could you post a sample of some of the warnings that you are seeing? Since this is the first time that we're enabling this, I think what we will want to do is make a sequence of PRs where each one just fixes all instances of one or two errors. That will let us figure out which warnings are worth fixing and which ones (if any) we want to ignore. If you can post some examples on this issue, we can evaluate those warnings a few at a time.

vyasr avatar Sep 19 '22 20:09 vyasr

Hi @vyasr! Thank you for the feedback. I am unfortunately unable to access my computer right now, but I will paste all of the warnings delivered and post them after 9pm PST tonight! Thank you again!

ShahRishi avatar Sep 19 '22 20:09 ShahRishi

Great, thank you! If you want to paste all of the warnings, please paste them into a text file and attach it here rather than copying directly, otherwise it will be very difficult to parse through them on the Github UI.

vyasr avatar Sep 19 '22 21:09 vyasr

Hi! I have uploaded the warnings and added a file below. I thought it might be better to see the general scope of the warnings since I was afraid that I would not include some important warnings.

warnings_cudf_numpydoc.txt

ShahRishi avatar Sep 20 '22 02:09 ShahRishi

Awesome! Here's a command that gives a quick idea of what warnings are showing up:

grep -v WARNING warnings_cudf_numpydoc.txt | sort | uniq

To get just a list of error codes you can also do

grep -v WARNING warnings_cudf_numpydoc.txt | sort | uniq | awk '{printf $1"\n"}' | uniq

From that, I see:

ES01:
EX01:
GL01:
GL02:
GL03:
GL06:
GL07:
GL08:
PR01:
PR02:
PR03:
PR04:
PR05:
PR06:
PR07:
PR08:
PR09:
PR10:
RT01:
RT02:
RT03:
RT04:
RT05:
SA01:
SA02:
SA04:
SS03:
SS05:
SS06:
*
/home/rishis/anaconda3/envs/cudf_dev/lib/python3.9/site-packages/cudf/core/series.py:docstring

From looking through that output, these all look like real errors that we should fix (although a couple of them are very pedantic like ending every line with a period). We can save the last few of lines until the end since those represent issues beyond just the docs (mostly AttributeError from something failing to import). The numpydoc website has a full list of error codes to explain what each of these are. Since these all do look like real errors, my recommendation would be to pick one or two of the above error codes at a time, find all of the issues, and make a PR fixing just that issue. That should make it fairly easy to get your PRs reviewed quickly. Does that make sense?

vyasr avatar Sep 20 '22 23:09 vyasr

This makes perfect sense. I will work on an error and create a pull request once I am ready. Thank you!

ShahRishi avatar Sep 21 '22 21:09 ShahRishi

@ShahRishi did you ever have any luck with this?

vyasr avatar Aug 17 '23 21:08 vyasr

Hi @vyasr. Thank you for reaching out, I unfortunately have had issues with this, so I slowed down. Is this something that is still an open issue?

ShahRishi avatar Nov 04 '23 17:11 ShahRishi

Hi @ShahRishi, yes it is and we would still welcome your contributions here. If there's something we can do to help unblock you, please let us know!

vyasr avatar Nov 06 '23 23:11 vyasr