mne-python Add API entry list and map

Add API entry list and map

Open alexrockhill opened this issue 1 year ago • 5 comments

Related to https://github.com/sphinx-gallery/sphinx-gallery/pull/983. Just building first, can add a link in documentation after.

Aug 02 '22 22:08 alexrockhill

FYI the API entry list for this will look horrible because CircleCI won't build any examples given the current diff. You could push first with [circle front] to run just those examples to see if it's different, and if that works, a [circle full] commit should give us the true report

Aug 02 '22 22:08 larsoner

Yeah, doesn't look very good https://output.circle-artifacts.com/output/job/1ea52c68-79dc-4487-a8b1-b41a0606927a/artifacts/0/dev/auto_examples/sg_api_usage.html.

Aug 02 '22 23:08 alexrockhill

The graph doesn't look interpretable either. Maybe the connections should just be for the immediate module...

Aug 02 '22 23:08 alexrockhill

Ok, I ran it locally and the graphs just have too small of entries to read. However, the .dot files are also saved out so you can view those interactively with graphviz and I think it might look really cool but even the raw image files are too large for me to view interactively... it might just be too big of a graph... I'll think on it

Aug 03 '22 00:08 alexrockhill

Hmmm it's still saying all the API entries are used but when I build locally I get more reasonable results that ~25% of API entries are used https://output.circle-artifacts.com/output/job/e63a86c9-573a-4c81-8d66-76490af12c4e/artifacts/0/dev/auto_examples/sg_api_usage.html#. Maybe the latest development in the sphinx-gallery PR will fix that.

Aug 08 '22 16:08 alexrockhill

Looks okay, needed to remove __ methods though and not sure where the graphs went https://output.circle-artifacts.com/output/job/e8c84b67-a72c-413a-b2a6-3ab289f0fd4b/artifacts/0/dev/sg_api_usage.html

Aug 16 '22 02:08 alexrockhill

The failure is a timeout for downloading data. I'm going to try without circle full and I think that might work. https://app.circleci.com/pipelines/github/mne-tools/mne-python/15791/workflows/6093e339-3143-49eb-b904-95208495dc0b/jobs/47901

Aug 23 '22 16:08 alexrockhill

There is some warning you can fix. But I'll also manually restart the full build, since I think we do want to see it

Aug 23 '22 16:08 larsoner

Okay this is at least building:

https://app.circleci.com/pipelines/github/mne-tools/mne-python/15791/workflows/5517e63b-8b5f-4ff0-92ef-c7b1e9f392bd/jobs/48074

Even if it fails due to a warning, I think the artifacts could be interpretable if it at least finishes

Aug 23 '22 16:08 larsoner

Warning does seem related. I won't paste it here because it's a huge error message, but if you ctrl-F "warning" in the CircleCI log for the most recent commit you'll see it:

https://app.circleci.com/pipelines/github/mne-tools/mne-python/15868/workflows/d14a4555-8242-41de-90d8-81021df28fa0/jobs/48066?invite=true#step-139-2040

Aug 23 '22 16:08 larsoner

Warning does seem related. I won't paste it here because it's a huge error message, but if you ctrl-F "warning" in the CircleCI log for the most recent commit you'll see it:

https://app.circleci.com/pipelines/github/mne-tools/mne-python/15868/workflows/d14a4555-8242-41de-90d8-81021df28fa0/jobs/48066?invite=true#step-139-2040

Ah okay, so four of the graphs are too large but then they just get scaled. Maybe it's okay to suppress this warning then as long as they look okay?

Aug 23 '22 16:08 alexrockhill

Otherwise, looks pretty good! https://output.circle-artifacts.com/output/job/c9635f83-07fc-46b7-8744-95a231567fc5/artifacts/0/dev/sg_api_usage.html

Aug 23 '22 16:08 alexrockhill

IMO, the images for the used API entries should also be hidden inside the <details> when it's collapsed.

Also earlier you said

needed to remove __ methods though

but I still see them in the list of unused entries. is that correct? Did I miss something?

Aug 23 '22 17:08 drammock

Can you check the sizes on the produced images? I'm worried that we're hitting some limit.

Also, for me the "used nodes" graph looks like it ends up outside the "details":

To me the "Used nodes" is not super useful, since it's easy to see where mne.Annotations is used just by looking at it's rendered doc. Maybe we should make this entire section optional to save time/space/processing, not sure...

Aug 23 '22 17:08 larsoner

IMO, the images for the used API entries should also be hidden inside the <details> when it's collapsed.

Also earlier you said

needed to remove __ methods though

but I still see them in the list of unused entries. is that correct? Did I miss something?

Yeah not sure why that's the case, I think the re line might be incorrect.

Aug 23 '22 17:08 alexrockhill

Can you check the sizes on the produced images? I'm worried that we're hitting some limit.

Also, for me the "used nodes" graph looks like it ends up outside the "details":

To me the "Used nodes" is not super useful, since it's easy to see where mne.Annotations is used just by looking at it's rendered doc. Maybe we should make this entire section optional to save time/space/processing, not sure...

Agreed, although I think it would be cool to see. My motivation was the beamformer case and I thought it would be nice to see all the tutorials and examples that used both DICS and LCMV because I found new ones late on and I just thought it would be nice. I'll make it optional in the other PR and we can turn it off for MNE.

Aug 23 '22 17:08 alexrockhill

Okay, I think this is it https://output.circle-artifacts.com/output/job/b27285e1-829c-4e7b-af00-73e825d7955e/artifacts/0/dev/sg_api_usage.html

Aug 23 '22 20:08 alexrockhill

The biggest takeaway that I see right away is that there are a lot of documented API entries like in https://mne.tools/stable/generated/mne.beamformer.Beamformer.html where I think because the object inherits from a dictionary, you get things like fromkeys documented with the default dictionary documentation which might not be ideal.

Aug 23 '22 20:08 alexrockhill

you have the fromkeys etc for all containers that inherit from dict in mne. Beamformer is like Ingo or Projection etc. I would consider this a second order problem.

Message ID: @.***>

Aug 23 '22 21:08 agramfort

you have the fromkeys etc for all containers that inherit from dict in mne. Beamformer is like Ingo or Projection etc. I would consider this a second order problem. … Message ID: @.***>

Agreed, I wouldn't fix that here, it was just noticeable from the output. Might be nice to fix at some point.

Aug 23 '22 21:08 alexrockhill

You can see Forward does not suffer from this problem

https://mne.tools/dev/generated/mne.Forward.html#mne.Forward

It's because we use a special template for it:

https://github.com/mne-tools/mne-python/blame/main/doc/forward.rst#L9

If we do the same for the other classes that inherit from dict (or list), I think those methods will go away

Aug 23 '22 22:08 larsoner

I think this is a good POC for sphinx-gallery!

Do you want to work toward getting this in MNE? To me what is missing is "just" a correct regular expression to exclude stuff we don't care about ever documenting. To me this is:

dict methods of subclasses (.copy, .update, etc.) -- from a quick look they're probably all okay to exclude because the overlap(s) (maybe just copy?) is used all over the place so its use case is unambiguous
base classes like BaseEpochs
lesser-known variants of a well-known class, e.g., we probably don't care about MixedSourceEstimate or MixedVectorSourceEstimate, but probably do care about SourceEstimate and ~~MixedSourceEstimate~~ VectorSourceEstimate

Doing this will probably require use of (thing|other) where thing and other (and more of these) make use of .*, ^, and $ to exclude just the right items.

Whoever works on this, the easiest way to move forward would probably be to put all of the names currently in that HTML in a text file on your system, iterate using a python regex on it until the list looks like "what we actually want to document", then test this again here using the SG config key

Aug 23 '22 22:08 larsoner

Sure, I'm happy to do that, should just take a minute. Better make the PR in SG test the regular expression on the whole name (including extension classes) and not just the function name like it does now to facilitate this.

Aug 23 '22 22:08 alexrockhill

Ok, this gets it down to 248 unused entries. The entries in the mixins are tough to adjudicate but I assume if they're in a mixin, they are used once and so probably best to exclude them everywhere.

(.*__.*__|.*Base.*|.*Array.*|mne.Vector.*|mne.Mixed.*|mne.Vol.*|mne.coreg.Coregistration.*|.*utils.*|.*verbose\(\)|.*copy\(\)|.*update\(\)|.*save\(\)|.*get_data\(\)|.*add_channels\(\)|.*add_reference_channels\(\)|.*anonymize\(\)|.*apply_baseline\(\)|.*apply_function\(\)|.*apply_hilbert\(\)|.*as_type\(\)|.*decimate\(\)|.*drop\(\)|.*drop_channels\(\)|.*drop_log_stats\(\)|.*export\(\)|.*get_channel_types\(\)|.*get_montage\(\)|.*interpolate_bads\(\)|.*next\(\)|.*pick\(\)|.*pick_channels\(\)|.*pick_types\(\)|.*plot_sensors\(\)|.*rename_channels\(\)|.*reorder_channels\(\)|.*savgol_filter\(\)|.*set_eeg_reference\(\)|.*set_channel_types\(\)|.*set_meas_date\(\)|.*set_montage\(\)|.*shift_time\(\)|.*time_as_index\(\)|.*to_data_frame\(\)|.*clear\(\)|.*fromkeys\(\)|.*get\(\)|.*items\(\)|.*keys\(\)|.*pop\(\)|.*popitem\(\)|.*setdefault\(\)|.*values\(\)|.*apply\(\)|.*decision_function\(\)|.*fit\(\)|.*fit_transform\(\)|.*get_params\(\)|.*predict\(\)|.*predict_proba\(\)|.*set_params\(\)|.*transform\(\)|.*.remove.*|.*.write.*)

I'll rebuild so we can see.

Aug 23 '22 23:08 alexrockhill

I tried to format it nicely so that if dictionary inherited items are refactored, those could just be removed in the future. I'll update once it builds.

Aug 23 '22 23:08 alexrockhill

Traceback (most recent call last):
  File "/home/circleci/python_env/lib/python3.8/site-packages/sphinx/events.py", line 94, in emit
    results.append(listener.handler(self.app, *args))
  File "/home/circleci/python_env/lib/python3.8/site-packages/sphinx_gallery/gen_gallery.py", line 816, in write_api_entry_usage
    if re.match(gallery_conf['missing_doc_ignore'], entry) is not None:
  File "/usr/lib/python3.8/re.py", line 191, in match
    return _compile(pattern, flags).match(string)
  File "/usr/lib/python3.8/re.py", line 303, in _compile
    raise TypeError("first argument must be string or compiled pattern")
TypeError: first argument must be string or compiled pattern

So this needs to be fixed here. But also this should be validated / compiled in complete_gallery_conf in sphinx-gallery so that the error is raised at init instead of much later (here, after all examples have run)

Aug 24 '22 11:08 larsoner

Good point, I just added those checks to the other PR.

Aug 24 '22 17:08 alexrockhill

Huh, the regular expression didn't seem to have any effect... testing locally. It was after the whole name commit so I would have thought it would have worked...

Aug 24 '22 17:08 alexrockhill

Huh, the regular expression didn't seem to have any effect... testing locally. It was after the whole name commit so I would have thought it would have worked...

Ah okay the issue was that I shouldn't have escaped the parentheses. I'm not sure why that worked when I was testing... Anyway should be good now.

Aug 24 '22 18:08 alexrockhill

Ok, locally it looks great now! The biggest issue I think is that methods aren't properly associated with the examples that they are used in (e.g. mne.viz.Brain.add_head says it's unused but it's in https://mne.tools/dev/auto_examples/visualization/brain.html#sphx-glr-auto-examples-visualization-brain-py. This seems like a larger sphinx_gallery issue and one that might be hard to fix since it involves tracking the variable that the class is assigned to when it's made and associating that with methods when they are called... Not sure if anyone knows a solution to this.

Aug 24 '22 18:08 alexrockhill

mne-python mne-python copied to clipboard

Add API entry list and map

mne-python
mne-python copied to clipboard