#504

BREAKING CHANGES:

Schema change on database (fixed by auto-versioning)
SearchResult (only for the full API, not the simple one)

Other concerns:

Repeatedly stores module docstrings (we should probably move it to a separate table)
Increases the DB size (5mb vs 20mb same env)
There's some limitations on what we can and cannot get docstrings for (see typing). Since we use AST for these its incomplete and not always accurate (due to decorators or such)
Slightly increases cache time 1.33 sec to 1.43 sec on large env

Preview

Side note: I really love the ORM you made, its very cool

Checklist (delete if not relevant):

[x] Database migrator (aka check if it has the right columns and wipe otherwise)
[x] I have added tests that prove my fix is effective or that my feature works
[x] I have updated CHANGELOG.md
[x] I have made corresponding changes to user documentation for new features
[x] I have made corresponding changes to library documentation for API changes
[ ] I have added preferences to enable/disable this (blocked by #516)

Nov 03 '22 06:11 bagel897

Hi bagel, thank you for the contribution.

Since this database is actually really just a "cache"/"index" of the files on disk, I think we don't really need a full blown migration. So, rather than checking whether the database columns matches, we can just have a table to store metadata for the database file, e.g. the version of rope that created the database. If the rope version doesn't match, then just wipe the tables.

For convenience during development of rope itself, where version number may not reflect actual on-disk changes, we can just wipe the database when the hashes of the files in rope/contrib/autoimport directory changes, but I think even a manual wipe with rm -r .ropeproject would have been fine.

Nov 03 '22 07:11 lieryan

Good point - the tests are in memory so they should work but I'll look into the version table

Nov 03 '22 13:11 bagel897

Hi Bagel, thanks for this, the changes for collecting documentation looks reasonable to me. However, I'm not quite seeing the reason why it is necessary to index the docstrings, maybe you can help me understand here.

Autocompletions need to index names because it needs fast search of the python names as the user types, but once you get the possible matches, wouldn't it be possible to just retrieve the docstring when asked by the editor in real time rather than preemptively returning all docstrings? For example, LSP >= 3.16.0 allows lazy resolution of an autocompleted name's documentation.

Resolving in real-time would allow for more complex resolution strategies as well that wouldn't really be easy during indexing time, for example, if a name is aliased or re-exported by another module, or you can retrieve the documentation for global object instances by inferring the type of the object and finding the docstring of its class.

Nov 15 '22 04:11 lieryan

That is a valid point - this can't get more than like 40% of docstrings so an import based solution would be better for these. On the other hand, I think the increased DB size and time isn't too bad, so it may be worth it to add it here as a baseline and let other projects build atop it (perhaps they could import if its missing/blank?). Also importing has the arbitrary code execution issue while this is AST based.

Nov 15 '22 05:11 bagel897

initial docstring support

BREAKING CHANGES:

Other concerns:

Preview

Checklist (delete if not relevant):