jazzy icon indicating copy to clipboard operation
jazzy copied to clipboard

Cross-link to external docsets

Open jpsim opened this issue 11 years ago • 22 comments

jpsim avatar Jul 03 '14 04:07 jpsim

Linking to Apple's docs and other projects should be possible, but will likely be a fairly difficult task.

jpsim avatar Nov 01 '14 02:11 jpsim

If you get the module that an external symbol refers to, this might be possible in cocoadocs

segiddins avatar Nov 01 '14 03:11 segiddins

Agreed.

jpsim avatar Nov 01 '14 03:11 jpsim

What's the progress on this?

istx25 avatar Nov 23 '16 14:11 istx25

No one's done any work on this as far as I can tell.

jpsim avatar Nov 26 '16 01:11 jpsim

Invoking a command like this:

xcrun docsetutil search -skip-text -query CLLocation ~/Library/Developer/Shared/Documentation/DocSets/com.apple.adc.documentation.iOS.docset

yields output like this:

 Swift/cl/-/CLLocation   documentation/CoreLocation/Reference/CLLocation_Class/index.html#//apple_ref/swift/cl/c:objc(cs)CLLocation
 Objective-C/cl/-/CLLocation   documentation/CoreLocation/Reference/CLLocation_Class/index.html#//apple_ref/occ/cl/CLLocation

We can construct a URL based on one of these file path–apple_ref combinations:

https://developer.apple.com/documentation/CoreLocation/Reference/CLLocation_Class/index.html#//apple_ref/occ/cl/CLLocation

which redirects to:

https://developer.apple.com/reference/corelocation/cllocation#//apple_ref/occ/cl/CLLocation

The query can be the name of a class, method, etc., but the redirect for a method only goes to the class reference. We can specify multiple symbols at a time, separated by spaces.

1ec5 avatar Jan 09 '17 08:01 1ec5

Nice finds @1ec5!

jpsim avatar Jan 10 '17 18:01 jpsim

With reference to my other comment (#190), should this be integrated into SourceKitten or jazzy? Additionally, do you think just reading the sqlite database and text-searching for module, then symbol is enough or should I try to reverse engineer the Xcode frameworks some more?

galli-leo avatar Jun 15 '18 16:06 galli-leo

I think it would go in Jazzy as a last step in resolving autolinks. Jazzy already has a dependency on the Ruby sqlite3 gem (for the docset builder) which groks that db fine, so should be fast to query it.

johnfairh avatar Jun 16 '18 09:06 johnfairh

@johnfairh Gotcha. Do you know if cocoadocs also store the search json file? The few doc pages from there I tried, didn‘t seem to have them :(. Because if so, we could even add doc links to all cocoapods.

galli-leo avatar Jun 16 '18 11:06 galli-leo

I will try to get something working today and throw up a pr for discussion

galli-leo avatar Jun 16 '18 11:06 galli-leo

Whooo some good news! Finally figured out how Xcode generates the uuids for the sqlite db. This should simplify this immensely. It's just a shortened sha1 of the usr of a symbol (provided it's actually a symbol).

Regarding whether to implement this in jazzy or sourcekitten, we definitely have to rework autolinking in jazzy, because on that step, we do not have any usr information anymore. Also I think this is either better suited for sourcekitten or a seperate app, since we need to execute cursorinfo for every symbol to get the usr.

Edit: After actually looking at the sourcekitten output, that should be enough to get it working (the fully annotated decl already contains the usr for any external symbol). However, imo it would be nice to integrate that into sourcekitten, since then others, beside jazzy could also profit from that.

Edit 2: Hmm, it seems that everything contains a usr link (if available). Maybe it would be a good idea to use that for autolinking instead of searching by name?

galli-leo avatar Jun 16 '18 15:06 galli-leo

Ok so my current implementation idea would be:

  1. Use annotated_decl if available (looks like this: <Declaration>public override var description: <Type usr="s:SS">String</Type> { get }</Declaration>)
  2. Read that as xml and "strip" any tags that we don't need. Keep the Type tag (or any with usr info, haven't come across that though). Transform it so it's always the same and convert the usr to an external / internal link. (e.g. <USRLINK url="...">...</USRLINK>
  3. Write a custom rouge lexer that scans for that and returns a custom token. (Already implemented that)
  4. Modify the HTML lexer to output an a tag with the url. (Already implemented that)

With this "new system", should the old autolinking methods still be kept? Or should we replace anything we find with the same USRLINK tokens and let the lexer & highlighter handle everything (IMO the better option)? Unfortunately, I haven't found a way to convert the dot notation to a usr (if anyone has an idea that would be great), so autolinking to external docs will be difficult. We can still use just text searching, but that will probably not be as reliable.

Additionally, we could even start linking code blocks inside markdown by passing the code blocks to sourcekit's index request and then inserting the provided usrs. (Though that's probably out of scope)

galli-leo avatar Jun 16 '18 22:06 galli-leo

The existing autolinking methods are also used for references to symbols within backticks in documentation comments. I don’t think those references get marked up with USRs.

1ec5 avatar Jun 16 '18 23:06 1ec5

@1ec5 Yeah that‘s what I meant with dot notation. So there we would either have to go back to the old system or resolve the usrs ourselves using the old methods.

galli-leo avatar Jun 16 '18 23:06 galli-leo

Nice detective work!

My 2p: I think that searching the DB (approx select reference_path from map where reference_path like '%/uiviewcontroller/%' etc. for method references) as an addition to the current autolink resolver would be a good first step. This would address Swift declarations, Objective-C declarations, and references to types/methods made from markdown docs or doc comments. I understand this might not be 100% accurate but I feel it will be pretty successful.

Then look at the USRLINK part and rearranging the data structures to be USR-based as a separate piece for Swift -- there are definite good reasons for doing that, performance, features, that last few % accuracy. But, there are common places this doesn't work (objc decls, markdown) so prefer to do 'good enough' on general case first.

On the sourcekitten/jazzy thoughts - maybe you could output an extra JSON object from sourcekitten doc that maps from USRs to either apple doc URLs or the DB uuid, where those USRs were accumulated and uniqued from the preceding doc json.

johnfairh avatar Jun 17 '18 09:06 johnfairh

@johnfairh Maybe we could add another output to sourcekitten that‘s parsed_with_links? Because using the USRLINK seems easier than doing a fuzzy text search, especially for Types. (I mostly implemented the userlink approach already :P)

Yeah it won‘t work for markdown comments or Objective-C, but we could run them through cursorinfo as well (If they are just types) or do fuzzy textsearch to create a USRLINK object as well.

galli-leo avatar Jun 17 '18 11:06 galli-leo

So I got it working quite nicely in SourceKitten:

public class Test: Swift.CustomStringConvertible
"key.parsed_annotated_decl" : "public class Test : Swift.<USRLINK usr=\"s:s23CustomStringConvertibleP\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/customstringconvertible\">CustomStringConvertible<\/USRLINK>"

"key.parsed_annotated_decl" : "public func test(first: <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, second: <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, closure: @escaping ((<USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>) -> (<USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>, <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>))) throws -> <USRLINK usr=\"s:SS\" url=\"https:\/\/developer.apple.com\/documentation\/swift\/string\">String<\/USRLINK>",
            "key.parsed_declaration" : "public func test(first: String, second: String, closure: @escaping ((String, String) -> (String, String))) throws -> String",

I even got it working for parsing arbitrary types quite easily (e.g. in a doc comment with Test), by just calling cursorinfo with the type as text. (So this works even with Apple types in doc comments!)

The next step would be parsing arbitrary function or property references. My idea would be (instead of using the old system), to have SourceKitten "register" any usrs it finds, with their parent(s) and parameters, and when parsing doc comments, go back to that list and find anything that matches.

This way we can even add a new command to SourceKitten that resolves a usr or any bit of pseudocode to get a documentation reference and / or usr.

Let me know, what you think. We can also just have SourceKitten link anything with a concrete usr and then autolink anything else in jazzy using the existing system. But I think implementing this in SourceKitten makes more sense, as other people can also profit as well as provide a replacement for the old docsetutils.

Should I create a pr with the SourceKitten changes already? It's really messy atm.

galli-leo avatar Jun 17 '18 15:06 galli-leo

If we have something that works everywhere (swift/objc/markdown) then I'm happy. A replacement for docsetutils sounds like a useful tool.

My only concern -- that I won't go on about any more after this because it feels like I'm just sitting here criticising while you do work!! -- is complexity: if we have the 'sql match' fallback we may as well just always use it. I pushed a small (hacky, incomplete, not widely tested...) sketch of this to this branch just to check we're on the same page. Resolves types String/NSMutableArray / UIViewController etc. but not method refs eg. UIViewController.prepare(for:sender:).

I think I understand your go-back-into-sourcekitten approach. A technical problem might be that type references in markdown / doc comments etc. may not actually compile (missing imports). I can see the attraction in treating code more like compilable code than parsable text.

If you think you're on the right track in sourcekitten then can't hurt to put up a PR so that team can look if they've bandwidth.

johnfairh avatar Jun 19 '18 13:06 johnfairh

@johnfairh I decided to throw up my current code in a pr for criticism (https://github.com/jpsim/SourceKitten/pull/537). Might want to take a look as well, since it's mostly about architectural decisions.

My only concern -- that I won't go on about any more after this because it feels like I'm just sitting here criticising while you do work!! -- is complexity: if we have the 'sql match' fallback we may as well just always use it.

If you take a look at my pr, querying by reference path will not be used. Instead we create an index of all usrs and then can resolve any "dot notation" into a usr and use that to link (external or into the same module). Or use the usr if already available / use cursorinfo for types. Additionally, please criticise as much as you like, bad design doesn't help anyone :P.

A technical problem might be that type references in markdown / doc comments etc. may not actually compile (missing imports). I can see the attraction in treating code more like compilable code than parsable text.

That's why I have "multiple" ways of finding a usr. First use cursorinfo. If it compiles and finds a type, great we are done here. Else search the "index" for any matches and then either add the url or create the reference to the usr.

Do you know if clang gives any info about usrs? I haven't had time to look at the non sourcekitten side.

Sidenote: By implementing this in sourcekitten we already have all the possible xcode paths in there, see the pr :).

galli-leo avatar Jun 19 '18 21:06 galli-leo

Libclang: see here.

johnfairh avatar Jun 20 '18 18:06 johnfairh

Thought I would post a quick update here. I got the SourceKitten implementation mostly working (i.e. parse doc comments and declarations). Additionally, I hacked jazzy to work with the new SourceKitten implementation and a live demo can be found here: http://galli.me/jazzy-demo/index.html.

galli-leo avatar Jun 24 '18 17:06 galli-leo