SciMLDocs icon indicating copy to clipboard operation
SciMLDocs copied to clipboard

Some third party Docs are external links, some are rendered inside

Open kellertuer opened this issue 2 years ago • 13 comments

I noticed, that some documentation of third arty tools like

https://docs.sciml.ai/ManifoldDiffEq/stable/ (thanks for seeing these already btw) or https://docs.sciml.ai/FFTW/stable/

are rendered (again) in this repo while they have their own docs ( https://juliamanifolds.github.io/ManifoldDiffEq.jl/stable/ or https://juliamath.github.io/FFTW.jl/stable/) while other menu entries for third party tools like Flux link to external docs.

Is there a rule when docs are “re-rendered” here and when they are linked? I feel linking should be the default if the standalone docs outside SciML exist?

kellertuer avatar Mar 10 '23 10:03 kellertuer

I'm not sure why Flux's ends up linking back.

ChrisRackauckas avatar Mar 10 '23 11:03 ChrisRackauckas

I think it would be nice for Documentations outside the SciML org to be links and not rendered MultiDocs-docs?

kellertuer avatar Mar 10 '23 12:03 kellertuer

But then the taskbar on the top would be gone. An embedding is probably best.

ChrisRackauckas avatar Mar 10 '23 12:03 ChrisRackauckas

But embedding means there is always 2 documentation links/URLs at least if not 2 different versions always, since they are rendered here and in their original place, I think?

kellertuer avatar Mar 10 '23 12:03 kellertuer

No, if it's in iframe it's just the same exact webpage, just embedded.

ChrisRackauckas avatar Mar 10 '23 12:03 ChrisRackauckas

Ah, ok, so outdated docs, which were my first worries, is not a problem. Then my only critique here is, that for external ones being “I-framed-into” the SciML-org, they might be confused to be not-so-third-party but actually SciML projects. This might be misleading for users? Th url is now

https://docs.sciml.ai/ManifoldDiffEq/stable/

from within the docs here but ManifoldDiffEq belongs to JuliaManifolds.

kellertuer avatar Mar 10 '23 12:03 kellertuer

No, if it's in iframe it's just the same exact webpage, just embedded.

I don't think that's correct, is it? I was just looking into MultiDocumenter for some personal packages and to me it seems it actually copies everything - no iframes, no links, just copies of the existing Documenter output with some tiny fixes and by default previews removed.

As an example, the MCMCDiagnosticTools docs (https://turinglang.org/MCMCDiagnosticTools.jl/stable/) are included in the ArviZ docs: https://julia.arviz.org/MCMCDiagnosticTools/stable/ But if you inspect the ArviZ docs, there are no iframes or embeddings, it's just a plain HTML page. And indeed, if you check the corresponding gh-pages branch (https://github.com/arviz-devs/ArviZJuliaDocs/tree/gh-pages/MCMCDiagnosticTools) you see that it contains a copy of the MCMCDiagnosticTools docs.

So to me it seems that indeed whenever you use Multidocumenter you copy all existing docs and republish them. I guess that's fine (but possibly messes with SEO?) when you "own" the re-published docs but seems a bit strange if it is a third-party project.

devmotion avatar Jul 05 '23 21:07 devmotion

It messes a bit with SEO, but the JuliaHub copy does that as well.

My main concern is really that this is a surely related but still non-SciML project and the docs currently indicate this differently, when browsing to https://docs.sciml.ai/ManifoldDiffEq/stable/; sure the menu states “Third party solvers”, but the page itself does not. Besides that I am not sure why a copy is necessary. Would a link not be enough?

I am also not so happy about the JuliaHub docs copy, because that leads to google linking to old version therein, at least when I tried a few searches. But sure neither of them is breaking any laws/copyrights. It is just that I am not a fan of either copies and do not see the reason why that should be necessary.

kellertuer avatar Jul 06 '23 08:07 kellertuer

No, it's not an iframe, it is a copy. But we could try to experiment with an iframe (and maybe have a small UI indicator saying that "hey, we're just mirroring someone else's docs here).

mortenpi avatar Jul 07 '23 06:07 mortenpi

At least an option would be great!

I think for example all SciML-internal packages that are “at home” in this MultiDoc do not necessarily need that indicator but external ones would? My personal preference would still be to not copy external ones, but that might just be my personal preference.

kellertuer avatar Jul 07 '23 06:07 kellertuer

I think for example all SciML-internal packages that are “at home” in this MultiDoc do not necessarily need that indicator but external ones would?

Yeah, I assume that https://docs.sciml.ai/ is the canonical home for most of the packages. It's just things like Flux and Ferrite that would get an iframe and an indicator. I have something in mind, so let me try to mock something up.

mortenpi avatar Jul 07 '23 06:07 mortenpi

Here's a rough mockup of what I had in mind (the colors might only work in dark mode): https://mortenpi.eu/mockups/sciml-ferrite/

image

A couple of potential issues that popped into my head though:

  • If it's an iframe, those docs will not be included in the site-wide search. You can still use the search within the iframe though.
  • The URL won't update as you browse the external package docs (e.g. it will always be https://docs.sciml.ai/Ferrite/stable/).
  • If the remote docs change their URL, then the iframe in the SciML docs immediately 404s. But this would just need a new aggregation build with a fix, so probably not a big issue.

So another thing we could do is just inject the small "mirrored documentation" notice, but otherwise keep copying the docs.

mortenpi avatar Jul 12 '23 03:07 mortenpi

I like this!

Just locally on my phone I noticed the one line remark at the top could be 2 lines on mobile (if with is quite small) otherwise for example the URL vanishes behind the content that follows below.

kellertuer avatar Jul 12 '23 07:07 kellertuer

I just googled for something related to Distributions - and all the top results were the SciML docs. I think that's really really bad and I don't see why SciML should rehost a possibly outdated copy of the Distributions docs. I very strongly believe the SciML docs should only link to docs of external tools.

devmotion avatar Aug 30 '24 11:08 devmotion

There's fixes coming in the form of https://github.com/JuliaComputing/MultiDocumenter.jl/issues/79. The issue is that multidocumenter doesn't let us use links...

ChrisRackauckas avatar Aug 30 '24 11:08 ChrisRackauckas

The linked "improvement" doesn't address my point. Marking pages as rehosted and performing some SEO tricks is not what I want - SciML should just not deal with and definitely not copy any external docs. I don't see any benefit over just linking to external docs, on the contrary it leads to inconsistencies and ia confusing for users. Distributions is not a SciML project.

devmotion avatar Aug 30 '24 12:08 devmotion

The issue is that multidocumenter doesn't let us use links...

There are links everywhere on every page? I imagine SciML docs could eg contain a page with noteworthy external packages and there link to the docs of these packages. Since these are external any tighter integration seems wrong.

devmotion avatar Aug 30 '24 12:08 devmotion

IANAL but I just came to think that the current behaviour of copying and rehosting external documentation is also potentially legally problematic. I know eg that Stan uses a quite strict license for their user guide which is different from the license used for the code. So even if all external packages use the MIT license for their code it is not clear a priori what license is used for the docs and whether it allows eg rehosting.

devmotion avatar Aug 30 '24 12:08 devmotion

If you get me a feature to put a link or iframe then I'll use it. I've been asking for it for years now.

ChrisRackauckas avatar Aug 30 '24 12:08 ChrisRackauckas

I think I don't understand what you want. A markdown page with a few links should be sufficient, shouldn't it? Why is that problematic?

devmotion avatar Aug 30 '24 12:08 devmotion

IMO in the worst case (which isn't too bad I think) it should just be removed without any replacement or alternative approach. Docs of SciML packages for which eg Distributions is useful can just link to it in their docs (which are then rehosted here).

devmotion avatar Aug 30 '24 12:08 devmotion

I would also prefer an external link actually and not something embedded. I do not see any good reason to have that embedded. None of the external projects is a SciML project so why should that be embedded?

kellertuer avatar Aug 31 '24 06:08 kellertuer

A markdown page with a few links should be sufficient, shouldn't it? Why is that problematic?

According to users, that is problematic. Remember, we did a full user study where we interviewed about 10 people from different companies and things that were highlighted as issues with documentation navigation were:

  1. Links in normal text to pages not findable from nav bars were unable to be found
  2. Values in code that are not searchable
  3. Examples of building packages on the code were unable to be found (to see if non-SciML code could be used in a project)

What you're saying we should do goes exactly against what we have collected a lot of evidence as what users requires, and does exactly what people have explicitly said drives them away. I think we can do it for practical reasons, but let's not act like it's not problematic when we have gone through all of the steps to thoroughly document why it's problematic. I'll see if I can open these interviews but it might be hard given companies can be fussy with recorded information.

IMO in the worst case (which isn't too bad I think) it should just be removed without any replacement or alternative approach. Docs of SciML packages for which eg Distributions is useful can just link to it in their docs (which are then rehosted here).

This directly goes against (2), as the whole point is that Distributions objects were one of the examples that were pointed out in an example code as rand(Normal()) had non-searchable things, i.e. searching the docs for Normal does not show anything. With the current form, the upper search bar in theory solves it, though its loading time is a bit too long to make that practical and the newer Documenter changes push people to use the other search (having two search bars is problematic for other reasons anyways).

We can remove it due to implementation issues, but let's not delude ourselves into thinking absolutely no good reason and that non-navbar external links work for this. That's very easily debunked with data and interviews that were already done. Keep the discussion non-emotion and in the real-world here. We know there's implementation issues, Documenter and MultiDocumenter were supposed to solve them, and now I don't think they will anytime soon so we need to accommodate that.

@mortenpi a minimal feature that we would need next is the ability to link to external docstring canonical locations, so for example I could have Normal link over to Distributions.jl, and if it shows up in examples we catch it so it shows up in the search as a link over to those docs. Otherwise we need to double up lots of docs in order to have any sense of completion. Also, links in the Multidocumenter top bars.

ChrisRackauckas avatar Sep 01 '24 22:09 ChrisRackauckas

IANAL but I just came to think that the current behaviour of copying and rehosting external documentation is also potentially legally problematic. I know eg that Stan uses a quite strict license for their user guide which is different from the license used for the code. So even if all external packages use the MIT license for their code it is not clear a priori what license is used for the docs and whether it allows eg rehosting.

Our legal council does not agree when it's an MIT license. Can you please share your legal analysis?

Again, I'm going to merge a PR that does a removal of the right stuff, so this isn't a no. But please keep to facts here.

ChrisRackauckas avatar Sep 01 '24 22:09 ChrisRackauckas

Sure, legally, this might all be fine. But from my perspective as one of the packages is from JuliaManifolds

  • it is not nice that SciML doc show up before the actual one
  • it seems they are currently up to date, but will they always be?
  • they are “rewrapped” and have a SciML banner atop

All this is negative I think. Maybe legal, but both not nice.

It is your organisation and you can of course run your development as you like, but I feel

  • you develop super fast
  • you break often (the only breaks on CI we ever head were from SciML packages but we have them a bit too often even)
  • the documentation is not that good (sure they're are tutorials but functions are merely never documented)
  • things are usually only half-finished as far as I see – does Maniopt.jl work with Optimisation.jl? I will write an issue about that in a second because it always reports failure, and is not mentioned in the readme.

And again, this is not against your work, you do very very much, answer super fast, all very nice – I value all the effort you put in. I just do not like the quality of the result.

So I would prefer that SciML would not try to convey the impression ManifoldsDiffEq is a SciML package, since it is in the docs and looks “like a chapter in the SciML book”.

And the least you can do (when you already interview users) – maybe ask the developers of the external packages you link?

kellertuer avatar Sep 02 '24 06:09 kellertuer

I'd be happy to go one-by-one fixing docs issues you bring up, but there is nothing concrete here, and again there is so much that is quantifiably so far from reality that I don't even know where to start on half of it. Can we please make things concrete?

you break often (the only breaks on CI we ever head were from SciML packages but we have them a bit too often even)

Can you please point to a break that you had from public API that's not in a NEWS.md with breaking release? We can correct that. For things that are public API, the DiffEq tutorial code from 2018 still works.

From https://github.com/JuliaManifolds/ManifoldDiffEq.jl/pull/28 I know though that ManifoldDiffEq is using internals and type pirates functionality in a way that directly contradicts our documented interfaces. From what I know, that's the only issue we had with Manifolds? And sorry but I don't see how that's not an expected outcome from extending internals?

the documentation is not that good (sure they're are tutorials but functions are merely never documented)

Can you point to a public API function that is missing a docstring? I'd be happy to fix it. "Merely never documented", given that it's rather straightforward to calculate that we have close to 100% coverage on public API docstrings (thanks to the new public API features!) at this point and are in the process of turning on tests to enforce that? I'd be happy to help figure out what is the issue but it's hard to make this comment actionable given that almost complete coverage makes it hard to find a public function that isn't documented, so I'd be really happy to find out which one you're thinking of!

things are usually only half-finished as far as I see – does Maniopt.jl work with Optimisation.jl?

I don't understand this comment. "things" refers to everything in a broad way, but then you bring up Optimization.jl as a typical example? We are very public about the fact that Optimization.jl is the newest solver library and hasn't caught up. It's specifically given the label "maturity level: low" with the definition:

"Low - these libraries are still in heavy development, with a major version update planned for the completion of the library. While the core methods are documented with tutorials, many of the extensive features are not as documented. The test suite covers the core areas but known edge cases exist."

One of the major points of the State of SciML talk was specifically that we plan to have a major push to finish Optimization.jl to be at the level of the other packages in 2025-2026. So it's very atypical, so atypical that it has been publicly labeled as our most in-development solver library right now.

Do you have a different concrete example of typical half-finished?

I will write an issue about that in a second because it always reports failure, and is not mentioned in the readme.

Please report, since it seems to be passing its tests in a rather recent merge https://github.com/SciML/Optimization.jl/actions/runs/10612194789/job/29413344610 so I don't know what you're referring to.

ChrisRackauckas avatar Sep 02 '24 07:09 ChrisRackauckas

Can you please point to a break that you had from public API that's not in a NEWS.md with breaking release? We can correct that. For things that are public API, the DiffEq tutorial code from 2018 still works.

I am not sure what the public API is, so probably all we did use was not public API, but we had no other way to use it, cf. also the discussion you linked to.

The specific thing that keeps me from working with DiffEq is that any function definition in the current ManifoldDiffEq like

https://github.com/JuliaManifolds/ManifoldDiffEq.jl/blob/af86f91ddeeb99856f4b1f15b3edea6f3dbde7b3/src/frozen_solvers.jl#L30-L46

is pure guesswork. If I look today at the variablaes, I do not remember what 80% of them are, this is not documented, probably considered internal and breaks every now and then.

Sure using the algorithms you provide might be stable since 2018 or so.

Defining own ones? I would not agree on that. So since that is all not considered public – is defining own solvers discouraged? Not possible? Not allowed? At least that struggle kept me from continuing on that, though I find geometric integrators interesting in general.

Concerning Optimisation, I opened an issue at https://github.com/SciML/Optimization.jl/issues/814 – and sure here it might be the case that that is the area I work in mainly, that I see a few things a bit different.

But yeah since those two are the only two contact points I have with SciML, maybe my impression might be a bit too focussed on things where I experienced only struggles.

kellertuer avatar Sep 02 '24 08:09 kellertuer

@mortenpi a minimal feature that we would need next is the ability to link to external docstring canonical locations, so for example I could have Normal link over to Distributions.jl, and if it shows up in examples we catch it so it shows up in the search as a link over to those docs.

@ChrisRackauckas I'm not 100% sure I understand the ask, but would https://github.com/JuliaDocs/DocumenterInterLinks.jl be an option here? It would allow you to link to canonical docs that are hosted elsewhere. As long as the other docs are using a recent enough Documenter, they will have the inventory file to make this work -- no need to do anything to make the upstream "opt-in".

mortenpi avatar Sep 02 '24 08:09 mortenpi

Defining own ones? I would not agree on that. So since that is all not considered public – is defining own solvers discouraged? Not possible? Not allowed? At least that struggle kept me from continuing on that, though I find geometric integrators interesting in general.

I will just remove the SciMLBase dependency from ManifoldDiffEq because its stable interface is insufficient to properly express our geometric integrators. I think that's the best solution here. I think we shouldn't spread this issue to unrelated discussions.

mateuszbaran avatar Sep 02 '24 08:09 mateuszbaran

Defining own ones? I would not agree on that. So since that is all not considered public – is defining own solvers discouraged? Not possible? Not allowed?

You just overload the high level API functions. It's in the dev docs: https://docs.sciml.ai/DiffEqDevDocs/stable/contributing/adding_packages/. There's then plenty of packages to look at that do this, from standard ones like Sundials.jl and SimpleDiffEq.jl to more obscure ones like DASKR.jl and IRKGaussLegendre.jl. I just noticed that DASKR's v2.5 from 2019 is still used in some benchmarks and it seems to work just fine, so that seems pretty stable.

Now if you're asking, can you add your own algorithms in a way that extends OrdinaryDiffEq.jl's internals? None of those solver packages dare to extend another package's internals, they simply stick to giving implementations of the interface.

Like anything digging into internals, that's really an at your own risk thing. If you really want to do that, we could add a downstream test to help keep it in sync, though I'd first want to find out what you're trying to do and why it needs to be separated. It's not expected that every internal function of OrdinaryDiffEq is not going to change: it's a package defining specific solvers and is not the API or interface. There are two packages that I know of, PositiveIntegrators.jl and ProbNumDiffEq.jl, which extend OrdinaryDiffEq.jl like this, and they keep alive mostly because there's a downstream test to them. But I wouldn't recommend doing this if you don't have to. The OrdinaryDiffEq integrator infrastructure is nice for handling lots of extra feature details it's not ready to be extended like that. Since we split OrdinaryDiffEq to be a monorepo with separate solver packages as libraries within it, I would rather recommend that anything relying on the internals be tested as part of the tests.

I will just remove the SciMLBase dependency from ManifoldDiffEq because its stable interface is insufficient to properly express our geometric integrators. I think that's the best solution here. I think we shouldn't spread this issue to unrelated discussions.

That's probably the right thing. If the interface doesn't cover what's required for full manifold definitions, then there's other issues that would need to be thought out, especially as we keep improving our tests that things don't break the interface 😅

ChrisRackauckas avatar Sep 02 '24 08:09 ChrisRackauckas