dar icon indicating copy to clipboard operation
dar copied to clipboard

Why not use NoteBook formats, like Jupyter?

Open jobdiogenes opened this issue 6 years ago • 4 comments

From what I understood, Dar is format that add data and code for scientific publishing. Jupyter Notebook do this same thing.

Why not use Jupyter?

jobdiogenes avatar Apr 16 '18 20:04 jobdiogenes

The Jupyter Notebook format does not model some contents which are important for an article to be considered a full-fledged manuscript (e.g. reference list, abstract with translations, journal specific metadata etc.). That's why we want to take it one step further and build on top of JATS-XML (an already established format at scientific journals). Additionally, we need to allow publications that consist of multiple documents (manuscript, notebooks, sheets) and assets (such as images, videos etc.).

Our ultimate goal: Create a reproducible publication (.dar) and submit it to a journal, which can review and publish the work as fast as possible (possible by using a Dar-compatible toolset).

We explained that in more detail in a recent webinar: https://youtu.be/oyBX9l9KzU8?t=446

The Jupyter team is actively involved in the discussion and we are considering different options make things interoperable:

  • Convert from Jupyter to Dar easily (and possibly the other way around, which is harder as there may be data loss)
  • Team up with Jupyter (and other projects in the space) to work out how we can share an open archive format, that all authoring tools (Jupyter, Stencila, ...) can use. Then we don't need to convert back and forth and we open the path for more tools to come that support an expressive and flexible de-facto-standard.
  • Work together on a specification for open kernels / runtime environments (think Binder, but not just working for Jupyter-style notebooks but full-fledged manuscripts, including data, user functions, etc.)

Here's a Google Doc where we are discussing this:

https://docs.google.com/document/d/1zIYXpbeUpFvfV5W0DR4S9Pd0PSm-nM5tYZIBqN2U2Zk/edit?usp=sharing

michael avatar Apr 16 '18 21:04 michael

Also see this recent demo of such a reproducible publication in action:

http://builds.stenci.la/stencila/reproducible-publication-example-2018-04-16-dcf17f9/example.html?archive=repro-pub

(note this is still a work in progress, and Python, R, etc. contexts are not exposed in this demo to keep things lightweight)

michael avatar Apr 16 '18 21:04 michael

Thanks Very Much, @michael.

You help me a lot. I'm a IT gui for (www.nupelia.uem.br) which could be translate as Research Institute for Limnology Ichthyology (Fish) and Aquaculture. And its was great to know in the video that one of the starters of Stencila works in fisheries research.

I think that Stencila is really in the way to fill the gap for reproducible in publishing research. I just make a fork in github. And I will try.

At now I'm working to deploy FidusWriter connected with OJS (already working) and generate Scielo Schema (JATS derivated). Then people from FidusWriter appoints me to Dar and [https://www.le-tex.de/en/transpect.html].

I think that FidusWriter did not has the same goal as Stencila (reproducible and live documents), but, could be a step forward from our currently and common workflow manual (.doc/.odt) and OJS.

As I understand, Dar format could be used as intermediate to generate JATS ?. You think using DAR, I could address to generate a Scielo JATS XML Scheme ?

Thanks again.

jobdiogenes avatar Apr 18 '18 01:04 jobdiogenes

Hi @jobdiogenes the Dar format actually IS valid JATS. We just enforce a stricter tagging schema. Once we have all use-cased covered SciELO, Erudit, eLife and others will be using Dar as the primary specification. It's actually based on SciELO's schema we just try to make it generic where it is necessary.

Short-term you'll be able to create and quality check static manuscripts (Texture editor). We are expecting a first stable release by September 2018. Long term (once majured), Stencila and other tools (probably even Jupyter) will allow creating reproducible publications, that can be accepted by journals directly.

michael avatar Apr 19 '18 18:04 michael