parsers
parsers copied to clipboard
Monorepo for a suite of `unified`-compatible converters for converting between, from, and to .docx, JATS XML, LaTeX, and PDF
JOTE Parsers
Monorepo for a suite of parsers used in the Journal of Trial and Error.
The goal is to automate the process of converting a manuscript from a word processor to a JATS XML file, which can then be used to generate a PDF and HTML version of the manuscript. Ideally this would allow authors to work in Word/Google Docs but still have the benefits of a modern publishing workflow.
The only current implementation is found at convert.centeroftrialanderror.com (very shodyy!).
Currently has 3 suites of parsers:
ooxast/reoff: Tools to parse, convert from, and create OOXML (.docx) XML. Currently only contains a parser and a converter tojats, plus some tools.jats/rejour: Tools to parse, convert from, and create JATS XML. Currently contains a parser, stringifier, and a converter totexast, plus some tools.texast/relatex: Tools to parse, convert from, and create LaTeX (DEPRECATED, use unified-latex instead). Only contains a way to generate LaTex fromtexastASTs.
Additionally, there are a few other tools:
citations-: Tools to parse and convert citations.ojs-: Things to operate on the OJS apiutils-: Various utilities
Finally, there are the processors, which are basically convenient wrappers around the parsers.
There is also an app directory, which contains a few apps that use the parsers, atm a really crappy frontend for converting from .docx.
See below for more info.
Dependency Graph
View an interactive dependency graph here

Overview
Apps:
converter-frontend
converter-frontend-e2e
jote
react-pdf
react-pdf-e2e
citations
crossref-json
Type definitions for crossref api responses
crossref-to-csl
Convert crossref metadata to CSL
csl-consolidate
Try to resolve a list of CSL data with crossref metadata
csl-to-biblatex
Somewhat jank CSL-JSON to biblatex converter
ojs-types
Some typescript types for OJS api responses
parse-text-cite
Small tool that parses a string of text containing APA style in text citations, e.g. Jones (2020), and returns a rudimentary AST with the thing parsed.
hybrid-builder
src
jast
jast
Type definitions for jast (journal article/abstract syntax tree), a syntax for abstract syntax trees representing JATS XML, specifically the "Green" publishing tag set.
Transform a CSL list or object to a jast node.
jast-util-to-csl
Convert a jast citation syntax tree to list of csl objects.
jast-util-to-texast
Utility to convert a jast tree to a texast tree.
notion
html-to-notion-blocks
Transform HTML to Notion blocks
rehype-notion
Plugin for rehype to turn HTML into Notion blocks
ojs
ojs-client
new default(«destructured»: object = {}): default;
ojs-relatex
Convert ojs data to relatex
ooxast
ooxast
Type definitions for ooxast (Open Office XML abstract syntax tree), a syntax for abstract syntax trees representing Open Office XML documents in the unist format.
ooxast-util-citation-plugin
Small ooxast utility which scans the text to identify the citation plugin used, either Mendely, Zotero, EndNote, Citavi, native word citations or none at all. It is used to feed into other things.
ooxast-util-citations
This package is ESM only. In Node.js (version 12.20+, 14.14+, 16.0+, 18.0+), install as
ooxast-util-get-style
Get style from a w:p element.
ooxast-util-parse-bib
Find and convert raw references to CSL-JSON using anystyle.
ooxast-util-parse-bib-browser
Find and convert raw references to CSL-JSON.
ooxast-util-parse-bib-node
Find and convert raw references to CSL-JSON.
ooxast-util-properties
Return the properties of an ooxast node as a JSON object
ooxast-util-remove-rsid
Cleans all the rsid tags from an ooxast tree, and merges w:r elements if they only differ by rsid values.
ooxast-util-to-hast
Convert docx to html (Not working)
ooxast-util-to-jast
Util to convert ooxast syntax tree to jast syntax tree, allowing for .docx to JATS XML conversion.
ooxast-util-to-mdast
Convert ooxast syntax tree to mdast syntax tree.
ooxast-util-to-unified-latex
Convert ooxast syntax tree to unified-latex syntax tree.
plugins
better-nx-tsc
processors
docx-to-jats
processorsDocxToJats(): string;
docx-to-tex
DOCX to TeX converter
jats-to-tex
jatsToTex(jats: string): Promise<VFile>;
jote-docx-tex
docxToTex(input: Uint8Array, options: object = {}): Promise<VFile>;
rejour
rejour-frontmatter
rejourFrontmatter(): Function;
rejour-meta
Doesn't do anything atm
rejour-move-abstract
Really simple plugin for rejour that moves the abstract from the body to the front of a JATS document.
rejour-parse
Parser for rejour that parses the JATS document to a jast tree.
rejour-relatex
Plugin for rejour that transforms a jast syntax tree into a texast syntax tree, allowing for conversion between JATS XML and LaTeX.
rejour-stringify
Plugin for rejour that stringifies a jast syntax tree to a JATS XML document.
relatex
relatex-add-preamble
Plugin for relatex that adds a preamble to a texast syntax tree.
relatex-stringify
Plugin for relatex that stringifies a texast syntax tree to a LaTeX file.
reoff
docx-to-vfile
Reads a .docx file and stores its components in vfile format to be processed by other tools, like reoff-parse.
reoff-cite
Plugin for reoff that parses citations in the form of @cite{key} and @cite[page]{key} using ooxast-util-parse-bib and ooxast-util-parse-text-cite.
reoff-clean
Plugin for [reoff][reoff] to clean the ooxast tree.
reoff-parse
Plugin for [reoff][reoff] to parse a .docx XML file into an ooxast AST. Ideally use docx-to-vfile to get to a parseable state.
reoff-parse-references
Plugin for reoff which tries to find a bibliography in the document and parse it using ooxast-util-parse-bib.
reoff-parse-references-browser
Plugin for reoff which tries to find a bibliography in the document and parse it using ooxast-util-parse-bib.
reoff-rejour
Plugin for reoff that transforms an ooxast syntax tree into a jats syntax tree, i.e. converting .docx to JATS XML.
reoff-remark
Plugin for reoff that takes an ooxast tree and turns it into a remark tree, allowing for .docx to .tex conversion
reoff-unified-latex
Plugin for reoff that takes an ooxast tree and turns it into a unified-latex tree, allowing for .docx to .tex conversion
texast
texast
DEPRECATED: Type definitions for texast (LaTeX abstract syntax tree), a syntax for abstract syntax trees representing LaTeX documents in the unist format.
texast-util-add-preamble
Add a preamble to a texast syntax tree.
texast-util-to-latex
Convert a texast syntax tree to LaTeX.
unified-latex
unified-latex-stringify
Plugin for unified-latex that takes an unified-latex tree and turns it into LaTeX
update-readme
src
utils
misc
tryCatchPromise<T>(promise: Promise<T>, errorHandler?: Function): Promise<[T | null, unknown | null]>;
ojs-to-preamble
This package is ESM only. In Node.js (version 12.20+, 14.14+, 16.0+, 18.0+), install as
readme
This library was generated with Nx.
update-readme
xast
xast-util-has-attribute
Port of hast-util-has-property for xast
xast-util-is-element
Port of hast-util-is-element for xast
xast-util-select
Port of (hast-util-select)[https://github.com/syntax-tree/hast-util-select] for use with xast nodes.
License
GPL-3.0+ © Thomas F. K. Jorna