schema
schema copied to clipboard
ALTO XML schema - latest and all former versions
Currently there is no way of distinguishing hard and soft `HYP` elements. Example of a hard hyphen: ``` I separated the words by a non- breaking space. ``` Example of...
The `GlyphType` [documentation](https://github.com/altoxml/schema/blob/1a67f01c3689e5ff4b1714c4395b9fdcf668d93b/v4/alto-4-4.xsd#L1181) states: ``` Accordingly the value for the glyph element will be defined as follows: Pre-composed representation = base + combining character(s) (decomposed representation) See http://www.fileformat.info/info/unicode/char/0101/index.htm "U+0101" =...
Use of LC xlink instead of w3c xlink fails in mixed validation. Mixup/clashes in schema definitions.
ALTO (also true for MODS, METS and EAD) uses a LoC version of xlink.xsd (http://www.loc.gov/standards/xlink/xlink.xsd) but with a w3c namespace (https://www.w3.org/1999/xlink.xsd). When validating mixed xml we prefer to use the...
Starting from version 4.3 a better ordering/grouping mechanism was defined. Usage of ZORDER/IDNEXT as a parallel ordering option may lead to a lot of issues (define priority between two systems,...
As per the [2021-04-29 Board Meeting](/altoxml/board/blob/gh-pages/minutes/2021/2021-04-29%20ALTO%20Board%20Meeting%20Minutes.md), the _CC_ attribute: https://github.com/altoxml/schema/blob/831adab03d0b20d3d6a220f082ce496a18caef12/v4/alto-4-2.xsd#L525 will be marked as deprecated in the documentation by including this line. ```xml Deprecated. Where possible, the Gylph element should...
For example ALTO schema allows negative float values in attributes like WIDTH, HEIGHT, HPOS, VPOS where values should be positive. Validating against schema doesn't catch documents where software has created...
This topic is derived from https://github.com/altoxml/schema/issues/49. On previous issue we focus on changing documentation and announce PointsType restrictions, and on this topic we will have the discussion regarding restrictions implementation,...
``` html Submitter: CCS ([email protected]) Submitted: 2013-02 Status: Discussion Backwards compatible:**Yes (Only Annotation)** To ALTO Version: ? ``` For the page / word and character confidence the values for the...
On face-2-face conference in Vienna the idea came up to generate a conversion between PAGE and ALTO as best-practice mapping between the different standard objects. If feasible, a transformation could...
At the [2019-05-07 ALTO face-to-face Board Meeting](/altoxml/board/edit/gh-pages/minutes/2019/2019-05-07%20ALTO%20Board%20Meeting%20Minutes.md) there was support for identifying a a standard and interoperable lattice structure for encoding OCR uncertainty and alternative hypotheses, and the discussion since...