Is there a reason not to add the books from Dicta? It seems to me that line breaks and highlighting are according to the classes that exist in the OCRData files that are in the json file there.