Amit Dovev comments

Results 538 comments of


                                            Amit Dovev

Always do serialization in little endian order

The only **big endian** architecture that is supported by LTS Linux distros is s390x (IBM Mainframe).

Always do serialization in little endian order

IMO, we can just drop the big endian support.

Always do serialization in little endian order

Tesseract has no SIMD support for s390x, so It's a huge waste of time and money to run Tesseract on IBM mainframes.

Always do serialization in little endian order

@stweil, can you comment about this issue?

Always do serialization in little endian order

Since we are going to keep the BE support, maybe we should test it regularly with GitHub actions using QEMU. https://github.com/uraimo/run-on-arch-action

Always do serialization in little endian order

>I doubt that there are critical applications using Tesseract OCR on big endian machines which would justify more efforts. That's why I thought we can drop it...

Always do serialization in little endian order

>Drop endianness check and swapping when reading data on LE machines.... models which were created on a big endian machine... must be rejected with an error message. How can you...

Always do serialization in little endian order

OK, I think I found the answer. Only check once during model loading. https://github.com/tesseract-ocr/tesseract/blob/d1912d70100d6e13ff0241c478735bbca7256d1e/src/ccutil/tessdatamanager.cpp#L120

Always do serialization in little endian order

@stweil, I don't think we should bother with step 2 and 3. What about doing just step 1? Another option is to keep the current code without doing any change...

segmentation fault shapeclustering -F

See #3925.