libredwg icon indicating copy to clipboard operation
libredwg copied to clipboard

DWG large files errors

Open arturred opened this issue 5 years ago • 9 comments

Hi Reading these files using the current binary assert reports many errors :

Warning: checksum: 0x31b7133b (calculated) mismatch

ERROR: read_R2004_section_info out of range
Warning: Failed to find section_info[7] with type 1
ERROR: Failed to read compressed Header section
ERROR: Invalid .props x 28191
Warning: Failed to find section_info[7] with type 3
ERROR: Failed to read compressed Classes section
Warning: Skip empty section 2329 AcDb:AcDbObjects
ERROR: Invalid opcode 0x0 in input stream at pos 294
ERROR: Failed to read compressed AcDbObjects section
Warning: Failed to find section_info[7] with type 2
ERROR: Failed to read uncompressed AuxHeader section
ERROR: Preview overflow > 29067
Warning: thumbnail.size mismatch: 29071 != 0
ERROR: Some section size or address out of bounds
ERROR: Template section not found
...

Please download samples from https://easyupload.io/jvzytl Teigha or AutoCAD can read them. If I convert them to other dwg formats (lower or higher), they open fine. This seems to be a file specific issue.

arturred avatar Oct 14 '20 11:10 arturred

Yes, this is the known section map bug #144

rurban avatar Oct 14 '20 16:10 rurban

Thanks for the info. This seems to be a hard bug for a year. I've also tested libdxfrw trying to fix it (using your suggestions for variables overflow) but no luck so far. I either get duplicated ids of a page map section or invalid addresses outside the buffer range. In other files, the page map seems to be correct but reading section info fails. The problem is not only overflow values but also the decompressed buffer that may contain gaps (negative page id?). No idea so I hope that you will figure this out.

arturred avatar Oct 15 '20 08:10 arturred

Yes, a hard one. The more failing dwg examples, the better to figure out the scheme. In principle it needs a big dwg and then delete many entities, which causes the gaps.

rurban avatar Oct 15 '20 20:10 rurban

I think I found a ton of files that fail in this same manner: https://www.3drotterdam.nl/downloads/#/

From HEAD build, Fedora 29, GCC 8.3.1:

curl -O https://www.3drotterdam.nl/downloads/global/download//DWG/Rotterdam_Centrum.zip
unzip Rotterdam_Centrum.zip
dwgread -O GeoJSON -o Cool.json Rotterdam_Centrum/Bomen/Cool.dwg
Warning: checksum: 0x28e2125d (calculated) mismatch

ERROR: Skip section A with size 89 > 1 * 0
ERROR: read_R2004_section_info out of range
Warning: Failed to find section_info[7] with type 1
ERROR: Failed to read compressed Header section
Warning: Failed to find section_info[7] with type 3
ERROR: Failed to read compressed Classes section
Warning: Failed to find section_info[7] with type 4
ERROR: Failed to read compressed Handles section
Warning: Failed to find section_info[7] with type 2
ERROR: Failed to read uncompressed AuxHeader section
ERROR: Preview overflow
ERROR: Invalid product_checksum size 16. Need min. 16 bits, have 65280 for .
ERROR: Template section not found

ERROR: Failed to decode file: Cool.dwg 0x941

ERROR 0x941

markstock avatar May 03 '21 21:05 markstock

This seems to be a good example, thanks. No deleted pages, just a corrupt section_info[6] out of thin air. Interesting

rurban avatar May 04 '21 04:05 rurban

It seems that many files have the checksum and other issues, including some that ship with libredwg:

root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json example_2010.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2edd12f6 (calculated) mismatch
root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json example_2013.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2c7512b9 (calculated) mismatch
root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json example_2018.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2d1512c9 (calculated) mismatch
root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json sample_2018.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2845124f (calculated) mismatch

And about a third of the sample files I am using to test.

How much does this impact the ability to extract text from the file? Are we going to miss any sections do to this issue?

no-such-user avatar Oct 20 '21 18:10 no-such-user

Large files will cause all the text to be garbled, Is there a way to solve?

FishOrBear avatar Apr 01 '22 08:04 FishOrBear

FishOrBear @.***> schrieb am Fr., 1. Apr. 2022, 10:09:

Large files will cause all the text to be garbled, Is there a way to solve?

Only, if many objects have been deleted. no way, as of yet

Reply to this email directly, view it on GitHub https://github.com/LibreDWG/libredwg/issues/272#issuecomment-1085571981, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAKGULVSETJNYJWY5U76ZTVC2VMDANCNFSM4SQP347Q . You are receiving this because you were assigned.Message ID: @.***>

rurban avatar Apr 02 '22 12:04 rurban

FishOrBear @.> schrieb am Fr., 1. Apr. 2022, 10:09: Large files will cause all the text to be garbled, Is there a way to solve? Only, if many objects have been deleted. no way, as of yet — Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <github.com/notifications/unsubscribe-auth/AAAKGULVSETJNYJWY5U76ZTVC2VMDANCNFSM4SQP347Q> . You are receiving this because you were assigned.Message ID: @.>

Why use dwggrep.exe to read without garbled characters?

FishOrBear avatar Apr 11 '22 05:04 FishOrBear