libredwg
libredwg copied to clipboard
Bugs in text splitting for DXF output
Invalid DXF file is produced when trying to convert DWG to DXF using dwg2dxf
. Long text (usually in group 1
) is split incorrectly. The continuation line does not have a preceding group code. Additionally unicode code points are split between the lines. Here is an example fragment of such result (sorry, I cannot share the full source file, it is proprietary):
XRECORD
5
E331
102
{ACAD_REACTORS
330
E2CB
102
}
330
E2CB
100
AcDbXrecord
280
1
40
401321167
1
{\fCalibri|b1|i0|c0|p34;\L3\fCalibri|b1|i0|c161|p34;. \fCalibri|b1|i0|c0|p34;ΟΡΟΙ ΔΟΜΗΣΗΣ\P\fCalibri|b0|i0|c0|p34;\l \P \fCalibri|b1|i0|c0|p34; 3.1 ΞΕΝΟΔΟΧΕΙΑΚΕΣ ΕΓΚΑΤΑΣΤΑΣΕΙΣ - ΕΚΤΟΣ ΣΧΕΔΙΟΥ\fCalibri|
b1|i0|c161|p34;\L\P\pxi-20.507,l23,t5.4931,2 # <<<< continuation without tag!
1
3,24;\fCalibri|b0|i0|c161|p34;\l1. ΘΕΣΗ ΓΗΠΕΔΟΥ \fCalibri|b1|i0|c161|p34;\L\P\fCalibri|b0|i0|c161|p34;\l2. ΝΟΜΟΙ & ΔΙΑΤΑΓΜΑΤΑ \P\pi0,l0,tz;\P\P\P\P\P\P\pi-20.507,l23,t5.4931,23,24;3. ΑΡΤΙΟΤΗΤΑ\P4. ΣΥΝΤΕΛ?
?ΣΤΗΣ ΔΟΜΗΣΗΣ \fCalibri|b0|i0|c0|p34; # <<<< continuation without tag! unicode symbol broken, part is left on the previous line.
Looking into the code, I suspect several problems: https://github.com/LibreDWG/libredwg/blob/07c078aca71840f0f9a0dffb3032056d043858b0/src/out_dxf.c#L1270
- [x] Continuation group code is written only if remaining length is greater than 255. It is probably a typo, and it should check for greater than 0, like in the block above it.
- [ ] Unicode is not handled in any way. Code point may be split at an arbitrary byte.
- [x] According to DXF specification text length limit is 250 characters, not 255.
- [x] According to DXF specification, if text field is split (group
1
), all partial fragments (starting from the first one) should have group3
, and should be terminated by last fragment with group1
. Seems there is no place for such logic in the current implementation. - [x] This 1024 bytes limit looks very bad. Shouldn't it be increased with dynamic buffer allocation?