iconv-lite icon indicating copy to clipboard operation
iconv-lite copied to clipboard

Improper encoding / decoding of some special 7-bit values in cp437, macintosh

Open rossj opened this issue 3 years ago • 3 comments

Hi there.

I've noticed that cp437 does not properly encode / decode special symbols that are assigned to bytes 0x01-0x1F and 0x7F. Instead, When decoding, these bytes are incorrectly treated as-is and passed through as control characters. Similarly, when encoding the special characters in this range, they are replaced with question marks.

I've noticed a similar issue with the macintosh encoding, which has special symbols defined at x11-x14.

As an example, the two tests below are currently failing:

import { decode, encode } from 'iconv-lite';

describe('encodings', () => {
    it('should encode special cp437 symbols that map to bytes 0x0-0x1F', () => {
        const input = '\u263A'; // A smiley face
        const result = encode(input, 'cp437');
        expect(result[0]).toEqual(1);
    });

    it('should decode cp437 bytes in range 0x01-0x1F', () => {
        const input = Buffer.from([1]);
        const result = decode(input, 'cp437');
        expect(result).toEqual('\u263A');
    });
});

rossj avatar Aug 06 '20 20:08 rossj

hmm yeah I think you're right. Thank you for filing this issue and the tests, really helpful! My current encoding generation code uses iconv project as the source, so it seems that it's wrong there too. Strange to see this in a relatively widely known encoding. I'll fix this soon.

ashtuchkin avatar Nov 22 '20 04:11 ashtuchkin

Came here to log exactly this. Any ETA? This would help a lot with enigma-bbs as well as a text mode RPG I'm working on!

NuSkooler avatar Jan 30 '21 01:01 NuSkooler

I had a double check, seems the issue exist indeed. I checked the source code, and found cp437 was achieved by remote resource, but i guess the remote resource lack of partial data. how about we make special treatment for these special characters?

hmm yeah I think you're right. Thank you for filing this issue and the tests, really helpful! My current encoding generation code uses iconv project as the source, so it seems that it's wrong there too. Strange to see this in a relatively widely known encoding. I'll fix this soon.

yosion-p avatar Aug 23 '21 02:08 yosion-p