kramdown icon indicating copy to clipboard operation
kramdown copied to clipboard

Shaky attempt to add the HTML5 entities

Open StoneCypher opened this issue 2 years ago • 4 comments

Hello. I don't speak Ruby, and I don't know how to run the tests on this. Use stink-eye heavily please.

I wrote this patch to attempt to address #736 , that all HTML5 entities are missing. I want ⅓.

Two assumptions are made in this patch which I am concerned about, above and beyond that I should be allowed anywhere near a keyboard.

  1. Extended combined entities may be represented in their long form. This is separately addressed as #737. I currently represent ⫅̸ as 10949 which is acceptable, but it should be 10949 338 instead.
  2. Integer placements may be repeated. Some entities have multiple names, such as 10878, which can be called ⩾̸ or ⩾̸. I'm assuming I can just list them both, but I don't ruby, so I'm not 100% sure I'm reading that hash table's uniqueness criterion correctly.

Thanks kindly.

StoneCypher avatar Oct 30 '21 21:10 StoneCypher

If you're curious, the method for producing this patch was:

  1. Using firefox, control-select to pull the appropriate columns out of this table, after sorting on standard
  2. Postprocess with the below script in a browser console, because lazy
const chars = `Lang  ⟪   U+27EA (10218)  HTML 5.0
Rang  ⟫   U+27EB (10219)  HTML 5.0
... rest of Firefox paste here ...
varsupsetneqq, vsupnE   ⫌︀  U+2ACC (10956), U+FE00 (65024)  HTML 5.0
nparsl  ⫽⃥  U+2AFD (11005), U+20E5 (8421)   HTML 5.0`
  .split('\n')
  .map(row => {
    const group = row.replace('   ', '  ');
    return group.split('  ')
  });



const makeRow = (name, num) => {
  const names = name.split(', ');
  return names.map(nm => `        [${num}, '${nm}'],`).join('\n');
}




console.log(
  chars.map(
    row => makeRow(row[0], parseInt(row[2].split('(')[1].split(')')[0]))
  )
    .join('\n')
);

StoneCypher avatar Oct 30 '21 21:10 StoneCypher

@gettalong

StoneCypher avatar Feb 17 '22 09:02 StoneCypher

@StoneCypher I will include this in the next released, though the expanded version that also handles multi-codepoint entities will have to wait.

gettalong avatar Mar 17 '22 22:03 gettalong

OK :)

StoneCypher avatar Mar 18 '22 10:03 StoneCypher

@StoneCypher Regarding your assumption 2: If multiple entries with the same code point exist, the mapping from code point to string representation uses the last entry. However, all string to code point representations are available.

I have merged your changes and they will be in the next release (see https://github.com/gettalong/kramdown/compare/master...devel), this time really! :smile:

gettalong avatar Mar 15 '23 22:03 gettalong

Thank you

StoneCypher avatar Mar 15 '23 22:03 StoneCypher