malteos
malteos
@jivatneet @ajoshi80 The issue is that since Python 3 all tokens are byte variables and therefore they are recognized as OOV. Changing everything back to string representations seems to work.
@tilusnet Any progress? I tried to get HTML syntax working but it seems to be not that trivial, specially serialization of text selection to send data back to Django. But...
Not true HTML support but for Open Redact we've build an annotation tool based on React JS that supports paragraphs which you could stylize with CSS. See https://github.com/openredact/openredact-app
Is there any progress on this? With the help of @jeffkhull's `docker-compose.yml` I managed to get everything running. However, I failed creating a user to login because verification mails aren't...
The checkpoint for the 1B version is available on HF Hub: https://huggingface.co/bigscience/tr5b-1B3-multilingual-alpha-checkpoints/tree/global_step118500 (Note: You must the select the branch depending on the global step - main branch is empty).
You can convert the HF checkpoints back to Megatron-DeepSpeed. See this (a bit hacky) script: https://gist.github.com/malteos/c194368594e16439c101b7bf27195fd1
The script updates the weights of Deepspeed checkpoint directly on the disk with the weights from a HF checkpoint. So you just need to save an untrained DS checkpoint and...
CSV renderer requires rewriting of results to simple table format. Postponed until we have a stable API.
What do you mean by "old documents"? Can you give me some more details?
Hi @dennlinger, thanks for your bug report. We are already aware of this bug but couldn't fix it until now (see https://github.com/openlegaldata/legal-reference-extraction/issues/1 ). If the original text without any annotation...