[solved] Usage example with CUDA
Thanks for the library. Nice approach.
The documentation is currently missing a complete working usage example, as well as instructions how to run the scorer on GPU (with CUDA).
Here's my solution. Feel free to use in any way you like.
This isn't the shortest possible code, but rather an IMHO minimal clean solution that can be quickly adapted for real-world uses.
Note that because dehyphen only analyzes the two lines across the paragraph seam, the paragraph joiner may combine when it shouldn't. If you have known-good paragraphs, it's better to dehyphenate each of them separately.
Tested on Python 3.10, with CUDA.
TL;DR: You need to know how to tell Flair to load on the GPU, as well as how to load, on recent Torch versions, a .pt that fails to load in weights-only mode. The main() function below shows how. If you try to instantiate a FlairScorer on a recent Torch without forcing Torch into no-weights-only load mode, it will fail, complaining about pickle protocol versions, and that the (nowadays default) weights-only mode is not supported for protocol 4. With the code below, Torch will emit a UserWarning (for good reason!), but this will allow Flair to load successfully.
"""Usage example for `dehyphen` package, with CUDA."""
import contextlib
import copy
import os
import textwrap
import threading
from typing import List, Union
import torch
import flair
import dehyphen
# Utility for cleanly loading *only* Flair in "no weights only" mode
_environ_lock = threading.Lock()
@contextlib.contextmanager
def environ_override(**bindings):
"""Context manager: Temporarily override OS environment variable(s).
When the `with` block exits, the previous state of the environment is restored.
Thread-safe, but blocks if the lock is already taken - only one set of overrides
can be active at any one time.
"""
with _environ_lock:
# remember old values, if any
old_bindings = {key: os.environ[key] for key in bindings.keys() if key in os.environ}
try:
# apply overrides
for key, value in bindings.items():
os.environ[key] = value
# let the caller do its thing
yield
finally:
# all done - restore old environment
for key in bindings.keys():
if key in old_bindings: # restore old value
os.environ[key] = old_bindings[key]
else: # this key wasn't there in the previous state, so pop it
os.environ.pop(key)
def _join_paragraphs(scorer: dehyphen.FlairScorer, candidate_paragraphs: List[str]) -> List[str]:
"""Internal function; input/output format is as produced by `dehyphen.text_to_format`.
Essentially, `[[lines, of paragraph, one], [lines, of, paragraph, two], ...]`, where each of the lines is a string.
"""
if len(candidate_paragraphs) >= 2:
out = []
candidate1 = candidate_paragraphs[0]
j = 1
# handle blank lines at beginning of input
while not len(candidate1): # no lines in this paragraph?
candidate1 = candidate_paragraphs[j]
j += 1
# all of input is blank lines?
if j == len(candidate_paragraphs):
out.append(candidate1)
return out
while True:
candidate2 = candidate_paragraphs[j]
combined = scorer.is_split_paragraph(candidate1, candidate2) # -> combined paragraph or `None`
if j == len(candidate_paragraphs) - 1: # end of text: commit whatever we have left
if combined is None: # candidate1 is a complete paragraph (candidate2 starts a new paragraph)
out.append(candidate1)
out.append(candidate2)
else:
out.append(combined)
break
else: # general case: commit only when a paragraph is completed
if combined is None: # candidate1 is a complete paragraph (candidate2 starts a new paragraph)
out.append(candidate1)
candidate1 = candidate2
else: # keep combining
candidate1 = combined
j += 1
else:
out = copy.copy(candidate_paragraphs)
return out
def dehyphenate(scorer: dehyphen.FlairScorer, text: Union[str, List[str]]) -> Union[str, List[str]]:
"""High-level API for dehyphenation.
Returns `str` (one input) or `list` of `str` (more inputs).
"""
def doit(text: str) -> str:
# Don't send if the input is a single character, to avoid crashing `dehyphen`.
if len(text) == 1:
return text
data = dehyphen.text_to_format(text)
data = scorer.dehyphen(data)
data = _join_paragraphs(scorer, data)
paragraphs = [dehyphen.format_to_paragraph(lines) for lines in data]
output_text = "\n\n".join(paragraphs)
return output_text
if isinstance(text, list):
output_text = [doit(item) for item in text]
else: # str
output_text = doit(text)
return output_text
def main():
# How to set CPU/GPU mode for Flair (used by `dehyphen`).
# This needs to be done *before* instantiating the model.
# https://github.com/flairNLP/flair/issues/464
flair.device = torch.device("cuda:0")
# Flair requires "no weights only load" mode for Torch; but this is insecure,
# so only enable it temporarily while loading the Flair model.
# https://github.com/flairNLP/flair/issues/3263
# https://github.com/pytorch/pytorch/blob/main/torch/serialization.py#L1443
with environ_override(TORCH_FORCE_NO_WEIGHTS_ONLY_LOAD="1"):
scorer = dehyphen.FlairScorer(lang="multi")
# Neumann & Gros 2023, https://arxiv.org/abs/2210.00849
input_text = textwrap.dedent("""
The recent observation of neural power-law scaling relations has made a signifi-
cant impact in the field of deep learning. A substantial amount of attention has
been dedicated as a consequence to the description of scaling laws, although
mostly for supervised learning and only to a reduced extent for reinforcement
learning frameworks. In this paper we present an extensive study of performance
scaling for a cornerstone reinforcement learning algorithm, AlphaZero. On the ba-
sis of a relationship between Elo rating, playing strength and power-law scaling,
we train AlphaZero agents on the games Connect Four and Pentago and analyze
their performance. We find that player strength scales as a power law in neural
network parameter count when not bottlenecked by available compute, and as a
power of compute when training optimally sized agents. We observe nearly iden-
tical scaling exponents for both games. Combining the two observed scaling laws
we obtain a power law relating optimal size to compute similar to the ones ob-
served for language models. We find that the predicted scaling of optimal neural
network size fits our data for both games. We also show that large AlphaZero
models are more sample efficient, performing better than smaller models with the
same amount of training data.
""").strip()
output_text = dehyphenate(scorer, input_text)
print("=" * 80)
print(input_text)
print("-" * 80)
print(output_text)
print("-" * 80)
if __name__ == "__main__":
main()