garak icon indicating copy to clipboard operation
garak copied to clipboard

Feature/multilingual

Open SnowMasaya opened this issue 4 months ago • 4 comments

Add multilingual support

Verification

  • [pass] Run the tests and ensure they pass python -m pytest tests/
  • [pass] Verify the thing does what it should: I add docs/source/translator.rst

Feature explain

You need an API key for the preferred service.

DeepL

export DEEPL_API_KEY=xxxx

Config file

You can pass the translation service, source language, and target language as arguments.

  • translation_service: "nim" or "deepl", "local"
  • lang_spec: "ja", "ja,fr" etc. (you can set multiple language codes)

Note: The Helsinki-NLP/opus-mt-en-{lang} case uses different language formats. The language codes used to name models are inconsistent. Two-digit codes can usually be found here, while three-digit codes require a search like "language code {code}".

run:
  translation_service: {your chosen translation service "nim" or "deepl", "local"}
  lang_spec: {your chosen language code}

Examples for multilingual support

DeepL

To use the translation option for garak, run the following command:

export DEEPL_API_KEY=xxxx
python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --translation_service deepl --lang_spec ja

If you save the config file as "garak/configs/simple_translate_config_deepl.yaml", use this command:

export DEEPL_API_KEY=xxxx
python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config garak/configs/simple_translate_config_deepl.yaml

Example config file:

run:
  translation_service: "deepl"
  lang_spec: "ja"

NIM

For NIM, run the following command:

export NIM_API_KEY=xxxx
python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --translation_service nim --lang_spec ja

If you save the config file as "garak/configs/simple_translate_config_nim.yaml", use this command:

export NIM_API_KEY=xxxx
python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config garak/configs/simple_translate_config_nim.yaml

Example config file:

run:
  translation_service: "nim"
  lang_spec: "ja"

Local

For local translation, use the following command:

python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --translation_service local --lang_spec ja

If you save the config file as "garak/configs/simple_translate_config_local.yaml", use this command:

python3 -m garak --model_type nim --model_name meta/llama-3.1-8b-instruct --probes encoding --config garak/configs/simple_translate_config_local.yaml

Example config file:

run:
  translation_service: local
  local_model_name: "facebook/m2m100_418M"
  local_tokenizer_name: "facebook/m2m100_418M"
  lang_spec: "ja"

Specific Hardware Examples:

Local Translation needs GPU

  • GPU related
    • Specific support required cuda
    • Minium GPU Memory:
      • Helsinki-NLP/opus-mt-en-{}: around 1500MB
      • facebook/m2m100_418M: around 8000MB
      • facebook/m2m100_1.2B: around 15000MB

SnowMasaya avatar Oct 09 '24 00:10 SnowMasaya