litgpt icon indicating copy to clipboard operation
litgpt copied to clipboard

Tensor parallelism generates non-sensical outputs

Open rasbt opened this issue 1 year ago • 1 comments

Bug description

For some reason, the tensor parallel implementation generates non-sensical outputs

⚡ python-api-tensor-parallel ~/litgpt litgpt generate_tp checkpoints/microsoft/phi-2 
...
Instruct: What food do llamas eat?
Output: When the
.

The first

.

The first

.

Time for inference 1: 1.31 sec total, 15.23 tokens/sec

Expected output (e.g., via base or sequential generation):

Instruct: What food do llamas eat?
Output: Llamas eat grass, shrubs, and other vegetation.

What operating system are you using?

Linux

LitGPT Version

Current main branch

rasbt avatar Aug 08 '24 15:08 rasbt

It seems to be related to the MLP class:

Has problem:

  • microsoft/phi-2

    • GptNeoxMLP
  • EleutherAI/pythia-2.8b

    • GptNeoxMLP
  • stabilityai/stablelm-base-alpha-7b

    • GptNeoxMLP
  • google/gemma-2-2b

    • GemmaMLP

Is fine:

  • meta-llama/Meta-Llama-3.1-8B-Instruct

    • LLaMAMLP
  • openlm-research/open_llama_3b

    • LLaMAMLP
  • microsoft/Phi-3-mini-4k-instruct

    • LLaMAMLP
  • garage-bAInd/Platypus2-7B

    • LLaMAMLP

It could be that this could automatically get fixed via #1421

rasbt avatar Aug 08 '24 20:08 rasbt