openvino.genai [Good First Issue]: Verify mpt-7b-chat with GenAI text

Context

This task regards enabling tests for mpt-7b-chat. You can find more details under openvino_notebooks LLM chatbot README.md.

Please ask general questions in the main issue at https://github.com/openvinotoolkit/openvino.genai/issues/259

What needs to be done?

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Example Pull Requests

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Resources

Contribution guide - start here!
Intel DevHub Discord channel - engage in discussions, ask questions and talk to OpenVINO developers

Contact points

Described in the main Discussion issue at: https://github.com/openvinotoolkit/openvino.genai/issues/259

Ticket

No response

Mar 01 '24 12:03 p-wysocki

.take

Mar 05 '24 12:03 qxprakash

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Mar 05 '24 12:03 github-actions[bot]

.take

Mar 05 '24 16:03 Utkarsh-2002

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

Mar 05 '24 16:03 github-actions[bot]

Hello @qxprakash, are you still working on this? Is there anything we could help you with?

Mar 12 '24 09:03 p-wysocki

hello @p-wysocki yes I am working on it , I faced error while trying to run mpt-chat

python3 ../../../llm_bench/python/convert.py --model_id mosaicml/mpt-7b-chat --output_dir ./MPT_CHAT --precision FP16

[ INFO ] Removing bias from module=LPLayerNorm((4096,), eps=1e-05, elementwise_affine=True).
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [01:02<00:00, 31.27s/it]
generation_config.json: 100%|██████████████████████████████████████████████████████████████████████████████████████████████| 121/121 [00:00<00:00, 781kB/s]
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:87: UserWarning: Propagating key_padding_mask to the attention module and applying it within the attention module can cause unnecessary computation/memory usage. Consider integrating into attn_bias once and passing that to each attention module instead.
  warnings.warn('Propagating key_padding_mask to the attention module ' + 'and applying it within the attention module can cause ' + 'unnecessary computation/memory usage. Consider integrating ' + 'into attn_bias once and passing that to each attention ' + 'module instead.')
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/modeling_mpt.py:311: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert S <= self.config.max_seq_len, f'Cannot forward input with seq_len={S}, this model only supports seq_len<={self.config.max_seq_len}'
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/modeling_mpt.py:253: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_k = max(0, attn_bias.size(-1) - s_k)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:78: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_q = max(0, attn_bias.size(2) - s_q)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:79: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  _s_k = max(0, attn_bias.size(3) - s_k)
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:81: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if attn_bias.size(-1) != 1 and attn_bias.size(-1) != s_k or (attn_bias.size(-2) != 1 and attn_bias.size(-2) != s_q):
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:89: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if is_causal and (not q.size(2) == 1):
/home/prakash/.cache/huggingface/modules/transformers_modules/mosaicml/mpt-7b-chat/1fe2374291e730f7c58ceb1bf49960082371b551/attention.py:90: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  s = max(s_q, s_k)
[ WARNING ] Failed to send event with the following error: <urlopen error EOF occurred in violation of protocol (_ssl.c:2426)>
[ WARNING ] Failed to send event with the following error: <urlopen error EOF occurred in violation of protocol (_ssl.c:2426)>```

Mar 12 '24 15:03 qxprakash

cc @pavel-esir

Mar 12 '24 15:03 p-wysocki

@pavel-esir I hope this is not a memory issue ?

Mar 15 '24 20:03 qxprakash

@qxprakash thanks for your update. I see only connections warning in your logs. Did you get the converted IR? If so, did you run it with c++ sample?

Mar 19 '24 10:03 pavel-esir

@qxprakash what is the progress ?

Mar 30 '24 12:03 tushar-thoriya

Hello @qxprakash, are you still working on that issue? Do you need any help?

May 06 '24 08:05 p-wysocki

Hi @p-wysocki I was stuck , currently I'm not working on it.

May 21 '24 04:05 qxprakash

.take

Jul 18 '24 17:07 rk119

Thanks for being interested in this issue. It looks like this ticket is already assigned to a contributor. Please communicate with the assigned contributor to confirm the status of the issue.

Jul 18 '24 17:07 github-actions[bot]

@p-wysocki I'd like to take upon this issue as well.

Jul 18 '24 17:07 rk119

I reassigned to @rk119 because I'm not sure if @Utkarsh-2002 is still here. If you still want to work on that, let us know. If @rk119 confirms that they didn't start working on it by the time @Utkarsh-2002 replies, @Utkarsh-2002 can take the task.

Jul 19 '24 09:07 Wovchena

Hi @Wovchena,

Before raising a PR, I want to address that for samples prompt_lookup_decoding_lm and speculative_decoding_lm, I am getting an error of Exception from src/inference/src/cpp/infer_request.cpp:193: Check '::getPort(port, name, {_impl->get_inputs(), _impl->get_outputs()})' failed at src/inference/src/cpp/infer_request.cpp:193: Port for tensor name position_ids was not found.

Jul 23 '24 08:07 rk119

.take

Jan 02 '25 10:01 Mahi230504

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

Jan 02 '25 10:01 github-actions[bot]

openvino.genai
openvino.genai copied to clipboard

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation

Context

What needs to be done?

Example Pull Requests

Resources

Contact points

Ticket

openvino.genai openvino.genai copied to clipboard

[Good First Issue]: Verify mpt-7b-chat with GenAI text_generation

Context

What needs to be done?

Example Pull Requests

Resources

Contact points

Ticket

openvino.genai
openvino.genai copied to clipboard