Add dora-lmdeploy multimodel

Open 7SOMAY opened this issue 9 months ago • 3 comments

Add dora-lmdeploy Node for Efficient LLM Inference

Summary

This PR - #817 introduces the dora-lmdeploy node, which leverages LMDeploy's Turbomind engine for efficient inference of large language models. It supports both text and image inputs, making it suitable for multimodal tasks.

Key Features

Efficient Inference: Optimized for CUDA performance using Turbomind.
Multimodal Support: Handles text and image inputs.
Configurable Models: Supports various models like InternLM and Qwen.
Conversation History: Maintains context for responses.

Configuration Example

env:
  MODEL_NAME: "internlm/internlm2-7b"
  MAX_LENGTH: 2048
  TEMPERATURE: 0.7
  TURBOMIND_CACHE_DIR: "./workspace"

Testing Status

The node is under working condition; however, testing is pending due to an incompatible version of LMDeploy with my Mac. I am currently working on finding a solution to resolve this issue.

Checklist

[x] Added tests
[x] Updated documentation
[ ] Example configurations included

Mar 24 '25 14:03 7SOMAY

@haixuanTao, The lmdeploy doesn't support the macos platform. What should be the way I should handle that? Can I make it available for windows and linux?

Mar 25 '25 14:03 7SOMAY

Please refer this Issue: https://github.com/InternLM/MindSearch/issues/46

Potential solution: https://github.com/InternLM/MindSearch/issues/46#issuecomment-2343880680

Mar 25 '25 14:03 7SOMAY

I think it would be better to implement this using a compatible OS otherwise it's going to be very hard to debug

Mar 26 '25 12:03 haixuanTao