OpenHands icon indicating copy to clipboard operation
OpenHands copied to clipboard

Implement Cost Reduction Options for LLM Usage in OpenDevin

Open PierrunoYT opened this issue 8 months ago • 4 comments

What problem or use case are you trying to solve?

We are trying to reduce the costs associated with using Large Language Models (LLMs) in the OpenDevin project. This involves optimizing the usage of LLMs to ensure that we are not incurring unnecessary expenses while maintaining the quality and efficiency of the system.

Describe the UX of the solution you'd like

The solution should be seamless and transparent to the end-users. Users should have the option to select cost-saving modes or configurations when initiating tasks that involve LLMs. This could be presented as a simple toggle or dropdown menu in the user interface where users can choose between different cost-saving strategies, such as using a less expensive model, limiting the number of tokens, or adjusting the model parameters.

Do you have thoughts on the technical implementation?

Yes, here are some thoughts on the technical implementation:

  1. Adjust Configuration Parameters:

    • Modify the config.py file to set default values for parameters that affect cost, such as reducing the number of retries, lowering the maximum number of input and output tokens, and adjusting the temperature and top_p parameters.
  2. Optimize Token Usage:

    • Implement a method in llm.py to truncate or summarize input text to reduce the number of tokens sent to the LLM.
  3. Use Cost-Efficient Models:

    • Implement a dynamic model selection mechanism in llm.py that selects a model based on the task's complexity and cost constraints.
  4. Monitor and Log Costs:

    • Continuously monitor the cost of LLM usage and log it for analysis. Implement alerts or thresholds to notify when the cost exceeds a certain limit.

Describe alternatives you've considered

  1. Manual Cost Monitoring:

    • Manually monitor the costs and adjust the usage patterns based on periodic reviews. This approach is less efficient and prone to human error.
  2. Fixed Model Usage:

    • Always use a fixed, less expensive model regardless of the task complexity. This approach may compromise the quality of results for more complex tasks.
  3. Token Limitation Only:

    • Only implement token limitation without adjusting other parameters or model selection. This approach may not be as effective in reducing costs comprehensively.

Additional context

For more details on configuring LLMs and troubleshooting, refer to the following documentation files:

PierrunoYT avatar Jun 22 '24 20:06 PierrunoYT