experiment-tensorflow-lite
experiment-tensorflow-lite copied to clipboard
Assistance with Creating an LLM Wrapper Similar to Image Detection Example
Hello Tito,
I've been exploring your repository, particularly the example on image detection and classification using TensorFlow Lite. It's been incredibly helpful and informative. Inspired by this, I'm attempting to create a similar wrapper, but for Large Language Model (LLM) inferencing using the MediaPipe LLM Inference API in a Kivy application. The idea is to enable text input for LLM processing and display the results in the app using the latest gemma:2b model.
Here's a brief outline of what I've done so far:
Created an LLMWrapper class, analogous to the TFLWrapper in your example. The wrapper is intended to handle the setup and execution of LLM tasks. Modified the main Kivy app code to use LLMWrapper instead of TFLWrapper. I'm reaching out to ask if you'd be open to the idea of a similar wrapper for LLM inferencing in your repository. This could be beneficial for others looking to integrate LLM functionalities in their Kivy apps.
Additionally, I would greatly appreciate it if you could take a look at the initial implementation of my LLMWrapper and the Kivy app example. Any insights, suggestions, or guidance you could provide would be invaluable. I'm particularly interested in ensuring that the wrapper is efficient and follows best practices similar to your TensorFlow Lite example.
Here is a snippet of the code for the LLMWrapper and the Kivy app:
from jnius import autoclass, cast, PythonJavaClass, java_method
from android.runnable import run_on_ui_thread
import threading
# Autoclass necessary Java classes
LlmInference = autoclass('com.google.mediapipe.tasks.genai.LlmInference')
LlmInferenceOptions = autoclass('com.google.mediapipe.tasks.genai.LlmInferenceOptions')
# Replace 'YourActivity' with your actual activity class name
YourActivity = autoclass('org.kivy.android.PythonActivity')
context = YourActivity.mActivity
class LLMWrapper(threading.Thread):
def __init__(self, model_path, max_tokens=512, top_k=40, temperature=0.8, random_seed=101, on_result=None):
super().__init__()
self.event = threading.Event()
self.quit = False
self.on_result = on_result
self.async_running = False
self.init_model(model_path, max_tokens, top_k, temperature, random_seed)
def init_model(self, model_path, max_tokens, top_k, temperature, random_seed):
options_builder = LlmInferenceOptions.builder()
options_builder.setModelPath(model_path)
options_builder.setMaxTokens(max_tokens)
options_builder.setTopK(top_k)
options_builder.setTemperature(temperature)
options_builder.setRandomSeed(random_seed)
options = options_builder.build()
self.llm_inference = LlmInference.createFromOptions(context, options)
def async_start(self):
if self.async_running:
return
self.next_prompt = None
self.daemon = True
self.async_running = True
self.start()
def async_stop(self):
self.quit = True
def run(self):
try:
while not self.quit:
if self.event.wait(0.5) is None:
continue
next_prompt, self.next_prompt = self.next_prompt, None
self.event.clear()
if next_prompt is None:
continue
result = self.generate_response(next_prompt)
if self.on_result:
self.on_result(result)
except Exception as e:
print("Exception in LLMWrapper:", e)
import traceback
traceback.print_exc()
def async_generate_response(self, prompt):
self.async_start()
self.next_prompt = prompt
self.event.set()
def generate_response(self, prompt):
result = self.llm_inference.generateResponse(prompt)
return result
Thank you for considering my request, and for the fantastic resources you've provided in your repository. I'm looking forward to potentially contributing to the repository and learning from your expertise.
regards, Shak Shat