Farook (EDev) Al-Sammarraie

Results 33 comments of Farook (EDev) Al-Sammarraie

@Mostelk I've cleaned up and formatted the description a bit. Please let me know if I missed or misplaced anything. ## LLM Benchmark for Android *(4 weeks)* ### Dataset -...

I've looked into how resources are acquired and cached, my findings are as follows: ## Requirements As far as I can tell and with the most liberal interpretation, the requirements...

# Update ## Conversion Memory Requirements The Memory requirements for conversion seems to scale linearly, 1B requiring `x` amount of RAM (30GiB on average) with 3B requiring `3x` (90GiB) and...

> I was able to convert 3B on colab with 56 GiB (CPU instance + high RAM). Thus I guess you over-estimate the memory requirements. My impression is that 64...

I was able to run ai-edge-torch's converter for llama 3.1 8b, the keyword here is run, because I still ran out of memory, but I believe the changed config will...

@freedomtan @Mostelk I could rent a server and run the conversion script, it'll cost less than $5 and should produce an 8B `.tflite` model. LMK if you'd like me to...

Looking at [loadgen's documentation](https://github.com/mlcommons/inference/tree/master/loadgen#upstream-all-local-modifications), they seem to require any changes to be upstreamed as a rule, does having our own branch satisfy this rule? If not, then I don't think...

@anhappdev The current loadgen version we use already has token latencies, we should be able to complete the LLM android implementation without the need for an upgrade. In the same...

I'm not sure where to report this, but there was a question some weeks back about whether or not LoadGen excludes TTFT from TPS counting. I found [this line](https://github.com/mlcommons/inference/blob/27db0530255bb8e72fd7fdffaf0f0005b1f6461a/loadgen/logging.cc#L475-L476) which...

This probably conflicts with #1040 to some extent. Whichever is merged first will require merging master into the other and some adjustments (and resolving some merge conflicts) will be required....