Niq Dudfield comments

Results 309 comments of


                                            Niq Dudfield

On longs pages it seems to get stuck

Apparently the embeddings don't use the entire weights, so maybe there's a way. I'm very fuzzy on how those are created. I patched the proxy server to allow CORS, but...

On longs pages it seems to get stuck

``` const vectorStore = await MemoryVectorStore.fromDocuments( documents, new OllamaEmbeddings({ baseUrl: OLLAMA_BASE_URL, model: OLLAMA_MODEL, }), ); ``` I think fromDocuments/OllamaEmbeddings runs serially anyway so may need some wrapper beyond the server...

On longs pages it seems to get stuck

``` [GIN] 2024/01/31 - 10:15:31 | 200 | 1.460126417s | 127.0.0.1 | POST "/api/embeddings" 127.0.0.1 - - [31/Jan/2024 10:15:31] "POST /api/embeddings HTTP/1.1" 200 - 127.0.0.1 - - [31/Jan/2024 10:15:31] "POST...

On longs pages it seems to get stuck

Probably worth investigating yourself I really like the magic / "just works" aspect of using RAG/embeddings, but it would be nice if it was a bit faster /$somehow/

On longs pages it seems to get stuck

@andrewnguonly > If there's a comparable Wikipedia article, > how large is the content in your testing I was kind of using random pages > it seems to be constant...

On longs pages it seems to get stuck

I hacked the OllamaEmbeddings class (just the compiled code in node_modules): ```javascript async _embed(strings) { console.log('hack is working!!') const embeddings = []; for await (const prompt of strings) { const...

On longs pages it seems to get stuck

Maybe we should start a branch if want to look seriously, but anyway, here's some more of the artifacts of my investigations before: The ini file I used ``` [DefaultServer]...

On longs pages it seems to get stuck

> how large is the content in your testing Enough that there was a lot of embeddings requests anyway This was one of the pages: https://news.ycombinator.com/item?id=39197619 But I suspect it's...

On longs pages it seems to get stuck

I would try hacking a separate pool of servers just for the embeddings, with the proxy running on a non-default port. I'm still not sure WHEN the full model weights...

On longs pages it seems to get stuck

In any case, it didn't seem to help much in the big picture. Maybe you can tweak the threading settings for each ollama instance or something