llama-node Code only using 4 CPU, when I have 16 CPU

This is the code that I am using

import {RetrievalQAChain} from 'langchain/chains'; import {HNSWLib} from "langchain/vectorstores"; import {RecursiveCharacterTextSplitter} from 'langchain/text_splitter'; import {LLamaEmbeddings} from "llama-node/dist/extensions/langchain.js"; import {LLM} from "llama-node"; import {LLamaCpp} from "llama-node/dist/llm/llama-cpp.js"; import * as fs from 'fs'; import * as path from 'path';

It is only using 4 CPU at the time of "vectorStore = await HNSWLib.fromDocuments(docs, new LLamaEmbeddings({maxConcurrency: 1}, llama));"

Can we change anything for it to use more than 4 CPU?

May 16 '23 18:05 gaurav-cointab

not yet. llama.cpp seems not supporting parallel inference at the moment. I may find another ways (like implement a round robin in rust level) for this.

May 17 '23 09:05 hlhr202

@hlhr202 any updates on that?

Jul 25 '23 14:07 pavelpiha

@hlhr202 u gotta hire us when you make it big :) LGTM

Jul 26 '23 20:07 HolmesDomain

llama-node llama-node copied to clipboard

Code only using 4 CPU, when I have 16 CPU

llama-node
llama-node copied to clipboard