wizd

Results 47 comments of wizd

emulate openai text api, so tons of apps could support llama without change.

> Could this be related to the phenomenon of the so-called [glitch tokens](https://www.lesswrong.com/posts/8viQEp8KBg2QSW4Yc/solidgoldmagikarp-iii-glitch-token-archaeology) ? The research on those has been focused on GPT-3 and I've yet to find any information...

哈哈哈,会玩。温度是多少?这种情况下是不是要设置成1.

> If you run on a single GPU does it generate valid responses? > > Can you share the server log so we can see which GPU(s) it's attaching to?...

this is the log with 2 7900 XTX selected, setting OLLAMA_DEBUG=1 ``` $ docker logs -f ollama time=2024-03-24T14:39:32.931Z level=INFO source=images.go:806 msg="total blobs: 60" time=2024-03-24T14:39:32.931Z level=INFO source=images.go:813 msg="total unused blobs removed:...

Use a creator function with parameters: ``` function createContactSaver(veid: string, ssoid: string) { const paramsSchema = z.object({ party: z.string(), name: z.string().optional(), phoneNumber: z.string() }) const name = 'saveContact'; const description...

Sorry but in fact we can use free context by closure provided by javascript. ``` createContactSaver = () => { const paramsSchema = z.object({ contactName: z.string(), phoneNumber: z.string().optional(), party: z.string().optional(),...

Use a proxy? https://github.com/songquanpeng/one-api

Sure it's a good idea to hide the loooong procedure which don't make much sense to users.

I agree. In today's world, a single CPU core is not particularly significant. However, for those who are unfortunate enough to have an Intel chip, it's a completely different story:...