Results 131 comments of setzer22

Hi @hlhr202! :wave: Thanks for bringing this to our attention. The code here doesn't look hard at all to port! We will add it to the repo since it makes...

Please check out #72. I implemented some code to extract embeddings, but we still need to validate if the results are correct, and how to best expose this to our...

This sounds like a good idea. But using signal interceptors like Ctrl-C or Ctrl-D and changing their semantics feels like a bit of a hack. I wonder if we could...

I already addressed the review feedback and removed the ad-hoc test code. So I take it a good plan now would be to merge this as-is and have embedding extraction...

> Is it a lot of data? It is quite a lot of data for comfortably printing to stdout. It's 4096 floats per token. Not that it wouldn't work, but...

I'm open to adding a way for the CLI to output embeddings if people find this is an interesting use case. The main blocker here is that the use case...

I'm a bit confused about this change. Does it increase quality? Because from what you're reporting, it seems to increase memory use *and* increase inference time. Probably needs some more...

> As far as I could tell, it was deterministic with a seed specified from the testing I did Yup, we're just being a bit careful with promising determinism overall,...

I've been doing some tests, but it's hard to measure if inference speed has gotten slower with the code change because different prompts can make inference speed vary by up...

I couldn't notice any performance differences in my tests either, so I'd say we can merge as is. No need to put it behind a flag. > I guess it's...