Sonny

Results 20 comments of Sonny

Some other guy had the same issue. Which was solved https://discord.com/channels/1164200432894234644/1164200433779212400/1202511250760798318

The CUDA error: unsuported toolchain is suggesting that the PTX Parallel Thread Execution code used by CUDA for kernels was compiled with a version of the toolchain (compiler, linkers, etc...

You are asking the wrong question. Running multiple GPU is totally different ballgame than increasing the layer offloading....

Work in progress.. Test and report back please https://github.com/imartinez/privateGPT/issues/1521#issuecomment-1963085662

I have been trying to fix the embedding component for couple of days now and still getting the runtime error When using PyTorch dataparallel to use more than 1 GPU...

Answer: I couldn't find a workaround hence working on fix. We Thought we had it here but we might have a a issue since during some ingestions we got "Segmentation...

Hey! i hope you all had a great weekend. can you please, try out this code which uses "DistrubutedDataParallel" instead. i cannot test it out on my own. we took...

Thanks for confirming @bharrison-it. @imartinez I will set up a PR for this as soon I get a verification from Jeff as well. PyTorch "DataParallel" is unstable. The best approach...

adding my changes [changes_summarize_service.txt](https://github.com/user-attachments/files/17946492/changes_summarize_service.txt) [changes_ui.txt](https://github.com/user-attachments/files/17946495/changes_ui.txt)

so, the summarizeservice is retrieved from the request state using SummarizeService = request.state.injector.get(SummarizeService) adn for the streaming Response uses to_openai_stream to convert the response to a SSE stream. the issue...