Adrien B comments

Results 9 comments of


                                            Adrien B

Out of memory when the meta optimizer updates parameters

I have kind of the same issue. On the line of code: `flat_params = self.f * flat_params - self.i * Variable(flat_grads)`, my computer take a lot of time (making the...

Out of memory when the meta optimizer updates parameters

Nevermind that was not the problem, the problem was certainly version change in pytorch and so the operation: `flat_params = self.f * flat_params - self.i * Variable(flat_grads)` produce a 25450*25450...

Packaging and pypi ?

Perfect ! I will try to be worthy maintainer ^^

How to use this model for point cloud?

You can try : model = Perceiver( input_channels = C, # number of channels for each token of the input input_axis = 1, etc) It will suppose you have only...

Allow cancellation of prediction while running prompt

Is this really up to OA to do this ? it looks like OA has a big dependency to https://github.com/huggingface/text-generation-inference for the inference. If we want to have a proper...

Issue with logs when using torch.compile

@carmocca Yeah this is exactly what I did ! simple workaround

Create test cases (possibly even synthetic data for training) from ChatGPT failure examples (see repo)

Do you think it will be usefull to gather data of OA failures too ? (scraping discord "bad-message-ids" ?)

Question about the architecture (graphTransformer)

I am doing some experiment on my own graph dataset. Your implementation seems to be more performant that the standard graph transformer (at least the one I tried from DGL...

Direct Policy Optimization

@Reichenbachian I think there is currently a version of DPO under review on the TRL lib if you want to check : https://github.com/lvwerra/trl/pull/416/files#diff-5bbdb5d54108f2162b47bc54dc23c7b8e7744d2941118e60a44c161a4acc0ee8