Gianfranco Demarco
                                            Gianfranco Demarco
                                        
                                    The problem comes from the fact that all of the encoded predictions are kept in memory, so has more predictions are made, more RAM is needed. What you can do...
@zhenghao977 I've provided a scheme for you to modify the script. Otherwise, I've the implementation in my fork of the project (even if I don't like to advertise it here...)....
@thomascong121 @WayneWong97 our version is [here](https://github.com/gianfrancodemarco/mm-cot/blob/8e2f0eb24fa2e3c460031fe408ed4d8c7d655cb8/src/runner/chain_of_thought.py#L102)