feat: Physics verifier
Description
Implement PhysicsVerifer that inherits PythonVerifier, which can deal with expressions, unit comparison and conversion in solution and reference answer.
Features:
- Evaluate expressions if needed
- Correctly parse and convert units
- If solution unit doesn't match with ground truth unit, perform unit conversion to match them (e.g. 1km vs 1000m, 200000 vs 2*10^5)
- Adjust relative tolerance when comparing the numerical results if needed (the default tol is 0.01, but sometimes the ground truth answer is given in lower tolerance, e.g. when ground truth is 1.3e+02, we should allow the answers to sit within a difference of 0.05e+02)
https://github.com/camel-ai/camel/pull/2133 there is float tolerance in the python verifier. Can we reuse it or make it modular for different verifiers @GitHoobar @Ebony59 @hallerite
https://github.com/camel-ai/camel/pull/2133 there is float tolerance in the python verifier. Can we reuse it or make it modular for different verifiers @GitHoobar @Ebony59 @hallerite
Not sure whether that makes sense. PythonVerifier also does float matching for different python objects, like sets, lists and dicts. I don't see a lot of code duplication if it is implemented for each verifier. Should be mostly 1-2 lines of code.
#2133 there is float tolerance in the python verifier. Can we reuse it or make it modular for different verifiers @GitHoobar @Ebony59 @hallerite
Not sure whether that makes sense.
PythonVerifieralso does float matching for different python objects, like sets, lists and dicts. I don't see a lot of code duplication if it is implemented for each verifier. Should be mostly 1-2 lines of code.
Not just for code duplication. It is also for maintenance. It is better to have only one abstraction dealing with float tolerance since it will be used in all different verifiers like python, math, physics, bio and so on. We don’t want to implement it all over the places
#2133 there is float tolerance in the python verifier. Can we reuse it or make it modular for different verifiers @GitHoobar @Ebony59 @hallerite
Not sure whether that makes sense.
PythonVerifieralso does float matching for different python objects, like sets, lists and dicts. I don't see a lot of code duplication if it is implemented for each verifier. Should be mostly 1-2 lines of code.Not just for code duplication. It is also for maintenance. It is better to have only one abstraction dealing with float tolerance since it will be used in all different verifiers like python, math, physics, bio and so on. We don’t want to implement it all over the places
In that case I would still implement it normally in the Python verifier and build one abstraction for the rest, because the Python verifier is really different.
#2133 there is float tolerance in the python verifier. Can we reuse it or make it modular for different verifiers @GitHoobar @Ebony59 @hallerite
Not sure whether that makes sense.
PythonVerifieralso does float matching for different python objects, like sets, lists and dicts. I don't see a lot of code duplication if it is implemented for each verifier. Should be mostly 1-2 lines of code.Not just for code duplication. It is also for maintenance. It is better to have only one abstraction dealing with float tolerance since it will be used in all different verifiers like python, math, physics, bio and so on. We don’t want to implement it all over the places
In that case I would still implement it normally in the Python verifier and build one abstraction for the rest, because the Python verifier is really different.
Maybe we can extract the comparison logic (float comparison and expression comparison) from PythonVerifier (make a PythonComparitor class or something like that), and these can be shared across different domains.