lmql
lmql copied to clipboard
Comparison with Guidance?
I appreciate your effort for making LMQL. I've played with it a bit and found many of the concepts it does are similar to what Guidance does. But I'm not too tech-savvy to know the technical differences between the two. Before starting my project using one of these frameworks, I wanted you to please make a comparison of the two. I think that would clarify the strengths of each framework and help people like me make the right decision.
thank you!
This is a good question, LMQL and Guidance have some similarities. We agree that a deeper comparison would be useful and will add a corresponding chapter to our documentation. I will keep this issue open, until then.
Hey, how about trying out some comparison between the both based on some of the simple BigBench tasks, maybe anachronism itself? I think Guidance claims to perform better than the few-shot + no chain-of-thought output structure cases.
If possible, I don't mind working on this issue as well.
Please feel free to have a look :). From a LLM performance perspective the implementation/tooling level will not really matter though, as long as the concrete tokens/constraints you give to the model are the same, the results will be the same, so I would expect the same gains as in the guidance experiments.
Still, using the different capabilities of LMQL (e.g. scripting, constraining) in the right way, we were able to show significant gains on a wide range of benchmarks across several papers we are currently working on. The experimental evaluation of this will of course be published in due time.
We have added a comprehensive comparison with guidance
to the documentation. Please feel free to comment here, any additional questions that come up, so we can extend it.
https://docs.lmql.ai/en/latest/python/comparison.html
Thanks! This was informative. I admire the way you guys are working on LMQL. It feels like a serious package with all the comprehensive documentation and examples. Hope it gets more traction by the community.