EduNLP
EduNLP copied to clipboard
[Feature] Optimize Tokenzation incluing multi-mode problems, Parser and Formula optimization
trafficstars
Description
(A clear and concise description of what the feature is.)
- Handle multi-mode problems
- AST Graph
- Image
- Handle noise problems when identify $...$ in Parser (need better rules)
- Handle Formula ast problems when identify $AB=BC$ and $123$ (consider preprocessing)
References
- https://huggingface.co/docs/transformers/tasks/image_classification
- http://home.ustc.edu.cn/~huangzhy/files/papers/ZhenyaHuang-SIGIR2020s.pdf