ReFinED
ReFinED copied to clipboard
Inefficient Process for Adding New Entities in ReFinED
When trying to add a dozen more entities by running preprocess_all.py, the process requires downloading over 100GB of data, which is highly inefficient for such a small addition.
This model cannot be considered to have zero-shot capabilities until there is a streamlined, bloat-free script for adding new entities into the system.
Steps to Reproduce:
- Clone the repository and set up the environment as per the documentation.
- Attempt to add a dozen new entities by running preprocess_all.py.
- Observe the data download requirements and inefficiency.
Expected Behavior:
There should be a lightweight and efficient process for adding new entities without requiring extensive data downloads.
Actual Behavior:
Adding new entities requires downloading over 100GB of data, making the process highly inefficient and cumbersome.
Environment:
Google Colab Operating System: Linux Python Version: 3.10
Severity:
High - This issue severely impacts the usability and efficiency of adding new entities to the system and needs immediate attention.