extraction-framework
extraction-framework copied to clipboard
UnmodifiedFeeder re-reads entries, anomaly
- Feeder reads 5000 records and puts them in queue
- these get extracted, but seems like they don't get updated in cache db
- next request it gets the same 5000 from the DB
Note: this bug is not consistent, so very often it updates the mysql chache db. We brought it down from 3 million to 1 million, but it seems that there are more than 5k anomalies now, which is exactly what is queried by the feeder.
Start debugging here: https://github.com/dbpedia/extraction-framework/blob/live-deployed/live/src/main/java/org/dbpedia/extraction/live/feeder/UnmodifiedFeeder.java#L77