CoAnSys
CoAnSys copied to clipboard
Fix timeout issue during the sim1_postprocess_s1_e1_filter_input phase
Originally reported in: https://github.com/openaire/iis/issues/1326
Documents similarity algorithm fails after running it on a non-deduplicated OpenAIRE Graph counting 300M of publications (deduped graph included 200M).
After in depth inspection covered by the https://github.com/openaire/iis/issues/1326#issuecomment-1105186910 it turned out we need to modify documents similarity sources by increasing allowed timeout value which should be defined in sim1-postprocess-s1-e1-filter-sims.pig PIG script.