removal_least_used parameter is improperly used in document-similarity-s1-rank_filter
After comparing two different ranking scripts: document-similarity-s1-rank_filter.pig and document-similarity-s1-ship-rank_filter.pig and deeper inspection of the document-similarity-s1-rank_filter.pig script it seems the removal_least_used is improperly used: it should be compared against the number of referenced docs ($1) instead of the rank position ($0).
Currently far less terms are filtered out because of this bug. In most cases only terms referenced once are discarded because the rank index is not dense and there are almost always more than 20 terms with single document reference. In current OpenAIRE documents similarity configuration removal_least_used was set to 20 so all the terms referenced in less than 20 documents should be filtered out.