iepy icon indicating copy to clipboard operation
iepy copied to clipboard

Download 3rd party downloads things that are not used

Open rafacarrascosa opened this issue 10 years ago • 2 comments

The download 3rd party scripts downloads additional packages that are not needed, like:

  • Stanford NER
  • Stanford postagger
  • nltk stuff (perhaps not needed, really not sure).

The script should only download what is actually used by IEPY.

rafacarrascosa avatar Nov 05 '14 17:11 rafacarrascosa

Good point, but still not completely sure.

The thing is, the current provided preprocessing is not using the things you mention, but we didn't deprecated the other preprocessing pipeline (ie, we still have the code on iepy.preprocess that can need that).

So, you think we should also remove the builtin support for using those things (stanford ner or postagger alone, nltk stuff)?

jmansilla avatar Nov 05 '14 17:11 jmansilla

Ahh good catch. Mmm I'm divided in two thoughts:

  • Unused code is code that has to go, and more so if it implies downloading many MB in things almost no-one will use.
  • The code for those preprocessing stages works as documentation on how to write your own preprocesing stage.

Sooo... I'm just thingking out loud here, but it would be a good thing if we get to keep the documentation and somehow also remove the (most likely) unused code...

rafacarrascosa avatar Nov 05 '14 18:11 rafacarrascosa