skrub
skrub copied to clipboard
API: shorter names / argument names
Long names lead to long lines of code, and beginners do not know how to format them right, and as a consequence their code is hard to read and they struggle. I am starting to think that we need to try to avoid long names when possible.
- [x] Rename all arguments in TableVectorizer that end with "_transformer" to not end with "_transformer", eg: "high_card_cat_transformer" -> "high_card_cat"
- [ ] Use short hands when they are obvious: "col" instead of "columns" (no clear list in my head of where this should be done, maybe we should audit our API to find out)
- [ ] Use verbs instead of nouns? (this is more open IMHO)
Regarding the TableVectorizer, do you think "high_cardinality"/"low_cardinality" would be better suited than "high_card_cat"/"low_card_cat"? It's more explicit but still short.
let's make a decision on the TableVectorizer parameters before the next release