libpostal
libpostal copied to clipboard
Training on a subset of countries
Thanks for this terrific project! I am trying to slim down the model due to resource constraints, is it possible to get the steps to train the dataset only on english addresses ( i am hoping this will help) ?
I have tried the steps listed in the README but could not figure out a way to filter the addresses and train the model
Have you figured it out? It would help me out too.
+1. This will be very helpful since all not users will need support for all the countries. The trained model is 1.8GB, which is too much.
See Splitting data files by country and language for how to train on a subset of countries