Using this instead of building the entire list freed up significant amount of memory with large datasets. Source: nltk doc