progscrape
progscrape copied to clipboard
Long Reddit titles should split on `|`, `.`, `:` or other punctuation
Long Reddit titles have never looked that great. We should be heuristically splitting them on punctuation where it makes sense.
Experimentally (and not completely) implemented in https://github.com/mmastrac/progscrape/commit/71ae991d92f5c19d93d3e2314c100b5946bbf907
Need to handle abbreviations:
https://gizmodo.com/drones-facial-recognition-us-air-force-realnetworks-1850163798