Aaron Straup Cope
Aaron Straup Cope
I re-generated the input data to filter out empty phone numbers but the problem still manifests itself. Thoughts about where/what the bad data (?) might be ?
After adding more verbose logging to `utils/enum.py` this is what I see: ``` at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/mnt/yarn/usercache/hadoop/appcache/application_1532544793717_0001/container_1532544793717_0001_02_000004/pyspark.zip/pyspark/worker.py", line 174, in...
More example errors and details: ``` ---- dupe class None == NEEDS_REVIEW a1 '{'house_number': u'400', 'house': u'Daubendiek Karen', 'lon': -121.502926, 'phone': u'+1 916 321 4500', 'postcode': u'95814', 'country': u'US', 'lat':...
More: ``` Traceback (most recent call last): File "/usr/bin/dedupe_geojson", line 420, in is_dupe = dupe_func(canonical, other, dupe_pairs, dupes, **dupe_func_kw) File "/usr/bin/dedupe_geojson", line 113, in is_name_address_dupe fuzzy_street_name=fuzzy_street_names) File "/usr/lib/python2.7/site-packages/lieu/dedupe.py", line 424,...
Once, more with `type(dupe_class) == types.NoneType` because... ``` Traceback (most recent call last): File "/usr/bin/dedupe_geojson", line 420, in is_dupe = dupe_func(canonical, other, dupe_pairs, dupes, **dupe_func_kw) File "/usr/bin/dedupe_geojson", line 113, in...
`type(dupe_class) == types.NoneType` appears to have fixed (or at least) trapped the problem.
Similarly: ``` No handlers could be found for logger "mrjob.launch" Traceback (most recent call last): File "dedupe_geojson.py", line 206, in DedupeGeoJSONJob.run() File "/usr/local/lib/python2.7/site-packages/mrjob/job.py", line 436, in run mr_job.execute() File "/usr/local/lib/python2.7/site-packages/mrjob/job.py",...
``` No handlers could be found for logger "mrjob.launch" Traceback (most recent call last): File "dedupe_geojson.py", line 213, in DedupeGeoJSONJob.run() File "/usr/local/lib/python2.7/site-packages/mrjob/job.py", line 436, in run mr_job.execute() File "/usr/local/lib/python2.7/site-packages/mrjob/job.py", line...
FWIW, the following changes fix all the errors above although I can't be sure I'm not glossing over some important details... https://github.com/openvenues/lieu/compare/master...sfomuseum:debug
See also: https://github.com/allmaps/render/blob/b4c71794526f8a40e063eff67939d70575035dd2/src/tiles.js#L162