experiment-impact-tracker
experiment-impact-tracker copied to clipboard
Error in "get_region_by_coords" on a remote computing cluster
Hi,
I am able to run the code smoothly on my local machine. The same code + env in a singularity container fails on a remote computing cluster with following error:
loading region bounding boxes for computing carbon emissions region, this may take a moment...
454/454... rate=566.68 Hz, eta=0:00:00, total=0:00:00, wall=11:38 ESTT
Done!
INFO:Gathering system info for reproducibility...
ERROR:Status code Unknown from http://ipinfo.io/json: ERROR - HTTPConnectionPool(host='ipinfo.io', port=80): Max retries exceeded with url: /json (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x2ad6ba6184c0>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last):
File "eval_with_tracker.py", line 565, in <module>
tracker = ImpactTracker(log_dir)
File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 246, in __init__
self.initial_info = gather_initial_info(logdir)
File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 225, in gather_initial_info
data[key] = info_["routing"]["function"]()
File "../../experiment-impact-tracker/experiment_impact_tracker/data_info_and_router.py", line 63, in <lambda>
"routing": {"function": lambda: get_current_region_info_cached()[0]},
File "../../experiment-impact-tracker/experiment_impact_tracker/emissions/get_region_metrics.py", line 65, in get_current_region_info_cached
return get_current_region_info(ttl_hash=get_ttl_hash(seconds=60 * 60))
File "../../experiment-impact-tracker/experiment_impact_tracker/emissions/get_region_metrics.py", line 43, in get_current_region_info
return get_zone_information_by_coords(get_current_location())
File "../../experiment-impact-tracker/experiment_impact_tracker/emissions/get_region_metrics.py", line 10, in get_zone_information_by_coords
region = get_region_by_coords(coords)
File "../../experiment-impact-tracker/experiment_impact_tracker/emissions/get_region_metrics.py", line 17, in get_region_by_coords
point = Point(lon, lat)
File "/usr/local/lib/python3.8/dist-packages/shapely/geometry/point.py", line 48, in __init__
self._set_coords(*args)
File "/usr/local/lib/python3.8/dist-packages/shapely/geometry/point.py", line 137, in _set_coords
self._geom, self._ndim = geos_point_from_py(tuple(args))
File "/usr/local/lib/python3.8/dist-packages/shapely/geometry/point.py", line 214, in geos_point_from_py
dx = c_double(coords[0])
TypeError: must be real number, not NoneType
I am able to ping the ipinfo.io from the same node on the cluster.
ping ipinfo.io
PING ipinfo.io (216.239.34.21) 56(84) bytes of data.
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=1 ttl=111 time=0.655 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=2 ttl=111 time=0.809 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=3 ttl=111 time=0.836 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=4 ttl=111 time=0.733 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=5 ttl=111 time=0.797 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=6 ttl=111 time=0.741 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=7 ttl=111 time=0.762 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=8 ttl=111 time=0.744 ms
64 bytes from any-in-2215.1e100.net (216.239.34.21): icmp_seq=9 ttl=111 time=0.749 ms
^C
--- ipinfo.io ping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8008ms
rtt min/avg/max/mdev = 0.655/0.758/0.836/0.055 ms
Any suggestions? Thanks!
Hi @nikhil153, did you resolve this issue? If so, what was the solution so that we can ensure people do not run into it again?
@Breakend so this issues is a bit mysterious. It only happens on a specific HPC cluster that could be blocking external IPs. Although the interactive sessions on the same cluster let me ping the Internet. So I am not exactly sure. I created a workaround for my situation by modifying the code to override geo-location. However this branch is still under development, so I didn't create a PR for this feature. If you think it will be useful in general, I will be happy to do so.
HI @nikhil153 , sounds like this might be a common use case, so I'm going to go ahead and re-open. if you have the time, would be happy to get a PR from you, otherwise we'll add it to our backlog
@Breakend sure, I will create one soon as I am actively working it.