flowsa icon indicating copy to clipboard operation
flowsa copied to clipboard

URL access error for EPA FBAs

Open catherinebirney opened this issue 2 years ago • 7 comments

As of yesterday, we cannot generate FBAs pulling from certain EPA sources. We see the error requests.exceptions.HTTPError: 403 Client Error: Forbidden for url for EPA_FactsAndFigures, EPA_WFR and TRACI (LCIAformatter)

Get a different error for EPA_NEI data: requests.exceptions.SSLError: HTTPSConnectionPool(host='gaftp.epa.gov', port=443): Max retries exceeded with url: /air/nei/2017/data_summaries/2017v1/2017neiApr_nonpoint.zip (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)')))

@bl-young @WesIngwersen

catherinebirney avatar Sep 07 '23 21:09 catherinebirney

this commit in esupy (develop) should resolve this issue: https://github.com/USEPA/esupy/commit/b481e35262c9387c2e14ddcc8fa114dafe1c641d

bl-young avatar Sep 11 '23 14:09 bl-young

I'm reopening this issue because I'm also running into an SSLError certificate verify failed. I'm getting this with multiple FBAs calling different domains so its not limited to one site.

FYI I'm using requests 2.32.3 esupy 0.4.0

WesIngwersen avatar May 14 '25 15:05 WesIngwersen

It looks like the best resolution is to set verify=False https://requests.readthedocs.io/en/latest/user/advanced/#ssl-cert-verification

The fix is likely to be done in esupy

WesIngwersen avatar May 14 '25 15:05 WesIngwersen

I would be concerned that this could lead to security issues - though I admit this is outside my wheelhouse. Is there an adjustment needed on your machine?

bl-young avatar May 14 '25 16:05 bl-young

I would be concerned that this could lead to security issues - though I admit this is outside my wheelhouse. Is there an adjustment needed on your machine?

I have checked other dependencies and reverted to an earlier version of requests to see if that helped but it did not.

If the request is being made to known safe domains which is the case with flowsa I say that this setup does not present a security issue.

r = requests.get( "https://www2.census.gov/programs-surveys/arts/tables/2022restated/gm.xlsx")

requests.exceptions.SSLError: HTTPSConnectionPool(host='www2.census.gov', port=443): Max retries exceeded with url: /programs-surveys/arts/tables/2022restated/gm.xlsx (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)')))


r = requests.get( "https://www2.census.gov/programs-surveys/arts/tables/2022restated/gm.xlsx", verify=False)

InsecureRequestWarning: Unverified HTTPS request is being made to host 'www2.census.gov'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(

#So this call with verify=False returns a a valid response along with a warning  

WesIngwersen avatar May 14 '25 17:05 WesIngwersen

we can pass kwargs through to esupy, so lets try it just here in flowsa at first rather than force it for all esupy calls

bl-young avatar May 14 '25 18:05 bl-young

Unfortunately that didn't solve it for me.

WesIngwersen avatar May 14 '25 19:05 WesIngwersen