handson-ml icon indicating copy to clipboard operation
handson-ml copied to clipboard

Chapter-2 error while getting the Housing datasets,

Open Abhishek689 opened this issue 6 years ago • 5 comments

Hi, Can you please help getting error for below code

import os
import tarfile
from six.moves import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml/master"
HOUSING_PATH = "datasets/housing"
HOUSING_URL = DOWNLOAD_ROOT + HOUSING_PATH + "/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    if not os.path.isdir(housing_path):
        os.makedirs(housing_path)
    tgz_path=os.path.join(housing_path,"housing.tgz")
    urllib.request.urlretrieve(housing_url,tgz_path)
    housing_tgz=tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

fetch_housing_data()

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
<ipython-input-37-bd66b1fe6daf> in <module>
----> 1 fetch_housing_data()

<ipython-input-36-108fe88897be> in fetch_housing_data(housing_url, housing_path)
      3         os.makedirs(housing_path)
      4     tgz_path=os.path.join(housing_path,"housing.tgz")
----> 5     urllib.request.urlretrieve(housing_url,tgz_path)
      6     housing_tgz=tarfile.open(tgz_path)
      7     housing_tgz.extractall(path=housing_path)

~\Anaconda3\lib\urllib\request.py in urlretrieve(url, filename, reporthook, data)
    245     url_type, path = splittype(url)
    246 
--> 247     with contextlib.closing(urlopen(url, data)) as fp:
    248         headers = fp.info()
    249 

~\Anaconda3\lib\urllib\request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    220     else:
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 
    224 def install_opener(opener):

~\Anaconda3\lib\urllib\request.py in open(self, fullurl, data, timeout)
    529         for processor in self.process_response.get(protocol, []):
    530             meth = getattr(processor, meth_name)
--> 531             response = meth(req, response)
    532 
    533         return response

~\Anaconda3\lib\urllib\request.py in http_response(self, request, response)
    639         if not (200 <= code < 300):
    640             response = self.parent.error(
--> 641                 'http', request, response, code, msg, hdrs)
    642 
    643         return response

~\Anaconda3\lib\urllib\request.py in error(self, proto, *args)
    567         if http_err:
    568             args = (dict, 'default', 'http_error_default') + orig_args
--> 569             return self._call_chain(*args)
    570 
    571 # XXX probably also want an abstract factory that knows when it makes

~\Anaconda3\lib\urllib\request.py in _call_chain(self, chain, kind, meth_name, *args)
    501         for handler in handlers:
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:
    505                 return result

~\Anaconda3\lib\urllib\request.py in http_error_default(self, req, fp, code, msg, hdrs)
    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found

Abhishek689 avatar Jun 13 '19 06:06 Abhishek689

Hi @Abhishek689 ,

There's a slash / missing at the end of the DOWNLOAD_ROOT string:

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml/master/"

ageron avatar Jun 13 '19 09:06 ageron

Hey Ageron, It worked..

Abhishek689 avatar Jun 14 '19 04:06 Abhishek689

Hi, using the new 2nd edition book and code

import os
import tarfile
import urllib
DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets","housing")
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    os.makedirs(housing_path, exist_ok=True)
    tgz_path = os.path.join(housing_path, "housing.tgz")
    urllib.request.urlretrieve(housing_url, tgz_path)
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()
fetch_housing_data()
=====
---------------------------------------------------------------------------
SSLCertVerificationError                  Traceback (most recent call last)

=========

I did check and it seems the certificate is in the certifi site package in teh Python 3.7 on my Mac. Any ideas?

VikingHenk avatar Oct 31 '19 05:10 VikingHenk

Hi @VikingHenk,

Thanks for your question. Generally, when people have SSL issues when using Python 3.6+ on MacOSX, it's due to a missing step when installing Python. Check out this step in the installation instructions: https://github.com/ageron/handson-ml2/blob/master/INSTALL.md#python--required-libraries

You mention that you checked the certifi package, but did you run this installation step?

/Applications/Python\ 3.7/Install\ Certificates.command

Cheers, Aurélien

ageron avatar Oct 31 '19 06:10 ageron

I am very new to python

Can someboody help me in debugging the below.

I am getting this while getting the Housing datasets

**req = meth(req)

File "C:\Users\HP\Anaconda3\envs\tf1\lib\urllib\request.py", line 1249, in do_request_ raise TypeError(msg)

TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.**

ajayjnv02 avatar Jul 15 '20 16:07 ajayjnv02