python-for-data-and-media-communication-gitbook
python-for-data-and-media-communication-gitbook copied to clipboard
Network issue - HTTPS certificate
The first two are issues about different os system users. The third is about encoding problem.
load csv from GitHub
Mac users(2) cannot import with the following codes, while Windows(2) users can work.
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/hupili/python-for-data-and-media-communication/master/text-analysis/regular_reader_tweets.csv')
print('The length of df is {}'.format(len(df)))
df.head()
Error message:
<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:833)>
Python version: both 3.6.5 Requests version: both 2.19.1 WI-FI environment: both CVA 808
import stopwords from nltk
Windows users(3) can setup the stopwords with nltk, while Mac users(2) cannot.
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
stopwords = stopwords.words('english')
Error message:
[nltk_data] Error loading stopwords: <urlopen error [SSL:
[nltk_data] CERTIFICATE_VERIFY_FAILED] certificate verify failed
[nltk_data] (_ssl.c:833)>
nltk version: both 3.4
import list of files by os.listdir
with encoding error
You can refer here.
CERTIFICATE_VERIFY_FAILED
is a common bug signature in CVA. Please do more test and pin down the problem scope. Once confirmed, please add to FAQ.
The 3rd problem has nothing to do with os.listdir
. It is related with open
file with "text mode" (default), whose encoding is not the same as system default encoding. You need to produce a minimum viable example, i.e. opening that single file leads to the error (even if you don't use listdir
)
after two more windows users and 1 mac users test, the results is still the same.
Which issue do you refer to? There are three issues in OP.
the first two issues.
@ChicoXYC what is the conclusion? Is CVA the only factor leading to the error?
the first two issues are related to the network in CVA and different protocols of the different operating system, which may need further discussion once encountered. And the os.listdir
issue is related to encoding&decoding.