obspy icon indicating copy to clipboard operation
obspy copied to clipboard

Check decoding of files opened in text mode

Open megies opened this issue 8 years ago • 2 comments

The following occurrences in the code base open files for reading in text mode without specifying an encoding. That means the data get decoded as ASCII which will fail if the input data can contain unicode characters (e.g. like in #1483).

It probably would be a good idea to go through these cases and either explicitly specify the encoding or open as binary and explicitly decode.

geodetics/flinnengdahl.py
88-            self.lons[quad] = lons
89-            self.fenums[quad] = fenums
90-
91:        with open(self.numbers_file, 'rt') as csvfile:
92-            fe_csv = csv.reader(csvfile, delimiter=native_str(';'),
93-                                quotechar=native_str('#'),
94-                                skipinitialspace=True)

io/sh/core.py
133-    .TEST..BHE | 2009-10-01T12:46:01.000000Z - ... | 20.0 Hz, 801 samples
134-    .WET..HHZ  | 2010-01-01T01:01:05.999000Z - ... | 100.0 Hz, 4001 samples
135-    """
136:    fh = open(filename, 'rt')
137-    # read file and split text into channels
138-    channels = []
139-    headers = {}
--
387-            raise IOError(msg % data_file)
388-        fh_data = open(data_file, 'rb')
389-    # loop through read header file
390:    fh = open(filename, 'rt')
391-    line = fh.readline()
392-    cmtlines = int(line[5:7]) - 1
393-    # comment lines

io/gse2/paz.py
62-    zeros = []
63-
64-    if isinstance(paz_file, (str, native_str)):
65:        with open(paz_file, 'rt') as fh:
66-            paz = fh.readlines()
67-    else:
68-        paz = paz_file.readlines()

io/ndk/core.py
98-    if not hasattr(filename, "readline"):
99-        # Check if it exists, otherwise assume its a string.
100-        try:
101:            with open(filename, "rt") as fh:
102-                first_line = fh.readline()
103-        except:
104-            try:
--
156-    if not hasattr(filename, "read"):
157-        # Check if it exists, otherwise assume its a string.
158-        try:
159:            with open(filename, "rt") as fh:
160-                data = fh.read()
161-        except:
162-            try:

io/ascii/core.py
76-    True
77-    """
78-    try:
79:        with open(filename, 'rt') as f:
80-            temp = f.readline()
81-    except:
82-        return False
--
102-    True
103-    """
104-    try:
105:        with open(filename, 'rt') as f:
106-            temp = f.readline()
107-    except:
108-        return False
--
134-    >>> from obspy import read
135-    >>> st = read('/path/to/slist.ascii')
136-    """
137:    with open(filename, 'rt') as fh:
138-        # read file and split text into channels
139-        buf = []
140-        key = False
--
202-    >>> from obspy import read
203-    >>> st = read('/path/to/tspair.ascii')
204-    """
205:    with open(filename, 'rt') as fh:
206-        # read file and split text into channels
207-        buf = []
208-        key = False

clients/arclink/client.py
140-            dcid_key_file = DCID_KEY_FILE
141-        # parse dcid_key_file
142-        try:
143:            with open(dcid_key_file, 'rt') as fp:
144-                lines = fp.readlines()
145-        except:
146-            pass

megies avatar Jul 29 '16 10:07 megies

Run into exactly that issue inside a docker image using Python 2.7.

  File "/external/apps/py-env/lib/python2.7/site-packages/obspy/geodetics/flinnengdahl.py", line 41, in __init__
    with open(self.names_file, 'r') as fh:

Adding the encoding argument in each open(..., encoding="utf-8") calls of that file solved the issue for me.

jourdain avatar Jan 10 '19 23:01 jourdain

This probably is obsolete on master

megies avatar Oct 08 '20 08:10 megies