lasio
lasio copied to clipboard
Read files with lots of blank lines in the data section
Describe the bug Can't read in this .las file. This is a open source .las file from Oklahoma.
To Reproduce Steps to reproduce the behavior: Using latest version of lasio (v 0.29) on windows
.las file is zipped 1051985022.zip
It looks like the lines are double spaced, which seems to be causing a problem when it gets down to the ~A
(data) section.
You might "just" need to strip those out first, write out a new file, and then it seems to work.
forig = '1051985022.las'
with open(forig) as f:
lines = f.readlines()
lines = lines[::2]
with open('1051985022_cleaned.las', 'w') as f:
for line in lines:
f.write(f'{line}\n')
las = lasio.LASFile('1051985022_cleaned.las')
Would it make sense to strip out every blank line, or those are needed for other formatting needs?
Check that. What I've done above will run, but the curves aren't being parsed in correctly. I'll poke around some more.
Just reposting here the comments I made in the welly-and-lasio channel on Software Underground slack.
" ... this particular file is ugly. There are blank lines between the lines with text, but you only need to remove the blank lines in the data section. I've removed them in the header section as well, and unfortunately it's not every other line throughout the file. It shifts by one at the start of the ~A
data so striding over the whole file won't work. Beware of obi-wan!
So this little code snippet will do what you need for this one well. Finding the position where the ~A is in the file. But it gets you what you need.
I wonder if lasio
could detect this and potentially handle it gracefully?
import lasio
def find_data_start(lines):
for i, line in enumerate(lines):
if line.startswith('~A'):
return i + 1
fname = '1051985022.las'
with open(fname) as f:
lines = f.readlines()
stack1 = lines[:find_data_start(lines):2] # cleaned header
stack2 = lines[find_data_start(lines)::2] # cleaned data
stack = stack1 + stack2
with open('1051985022_cleaned.las', 'w') as f:
for line in stack:
f.write(f'{line}')
las = lasio.LASFile('1051985022_cleaned.las')
Thanks for raising this issue and the example code. I'll take a look, it definitely should be readable.
I looked into this file today. Here are my finding working with the file on a mac:
- Opening the file with a textedit, there are the extra blank lines.
- Opening the file in Vim, there aren't any blank lines, but there is a windows carriage-return
^M
character at the end of each line. - With the following script, the file opens okay and I think handles the headers and data sections properly:
import lasio
las = lasio.LASFile("tmp/1051985022.las", index_unit='ft')
# Print the number of columns in a data row
# Output: 33
print(len(las.data[1]))
# Print number of data rows
# Output: 6657
print(len(las.data))
# Print the first two values of the last data row
# Output: array([3577. , 2004.0742])
las.data[-1][0:2]
--
-
If someone could check that the script has the same (or different) behavior on Windows, that will clarify if Lasio needs changes to handle the extra
^M
characters when running on Windows. -
Also, if it succeeds in Lasio, it seems like the LAS object in Welly should work, if Welly is working with the las.LASFile() object, since the
^M
characters should have been filtered out by Lasio's parsing of the file. -
Was there an error message when attempting to read the file?
@ThomasMGeo, @EvanBianco ,
I continued to look into this today and have a couple of additional findings:
- There is a Lasio Test case for a file with extra
^M
in it:
pytest tests/test_open_file.py::test_open_url_different_newlines
This test passes on both Ubuntu and Windows in the GitHub Action Tests
- I tried to duplicate the related Welly issue:
https://github.com/agilescientific/welly/issues/220
by running this Welly script with Welly 0.5.2 and Lasio 0.30, on a macos machine.
import lasio
from welly import Well, Project
#--------------------------------------------------------------------------------------
mywell = Well.from_las("tmp/1051985022.las", index_unit='ft')
gr = mywell.data['GR']
print(mywell.df())
Below is the resulting output. It indicates there are 6657 lines of data and that is the same number of data lines in the file.
TENS SPHI SP RXRT RXO RT90 RT60 RT30 RT20 ... GR DT DRHO DPHS DPHI DPHD DLIM CT90 CALI
DEPT ...
249.0 NaN 0.0574 125.5235 NaN NaN NaN NaN NaN NaN ... NaN 55.7208 NaN NaN NaN NaN NaN NaN NaN
249.5 NaN 0.0630 126.1887 NaN NaN NaN NaN NaN NaN ... NaN 56.5073 NaN NaN NaN NaN NaN NaN NaN
250.0 1365.0989 0.0638 127.0521 0.0 1999.9999 1999.9999 1999.9999 1999.9999 1999.9999 ... 37.7590 56.6280 NaN NaN NaN NaN NaN 0.5 8.1087
250.5 1363.3073 0.0610 127.5533 0.0 1999.9999 1999.9999 1999.9999 1999.9999 1999.9999 ... 41.9990 56.2275 NaN NaN NaN NaN NaN 0.5 8.1017
251.0 1362.9364 0.0471 127.7967 0.0 1999.9999 1999.9999 1999.9999 1999.9999 1999.9999 ... 46.8416 54.2585 NaN NaN NaN NaN NaN 0.5 8.1003
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
3575.0 2610.8960 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN
3575.5 2517.2781 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN
3576.0 2314.6697 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN
3576.5 2151.5366 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN
3577.0 2004.0742 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN
[6657 rows x 32 columns]
Given these findings, it seems possible the extra blank lines caused by the extra ^M
characters, isn't an issue for parsing this file with either Welly or Lasio. Maybe there is a different issue with the file. @ThomasMGeo, what was there failure error message when trying to bring the file into Welly/Laisio?
Thanks! DC
Hi DC, it was on a windows machine. If it works for you now, feel free to close. I will look into it next week.
Best, Thomas
@ThomasMGeo , Thanks for the update. If you will look into it next week, lets leave it open for your updates. If there is an issue it will be good to resolve it.
This seems fixed after testing today! Thanks again. I am going to close it now, and will make a new issue if it comes up.
Thanks, -TM
@ThomasMGeo, Good. Thank you retesting and the update! :-)