pygrib
pygrib copied to clipboard
pygrib.fromstring entire grib file?
would it be possible/difficult to support fromstring() entire grib file instead of a single message?
Not sure what the use case would be - fromstring is usually used to create a grib message object from a binary string. If you have a bunch of binary strings strung together, they could be split up by looking for the grib end section ('7777' when decoded to ascii), and then iterated over.
I would like to stream data from from internet and split the grib file without storing it on a file system. I guessed that there must be a message separator but I did not know about '7777' string. I tested your suggestion but unsuccessfully. Would you mind to show me an example? I can figure out the streaming from URL but I dont know how to split the messages. I would expect this line should give me number of messages but it does not. What I am doing wrong?
len(open('./nam_218_20160101_0000_001.grb','rb').read().split('7777'))
I think you need to add decode('ascii','ignore'), i.e.
(open('./nam_218_20160101_0000_001.grb','rb').read().decode('ascii','ignore')).split('7777')
If it works, each string should start with 'GRIB' and end with '7777'.
len(open('./nam_218_20160101_0000_001.grb','rb').read().decode('ascii','ignore').split('7777'))
gives length 7972 and only 448 starts with 'GRIB' string.
Note that you'll have to add the 7777 back to each string, since the split will remove it.
I could not get the string split method to work - turns out grib messages sometimes have '7777' in the body of the message and not just at the end. Here's a script that does work for GRIB2 files (for me at least).
import pygrib, sys, struct
filename = sys.argv[1]
f = open(filename,'rb')
msgs = []
while 1:
# find next occurence of string 'GRIB' (or EOF).
nbyte = f.tell()
while 1:
f.seek(nbyte)
start = f.read(4).decode('ascii','ignore')
if start == '' or start == 'GRIB': break
nbyte = nbyte + 1
if start == '': break # at EOF
# otherwise, start (='GRIB') contains indicator message (section 0)
startpos = f.tell()-4
f.seek(4,1) # next four octets are reserved
# 5th octet is length of grib message
lengrib = struct.unpack('>q',f.read(8))[0]
# read in entire grib message, append to list.
f.seek(startpos)
gribmsg = f.read(lengrib)
msgs.append(gribmsg)
# convert grib message string to grib message object
for msg in msgs:
print pygrib.fromstring(msg)
Here's a version that works for both GRIB1 and GRIB2:
import pygrib, sys, struct
filename = sys.argv[1]
f = open(filename,'rb')
msgs = []
while 1:
# find next occurence of string 'GRIB' (or EOF).
nbyte = f.tell()
while 1:
f.seek(nbyte)
start = f.read(4).decode('ascii','ignore')
if start == '' or start == 'GRIB': break
nbyte = nbyte + 1
if start == '': break # at EOF
# otherwise, start (='GRIB') contains indicator message (section 0)
startpos = f.tell()-4
f.seek(3,1) # next three octets are reserved
# grib version number
vers = struct.unpack('>B',f.read(1))[0]
# length of grib message
if vers == 2:
lengrib = struct.unpack('>q',f.read(8))[0]
elif vers == 1:
f.seek(startpos+4)
lengrib = struct.unpack('>i','\x00'+f.read(3))[0]
# read in entire grib message, append to list.
f.seek(startpos)
gribmsg = f.read(lengrib)
msgs.append(gribmsg)
# convert grib message string to grib message object
for msg in msgs:
print pygrib.fromstring(msg)
I appreciate you taking your time. this answers my question. thx
I finally get time to use this more extensively but when I compared the results of using the method you posted above (where i split the grib file in memory) VS pygrib.open() of a physical file, I get a different number of messages. In fact, even the message length differs. Any idea why?
nope, no idea
@jparal I ran into the same error. I solved it like this:
import sys, struct
f = open('grib','rb')
msgs = []
f.seek(0, 2)
size = f.tell()
f.seek(0)
while 1:
# find next occurence of string 'GRIB' (or EOF).
nbyte = f.tell()
while 1:
f.seek(nbyte)
start = f.read(4).decode('ascii', 'ignore')
if start == 'GRIB':
break
nbyte = nbyte + 1
if nbyte >= size:
break
if nbyte >= size:
break
# otherwise, start (='GRIB') contains indicator message (section 0)
startpos = f.tell()-4
f.seek(3,1) # next three octets are reserved
# grib version number
vers = struct.unpack('>B',f.read(1))[0]
# length of grib message
if vers == 2:
lengrib = struct.unpack('>q',f.read(8))[0]
elif vers == 1:
f.seek(startpos+4)
lengrib = struct.unpack('>i', b'\x00'+f.read(3))[0]
# read in entire grib message, append to list.
f.seek(startpos)
gribmsg = f.read(lengrib)
msgs.append(gribmsg)
The key difference is that I stop when nbyte
exceeds the file size.