resdata icon indicating copy to clipboard operation
resdata copied to clipboard

Difference between `EclSumKeyWordVector` and `eclsum.keys()`

Open anders-kiaer opened this issue 3 years ago • 2 comments

from ecl.summary import EclSum, EclSumKeyWordVector


eclsum = EclSum("./SOME.UNSMRY", lazy_load=False)

column_names1 = list(EclSumKeyWordVector(eclsum, add_keywords = True))
column_names2 = list(eclsum.keys())

print(len(column_names1))
print(len(set(column_names1)))

print(len(column_names2))
print(len(set(column_names2)))

gives on a test model:

11260
10731
12895
12895

I.e. EclSumKeyWordVector gives fewer vectors than eclsum.keys() - especially when duplicate keys are removed 🤔

It looks like EclSumKeyWordVector is the one used in pandas.DataFrame export function, while the keys() function is used in the CSV export function. https://github.com/equinor/ecl/blob/3d6d17e1470419fc5af0c891560fb2211b2eda0d/python/ecl/summary/ecl_sum.py#L469-L525 https://github.com/equinor/ecl/blob/3d6d17e1470419fc5af0c891560fb2211b2eda0d/python/ecl/summary/ecl_sum.py#L1515-L1535

anders-kiaer avatar Jun 22 '21 08:06 anders-kiaer

We investigated this a bit further:

  • The vectors in .keys() but not in EclSumKeyWordVector ends with :INTEGER, where INTEGER is probably an integer representing total grid cell index. Corresponding vectors on the format :I,J,K are in both.
  • The duplicated vectors in EclSumKeyWordVector appears to be due to user requesting them twice in the SUMMARY section of the Eclipse model.

anders-kiaer avatar Jun 22 '21 10:06 anders-kiaer

Thanks for reporting @anders-kiaer :)

markusdregi avatar Jun 24 '21 05:06 markusdregi