striplog
striplog copied to clipboard
natural language processing for lithological descriptions
I saw in the video from scipy2015 that there were ideas for using NLP for parsing lithological descriptions and would be interested in the feature. I did not see any issue about it hence opening one.
To give a bit of context I am currently developping this package for exploratory lithology analysis. I am using striplog for some of it (abbreviations), and doing "my own" regex based analysis for lithology classification. I reckon this classification capability has more its place in the striplog package.
If using striplog e.g. with
striplog.Component.from_text(u'black clay with layers of fine sands', lex)
I obtain:
| attr | val |
|---|---|
| colour | black |
| lithology | clay |
| grainsize | fine |
More sophisticated regular expressions may alleviate such issues but NLP is probably something to explore these days (Disclaimer: I am far from an NLP guru). I am trying to find project resources (e.g. intern) to work on this. I'm doing a bit of lit. research but I'd value guidance or feedback (best case, something already implemented)