Florian Leitner

Results 35 comments of Florian Leitner

Yes, and it didn't help. But now, after several restarts, and without the old database, that message is gone again. So just an annoying and seemingly not reproducible issue...

Just to potentially mark this as "solved": The confusion/problem arises because OSX comes with its own CRF Suite installed (in `/usr/lib/libCRFSuite.dylib`), and that Apple-hacked version does have said assertion, explaining...

Yes, I agree - sentences in quotes that end in an exclamation or question mark and that are followed by so-called speech tag should never be split, even if the...

Sadly, not at all. This tool requires punctuation markers to work. > On 21 Oct 2019, at 11:19, Tortoise17 wrote: > >  > I have plain text without any...

Statistical tagging - but I don’t know where you get the training data from. > On 21 Oct 2019, at 11:21, Tortoise17 wrote: > >  > any other guide...

Hi @pwichmann - have you seen the `--split-contractions` option here? https://github.com/fnl/segtok/blob/master/segtok/tokenizer.py#L344 Or the public `split_contractions` function to post-process tokens if you are using this programmatically, here? https://github.com/fnl/segtok/blob/master/segtok/tokenizer.py#L122 If you have,...

And you did the setup step to "create" your database, as recommended in Setup? (`medic insert --url sqlite:///medline.db 123456`)?

Forgot to add, as per the Setup instructions: And you created the tables with `medic --url sqlite:///medline.db write 123`?

Seems very peculiar indeed, then. When I have some time to spare, I can try checking if medic still works with OSX and/or Linux. However, note, I have no access...

Could totally be the case; When I developed and worked with this tool, it still was version 0.8 (see the setup.py).