dkpro-jwktl icon indicating copy to clipboard operation
dkpro-jwktl copied to clipboard

bugfix/#78- Remove `File:` comments, not just `Image:` comments.

Open benldr opened this issue 3 years ago • 1 comments

This solves issue #78 (I have re-parsed a Wiktionary dump using the updated code and no longer get the issues explained in #78).

I am assuming that [[File mark-up on a Wiktionary page is not used by the jwktl parser anywhere (I am not very familiar with the bulk of jwktl) - obviously if it is used elsewhere then my code change should not be approved!

Apologies if I have not followed the correct convention for contributing to the project- I am new to this. If so, feel free to delete my pull request and make the edit yourself.

benldr avatar Apr 01 '21 20:04 benldr

Hello,

you're right, the parser should remove File: tags as well. However I've just noticed another problem with the image removal, it fails in cases the image has nested links, for example

[[File:foo.png|thumb|Bla bla [[foo]]  bar]]

Currently, this results in bar]].

jberkel avatar Jan 20 '22 19:01 jberkel