ReMine icon indicating copy to clipboard operation
ReMine copied to clipboard

Cannot reproduce the results of the paper

Open xikakera opened this issue 3 years ago • 1 comments

the output of kdd branch still has the same issue that addressed in #25

raw data raw_train.json is missed in code and cannot run the re-train process

output files:

results_remine/remine_result.txt

1	man | have , | medical
1	man | have , | neglect
1	man | have , | month
1	health | | Home
1	health | | operation
1	risk of Minnesota | | state
2	people | have , | Conakry
2	people | have , | capital
2	people | have , | Hospital
3	Democrat | | Nebraska
4	something | | for
4	New | by , , , | something

results_remine/remine_segmentation.txt

Gov. :RP]_[Tim :EP]_[Pawlenty of :BP]_[Minnesota :BP]_[order :EP]_[the :RP]_[state :EP]_[health :EP]_[department :EP]_[this :EP]_[month :EP]_[to :EP]_[monitor]_[day-to-day :BP]_[operation :EP]_[at :EP]_[the :RP]_[Minneapolis :BP]_[Veterans :EP]_[Home :EP]_[after :EP]_[state :EP]_[inspector]_[find :RP]_[that :RP]_[three :EP]_[man :EP]_[have :RP]_[die :EP]_[there :EP]_[in :EP]_[the :RP]_[previous :EP]_[month :EP]_[because :BP]_[of :EP]_[neglect :BP]_[or :EP]_[medical :BP]_[error :RP]_[. :RP]_[
the :RP]_[aid :RP]_[group]_[doctor :EP]_[without :EP]_[border :EP]_[say :BP]_[that :RP]_[since :RP]_[Saturday :BP]_[, :RP]_[more :RP]_[than :EP]_[275 :EP]_[wounded :BP]_[people :EP]_[have :RP]_[be]_[admit :BP]_[and :EP]_[treat :EP]_[at :EP]_[Donka]_[Hospital :EP]_[in :EP]_[the :RP]_[capital :EP]_[of :EP]_[Guinea :BP]_[, :RP]_[Conakry :BP]_[. :RP]_[
the :RP]_[american :BP]_[people :EP]_[can :EP]_[see :EP]_[what :EP]_[be]_[happen :BP]_[here :RP]_[, :RP]_[say :BP]_[Senator :BP]_[Ben :BP]_[Nelson]_[, :RP]_[Democrat :BP]_[of :EP]_[Nebraska :BP]_[. :RP]_[
for :BP]_[million :RP]_[, :RP]_[it :EP]_[be]_[a :EP]_[tough :BP]_[day :EP]_[of :EP]_[coping :BP]_[-- :BP]_[of :EP]_[watch :EP]_[floodwater :BP]_[pour]_[into :BP]_[home :EP]_[on :EP]_[the :RP]_[Raritan :BP]_[River :BP]_[in :EP]_[New :EP]_[Jersey :BP]_[from :RP]_[New :EP]_[Brunswick]_[to :EP]_[bind :EP]_[Brook :BP]_[and :EP]_[in :EP]_[the :RP]_[Westchester]_[suburb]_[of :EP]_[Mamaroneck :EP]_[and :EP]_[New :EP]_[Rochelle]_[, :RP]_[of :EP]_[sleep :EP]_[in :EP]_[a :EP]_[shelter :BP]_[or :EP]_[a :EP]_[airport :EP]_[, :RP]_[of :EP]_[tow :BP]_[a :EP]_[car :BP]_[and :EP]_[watch :EP]_[a :EP]_[refrigerator]_[float]_[by :RP]_[, :RP]_[of :EP]_[get :EP]_[to :EP]_[work]_[despite :EP]_[flood :BP]_[road :BP]_[and :EP]_[erratic :BP]_[train :BP]_[, :RP]_[of :EP]_[wait :BP]_[for :BP]_[power :BP]_[or :EP]_[a :EP]_[water :RP]_[pump :EP]_[or :EP]_[just :BP]_[something :BP]_[to :EP]_[hope :BP]_[for :BP]_[. :RP]_[

tmp_remine/remine_tokenized_segmented_sentences.txt

1	5663| 141 , | 983
1	5663| 141 , | 44243
1	5663| 141 , | 2668
1	1931| | 18561
1	1931| | 5053
1	2 3 72917| | 1519
2	3245| 141 , | 127954
2	3245| 141 , | 7358
2	3245| 141 , | 3303
3	60075| | 51787
4	18944| | 70
4	1269| 319 , 24 , | 18944
4	1269| | 21073
4	1269| | 89926

xikakera avatar Jul 15 '21 12:07 xikakera

maybe pre_train/segmentation.model is old file.

re-train model need the raw data file, or the raw data looks like.

thanks!

xikakera avatar Jul 15 '21 19:07 xikakera