ReMine
ReMine copied to clipboard
Cannot reproduce the results of the paper
the output of kdd branch still has the same issue that addressed in #25
raw data raw_train.json is missed in code and cannot run the re-train process
output files:
results_remine/remine_result.txt
1 man | have , | medical
1 man | have , | neglect
1 man | have , | month
1 health | | Home
1 health | | operation
1 risk of Minnesota | | state
2 people | have , | Conakry
2 people | have , | capital
2 people | have , | Hospital
3 Democrat | | Nebraska
4 something | | for
4 New | by , , , | something
results_remine/remine_segmentation.txt
Gov. :RP]_[Tim :EP]_[Pawlenty of :BP]_[Minnesota :BP]_[order :EP]_[the :RP]_[state :EP]_[health :EP]_[department :EP]_[this :EP]_[month :EP]_[to :EP]_[monitor]_[day-to-day :BP]_[operation :EP]_[at :EP]_[the :RP]_[Minneapolis :BP]_[Veterans :EP]_[Home :EP]_[after :EP]_[state :EP]_[inspector]_[find :RP]_[that :RP]_[three :EP]_[man :EP]_[have :RP]_[die :EP]_[there :EP]_[in :EP]_[the :RP]_[previous :EP]_[month :EP]_[because :BP]_[of :EP]_[neglect :BP]_[or :EP]_[medical :BP]_[error :RP]_[. :RP]_[
the :RP]_[aid :RP]_[group]_[doctor :EP]_[without :EP]_[border :EP]_[say :BP]_[that :RP]_[since :RP]_[Saturday :BP]_[, :RP]_[more :RP]_[than :EP]_[275 :EP]_[wounded :BP]_[people :EP]_[have :RP]_[be]_[admit :BP]_[and :EP]_[treat :EP]_[at :EP]_[Donka]_[Hospital :EP]_[in :EP]_[the :RP]_[capital :EP]_[of :EP]_[Guinea :BP]_[, :RP]_[Conakry :BP]_[. :RP]_[
the :RP]_[american :BP]_[people :EP]_[can :EP]_[see :EP]_[what :EP]_[be]_[happen :BP]_[here :RP]_[, :RP]_[say :BP]_[Senator :BP]_[Ben :BP]_[Nelson]_[, :RP]_[Democrat :BP]_[of :EP]_[Nebraska :BP]_[. :RP]_[
for :BP]_[million :RP]_[, :RP]_[it :EP]_[be]_[a :EP]_[tough :BP]_[day :EP]_[of :EP]_[coping :BP]_[-- :BP]_[of :EP]_[watch :EP]_[floodwater :BP]_[pour]_[into :BP]_[home :EP]_[on :EP]_[the :RP]_[Raritan :BP]_[River :BP]_[in :EP]_[New :EP]_[Jersey :BP]_[from :RP]_[New :EP]_[Brunswick]_[to :EP]_[bind :EP]_[Brook :BP]_[and :EP]_[in :EP]_[the :RP]_[Westchester]_[suburb]_[of :EP]_[Mamaroneck :EP]_[and :EP]_[New :EP]_[Rochelle]_[, :RP]_[of :EP]_[sleep :EP]_[in :EP]_[a :EP]_[shelter :BP]_[or :EP]_[a :EP]_[airport :EP]_[, :RP]_[of :EP]_[tow :BP]_[a :EP]_[car :BP]_[and :EP]_[watch :EP]_[a :EP]_[refrigerator]_[float]_[by :RP]_[, :RP]_[of :EP]_[get :EP]_[to :EP]_[work]_[despite :EP]_[flood :BP]_[road :BP]_[and :EP]_[erratic :BP]_[train :BP]_[, :RP]_[of :EP]_[wait :BP]_[for :BP]_[power :BP]_[or :EP]_[a :EP]_[water :RP]_[pump :EP]_[or :EP]_[just :BP]_[something :BP]_[to :EP]_[hope :BP]_[for :BP]_[. :RP]_[
tmp_remine/remine_tokenized_segmented_sentences.txt
1 5663| 141 , | 983
1 5663| 141 , | 44243
1 5663| 141 , | 2668
1 1931| | 18561
1 1931| | 5053
1 2 3 72917| | 1519
2 3245| 141 , | 127954
2 3245| 141 , | 7358
2 3245| 141 , | 3303
3 60075| | 51787
4 18944| | 70
4 1269| 319 , 24 , | 18944
4 1269| | 21073
4 1269| | 89926
maybe pre_train/segmentation.model is old file.
re-train model need the raw data file, or the raw data looks like.
thanks!