berkeley-doc-summarizer icon indicating copy to clipboard operation
berkeley-doc-summarizer copied to clipboard

Alex

Open alex-bloom opened this issue 7 years ago • 1 comments

The joint model (COREF+NER+WIKI) of the Berkeley Entity Resolution System combines the output for all input documents (e.g. government.txt and music.txt) into a single file output.conll. While the output produced by other models does not exactly match the test files in the Berkeley Document Summarizer (e.g. the last two columns of government.txt are off). Would appreciate a clarification on the assumed data interface between the Berkeley Entity Resolution System and the Berkeley Document Summarizer.

alex-bloom avatar Mar 01 '17 00:03 alex-bloom

Greg clarified that the utility class edu.berkeley.nlp.entity.preprocess.ConllDocSharder can be used for this splitting

alex-bloom avatar Mar 01 '17 18:03 alex-bloom