webstruct icon indicating copy to clipboard operation
webstruct copied to clipboard

Add date features

Open Kebniss opened this issue 6 years ago • 3 comments

I added the features I created for Fireflax

Kebniss avatar Jan 30 '18 15:01 Kebniss

Codecov Report

Merging #58 into master will increase coverage by 0.13%. The diff coverage is 84.44%.

@@            Coverage Diff             @@
##           master      #58      +/-   ##
==========================================
+ Coverage   81.01%   81.14%   +0.13%     
==========================================
  Files          40       41       +1     
  Lines        2091     2180      +89     
==========================================
+ Hits         1694     1769      +75     
- Misses        397      411      +14

codecov[bot] avatar Jan 30 '18 15:01 codecov[bot]

Uh? Is it complaining because I did not write tests for the new features?

Kebniss avatar Jan 30 '18 15:01 Kebniss

I run some tests to check how much these features help identifying date objects and results were mixed:

  • when start and end dates were identified by a single entity the extra features slightly worsened the performance moving the F1 score for B-date and I-date from 0.567 and 0.628 to 0.548 and 0.611 respectively. Sequence accuracy remains the same
  • when start and end dates were identified in two separate entities the extra features slightly increased the performance. For B-END_DATE F1 score moved from 0.591 to 0.625, I-END_DATE went from 0.682 to 0.721, B-START_DATE went from 0.522 to 0.547 and I-START_DATE went from 0.667 to 0.690. sequence accuracy went from 1.5% to 3.1%

scores were evaluated cross validating (3 fold) on 45 labelled pages and using crf model

Kebniss avatar May 03 '18 07:05 Kebniss