bsd Handling of quotations

How does the bsd handle statements in quotations?

Nov 21 '17 14:11 jpfairbanks

Currently, BSD doesn't do anything special with quoted material.
However... given that BSD is engineered and tuned to the statement/sentence level of analysis, it would be fairly trivial to extract (or at least flag) quoted material for further analysis -- or to ignore it, as the use case might be. I can see a need for both options: journalists or researchers who don't want to penalize an article for being biased simply due to the quoted material having a bias (may want to just ignore text in quotes)... or researchers who want to explore the delta between the objectivity/bias of the article text vice the quoted material within it... or some other comparison. Implementation-wise: would it work for most purposes if we just add another routine (something like "isInQuote" for the input sentence) that outputs a boolean to the features dict that gets returned in "extract_bias_features"?

Nov 21 '17 16:11 cjhutto

Just for sake of documenting, on our recent call with GV, I proposed a few ways to handle quotes, to start with:

detect quotations and remove them and then do bias scoring
detect quotations and leave them in and then do bias scoring
introduce some weighting about how much bias scores of quotes affect the overall article level bias scores (e.g. maybe quotes are given half as much importance as actual article statements)

Yes, agree, there are different reasons why individuals would or wouldn't want to have quotes included; we discussed these briefly on the call. I'm running bsd on some articles to discuss with C from GV.

Yes, the "isInQuote" boolean would be useful to have 👍

Nov 21 '17 17:11 scottagt

We also discussed a feature which is median quote length. Which detects an adversarial style such as scare quotes, and quotes taken out of context.

Nov 28 '17 22:11 jpfairbanks

How do we want to move forward on this?

Dec 19 '17 16:12 jpfairbanks

I'm adding consideration for quotes in my next update to the features... hope to get it done by Sat night.

Dec 20 '17 17:12 cjhutto

Ok great, let me know if you need anything.

Dec 21 '17 14:12 jpfairbanks

Yes, let us know

Dec 21 '17 14:12 scottagt

Made some big changes, and continuing to do so as I work through my punch list. Among the changes was addressing the desire to consider use of quotes. So far, I've included new features such as has_quotes, mean_quote_length, mean_nonquote_length.

Dec 26 '17 15:12 cjhutto

Is there a PR coming?

Jan 02 '18 16:01 jpfairbanks

I merged and synced - should be there, right?

Jan 03 '18 14:01 cjhutto

@cjhutto it broke the build. Can you describe the changes you made?

Jan 18 '18 18:01 jpfairbanks

This commit deletes all of the ref_lexicons. https://github.com/cjhutto/bsd/commit/f58fabd16f07dfb6e3696f520fde3d150f76aba4

How do you want to handle lexicons going forward?

Jan 18 '18 18:01 jpfairbanks

I patched the setup.py to use the new ref_lexicons.

Feb 01 '18 14:02 jpfairbanks

bsd bsd copied to clipboard

Handling of quotations

bsd
bsd copied to clipboard