textacy icon indicating copy to clipboard operation
textacy copied to clipboard

Question: blacklining

Open numericlee opened this issue 7 years ago • 3 comments

Thanks for the great tool. Can textacy be used to blackline one string against another? Is there another NLP tool which does this already?

numericlee avatar Jan 23 '18 17:01 numericlee

Hi @numericlee , I don't know what "blackline" means, and I wasn't able to figure it out via web search. Could you clarify?

bdewilde avatar Jan 23 '18 17:01 bdewilde

blacklining compares two versions of the same document. Generally, the second is a modification of the first

Blackling expresses the second document as the base document plus a series of insertions,deletions , and moves. This is a common feature in Microsoft Word

Base: Through a number of successful wars he expanded the Tsardom into a much larger empire that became a major European power. Revised: Through a number of wildly successful wars, Peter the Great expanded the Tsardom into a larger empire that became a European major power .

While it doesnt have to be in this format, something like this Through a number of [added:wildly] successful wars [deleted:he][added:Peter the Great] expanded the Tsardom into a [deleted:much] larger empire that became a European [moved:major] power.

On Tue, Jan 23, 2018 at 12:28 PM, Burton DeWilde [email protected] wrote:

Hi @numericlee https://github.com/numericlee , I don't know what "blackline" means, and I wasn't able to figure it out via web search. Could you clarify?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/chartbeat-labs/textacy/issues/159#issuecomment-359866936, or mute the thread https://github.com/notifications/unsubscribe-auth/AfO1sJFPwejeSvbYxS8r01cLgpo37CFeks5tNhaogaJpZM4Rp_Gm .

numericlee avatar Jan 23 '18 23:01 numericlee

Hey @numericlee , sorry to leave you hanging. In short, textacy doesn't have this functionality built-in, but it looks like Python does, more or less, via difflib. I've given some thought for how this could / whether it should fit into textacy, but haven't had time to really work on it. For now, difflib might be enough to get you started.

bdewilde avatar Feb 02 '18 15:02 bdewilde