Andy Halterman

Results 45 issues of Andy Halterman

Occasionally something like "USAGOVMILGOV" will get coded. We want only one of each code. This can be done pretty easily in postprocess.py A more sophisticated (and less pressing) improvement would...

Going through the Levant data I've seen a lot of e.g. "Talks begin _here_ today regarding....". The "here" is easy if you can see the dateline, but that's been lost...

enhancement

The functions that @philip-schrodt wrote to extract verb and actor phrases from coded sentences need documentation and unit tests.

Petrarch2's code should be distinct from the dictionaries it uses. To make changes to the dictionaries more visible and to make it easier to switch in custom dictionaries, take the...

@philip-schrodt is working on adding support in Petrarch2 for the entity coreference information that CoreNLP outputs. This should increase yield for later sentences in a story. Feel free to add...

enhancement

Goose seems to be getting lots of junk mixed in with the story text proper. I have no idea how to fix this. Many, many stories across various VOA regions...

bug

The WN sources seem to be getting links to things that aren't news. Here's one, for instance, that's definitely a Wikipedia page: 53a4e750421aa95075fcf436 **Title**: Mayor Katz address to Winnipeggers from...

Cut out cruft and make it readable enough that someone else might be able to read it and figure out how contributions work.

In many cases, it's silly to emphasize the country of origin of the paper (e.g. Al Jazeera). What we could do (but would entail a lot of re-organization) is to...

enhancement

Now that UP is a little more stable, we need to start thinking about making it usable in production pipelines. In order for people (specifically the Spanish and Arabic teams)...

critical