Natural Language Processing Suggestions
While it is certainly possible for Chicago to use OpenNLP to build out it's own full natural language processing systems, I'm not sure if that's wise. With the advent of chatbots, NLP is about to make some huge strides and it looks like progress will mostly be concentrated among Google, Microsoft, Amazon, IBM Watson & Apple. At the moment, Microsoft Cognitive Services and IBM Watson are the most mature and it seems like it would be most wise to use those so you can utilize the progress they will undoubtedly make. Not saying Chicago couldn't make it's own NLP - but it almost certainly couldn't improve it at the same rate as the big cloud providers could.
Adding some further thoughts on this issue and how it can be tackled:
The advanced query consists of some basic parameters:
- Range of dates
- Selecting data sources (and criteria for each source)
- Select location parameters
Natural Language Processing (e.g., OpenNLP) can identify these principal components of the query.
Example syntax and the resulting queries.
- 911 calls in Rogers Park
Resulting query: Dataset == 911p AND Community Area == Rogers Park
- Burglaries in Rogers Park
Though similar to previous example, this can be more complex. Burglaries could correspond to burglaries filed in the Crimes dataset or could be related to 911 calls received about burglaries. We should over-identify
In the absence of specific dates, the application could rely upon our current protocol to displaying a fixed number (e.g., 6,000) of the most recent data points.
Resulting query: Dataset == Crimes AND Dataset == 911p WHERE Primary Description == Burglaries AND Community Area == Rogers Park
- Tweets around me
Resulting query: Dataset == Twitter AND geoWithin: {center, ([current location])}
- Tweets about Chicago Bulls on May 20, 2015
Resulting query: Dataset == Twitter WHERE Twitter.text == "Chicago Bulls" AND Date == "2015-05-20"
- Buses around City Hall
Resulting query: Dataset == CTA AND geoWithin: {center, (41,8657, -87.7611)}
Developing and testing
Testing the NLP feature can be done against the developer API and by referencing the corresponding API docs