superfastmatch
superfastmatch copied to clipboard
/document/N/M/ format differs from previous API
Here is an example of the same document loaded into the old and new versions of superfastmatch, associated with itself using these commands:
echo "Lorem ipsum dolor sit amet, consectetur adipiscing elit." > /tmp/loremipsum.txt
curl --data-urlencode "title=Lorem ipsum" --data-urlencode "text@/tmp/loremipsum.txt" http://localhost:8080/document/1/1/
curl -XPOST http://localhost:8080/association/1/1/
curl http://localhost:8080/document/1/1/ | python -m json.tool
Old format:
{
"text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit.\n",
"title": "Lorem ipsum",
"doctype": 1,
"docid": 1,
"characters": 57,
"documents": {
"rows": [],
"metaData": {
"fields": []
}
},
"success": true
}
New format:
{
"associations": {
"Documents": [
{
"fragment_count": 1,
"fragments": [
[
0,
0,
55,
239133750739
]
],
"valid": true,
"characters": 57,
"title": "Lorem ipsum",
"id": {
"docid": 1,
"doctype": 1
}
}
],
"Meta": {}
},
"valid": true,
"characters": 57,
"text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit.\n",
"title": "Lorem ipsum",
"id": {
"docid": 1,
"doctype": 1
}
}
A re-implementation of the old API would be best for compatibility.
- Move "doctype" and "docid" out of the "id" object, drop the "id" key
- Either drop or document the new "valid" key. Is this the same as the old "success" key?
- Capitalization changes, e.g. documents vs Documents
- Can we drop the new "Associations" key that wraps the "Documents" key?
- What is the empty "Meta" object for? Is this supposed to be the old "meta_data" object under "documents"?
It seems the old version didn't associate a document with itself while the new old does. Since the full overlap of a document with itself can be assumed, the old behavior makes more sense to me.