couchdb-lucene
couchdb-lucene copied to clipboard
CouchDB 2.1.x and Lucene
Hi, I've been trying to get couchdb-lucene
running and have hit a bit of a brick wall. I've followed the README, have built couchdb-lucene and have it running, also added the relevant _fti
handler into the correct section of the CouchDB local.ini
(shows in Fauxton). However once rigging the required index in CouchDB I get no indexing activity (empty Lucene indexes folder), and if I try and hit the _fti
endpoint I get back the standard CouchDB error of {"error":"not_found","reason":"Database does not exist."}
. This suggests the proxying isn't working (also Lucene logs are empty). Here is some system info...
$ java -version
openjdk version "9.0.4"
OpenJDK Runtime Environment (build 9.0.4+12-Ubuntu-2ubuntu3)
OpenJDK 64-Bit Server VM (build 9.0.4+12-Ubuntu-2ubuntu3, mixed mode)
$ curl "http://localhost:5985"
{"couchdb-lucene":"Welcome","version":"2.1.0"}
{
"_id": "_design/search",
"fulltext": {
"by_title": {
"index": "function(doc) { var ret = null; if (doc.title) { ret = new Document(); ret.add(doc.title, {\"field\":\"title\", \"store\":\"yes\" }); return ret; } }"
}
}
}
After saving a few docs in CouchDB, trying to hit the following endpoint:
curl http://****:****@localhost:5984/_fti/local/testDb/_design/search/by_title?q=captain
This CouchDB doc I would expect to be indexed is:
{
"_id": "a32629854547e9e3b50bdd75fd00cff7",
"_rev": "4-b2d552ac9ff8968189a3660dec987697",
"title": "Captain Jack Sparrow",
"slug": "captain-jack-sparrow",
"tags": [
"captain",
"jack",
"sparrow"
]
}
As far as I can tell everything is in order, before I destroy everything and start again was wondering if there is something silly that I've missed?
Thanks.
Ahhhh, now I see, after digging into some of the other issues I now get that with 2.x
you have to query couchdb-lucene
directly with something like:
curl http://localhost:5985/local/testDb/_design/search/by_title?q=title:captain
...with then gets me back...
{
"q": "title:captain",
"fetch_duration": 0,
"total_rows": 3,
"limit": 25,
"search_duration": 0,
"etag": "5633589417da1",
"skip": 0,
"rows": [
{
"score": 0.9384178519248962,
"id": "a32629854547e9e3b50bdd75fd00df8c",
"fields": {
"title": "Captain Hector Barbossa"
}
},
{
"score": 0.9384178519248962,
"id": "a32629854547e9e3b50bdd75fd00cff7",
"fields": {
"title": "Captain Jack Sparrow"
}
},
{
"score": 0.810458242893219,
"id": "a32629854547e9e3b50bdd75fd00ebff",
"fields": {
"title": "Captain John \"Calico Jack\" Rackham"
}
}
]
}
I think it could be well worth adding a note to the README to put this detail in an easier to find place, would have saved me a lot of time.
Still a very cool CouchDB addition. :+1:
Could you please send me your config?
I can't make it to work:
http://localhost:5985/local/databaseName/_design/posts/by_content?q=this
but it returns to me:
{"q":"","fetch_duration":0,"total_rows":0,"limit":25,"search_duration":0,"etag":"d4a463039c2","skip":0,"rows":[]}
@huyentk happy to help... can you post the content of the view by_content
?
@Crispy1975 Oh I typed error word in my view so that it didn't work, after I fixed, it works. Thanks you so much!
@Crispy1975 : Thank you for the help with the curl that needs to be used for lucene in 2.0. I've tried and it seems none of the fields are getting indexed in my case.
An example doc:
{
"_id": "939ff7c2a8asfkj908wgsgdgggdgg",
"_rev": "1-1a5adca8b3c229esf8ste7sgs7ts",
"sex": "M",
"age": 55.8,
"subject_id": "fsfj35355",
"subject_comment": "This is a test user.",
"type": "subject"
}
{
"_id": "_design/foo",
"_rev": "6-1b45904c9ce4325d27049e10102a441e",
"fulltext": {
"by_subject": {
"index": "function(doc) {var ret=new Document();ret.add(doc.subject_id,{'field':'subject_id', 'store':'yes'}); ret.add(doc.type,{'field':'type', 'store':'yes'}); return ret; }"
}
}
}
curl localhost:5985/local/dbName/_design/foo/by_subject
returns zero doc_count and empty fields
{"current":true,"doc_count":0,"digest":"3063p3oazx5p237dqdgqtpnen","update_seq":"start","disk_size":0,"doc_del_count":0,"fields":[],"ref_count":2,"uuid":"18bd1
Not sure what would I be missing here
It was my bad that I had installed the earlier version of lucene and obviously it didn't work out. Sorry, and I'm providing the link here for the couch2.0 compatible lucene 2.0.0 version https://github.com/rnewson/couchdb-lucene/releases/tag/v2.0.0 Thanks for all the help through the issues and comments
I had the same problem and Crispy1975's post provided the solution. If, from couchdb 2.0 onwards, the url has the format http://localhost:5985/local/testDb/_design/search/by_title?q=title:captain this should be added to the README. This would have saved me an hour. Should I do the edit and submit a PR for this?
@leforestier a PR would be welcomed. It's not so much that the url has changed, it's that you now have to call couchdb-lucene directly. the various schemes for proxying through http 2.0 no longer work.
Thank Google, I found this thread. I was tearing my hairs out.
The README is downright wrong, actually. It states that the URL format is:
http://localhost:5984/_fti/local/dbname/_design/foo/view_name?q=field_name:value
Port 5984, of course, is the default port of CouchDB!!! I bet 9 out of 10 people would not know that we need to query Lucene directly when seeing 5984.