couchdb-lucene
couchdb-lucene copied to clipboard
issues when indexing hash of hashes
I have a couchdb doc with the following attribute (a hash containing another hash):
"subscriptions": { "2999": { "subscription_type_id": 1, "source_keyword_id": 23629 }, "1668": { "subscription_type_id": 1, "source_keyword_id": 3099 } }
My fulltext search function always returns undefined for the inner hash (i.e. I can't access doc.subscriptions["2999"].subscription_type_id). Here is the relevant part:
if (doc.subscriptions && doc.subscriptions != null) {
for (var sc in doc.subscriptions) {
subscription_campaign_id = parseInt(sc);
if (doc.subscriptions[sc] != null) {
log.info('subscription_type_id' + doc.subscriptions[sc].subscription_type_id);
result.add(subscription_campaign_id, {'field':'subscription_campaign_id', 'type':'long'});
result.add(doc.subscriptions[sc].subscription_type_id, {'field':'subscription_type_id', 'type':'long'});
result.add(doc.subscriptions[sc].source_keyword_id, {'field':'source_keyword_id', 'type':'long'});
} else {
log.info('doc.subscriptions[sc] is null');
}
}
}
This same code works fine in javascript, so it seems like the problem is somewhere in couchdb-lucene. Let me know if you would like the full doc and search function.
Thanks, Dave
Anything in the logs? I threw a simple test together to see if nested structures can be emitted and it passes;
@Test
public void testNested() throws Exception {
final String fun = "function(doc) { var ret = new Document(); ret.add(doc.foo[\"bar\"]); return ret; }";
final DocumentConverter converter = new DocumentConverter(context,
view(fun));
final Document[] result = converter.convert(
doc("{_id:\"hi\", foo: { bar: \"baz\"}}"), settings(), null);
assertThat(result.length, is(1));
assertThat(result[0].get("default"), is("baz"));
}
No, just my log statements:
2010-10-26 16:41:30,386 INFO [lucene_test] Indexing from update_seq 0 2010-10-26 16:41:30,454 INFO [JSLog] doc.subscriptions[sc] is null 2010-10-26 16:41:30,455 INFO [JSLog] doc.subscriptions[sc] is null 2010-10-26 16:41:45,406 INFO [lucene_test] View[digest=a0ww492t5g4b1vhy7i2mhpl9w] now at update_seq 21
It seems to be because your keys are numbers.
That seems to be it. I'm actually saving the keys as a string (and the source from couchdb shows them as strings) so
"subscriptions": { "a2999": { "subscription_type_id": 1, "source_keyword_id": 23629 }, "b1668": { "subscription_type_id": 1, "source_keyword_id": 3099 } }
works, but
"subscriptions": { "2999": { "subscription_type_id": 1, "source_keyword_id": 23629 }, "1668": { "subscription_type_id": 1, "source_keyword_id": 3099 } }
doesn't. When I log the typeof(key) for the "2999" it sees it as a string. Any idea where it is getting converting back to an int?
This is where the json string from couchdb is converted to an object in couchdb-lucene;
final JSONObject json = JSONObject.fromObject(line);
I'll check tomorrow but perhaps this is where the string is being converted to a int?
the json library has changed recently, is this still an issue?
I also see the issue with a hash where keys are number (though presented as strings in couchdb). Note that couchapps work perfectly well with this, so the problem is isolated to couch-lucene, probably 'lost in translation' between couch/java/javascript. I am using a fairly recent git clone (less than a week old).
I'm experiencing this as well with the current 0.10.0 snapshot from Git. I can iterate over objects with digit-containing properties, and log.info(typeof(key))
reports the property names are in fact strings. Trying to actually access those properties produces errors like WARN [test] foo caused TypeError: Cannot read property "length" from undefined (unnamed script#3)
.
I also discovered a workaround, at the start of your index function add:
doc = JSON.parse(JSON.stringify(doc));
Having exactly the same issue with numeric indexes – works perfect with text, but when it comes to numbers, values are inaccessible.
@dylantack you're are the man! It's the ugly workaround but by far more elegant than anything else we could do. Thanks for sharing.