couchdb-lucene icon indicating copy to clipboard operation
couchdb-lucene copied to clipboard

issues when indexing hash of hashes

Open sakrafd opened this issue 14 years ago • 9 comments

I have a couchdb doc with the following attribute (a hash containing another hash):

"subscriptions": { "2999": { "subscription_type_id": 1, "source_keyword_id": 23629 }, "1668": { "subscription_type_id": 1, "source_keyword_id": 3099 } }

My fulltext search function always returns undefined for the inner hash (i.e. I can't access doc.subscriptions["2999"].subscription_type_id). Here is the relevant part:

        if (doc.subscriptions && doc.subscriptions != null) {
          for (var sc in doc.subscriptions) {
            subscription_campaign_id = parseInt(sc);
            if (doc.subscriptions[sc] != null) {
              log.info('subscription_type_id' + doc.subscriptions[sc].subscription_type_id);  
              result.add(subscription_campaign_id, {'field':'subscription_campaign_id', 'type':'long'});
              result.add(doc.subscriptions[sc].subscription_type_id, {'field':'subscription_type_id', 'type':'long'});
              result.add(doc.subscriptions[sc].source_keyword_id, {'field':'source_keyword_id', 'type':'long'});
            } else {
              log.info('doc.subscriptions[sc] is null');
            }
          }
        }

This same code works fine in javascript, so it seems like the problem is somewhere in couchdb-lucene. Let me know if you would like the full doc and search function.

Thanks, Dave

sakrafd avatar Oct 26 '10 22:10 sakrafd

Anything in the logs? I threw a simple test together to see if nested structures can be emitted and it passes;

@Test
public void testNested() throws Exception {
    final String fun = "function(doc) { var ret = new Document(); ret.add(doc.foo[\"bar\"]); return ret; }";
    final DocumentConverter converter = new DocumentConverter(context,
            view(fun));
    final Document[] result = converter.convert(
            doc("{_id:\"hi\", foo: { bar: \"baz\"}}"), settings(), null);
    assertThat(result.length, is(1));
    assertThat(result[0].get("default"), is("baz"));
}

rnewson avatar Oct 26 '10 22:10 rnewson

No, just my log statements:

2010-10-26 16:41:30,386 INFO [lucene_test] Indexing from update_seq 0 2010-10-26 16:41:30,454 INFO [JSLog] doc.subscriptions[sc] is null 2010-10-26 16:41:30,455 INFO [JSLog] doc.subscriptions[sc] is null 2010-10-26 16:41:45,406 INFO [lucene_test] View[digest=a0ww492t5g4b1vhy7i2mhpl9w] now at update_seq 21

sakrafd avatar Oct 26 '10 22:10 sakrafd

It seems to be because your keys are numbers.

rnewson avatar Oct 26 '10 22:10 rnewson

That seems to be it. I'm actually saving the keys as a string (and the source from couchdb shows them as strings) so

"subscriptions": { "a2999": { "subscription_type_id": 1, "source_keyword_id": 23629 }, "b1668": { "subscription_type_id": 1, "source_keyword_id": 3099 } }

works, but

"subscriptions": { "2999": { "subscription_type_id": 1, "source_keyword_id": 23629 }, "1668": { "subscription_type_id": 1, "source_keyword_id": 3099 } }

doesn't. When I log the typeof(key) for the "2999" it sees it as a string. Any idea where it is getting converting back to an int?

sakrafd avatar Oct 26 '10 23:10 sakrafd

This is where the json string from couchdb is converted to an object in couchdb-lucene;

final JSONObject json = JSONObject.fromObject(line);

I'll check tomorrow but perhaps this is where the string is being converted to a int?

rnewson avatar Oct 27 '10 01:10 rnewson

the json library has changed recently, is this still an issue?

rnewson avatar Feb 14 '11 09:02 rnewson

I also see the issue with a hash where keys are number (though presented as strings in couchdb). Note that couchapps work perfectly well with this, so the problem is isolated to couch-lucene, probably 'lost in translation' between couch/java/javascript. I am using a fairly recent git clone (less than a week old).

raniglas avatar Apr 11 '11 13:04 raniglas

I'm experiencing this as well with the current 0.10.0 snapshot from Git. I can iterate over objects with digit-containing properties, and log.info(typeof(key)) reports the property names are in fact strings. Trying to actually access those properties produces errors like WARN [test] foo caused TypeError: Cannot read property "length" from undefined (unnamed script#3).

I also discovered a workaround, at the start of your index function add: doc = JSON.parse(JSON.stringify(doc));

dylantack avatar Feb 02 '13 07:02 dylantack

Having exactly the same issue with numeric indexes – works perfect with text, but when it comes to numbers, values are inaccessible.

@dylantack you're are the man! It's the ugly workaround but by far more elegant than anything else we could do. Thanks for sharing.

iby avatar Apr 09 '13 19:04 iby