clj-rethinkdb
clj-rethinkdb copied to clipboard
get-all returns Cursor sometimes, PersistentVector others
It looks like when I call get-all with an index specified and include a key in the list that doesn't have any results in the database, I sometimes get a Cursor back and sometimes get a vector.
If I only ask for keys that exist in the index, I always get a vector.
Here's some code to recreate the issue:
(require '[rethinkdb.core :as rc])
(require '[rethinkdb.query :as r])
(defn recreate-bug []
(let [db-name (-> (str (java.util.UUID/randomUUID))
(clojure.string/replace "-" ""))]
(with-open [c (rc/connect)]
(r/run (r/db-create db-name) c)
(r/run (-> (r/db db-name)
(r/table-create "example")) c)
(let [table (-> (r/db db-name)
(r/table "example"))]
(-> (r/index-create table "by-race-id" (r/fn [row] (r/get-field row :race-id)))
(r/run c))
(-> (r/insert table {:test "document" :race-id "id"})
(r/run c))
(doseq [_ (range 50)]
(println
(-> (r/get-all table ["id" "cake"] {:index "by-race-id"})
(r/run c)
type)))))))
This produces output like this:
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
clojure.lang.PersistentVector
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
clojure.lang.PersistentVector
clojure.lang.PersistentVector
clojure.lang.PersistentVector
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
clojure.lang.PersistentVector
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
rethinkdb.net.Cursor
Another thing I'm seeing is sometimes I'll do a multiget on a couple of keys with no index, get a cursor back, call "seq" on it and get a result that hangs forever. I don't see anything bad in the logs for RethinkDB. Any thoughts here?
Also, what's the reason for the Thread/sleep call in the seq implementation for Cursor?
@sritchie see #68 and follow the links to understand more about this until I have the chance to document it properly. I think that's part of the problem but there may be more there. RethinkDB distinguishes between lazy seq operations and eagerly loaded vectors. Operations without an index will require the dataset to be loaded into memory before returning and will come back as an array, operations that can take advantage of the index will come back as a seq.
However I think there's something going on in your code I don't understand yet.
The sleep call is to avoid overwhelming the RethinkDB server requesting changefeeds. It's being fixed in the new core.async implementation I'm working on at the moment by upgrading the driver version to v4 which handles changefeeds in a nicer way.
Okay, cool. My big issue is that when I call seq on the cursors they hang forever. (At least for 5 minutes before I killed the thread in the 2-item get-all that I tested out.)
Do you guys have any idea why calling seq causes an infinite hang?
Maybe a better question - is there some way to just prevent Cursors from ever coming back and force all calls to be synchronous?
@sritchie you could call http://rethinkdb.com/api/javascript/coerce_to/, although this isn't necessarily the best as it will result in higher memory usage on the server. Take a look at the different types that can be returned in the API http://rethinkdb.com/api/javascript/ and how they depend on the input type. This should give you an intuition for when you'll get a Cursor or an array.
Now to your race condition bug, I'm 90% certain this is a bug in RethinkDB. I can reproduce this reliably, and this behaviour occurs even after running it on the same table several times (in case the index hadn't been completely built). Do you want to report this to RethinkDB? I'm happy to help.
If you can give me a minimal reproduction case for the seq hanging bug I can take a closer look at it.
coerce-to seems to work. I'll go with that for now, since I'm only hitting this index for a couple of items. I'd love some help filing a bug with RethinkDB. Is there some way to log the commands we're sending over so we can just report that sequence? I'll follow your lead here.
@sritchie I've opened https://github.com/rethinkdb/rethinkdb/issues/4616 about this.
The full details are in https://github.com/rethinkdb/rethinkdb/issues/4616 and https://github.com/rethinkdb/docs/pull/843, but the short version is that this is a deficiency in the current driver. Both of those results should be returned in a uniform interface. The upcoming work on a core.async API should be helpful for this.
@danielcompton that's a fantastic bug report, and helps me understand much better what's going on. Thanks again for all your help!