yrb icon indicating copy to clipboard operation
yrb copied to clipboard

Hash stored in map causes panick, type inconsistency between JS and Ruby

Open wkirby opened this issue 1 year ago • 17 comments

We are attempting to modify the same ydoc in the browser using yjs, and on our server using yrb. I have a pretty simple setup client side (this is simplified further for the sake of this report, but captures our use case fairly):

const ydoc = new Y.Doc();
const localBinder = bindImmerYjs(ydoc.getMap("data"));

localBinder.update((s) => {
  return {
    testString: "foo",
    testInt: 1,
    testFloat: 3.1415,
    testBool: false,
    testArray: [1, "two", true],
    testHash: { a: 1, b: "2", c: false }
  }
})

This ydoc is encoded as a uint8array, base64'd, then sent to the server where it's saved to a file store. In ruby, we load this content and try to access these values. This is mostly working. Roughly, this looks like:

ydata = Base64.decode64(stored_data)
ydoc = Y::Doc.new
ydoc.transact { |tx| tx.apply(ydata.bytes.to_a) }

ymap = ydoc.get_map("data")

All that is working, but this is where we run into problems:

# Works as expected
ymap[:testString] # => "foo"
ymap[:testBool] # => false
ymap[:testFloat] # => 3.1415

# Works close enough, but still wrong
ymap[:testInt] # => 1.0 (notice this is a Float now)

# Panick!
ymap[:testArray]
ymap[:testHash]

These Panicks are identical, producing:

thread '<unnamed>' panicked at /bundle/ruby/3.1.0/gems/y-rb-0.5.2/ext/yrb/.rb-sys/stable/cargo/registry/src/index.crates.io-6f17d22bba15001f/yrs-0.16.9/src/doc.rs:775:41:
called `Option::unwrap()` on a `None` value

There's an obvious workaround here where we can encode our hashes and arrays as a JSON string, but if we're going to do that, we might as well encode the whole map as a JSON string instead of as a map in the first place.

Note that this is not a problem when we decode these values on the client side.

wkirby avatar Oct 13 '23 21:10 wkirby

@eliias should I open this as an issue upstream with yrs?

wkirby avatar Oct 18 '23 20:10 wkirby

@eliias should I open this as an issue upstream with yrs?

@wkirby I am traveling atm. I can take a look next week. If it's really an upstream issue I'll move the issue.

eliias avatar Oct 18 '23 21:10 eliias

@wkirby I recall having a discussion w/ the other yrs folks about support for nested “complex” data structures (like Array, and Object). I need to check if it's technically possible to do that as long as you know what you do (there will be no conflict-free replication for nested objects). Text editors usually utilize the XML-like structure to achieve proper replication with nested data-structures.

Works close enough, but still wrong

ymap[:testInt] # => 1.0 (notice this is a Float now)

There is technically no Integer in JavaScript, so not sure if there is an easy “fix” for this. Probably best to do value.to_i when you know it should be an Integer in Ruby.

eliias avatar Oct 23 '23 11:10 eliias

@wkirby I actually can't reproduce the panic, even using your example values. Could you provide me a value for ydata = Base64.decode64(stored_data) -> stored_data to allow me debug this further?

https://github.com/y-crdt/yrb/blob/main/spec/y/map_spec.rb#L179-L192

eliias avatar Oct 23 '23 11:10 eliias

@eliias Just got back from some time off myself. I'll send over the encoded document we have and work on setting up a more concise repro. There's always the chance that the error is on the encoding end with y-immer.

wkirby avatar Oct 26 '23 03:10 wkirby

@eliias attached is a file containing a Base64 string that presents the error. Here's how I reproduce with this data:

ydata = File.read('ydoc.base64.txt')
ydata = Base64.decode64(ydata)
ydoc = Y::Doc.new
ydoc.transact { |tx| tx.apply(ydata.bytes.to_a) }

ymap = ydoc.get_map("data")

ymap[:testInt] # => 1.0
ymap[:testFloat] # => 3.1415
ymap[:testString] # => "foo"
ymap[:testBool] # => false

ymap[:testArray] # => Panick!
ymap[:testHash] # => Panick!

ydoc.base64.txt

Because this blob is pulled directly from our application, the Map at data also contains the following keys:

title
fields
fieldValues
agents
signatories

In addition, the ydoc includes an XML fragment at content which is serialized tiptap state. I don't believe either of these are causing the issue, as the same problem was present without these values in the ydoc.

Here are the relevant lines from my Gemfile.lock:

    y-rb (0.5.2)
      rake (~> 13.0)
      rb_sys (~> 0.9.71)
    y-rb (0.5.2-x86_64-linux)
      rake (~> 13.0)
      rb_sys (~> 0.9.71)

And from package.json:

"immer-yjs": "^1.1.0",
...
"yjs": "^13.6.7",

On the client side, the code (roughly) to serialize this data:

const useSyncedData = (doc) => {
  const ydoc = useMemo(() => new Y.Doc(), [doc.id]);
  const [binder, setBinder] = useState(undefined);

  useEffect(() => {
    const localBinder = bindImmerYjs(ydoc.getMap("data"));

    if (localBinder) {
      setBinder(localBinder);

      // optionally set initial data
      localBinder.update((s) => {
        return {
          testString: "foo",
          testInt: 1,
          testFloat: 3.1415,
          testBool: false,
          testArray: [1, "two", true],
          testHash: { a: 1, b: "2", c: false }
        };
      });
    }

    return () => {
      localBinder.unbind();
    };
  }, [ydoc, doc]);

  return ydoc;
}

const MyReactComponent = () => {
  const { doc } = useDoc();
  const ydoc = useSyncedData(doc);

  console.log(ydoc);

  return <p>Oh hey, a component</p>
}

wkirby avatar Oct 26 '23 18:10 wkirby

@eliias hope I'm not bugging you, but were the above resources enough to reproduce? If not I'm happy to schedule a call to work through this together. Just shoot me an email at [email protected] if you'd like.

wkirby avatar Nov 01 '23 22:11 wkirby

title

Yeah, I am able to reproduce this in a test case with your data. I also found the actual issue, but I do not have an immediate solution that is shippable. The logic that converts the internal CRDT representation for a Map/Array requests a new read-only transaction on the CRDT store, which is usually fine. The Ruby library always maintains a read-write transaction, so it causes a panic in this case. It is also unclear to me why it only fails on a sync. It should fail in both cases.

FWIW… I was able to access your test data after a somewhat hacky fix, so it is definitely doable. https://github.com/y-crdt/yrb/pull/134

eliias avatar Nov 02 '23 09:11 eliias

@eliias I'm not super versed in Rust, so I can't comment on the hackiness present here. I really appreciate your attention on this, and do let me know if there's any other way I can assist!

wkirby avatar Nov 02 '23 18:11 wkirby

@wkirby I am going to release 0.5.3 later today that includes a fix. The spec passes for your test data. Would be great if you can confirm it works for you.

eliias avatar Nov 13 '23 17:11 eliias

@eliias sorry, we moved on in our project to other things, but are returning now. I will confirm within the next week or so whether your implementation fixes our issue. Thank you so much!

wkirby avatar Feb 09 '24 21:02 wkirby

@eliias sorry for re-using same issue for my problem, but I have feeling that it's same or very close

$ grep y-rb Gemfile.lock
    y-rb (0.5.3-x86_64-linux)
    y-rb_actioncable (0.1.6)
      y-rb (>= 0.4.5)
  y-rb_actioncable (~> 0.1.6)
<script type="module">
  import * as Y from "yjs";
  import {WebsocketProvider} from "@y-rb/actioncable";
  import {createConsumer} from "@rails/actioncable";

  const
    document = new Y.Doc(),
    consumer = createConsumer(),
    provider = new WebsocketProvider(
      document,
      consumer,
      'MyChannel',
      {signed_stream_name: '...'}
    ),
    yArray = document.getArray('array')
  const test = `test${Math.random()}`
  const yMap = new Y.Map()
  yMap.set(test, test)
  yArray.insert(0, [yMap])
  provider.connect()
</script>

Then in the Rails log

MyChannel#receive {"update"=>"AAFKAQK/lKqpBgAHAQVhcnJheQEoAL+UqqkGABZ0ZXN0MC42NjkwOTk1MzkyNDY1NjUyAXcWdGVzdDAuNjY5MDk5NTM5MjQ2NTY1MgA="}

thread '' panicked at 'called Option::unwrap() on a None value', /home/runner/work/yrb/yrb/tmp/cargo-vendor/yrs/src/doc.rs:775:41

and when I'm trying to reproduce it in the console

irb(main):004> require 'y-rb'; update = Base64.decode64("AAFKAQK/lKqpBgAHAQVhcnJheQEoAL+UqqkGABZ0ZXN0MC42NjkwOTk1MzkyNDY1NjUyAXcWdGVzdDAuNjY5MDk5NTM5MjQ2NTY1MgA=").unpack('C*'); doc = Y::Doc.new; doc.sync(update[3..-1]); doc.get_array('array').to_a
thread '<unnamed>' panicked at 'called `Option::unwrap()` on a `None` value', /home/runner/work/yrb/yrb/tmp/cargo-vendor/yrs/src/doc.rs:775:41
/home/fedora/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/y-rb-0.5.3-x86_64-linux/lib/y/array.rb:269:in `yarray_to_a': called `Option::unwrap()` on a `None` value (fatal)

(Why [3..-1] - because of the 1st byte is yrb-actioncable's message type, the 2nd byte is y-protocol message type and the 3rd byte is the length of the update)

If I put not an YMap into the array, but a primitive

  const test = `test${Math.random()}`
  const yMap = new Y.Map()
  yArray.insert(0, [test])
  provider.connect()

then it works like a charm

MyChannel#receive: {"update"=>"AAEpAQHs4JKHAgAIAQVhcnJheQF3FXRlc3QwLjU1NzkyODAyODMzNjEyMQA="}

MyChannel#receive synced: ["test0.557928028336121"]

and in the console

irb(main):005> require 'y-rb'; update = Base64.decode64("AAEpAQHs4JKHAgAIAQVhcnJheQF3FXRlc3QwLjU1NzkyODAyODMzNjEyMQA=").unpack('C*'); doc = Y::Doc.new; doc.sync(update[3..-1]); doc.get_array('array').to_a
=> ["test0.557928028336121"]

May be it's me doing something wrong?

Phaengris avatar Feb 19 '24 17:02 Phaengris

@Phaengris It is possible that Maps in an Array aren't handled properly (still). I am going to add your case as a test and try to fix.

eliias avatar Feb 19 '24 17:02 eliias

@Phaengris I identified the issue. Map/Array does not accept nested Y data-structures in yrb. It can be fixed, but it might take a while.

eliias avatar Feb 20 '24 09:02 eliias

@eliias thank you for investigating this :)

No rush. If you can find time to fix it, it would make me happy, but if no - no worries.

For now I switched to use xml fragments / elements which also supports data nesting, in it's own way. May be a possible drawback is some performance hit, but I didn't do any benchmarks yet to see if it is actually a problem for my project.

Phaengris avatar Feb 20 '24 18:02 Phaengris

@eliias came back to note that the most recent published version (0.5.4) is working roughly as expected. I do have a very similar issue to @Phaengris (same error message, different proximal cause), but with a clean workaround.

The structure of our Y:Map is (roughly):

{
  title: "string",
  order: 0,
  data: {},
  dataValues: []
}

I can access all these values individually:

ydoc = Y::Doc.new
ydata = get_file_from_s3()
ydoc.transact { |tx| tx.apply(ydata.bytes.to_a) }

ymap = ydoc.get_xml_map('data')

ymap['title'] # => 'string'
ymap['order'] # => 0.0
ymap['data'] # => {}
ymap['dataValues'] # => []

So far so good! The error arises from trying to get all the values:

ymap.to_h # => Panic!
ymap.to_json # => Panic!

Results in the same error as above:

thread '<unnamed>' panicked at /home/runner/work/yrb/yrb/tmp/cargo-vendor/yrs/src/doc.rs:775:41:
called `Option::unwrap()` on a `None` value

The workaround, though:

ymap.as_json.to_h # => { title: 'string', order: 0.0, data: {}, dataValues: [] }

Anyway, from my perspective this issue can be closed out. Appreciate all your hard work!

wkirby avatar Mar 01 '24 15:03 wkirby

So it looks like nested maps don't work as the only way to create a Y::Map object it to use Y::Doc#get_map as the Y::Map constructor crashes.

In javascript I can create and assign Y.Map instances without errors.

I can work around this by changing how I'm modelling data but I'd love to be told I'm wrong here!

writercoder avatar Aug 26 '24 08:08 writercoder