nodejs-driver icon indicating copy to clipboard operation
nodejs-driver copied to clipboard

Fix typeName with tailing whitespace

Open FL4TLiN3 opened this issue 1 year ago • 4 comments

Problem:

While processing parseFqTypeName(), typeName for vector field ( vector<float, x> ) contains tailing whitespace like "org.apache.cassandra.db.marshal.FloatType ".

TypeError: Not a valid type "org.apache.cassandra.db.marshal.FloatType "
 ❯ parseFqTypeName node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/encoder.js:1372:13
 ❯ Encoder.parseVectorTypeArgs node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/encoder.js:994:24
 ❯ Encoder.encodeCustom node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/encoder.js:716:44
 ❯ Encoder.encode node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/encoder.js:1703:18
 ❯ node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/requests.js:458:71
 ❯ BatchRequest.eachQuery node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/requests.js:458:14
 ❯ BatchRequest.write node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/requests.js:440:18
 ❯ WriteQueue.process node_modules/.pnpm/[email protected]/node_modules/cassandra-driver/lib/writers.js:247:44

This problem is caused by Cassandra's prepared statement functionality. While the driver tries to get columns info from Cassandra, it sends response like;

columns: [
  { name: 'document_id', type: { code: 13, type: null } },
  { name: 'chunk_number', type: { code: 2, type: null } },
  {
    name: 'embedding',
    type: {
      code: 0,
      type: null,
      info: 'org.apache.cassandra.db.marshal.VectorType(org.apache.cassandra.db.marshal.FloatType , 256)'
    }
  },
  { name: 'chunk', type: { code: 13, type: null } },
  { name: 'language', type: { code: 13, type: null } }
]

(Cassandra version 5.0.1)

The string literal "org.apache.cassandra.db.marshal.VectorType(org.apache.cassandra.db.marshal.FloatType , 256)" seems to be valid for type.info, so I believe the driver should trim tailing whitespace in parseFqTypeName().

FL4TLiN3 avatar Oct 27 '24 15:10 FL4TLiN3

Related Issue: https://datastax-oss.atlassian.net/browse/NODEJS-667

FL4TLiN3 avatar Oct 27 '24 15:10 FL4TLiN3

Thanks! I think this issue will go away with NODEJS-666 PR.

SiyaoIsTraveling avatar Dec 18 '24 15:12 SiyaoIsTraveling

I believe @SiyaoIsHiding is correct, I expect this issue will be resolved by NODEJS-666. We should test to confirm however.

absurdfarce avatar Jun 17 '25 23:06 absurdfarce

Oh, forgot to add this part too!

Greetings @FL4TLiN3 and thanks for the contribution!

Have you signed the Contributor License Agreement for contributions to DataStax open source projects? If not you can find it at https://cla.datastax.com/. Thanks!

absurdfarce avatar Jun 17 '25 23:06 absurdfarce

Yeah, pretty sure this problem is already gone.

SiyaoIsHiding avatar Jun 25 '25 01:06 SiyaoIsHiding

I confirm that the following test passes with the latest driver, but fails before the vector arbitrary type support.

  it('should encode with whitespace in custom type name', function () {
    const vector = new Float32Array([1, 2, 3]);
    const typeObj = { code: 0, type: null, info: 'org.apache.cassandra.db.marshal.VectorType(org.apache.cassandra.db.marshal.FloatType , 3)' };
    const encoded = encoder.encode(vector, typeObj);
    const decoded = encoder.decode(encoded, typeObj);
    assert.strictEqual(decoded[0], vector[0]);
    assert.strictEqual(decoded[1], vector[1]);
    assert.strictEqual(decoded[2], vector[2]);
  });

Thus, we should close this PR, as this issue is already fixed.

SiyaoIsHiding avatar Jul 24 '25 06:07 SiyaoIsHiding