datajoint-matlab icon indicating copy to clipboard operation
datajoint-matlab copied to clipboard

Empty matrices are converted to NULL

Open dglee3 opened this issue 2 years ago • 2 comments

Bug Report

Description

In a newer version of DataJoint, empty matrices ([]) are inserted into a table as NULL rather than a longblob representing the empty set. This results in discrepancies when fetching data uploaded by different versions.

Reproducibility

  • OS: Windows Server 2016 Standard
  • MATLAB Version: R2022a
  • MySQL Version: 8.0.29
  • MySQL Deployment Strategy: Uncertain
  • DataJoint Version: 3.5.0
  • Minimum number of steps to reliably reproduce the issue:
  1. Create a table with a longblob attribute that is non-nullable. In this example the attribute is defined as puff: longblob
  2. Insert [] into this attribute
  • Complete error stack as a result of evaluating the above steps:
Error using dj.Relvar/insert/makePlaceholder
attribute `puff` is not nullable.

Error in dj.Relvar/insert (line 353)
                    [v, placeholder] = makePlaceholder(i, tuple.(header.attributes(i).name));

Error in experiment.TrialSummary/makeTuples (line 144)
            insert(self, key)

Error in dj.internal.AutoPopulate/taskCore (line 215)
                self.makeTuples(key)

Error in dj.internal.AutoPopulate>@(key,fun,args)fun(args{:}) (line 129)
                rels{i}.executionEngine = @(key, fun, args) fun(args{:});

Error in dj.internal.AutoPopulate/populate_ (line 327)
                                    self.executionEngine(key, @taskCore, {self, key})

Error in dj.internal.AutoPopulate/populate (line 130)
                [varargout{1:nargout}] = rels{i}.populate_(varargin{:});

Expected Behavior

When a [] is inserted, we expect it to be stored as a longblob which reads as [] when the data is fetched. This is the behavior of another machine with the following specifications:

  • OS: Windows 10 Pro
  • MATLAB Version: R2019a
  • MySQL Version: 8.0.29
  • MySQL Deployment Strategy: Uncertain
  • DataJoint Version: 3.3.1

dglee3 avatar Jun 01 '22 17:06 dglee3

Adding some documentation on reproducing this issue:

getSchema.m:

function obj = getSchema
persistent schemaObject
if isempty(schemaObject)
    schemaObject = dj.Schema(dj.conn, 'test', 'test');
end
obj = schemaObject;
end

Person.m:

%{
person: varchar(30)
-----
myblob: longblob
%}

classdef Person < dj.Manual
end

test_insert_empty_array.m:

clc
clear functions
dj.version
dj.conn
dj.createSchema('test', '/src/test', 'test')
test.Person
insert(test.Person, {'jeff', [1,2,3,4]})
insert(test.Person, {'jeff2', []})
schemaObject = dj.Schema(dj.conn, 'test', 'test');
fetch1(test.Person & 'person="jeff"', 'myblob')
fetch1(test.Person & 'person="jeff2"', 'myblob')
dropQuick(schemaObject)
When you run test_insert_empty_array.m on DJ Matlab version 3.3.1 and mym version 2.7.0 you get this output:
DataJoint version 3.3.1

Setting up mym...
No connections open
mym is now ready for use.
 0:  fakeservices.datajoint.io via TCP/IP   Server version 5.7.33 (encrypted)

  connection_id() 
 +---------------+
  84              


ans = 

  Connection with properties:

             host: 'fakeservices.datajoint.io'
             user: 'root'
        initQuery: ''
          use_tls: 'none'
    inTransaction: 0
           connId: 0
         packages: [0×1 containers.Map]
      foreignKeys: [0×0 struct]
      isConnected: 1

database created
Warning: Directory already exists. 
> In dj.createSchema (line 52)
  In test_insert_empty_array (line 5) 
test.getSchema.m already exists

ans = 


Object test.Person


<SQL>
CREATE TABLE `test`.`person` (
`person` varchar(30) NOT NULL COMMENT "",
`myblob` longblob NOT NULL COMMENT "",
PRIMARY KEY (`person`)
) ENGINE = InnoDB, COMMENT ""
</SQL>

 ::  ::

0 tuples (0.162 s)

closing DataJoint connection #0

ans =

     1     2     3     4


ans =

     []

done
When you run test_insert_empty_array.m on DJ Matlab version 3.5.0 and latest mym version you get this output:
DataJoint version 3.5.0

 0:  fakeservices.datajoint.io via TCP/IP   Server version 5.7.33 (encrypted)
database connection id: 19

ans = 

  Connection with properties:

             host: 'fakeservices.datajoint.io:3306'
             user: 'root'
        initQuery: []
          use_tls: 'none'
    inTransaction: 0
           connId: 0
         serverId: 19
         packages: [0×1 containers.Map]
          schemas: [1×1 struct]
      foreignKeys: [0×0 struct]
      isConnected: 1

database created
Warning: Directory already exists. 
> In dj.createSchema (line 56)
  In test_insert_empty_array (line 5) 
test.getSchema.m already exists

ans = 


Object test.Person

 ::  ::

0 tuples (0.193 s)

Error using dj.Relvar/insert/makePlaceholder (line 240)
attribute `myblob` is not nullable.

Error in dj.Relvar/insert (line 353)
                    [v, placeholder] = makePlaceholder(i, tuple.(header.attributes(i).name));

Error in test_insert_empty_array (line 8)
insert(test.Person, {'jeff2', []})

So that would confirm that there is a regression issue between the two version as you should be able to store and retrieve []

jverswijver avatar Jun 06 '22 21:06 jverswijver

Thanks @jverswijver for testing and collecting these details! This does confirm a regression issue which elevates the priority.

@dglee3 We are wrapping up a few features on datajoint/datajoint-python so we'll be able to address this soon thereafter. We'll have more updates on planning this activity next week.

guzman-raphael avatar Jun 06 '22 21:06 guzman-raphael