bulk_insert
bulk_insert copied to clipboard
Use #exec_insert over #exec_query because there's no field selections
What's up?
Hi there,
Thanks for the gem! It's awesome. While I was trying it, I got an error on calling worker.save!
.
NoMethodError: undefined method `fields' for nil:NilClass
from /Users/saw/.gem/ruby/2.3.6/gems/activerecord-3.2.22.9/lib/active_record/connection_adapters/mysql2_adapter.rb:219:in `exec_query'
from (irb):2
from /Users/saw/.gem/ruby/2.3.6/gems/railties-3.2.22.9/lib/rails/commands/console.rb:47:in `start'
from /Users/saw/.gem/ruby/2.3.6/gems/railties-3.2.22.9/lib/rails/commands/console.rb:8:in `start'
from /Users/saw/.gem/ruby/2.3.6/gems/railties-3.2.22.9/lib/rails/commands.rb:41:in `<top (required)>'
from script/rails:6:in `require'
from script/rails:6:in `<main>'
Investigation
It seems that the basic skeleton code of Connection#exec_query
expects a result set that has field selections. However, since we're doing a bulk insert, it doesn't seem like this will work.
def exec_query(sql, name = 'SQL', binds = [])
result = execute(sql, name)
ActiveRecord::Result.new(result.fields, result.to_a)
end
However, there is an Connection#exec_insert
method that does not expect return results.
I suspect exec_query
is used because of return_primary_keys
where we call a SELECT
after the insert.
What this does
- Use
exec_query
when there primary keys are required - Use
exec_insert
otherwise
Hi @jamis a small bump, hopefully you will have a chance to take a look in the coming days. Have a good weekend.
It seems that at least it didn't error on the mysql2
adapter.
I'll test it on Rails 4/5 and also Postgres and let you know what I find
It should be fine for postgresql I guess.
For MySQL is a little bit more complicated since you cannot emulate the behavior of postgresql's returning
out of the box:
Returning equivalent in MySQL https://stackoverflow.com/q/9027008/5687152
For a single insertion it would make sense to retrieve LAST_INSERT_ID()
.
However for bulk insert is more complicates:
https://stackoverflow.com/q/7333524/5687152
InnoDB guarantees sequential numbers for AUTO INCREMENT when doing bulk inserts, provided innodb_autoinc_lock_mode is set to 0 (traditional) or 1 (consecutive). Consequently you can get the first ID from
LAST_INSERT_ID()
and the last by addingROW_COUNT()-1
.
Even if we would like to implement this combination gets quite hard in cases like the one I am facing with a MySQL DB where the autoincrement offset is changing according to the table.
More details on the keywords from an old (and still used a lot) MySQL version doc:
https://dev.mysql.com/doc/refman/5.6/en/information-functions.html#function_last-insert-id
I am not excluding that there is a smart and straightforward way to make pg logic available in this cases. (E.g. it could make sense to use this combination if you can easily retrieve the autoincrement offset and you retrieve the row diff and the last id within the same transaction as the insert)
Sorry to be slow to chime in here. Yeah, the "return primary keys" is Postgres-only, and as @mberlanda has pointed out, it's non-trivial to implement in MySQL (or, AFAIK, other databases in general).
I think the right way forward, here, is to only use exec_query
when @return_primary_keys
is true, and the adapter is Postgres. Otherwise, it should use execute
or exec_insert
.
(Interestingly, the default implementation of exec_insert
calls exec_query
...though perhaps not in the MySQL adapter?)
I have checked the current implementation of exec_query
for MySQL adapter on the ActiveRecord master branch:
https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/mysql/database_statements.rb
def exec_query(sql, name = "SQL", binds = [], prepare: false)
if without_prepared_statement?(binds)
execute_and_free(sql, name) do |result|
ActiveRecord::Result.new(result.fields, result.to_a) if result
end
else
exec_stmt_and_free(sql, name, binds, cache_stmt: prepare) do |_, result|
ActiveRecord::Result.new(result.fields, result.to_a) if result
end
end
end
This is preventing from raising NoMethodError: undefined method 'fields' for nil:NilClass
. However, since no else is provided the result returned will be nil
.
I will suggest to create a custom error and raise when return primary_key is provided with adapters other than pg. I can do a quick pr!
Hey @mberlanda @jamis sorry I haven't dug into the code, I see that Mauro is on top of it already. To take a step back, the issue I uncovered was that undefined method
fields' for nil:NilClass` and it is caused bulk insert returning nothing (mysql) and that caused the error instantiating the result set.
At that point, my thinking is just that exec_query
is used when you except results and exec_insert
doesn't expect results.
Following Mauro's investigation and reading the code for the different adapters, that seems to be case. In PG's case, exec_query
just naively instantiates results. In MySQL's case (copied from above), it has a guard clause.
**NB: PG's exec_insert seem to have something that deals with the return_primary_key option. I'll take a look. **
I agree with @jamis 's comment, we should just use the two methods accordingly. I can update the PR to make sure it works with both PG + (return primary keys) and MySQL.
exec_query
PG
def exec_query(sql, name = "SQL", binds = [], prepare: false)
execute_and_clear(sql, name, binds, prepare: prepare) do |result|
types = {}
fields = result.fields
fields.each_with_index do |fname, i|
ftype = result.ftype i
fmod = result.fmod i
types[fname] = get_oid_type(ftype, fmod, fname)
end
ActiveRecord::Result.new(fields, result.values, types)
end
end
https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql/database_statements.rb#L80
MySQL
def exec_query(sql, name = "SQL", binds = [], prepare: false)
if without_prepared_statement?(binds)
execute_and_free(sql, name) do |result|
ActiveRecord::Result.new(result.fields, result.to_a) if result
end
else
exec_stmt_and_free(sql, name, binds, cache_stmt: prepare) do |_, result|
ActiveRecord::Result.new(result.fields, result.to_a) if result
end
end
end
https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/mysql/database_statements.rb#L31
Sorry I haven't acted on this. Will try to rally sometime to do it within this week.
ok! Don't worry! I hope I will have the time to finish soon #37 to allow testing against the real database without mocking the adapter. In this case we should be able to validate that this has the expected behaviour for mysql2
adapter 😄