rails
rails copied to clipboard
ActiveRecord::Sanitization::ClassMethods#sanitize_sql surrounds Numerics with quotes
Steps to reproduce
In a rails app that uses the mysql2 connection adapter run the following snippet.
ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: 12, per_page: 32 }])
Expected behavior
It should generate the following:
"LIMIT 12, 32"
Which is a valid MySQL LIMIT
clause.
Actual behavior
It generates the following:
"LIMIT '12', '32'"
Which is not a valid MySQL LIMIT
clause.
System configuration
Rails version: 7.0.1
Ruby version: MRI 3.0.3
This was changed in https://github.com/rails/rails/pull/42440
Hum, that would be tricky to fix. The whole issue is that when using bound values like this, we don't know what the value is for, and if it's used in a comparison with a string then not quoting it would be unsafe.
So I don't think there's a proper fix for this, however I can offer you a somewhat dirty workaround:
class SafeInteger < Numeric
def initialize(int)
@int = int
end
def to_s
@int
end
end
>>ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: SafeInteger.new(12), per_page: SafeInteger.new(32) }])
=> "LIMIT 12, 32"
Steps to reproduce
In a rails app that uses the mysql2 connection adapter run the following snippet.
ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: 12, per_page: 32 }])
Expected behavior
It should generate the following:
"LIMIT 12, 32"
Which is a valid MySQL
LIMIT
clause.Actual behavior
It generates the following:
"LIMIT '12', '32'"
Which is not a valid MySQL
LIMIT
clause.System configuration
Rails version: 7.0.1
Ruby version: MRI 3.0.3
Hum, that would be tricky to fix. The whole issue is that when using bound values like this, we don't know what the value is for, and if it's used in a comparison with a string then not quoting it would be unsafe.
So I don't think there's a proper fix for this, however I can offer you a somewhat dirty workaround:
class SafeInteger < Numeric def initialize(int) @int = int end def to_s @int end end >>ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: SafeInteger.new(12), per_page: SafeInteger.new(32) }]) => "LIMIT 12, 32"
It does seem a bit icky. I would rather do something like this in my app instead:
def generate_sql_limit(offset, per_page)
offset = Integer(offset, 10)
per_page = Integer(per_page, 10)
"LIMIT #{offset}, #{per_page}"
end
generate_limit(12, 32)
Could a solution be to add a sanitize_for_limit
method to ActiveRecord::Sanitization::ClassMethods
similar to ActiveRecord::Sanitization::ClassMethods.sanitize_sql_for_order(condition)
or perhaps even just an alias to it? Because looking at the documentation it seems like it pretty much already does what sanitize_for_limit
should do.
I would rather do something like this in my app instead:
Yes that works too. If you use it for limit and enforce integers, there's no inject risk.
Could a solution be to add
I'll have a look tomorrow.
@rbgrouleff as a slight side-track from the specifics of this issue, for general API-utility purposes, I'm interested in the circumstances that lead you to need to hand-sanitize a LIMIT clause. If you're able, I'd love to hear a bit more about the sort of query-building you're doing, and the reasons it's a better fit for your needs. (In any format you like -- here in a comment, in a gist, or e.g. via email if privacy makes it easier to show an example.)
@matthewd Of course, the limit clause is only a small part of this larger query I'm building:
sql = <<END
(
SELECT t.id
FROM transfers AS t
WHERE t.season = :season AND t.country_id_to = :country_id AND t.division_to = :division AND t.group_to = :group
)
UNION ALL
(
SELECT t.id
FROM transfers AS t
WHERE t.season = :season AND t.country_id_from = :country_id AND t.division_from = :division AND t.group_from = :group
)
ORDER BY id DESC
LIMIT :offset, :per_page
END
ids = ActiveRecord::Base.connection.select_values(ActiveRecord::Base.sanitize_sql([sql, {season: season, country_id: country_id, division: division, group: group, offset: (page - 1)*per_page, per_page: per_page}]))
The main reason to do it like this is to be able to page through the union of the two subqueries in SQL instead of finding the union in ruby.
Yeah it totally makes sense as a use case, I just don't see how we could actually support it without reopening the vulnerability.
I think in this case since offset is just an integer, it's fine to use #{Integer(offset)}
, even though it might trip some static analyzers into thinking you have a potential injection.
I was also bitten by this with some json manipulation - we have something like
foo = 1
ActiveRecord::Base.sanitize_sql(["data = JSON_SET('data', '$.count', ?)", foo])
which ends up writing a stringified "count": "1"
into our json data, which isn't ideal. Going to look into the SafeInteger workaround...
I'll be closing as I'm afraid there's really nothing we can do here.
That said if someone has ideas on how to make this more usable, I'm open to it.
My bad, re-reading the issue, I forgot to look at sanitize_for_limit
.
Reopening.
After thinking more about this, I think we should provide an Arel.integer() / Arel.float()
helper to mark these values as safe, which would essentially be the SafeInteger
workaround.
This issue has been automatically marked as stale because it has not been commented on for at least three months.
The resources of the Rails team are limited, and so we are asking for your help.
If you can still reproduce this error on the 7-0-stable
branch or on main
, please reply with all of the information you have about it in order to keep the issue open.
Thank you for all your contributions.
We just upgraded to Rails 7.0 and were bit by this issue.
In previous versions of Rails, a query like this:
Model.where('numeric_field <= ?', 10.0)
used to generate the following query:
SELECT `models`.* FROM `models` WHERE `models`.`numeric_field` <= 10.0
But in Rails 7.0.4, the generated SQL looks like this
SELECT `models`.* FROM `models` WHERE `models`.`numeric_field` <= '10.0'
We're using MySQL (8.0.19) and based on this it sounds like ~'10.0'
~ ~(surrounded by quotes) gets typecast by MySQL to~ ~0.0
~ ~internally since~ ~numeric_field
~ ~is an integer field type and this ends up producing invalid results for this query.~ See below.
I had to work-around this issue using this monkey-patch:
# config/initializers/activerecord_mysql_quoting_patch.rb
module CoreExtensions
module ActiveRecord
module ConnectionAdapters
module MySQL
module Quoting # :nodoc:
def quote_bound_value(value)
case value
when Numeric
value # We don't want to quote numeric values, since MySQL doesn't like this
when Rational
quote(value.to_f.to_s)
when ActiveSupport::Duration
quote(value.to_s)
when BigDecimal
quote(value.to_s('F'))
when true
"'1'"
when false
"'0'"
else
quote(value)
end
end
end
end
end
end
end
ActiveSupport.on_load(:active_record) do
ActiveRecord::ConnectionAdapters::MySQL::Quoting.prepend CoreExtensions::ActiveRecord::ConnectionAdapters::MySQL::Quoting
end
Would be great to have a way to avoid having to do this monkey patch.
it sounds like
'10.0'
(surrounded by quotes) gets typecast by MySQL to0.0
internally sincenumeric_field
is an integer field type and this ends up producing invalid results for this query.
That's not correct. It gets type cast to 0.0
only if it's not a numeric value.
WHERE `models`.`numeric_field` <= 10.0`
and
WHERE `models`.`numeric_field` <= '10.0'`
should produce exactly the same results.
I don't know what problem you ran into exactly, but this is not it.
it sounds like
'10.0'
(surrounded by quotes) gets typecast by MySQL to0.0
internally sincenumeric_field
is an integer field type and this ends up producing invalid results for this query.That's not correct. It gets type cast to
0.0
only if it's not a numeric value.
You're right. I just tested this out on a test MySQL DB. However, I realized that I had simplified my previous example a little too much. The issue actually doesn't happen with simple WHERE
clauses. It only seems to happen when IFNULL
is used. Here's what I found:
mysql> CREATE DATABASE test_typecasting;
Query OK, 1 row affected (0.01 sec)
mysql> USE test_typecasting;
Database changed
mysql> CREATE TABLE models (
-> numeric_field INT
-> );
Query OK, 0 rows affected (0.01 sec)
mysql> INSERT INTO models VALUES (10), (20), (30), (40), (NULL);
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM models;
+---------------+
| numeric_field |
+---------------+
| 10 |
| 20 |
| 30 |
| 40 |
| NULL |
+---------------+
5 rows in set (0.00 sec)
mysql> SELECT * FROM models WHERE numeric_field >= 30.4;
+---------------+
| numeric_field |
+---------------+
| 40 |
+---------------+
1 row in set (0.00 sec)
-- ✅ Works as expected. I was wrong in my previous example.
mysql> SELECT * FROM models WHERE numeric_field >= '30.4';
+---------------+
| numeric_field |
+---------------+
| 40 |
+---------------+
1 row in set (0.00 sec)
-- ❌ Doesn't work as expected. Notice how `20` is missing from the results.
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, '20.0') >= '20.0';
+---------------+
| numeric_field |
+---------------+
| 30 |
| 40 |
| NULL |
+---------------+
3 rows in set (0.00 sec)
-- ✅ But if I don't surround the 20.0 inside IFNULL with quotes, I get the expected results.
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, 20.0) >= '20.0';
+---------------+
| numeric_field |
+---------------+
| 20 |
| 30 |
| 40 |
| NULL |
+---------------+
4 rows in set (0.00 sec)
-- ✅ Works as expected.
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, 20.0) >= 20.0;
+---------------+
| numeric_field |
+---------------+
| 20 |
| 30 |
| 40 |
| NULL |
+---------------+
4 rows in set (0.00 sec)
This does sound like an issue with how MySQL handles type conversions in IFNULL()
... But it would be helpful if we can help prevent running into this issue from Rails somehow.
To summarize:
In previous version of Rails (<7.0):
Model.where("IFNULL(numeric_field, ?) >= ?", 20.0, 20.0)
used to generate:
SELECT * FROM models WHERE IFNULL(numeric_field, 20.0) >= 20.0
But in Rails 7.0 it generates:
SELECT * FROM models WHERE IFNULL(numeric_field, '20.0') >= '20.0'
which triggers an odd behavior in MySQL.
I see.
You can solve this issue without a monkey patch with a cast:
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, CAST('20.0' AS DECIMAL)) >= '20.0';
+---------------+
| numeric_field |
+---------------+
| 20 |
| 30 |
| 40 |
| NULL |
+---------------+
4 rows in set (0.00 sec)
For the explanation: https://dev.mysql.com/doc/refman/5.7/en/flow-control-functions.html#function_ifnull
The default return type of IFNULL(expr1,expr2) is the more “general” of the two expressions, in the order STRING, REAL, or INTEGER.
So IFNULL(integer, string)
always returns a string
, that's why the behavior is a bit confusing, you end up with a string on both sides.
Interestingly this works as expected, even though both sides have a string:
mysql> SELECT '20.0' <= '20.0';
+------------------+
| '20.0' <= '20.0' |
+------------------+
| 1 |
+------------------+
1 row in set (0.00 sec)
So it seems to be something specific with IFNULL
.
You can solve this issue without a monkey patch with a cast
True, but I'm now wondering if there are similar oddities with other SQL functions, that get triggered by using quotes around numeric values. Need to audit the entire code base before I can remove the monkey patch. Feels like a big gotcha when upgrading Rails versions though, even though it's due to an underlying MySQL gotcha.
Interestingly this works as expected, even though one datatype is a string:
Yes, it has to do with how MySQL does casting, and it because of its casting rules we had to make this change.
SELECT '20.0' <= 20.0;
one side is a decimal, so MySQL will do a decimal comparison casting the string "20.0"
into 20.0
.
So it seems to be something specific with IFNULL.
Yes, as I said, since one of the two arguments of IFNULL
is a string, that IFNULL will always return a string, so:
SELECT IFNULL(20, "20.0") = "20.0"
can be reduced to SELECT "20" = "20.0"
(string = string).
SELECT IFNULL(20, "20.0") = "20.0"
can be reduced toSELECT "20" = "20.0"
(string = string).
Ah I see what you mean now. Thank you for that explanation!
While a fix is discussed, I just wanted to raise that the docs are currently incorrect: https://api.rubyonrails.org/classes/ActiveRecord/Sanitization/ClassMethods.html#method-i-sanitize_sql_array says that:
sanitize_sql_array(["name=:name and group_id=:group_id", name: "foo'bar", group_id: 4])
# => "name='foo''bar' and group_id=4"
But actually in Rails 7:
[3] pry(main)> ActiveRecord::Base.sanitize_sql_array(["name=:name and group_id=:group_id", name: "foo'bar", group_id: 4])
=> "name='foo\\'bar' and group_id='4'"
I just wanted to raise that the docs are currently incorrect:
They're not incorrect because this is a MySQL specific behavior. They could mention that of course, but the example you quote is correct on Postgres and Sqlite3
I get that it's a database specific behaviour, but omiting that makes the doc incorrect. As a dev, I see the examples and expect the function to behave like them - but it's doesn't to the point of breaking the query.
It would be nice if docs would not confuse 50% of Rails users who are running on MySQL.
See https://github.com/rails/rails/commit/50d59e9c3e2da833c0122112d744b34f543f4df2
We were affected by this as well, after upgrade to v7 such queries don't work anymore:
Model.where('JSON_CONTAINS("[1,2,3]", JSON_ARRAY(?))', [1,2]);
as AR converts [1, 2]
into JSON_ARRAY("1", "2")
not JSON_ARRAY(1, 2)
like before.
Another example of a generated query that causes a MySQL error due to an integer being encoded as a string in the query:
value = 339
ActiveRecord::Base.sanitize_sql(["ALTER TABLE tbl_name AUTO_INCREMENT = ?", value])
# ALTER TABLE tbl_name AUTO_INCREMENT = '339'
This bug breaks all Model.find_by_sequel
call when a numeric parameter for LIMIT
is passed. Example:
Discount.find_by_sql(['SELECT * FROM discounts LIMIT 1, ?', 1])
You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''1'' at line 1 (Mysql2::Error)
This is a huge regression.