rails icon indicating copy to clipboard operation
rails copied to clipboard

ActiveRecord::Sanitization::ClassMethods#sanitize_sql surrounds Numerics with quotes

Open rbgrouleff opened this issue 2 years ago • 21 comments

Steps to reproduce

In a rails app that uses the mysql2 connection adapter run the following snippet.

ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: 12, per_page: 32 }])

Expected behavior

It should generate the following:

"LIMIT 12, 32"

Which is a valid MySQL LIMIT clause.

Actual behavior

It generates the following:

"LIMIT '12', '32'"

Which is not a valid MySQL LIMIT clause.

System configuration

Rails version: 7.0.1

Ruby version: MRI 3.0.3

rbgrouleff avatar Feb 02 '22 14:02 rbgrouleff

This was changed in https://github.com/rails/rails/pull/42440

fatkodima avatar Feb 03 '22 11:02 fatkodima

Hum, that would be tricky to fix. The whole issue is that when using bound values like this, we don't know what the value is for, and if it's used in a comparison with a string then not quoting it would be unsafe.

So I don't think there's a proper fix for this, however I can offer you a somewhat dirty workaround:

class SafeInteger < Numeric
  def initialize(int)
    @int = int
  end

  def to_s
    @int
  end
end

>>ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: SafeInteger.new(12), per_page: SafeInteger.new(32) }])
=> "LIMIT 12, 32"

casperisfine avatar Feb 06 '22 11:02 casperisfine

Steps to reproduce

In a rails app that uses the mysql2 connection adapter run the following snippet.

ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: 12, per_page: 32 }])

Expected behavior

It should generate the following:

"LIMIT 12, 32"

Which is a valid MySQL LIMIT clause.

Actual behavior

It generates the following:

"LIMIT '12', '32'"

Which is not a valid MySQL LIMIT clause.

System configuration

Rails version: 7.0.1

Ruby version: MRI 3.0.3

Jerjer3333333 avatar Feb 06 '22 11:02 Jerjer3333333

Hum, that would be tricky to fix. The whole issue is that when using bound values like this, we don't know what the value is for, and if it's used in a comparison with a string then not quoting it would be unsafe.

So I don't think there's a proper fix for this, however I can offer you a somewhat dirty workaround:

class SafeInteger < Numeric
  def initialize(int)
    @int = int
  end

  def to_s
    @int
  end
end

>>ActiveRecord::Base.sanitize_sql(["LIMIT :offset, :per_page", { offset: SafeInteger.new(12), per_page: SafeInteger.new(32) }])
=> "LIMIT 12, 32"

It does seem a bit icky. I would rather do something like this in my app instead:

def generate_sql_limit(offset, per_page)
  offset = Integer(offset, 10)
  per_page = Integer(per_page, 10)
  "LIMIT #{offset}, #{per_page}"
end

generate_limit(12, 32)

Could a solution be to add a sanitize_for_limit method to ActiveRecord::Sanitization::ClassMethods similar to ActiveRecord::Sanitization::ClassMethods.sanitize_sql_for_order(condition) or perhaps even just an alias to it? Because looking at the documentation it seems like it pretty much already does what sanitize_for_limit should do.

rbgrouleff avatar Feb 06 '22 16:02 rbgrouleff

I would rather do something like this in my app instead:

Yes that works too. If you use it for limit and enforce integers, there's no inject risk.

Could a solution be to add

I'll have a look tomorrow.

byroot avatar Feb 06 '22 16:02 byroot

@rbgrouleff as a slight side-track from the specifics of this issue, for general API-utility purposes, I'm interested in the circumstances that lead you to need to hand-sanitize a LIMIT clause. If you're able, I'd love to hear a bit more about the sort of query-building you're doing, and the reasons it's a better fit for your needs. (In any format you like -- here in a comment, in a gist, or e.g. via email if privacy makes it easier to show an example.)

matthewd avatar Feb 07 '22 10:02 matthewd

@matthewd Of course, the limit clause is only a small part of this larger query I'm building:

sql = <<END
(
  SELECT t.id
  FROM transfers AS t
  WHERE t.season = :season AND t.country_id_to = :country_id AND t.division_to = :division AND t.group_to = :group
)
UNION ALL
(
  SELECT t.id
  FROM transfers AS t
  WHERE t.season = :season AND t.country_id_from = :country_id AND t.division_from = :division AND t.group_from = :group
)
ORDER BY id DESC
LIMIT :offset, :per_page
END
ids = ActiveRecord::Base.connection.select_values(ActiveRecord::Base.sanitize_sql([sql, {season: season, country_id: country_id, division: division, group: group, offset: (page - 1)*per_page, per_page: per_page}]))

The main reason to do it like this is to be able to page through the union of the two subqueries in SQL instead of finding the union in ruby.

rbgrouleff avatar Feb 07 '22 17:02 rbgrouleff

Yeah it totally makes sense as a use case, I just don't see how we could actually support it without reopening the vulnerability.

I think in this case since offset is just an integer, it's fine to use #{Integer(offset)}, even though it might trip some static analyzers into thinking you have a potential injection.

casperisfine avatar Feb 07 '22 17:02 casperisfine

I was also bitten by this with some json manipulation - we have something like

foo = 1
ActiveRecord::Base.sanitize_sql(["data = JSON_SET('data', '$.count', ?)", foo])

which ends up writing a stringified "count": "1" into our json data, which isn't ideal. Going to look into the SafeInteger workaround...

jdelStrother avatar Mar 02 '22 09:03 jdelStrother

I'll be closing as I'm afraid there's really nothing we can do here.

That said if someone has ideas on how to make this more usable, I'm open to it.

byroot avatar May 12 '22 00:05 byroot

My bad, re-reading the issue, I forgot to look at sanitize_for_limit.

Reopening.

byroot avatar May 12 '22 00:05 byroot

After thinking more about this, I think we should provide an Arel.integer() / Arel.float() helper to mark these values as safe, which would essentially be the SafeInteger workaround.

casperisfine avatar May 12 '22 14:05 casperisfine

This issue has been automatically marked as stale because it has not been commented on for at least three months. The resources of the Rails team are limited, and so we are asking for your help. If you can still reproduce this error on the 7-0-stable branch or on main, please reply with all of the information you have about it in order to keep the issue open. Thank you for all your contributions.

rails-bot[bot] avatar Aug 10 '22 14:08 rails-bot[bot]

We just upgraded to Rails 7.0 and were bit by this issue.

In previous versions of Rails, a query like this:

Model.where('numeric_field <= ?', 10.0)

used to generate the following query:

SELECT `models`.* FROM `models` WHERE `models`.`numeric_field` <= 10.0

But in Rails 7.0.4, the generated SQL looks like this

SELECT `models`.* FROM `models` WHERE `models`.`numeric_field` <= '10.0'

We're using MySQL (8.0.19) and based on this it sounds like ~'10.0'~ ~(surrounded by quotes) gets typecast by MySQL to~ ~0.0~ ~internally since~ ~numeric_field~ ~is an integer field type and this ends up producing invalid results for this query.~ See below.

I had to work-around this issue using this monkey-patch:

# config/initializers/activerecord_mysql_quoting_patch.rb

module CoreExtensions
  module ActiveRecord
    module ConnectionAdapters
      module MySQL
        module Quoting # :nodoc:
          def quote_bound_value(value)
            case value
            when Numeric
              value # We don't want to quote numeric values, since MySQL doesn't like this
            when Rational
              quote(value.to_f.to_s)
            when ActiveSupport::Duration
              quote(value.to_s)
            when BigDecimal
              quote(value.to_s('F'))
            when true
              "'1'"
            when false
              "'0'"
            else
              quote(value)
            end
          end
        end
      end
    end
  end
end

ActiveSupport.on_load(:active_record) do
  ActiveRecord::ConnectionAdapters::MySQL::Quoting.prepend CoreExtensions::ActiveRecord::ConnectionAdapters::MySQL::Quoting
end

Would be great to have a way to avoid having to do this monkey patch.

jasonbosco avatar Sep 13 '22 21:09 jasonbosco

it sounds like '10.0' (surrounded by quotes) gets typecast by MySQL to 0.0 internally since numeric_field is an integer field type and this ends up producing invalid results for this query.

That's not correct. It gets type cast to 0.0 only if it's not a numeric value.

WHERE `models`.`numeric_field` <= 10.0`

and

WHERE `models`.`numeric_field` <= '10.0'`

should produce exactly the same results.

I don't know what problem you ran into exactly, but this is not it.

byroot avatar Sep 13 '22 21:09 byroot

it sounds like '10.0' (surrounded by quotes) gets typecast by MySQL to 0.0 internally since numeric_field is an integer field type and this ends up producing invalid results for this query.

That's not correct. It gets type cast to 0.0 only if it's not a numeric value.

You're right. I just tested this out on a test MySQL DB. However, I realized that I had simplified my previous example a little too much. The issue actually doesn't happen with simple WHERE clauses. It only seems to happen when IFNULL is used. Here's what I found:

mysql> CREATE DATABASE test_typecasting;
Query OK, 1 row affected (0.01 sec)

mysql> USE test_typecasting;
Database changed
mysql> CREATE TABLE models (
    ->     numeric_field INT
    -> );
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO models VALUES (10), (20), (30), (40), (NULL);
Query OK, 5 rows affected (0.00 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM models;
+---------------+
| numeric_field |
+---------------+
|            10 |
|            20 |
|            30 |
|            40 |
|          NULL |
+---------------+
5 rows in set (0.00 sec)

mysql> SELECT * FROM models WHERE numeric_field >= 30.4;
+---------------+
| numeric_field |
+---------------+
|            40 |
+---------------+
1 row in set (0.00 sec)

-- ✅ Works as expected. I was wrong in my previous example.
mysql> SELECT * FROM models WHERE numeric_field >= '30.4';
+---------------+
| numeric_field |
+---------------+
|            40 |
+---------------+
1 row in set (0.00 sec)

-- ❌ Doesn't work as expected. Notice how `20` is missing from the results.
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, '20.0') >= '20.0';
+---------------+
| numeric_field |
+---------------+
|            30 |
|            40 |
|          NULL |
+---------------+
3 rows in set (0.00 sec)

-- ✅ But if I don't surround the 20.0 inside IFNULL with quotes, I get the expected results.
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, 20.0) >= '20.0';
+---------------+
| numeric_field |
+---------------+
|            20 |
|            30 |
|            40 |
|          NULL |
+---------------+
4 rows in set (0.00 sec)

-- ✅ Works as expected.
mysql> SELECT * FROM models WHERE IFNULL(numeric_field, 20.0) >= 20.0;
+---------------+
| numeric_field |
+---------------+
|            20 |
|            30 |
|            40 |
|          NULL |
+---------------+
4 rows in set (0.00 sec)

This does sound like an issue with how MySQL handles type conversions in IFNULL()... But it would be helpful if we can help prevent running into this issue from Rails somehow.

To summarize:

In previous version of Rails (<7.0):

Model.where("IFNULL(numeric_field, ?) >= ?", 20.0, 20.0)

used to generate:

SELECT * FROM models WHERE IFNULL(numeric_field, 20.0) >= 20.0

But in Rails 7.0 it generates:

SELECT * FROM models WHERE IFNULL(numeric_field, '20.0') >= '20.0'

which triggers an odd behavior in MySQL.

jasonbosco avatar Sep 13 '22 22:09 jasonbosco

I see.

You can solve this issue without a monkey patch with a cast:

mysql> SELECT * FROM models WHERE IFNULL(numeric_field, CAST('20.0' AS DECIMAL)) >= '20.0';
+---------------+
| numeric_field |
+---------------+
|            20 |
|            30 |
|            40 |
|          NULL |
+---------------+
4 rows in set (0.00 sec)

byroot avatar Sep 13 '22 23:09 byroot

For the explanation: https://dev.mysql.com/doc/refman/5.7/en/flow-control-functions.html#function_ifnull

The default return type of IFNULL(expr1,expr2) is the more “general” of the two expressions, in the order STRING, REAL, or INTEGER.

So IFNULL(integer, string) always returns a string, that's why the behavior is a bit confusing, you end up with a string on both sides.

byroot avatar Sep 13 '22 23:09 byroot

Interestingly this works as expected, even though both sides have a string:

mysql> SELECT '20.0' <= '20.0';
+------------------+
| '20.0' <= '20.0' |
+------------------+
|                1 |
+------------------+
1 row in set (0.00 sec)

So it seems to be something specific with IFNULL.

You can solve this issue without a monkey patch with a cast

True, but I'm now wondering if there are similar oddities with other SQL functions, that get triggered by using quotes around numeric values. Need to audit the entire code base before I can remove the monkey patch. Feels like a big gotcha when upgrading Rails versions though, even though it's due to an underlying MySQL gotcha.

jasonbosco avatar Sep 13 '22 23:09 jasonbosco

Interestingly this works as expected, even though one datatype is a string:

Yes, it has to do with how MySQL does casting, and it because of its casting rules we had to make this change.

SELECT '20.0' <= 20.0; one side is a decimal, so MySQL will do a decimal comparison casting the string "20.0" into 20.0.

So it seems to be something specific with IFNULL.

Yes, as I said, since one of the two arguments of IFNULL is a string, that IFNULL will always return a string, so:

SELECT IFNULL(20, "20.0") = "20.0" can be reduced to SELECT "20" = "20.0" (string = string).

byroot avatar Sep 13 '22 23:09 byroot

SELECT IFNULL(20, "20.0") = "20.0" can be reduced to SELECT "20" = "20.0" (string = string).

Ah I see what you mean now. Thank you for that explanation!

jasonbosco avatar Sep 13 '22 23:09 jasonbosco

While a fix is discussed, I just wanted to raise that the docs are currently incorrect: https://api.rubyonrails.org/classes/ActiveRecord/Sanitization/ClassMethods.html#method-i-sanitize_sql_array says that:

sanitize_sql_array(["name=:name and group_id=:group_id", name: "foo'bar", group_id: 4])
# => "name='foo''bar' and group_id=4"

But actually in Rails 7:

[3] pry(main)> ActiveRecord::Base.sanitize_sql_array(["name=:name and group_id=:group_id", name: "foo'bar", group_id: 4])
=> "name='foo\\'bar' and group_id='4'"

viraptor avatar Dec 19 '22 10:12 viraptor

I just wanted to raise that the docs are currently incorrect:

They're not incorrect because this is a MySQL specific behavior. They could mention that of course, but the example you quote is correct on Postgres and Sqlite3

casperisfine avatar Dec 19 '22 12:12 casperisfine

I get that it's a database specific behaviour, but omiting that makes the doc incorrect. As a dev, I see the examples and expect the function to behave like them - but it's doesn't to the point of breaking the query.

viraptor avatar Dec 19 '22 20:12 viraptor

It would be nice if docs would not confuse 50% of Rails users who are running on MySQL.

claasz avatar Jan 04 '23 13:01 claasz

See https://github.com/rails/rails/commit/50d59e9c3e2da833c0122112d744b34f543f4df2

byroot avatar Jan 04 '23 13:01 byroot

We were affected by this as well, after upgrade to v7 such queries don't work anymore:

Model.where('JSON_CONTAINS("[1,2,3]", JSON_ARRAY(?))', [1,2]);

as AR converts [1, 2] into JSON_ARRAY("1", "2") not JSON_ARRAY(1, 2) like before.

aliaksandrb avatar Mar 30 '23 21:03 aliaksandrb

Another example of a generated query that causes a MySQL error due to an integer being encoded as a string in the query:

value = 339
ActiveRecord::Base.sanitize_sql(["ALTER TABLE tbl_name AUTO_INCREMENT = ?", value])
# ALTER TABLE tbl_name AUTO_INCREMENT = '339'

bfad avatar Aug 08 '23 13:08 bfad

This bug breaks all Model.find_by_sequel call when a numeric parameter for LIMIT is passed. Example:

Discount.find_by_sql(['SELECT * FROM discounts LIMIT 1, ?', 1])

You have an error in your SQL syntax; check the manual that corresponds to your MariaDB server version for the right syntax to use near ''1'' at line 1 (Mysql2::Error)

This is a huge regression.

Nowaker avatar Jan 27 '24 23:01 Nowaker