Sequel inserts "0" when insert
is given a Hash in MySQL
bug ruby
Note: This bug was found with an app that is backed by MySQL. It might apply to other sequel backends, I don’t know.
This is the root cause of an interesting bug a colleague and I had to solve at $work. If you want to skip to the MWE of the bug, have a look at the repo: https://github.com/nick96/bug-mwe.
The problem
We use Ruby along with the sequel
library for interacting with the database. Through a series mishandlings
(swiss cheese anyone?)
Dataset#insert
ended up being called with a hash as its argument, rather than
a JSON string. When given the hash, SQL converts it to a conditional, so
db[:table].insert({foo: 'bar', baz: 'quux'})
becomes:
INSERT INTO `table` (`text`) VALUES ((('foo' = 'bar') AND (('baz' = 'quuz'))));
the ((('foo' = 'bar') AND (('baz' = 'quuz'))))
part will evaluate to false
(0 in MySQL). I haven’t dug in to the MySQL source code but my understanding is
that it will happly convert INT
s to VARCHAR
s which is why “0” ends up being
inserted into that row.
The cause of this issue may seem relatively simple, though, as you probably know, things that are simple in an MWE like the one shown in the GitHub repo are not so simple in a moderately complex application!
Debugging
Debugging this issue was really interesting because I got to utilise tools that I’ve never used before. Going top down wasn’t working - things were a bit too opaque and we couldn’t see where the problem was. Instead, we went bottom up. You set set MySQL to log SQL it receives to a file with
SET global general_log = 1;
SET global log_output = 'file';
then use
SELECT @@general_log_file;
to find out which file it is being written to.
It’s then just a matter of tail -f
ing the general_log_file
and reproducing
the bug to see what SQL is being generated by sequel. In the case of the MWE
(and the actual code generated by our app was pretty similar, albeit a bit more
complex) it was:
SET @@wait_timeout = 2147483
SET SQL_AUTO_IS_NULL=0
SELECT version()
SELECT version()
SELECT NULL AS `nil` FROM `test` LIMIT 1
INSERT INTO `test` (`text`) VALUES ((('type' = 'doc') AND ('content' IN ((('type' = 'text') AND ('text' = 'first'))))))
SELECT * FROM `test` WHERE (`id` = 19) LIMIT 1
The line of interest is:
INSERT INTO `test` (`text`) VALUES ((('type' = 'doc') AND ('content' IN ((('type' = 'text') AND ('text' = 'first'))))))
Which you can see is pretty much the same as the example given above and will
evaluate to 0 which MySQL will the cast from an INT
to a VARCHAR
and viola,
“0” has been inserted in to the database.
The solution
The obvious part of the solution is to fix the app to not pass a hash in to
Dataset#insert
and we did do that. However, there is a deeper problem here:
not enforcing the types a function expects. Ruby is a dynamically typed
language so it’s very difficult to enforce functions being called with
arguments of the correct types. No, this is not where I got and recommend that
you go rewrite your application in a more strongly type language. We can get
some of those wins whilst still keeping our existing codebase. At $work we use
Sorbet which works pretty well. You’re able to
incrementally introduce it to your codebase which is really great for using it
on existing code. With the advent of Ruby 3 there is also RBS
(https://github.com/ruby/rbs) which appears to fill a similar niche as Sorbet,
though supported by the Ruby team itself.
I haven’t used RBS but I have used Sorbet and it has definitely saved us from this class of bugs, so if you haven’t already, I definitely recommend starting to migrate to using it.