Monkey-patching around a MySQL/UTF8 bug

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Monkey-patching around a MySQL/UTF8 bug

David Cantrell
This bug in DBD::mysql is apparently not going to be fixed:
  https://rt.cpan.org/Ticket/Display.html?id=60987

and it's preventing us from inserting, eg, an i-acute character into our
database. Our customer Mr. García is Quite Irritated at this. It appears
that any non-ASCII character with code-point below 0x100 is affected
(higher codepoints like ij and ψ and ☃ are OK). As a work-around I've
done this in my application:

BEGIN {
    my $old_ex = \&DBIx::Class::Storage::DBI::_dbh_execute;
    my $new_ex = sub {
        foreach (@{$_[3]}) {
            if(exists($_->[1]) && defined($_->[1])) {
                utf8::upgrade($_->[1])
            }
        }
        return $old_ex->(@_);
    };

    {
        no strict qw/ refs /;
        no warnings 'redefine';
        *DBIx::Class::Storage::DBI::_dbh_execute = $new_ex;
    }
}

And it appears to work. However, I don't like monkey-patching like that.
Is there a better way that I haven't been able to find in the
DBIx::Class doco?

--
David Cantrell | Hero of the Information Age

  Sobol's Law of Telecom Utilities:
    Telcos are malicious; cablecos are simply clueless.

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Monkey-patching around a MySQL/UTF8 bug

Hailin Hu
Give a try to connect with the option mysql_enable_utf8 = 1
http://search.cpan.org/~capttofu/DBD-mysql-4.029/lib/DBD/mysql.pm#DATABASE_HANDLES

On Thu, Feb 26, 2015 at 11:43 PM, David Cantrell <[hidden email]> wrote:

> This bug in DBD::mysql is apparently not going to be fixed:
>   https://rt.cpan.org/Ticket/Display.html?id=60987
>
> and it's preventing us from inserting, eg, an i-acute character into our
> database. Our customer Mr. García is Quite Irritated at this. It appears
> that any non-ASCII character with code-point below 0x100 is affected
> (higher codepoints like ij and ψ and ☃ are OK). As a work-around I've
> done this in my application:
>
> BEGIN {
>     my $old_ex = \&DBIx::Class::Storage::DBI::_dbh_execute;
>     my $new_ex = sub {
>         foreach (@{$_[3]}) {
>             if(exists($_->[1]) && defined($_->[1])) {
>                 utf8::upgrade($_->[1])
>             }
>         }
>         return $old_ex->(@_);
>     };
>
>     {
>         no strict qw/ refs /;
>         no warnings 'redefine';
>         *DBIx::Class::Storage::DBI::_dbh_execute = $new_ex;
>     }
> }
>
> And it appears to work. However, I don't like monkey-patching like that.
> Is there a better way that I haven't been able to find in the
> DBIx::Class doco?
>
> --
> David Cantrell | Hero of the Information Age
>
>   Sobol's Law of Telecom Utilities:
>     Telcos are malicious; cablecos are simply clueless.
>
> _______________________________________________
> List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
> IRC: irc.perl.org#dbix-class
> SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
> Searchable Archive: http://www.grokbase.com/group/dbix-class@...

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Monkey-patching around a MySQL/UTF8 bug

David Cantrell
On Thu, Feb 26, 2015 at 11:53:49PM +0900, Hailin Hu wrote:

> Give a try to connect with the option mysql_enable_utf8 = 1

We're already doing that.

--
David Cantrell | Reality Engineer, Ministry of Information

Disappointment:
  n: No results found for "priapic dwarf custard wrestling".

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Monkey-patching around a MySQL/UTF8 bug

David Cantrell
In reply to this post by David Cantrell
On Thu, Feb 26, 2015 at 04:55:10PM +0100, Peter Rabbitson wrote:
> On 02/26/2015 03:43 PM, David Cantrell wrote:
> >This bug in DBD::mysql is apparently not going to be fixed:
> >   https://rt.cpan.org/Ticket/Display.html?id=60987
> Before we go further - is there explicit statement from the DBD devs
> that this is not going to be fixed? Perhaps just forgotten...?

Not authoritative, no.

But it's several years old, and there are a few related tickets going
back years too. And I need a fix *now*, hence my monkey-patch. Some of
those tickets include claims that fixing it would break
backward-compatibility.

> There isn't an overrideable point exposed for this, because "ugh".

That is, alas, what I expected.

> Also it never came up until now.

My patch is an adaptation of one that someone else wrote that
monkey-patched DBI instead. I guess that it didn't come up because other
people are patching closer to the bug, or they're just not mentioning
it, because "ugh".

>                             Also (as noted in that RT) - this will badly
> break BLOB operations.

Good point. That prompted me to look over our code. We had two BLOB
fields, one of which can be replaced with a TEXT. The other really does
contain binary data, but it seems that it's getting corrupted anyway
even without my monkey-patch. Yuck.

--
header   FROM_DAVID_CANTRELL    From =~ /david.cantrell/i
describe FROM_DAVID_CANTRELL    Message is from David Cantrell
score    FROM_DAVID_CANTRELL    15.72 # This figure from experimentation

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Inserting binary data without corrupting (was: Re: Monkey-patching around a MySQL/UTF8 bug)

Sam Kington
On 26 Feb 2015, at 18:14, David Cantrell <[hidden email]> wrote:
> On Thu, Feb 26, 2015 at 04:55:10PM +0100, Peter Rabbitson wrote:
[...]
>> Also (as noted in that RT) - this will badly
>> break BLOB operations.
>
> Good point. That prompted me to look over our code. We had two BLOB
> fields, one of which can be replaced with a TEXT. The other really does
> contain binary data, but it seems that it's getting corrupted anyway
> even without my monkey-patch. Yuck.

The problem appeared to be encoding issues between DBI and mysql. Delving into DBIx::Class::Storage::DBI:_dbh_execute I found that we were correctly passing a byte-encoded string from DBIx::Class, but it ended up corrupted in the database:

>   DB<11> x $sql                                                                
> 0  'INSERT INTO customer_static_file ( bytes, customer_id, date_created, date_updated, filename, user_id) VALUES ( ?, ?, ?, ?, ?, ? )'
>   DB<12> x $bind                                                                
> 0  ARRAY(0x122bf2b8)
>    0  ARRAY(0x122bef70)
>       0  HASH(0x122780a8)
>          'dbic_colname' => 'bytes'
>          'sqlt_datatype' => 'LONGBLOB'
>       1  "?PNG\cM\cJ\cZ\cJ??Stuff🝖"

DBI appears to know that this is a binary column, but that's not helping.

The solution I arrived at after some googling was as follows:

> __PACKAGE__->load_components('FilterColumn');
> __PACKAGE__->filter_column(
>     'bytes',
>     {
>         filter_from_storage => sub { pop },
>         filter_to_storage => sub {
>             my ($resultset, $binary_data) = @_;
>             my $encoded_string = sprintf(
>                 "x'%s'",
>                 join('',
>                     map { sprintf('%02x', ord($_)) } split('', $binary_data))
>             );
>             return \$encoded_string;
>         },
>     }
> );

which uses MySQL's preferred way of encoding binary data on the command line.

Is this the best way of handling this, or is there a simpler or purer way I've overlooked?

Sam
--
Website: http://www.illuminated.co.uk/


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Inserting binary data without corrupting

Sam Kington
On 27 Feb 2015, at 07:54, Peter Rabbitson <[hidden email]> wrote:
> On 02/26/2015 11:38 PM, Sam Kington wrote:
>> Is this the best way of handling this, or is there a simpler or purer way I've overlooked?
>
> Interesting. This is one way of handling it, yes (btw `unpack 'H*', $binary data` would be much faster). If this definitely fixes things on a combination of a DBD and RDBMS version (you never said what you are using) - I can look into making this implicitly handled by the driver itself.

MySQL 5.1.73-1, DBIx::Class 0.082810, DBI 1.622, DBD::mysql 4.022.

Sam
--
Website: http://www.illuminated.co.uk/


_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...
Reply | Threaded
Open this post in threaded view
|

Re: Inserting binary data without corrupting

David Cantrell
On Fri, Feb 27, 2015 at 12:43:11PM +0000, Sam Kington wrote:
> On 27 Feb 2015, at 07:54, Peter Rabbitson <[hidden email]> wrote:
> > On 02/26/2015 11:38 PM, Sam Kington wrote:
> >> Is this the best way of handling this, or is there a simpler or purer way I've overlooked?
> >
> > Interesting. This is one way of handling it, yes (btw `unpack 'H*', $binary data` would be much faster). If this definitely fixes things on a combination of a DBD and RDBMS version (you never said what you are using) - I can look into making this implicitly handled by the driver itself.
>
> MySQL 5.1.73-1, DBIx::Class 0.082810, DBI 1.622, DBD::mysql 4.022.

Same happens in MySQL 5.5 with DBD::mysql $latest as well.

--
David Cantrell | London Perl Mongers Deputy Chief Heretic

You don't need to spam good porn

_______________________________________________
List: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/dbix-class
IRC: irc.perl.org#dbix-class
SVN: http://dev.catalyst.perl.org/repos/bast/DBIx-Class/
Searchable Archive: http://www.grokbase.com/group/dbix-class@...