mysql2sqlite
mysql2sqlite copied to clipboard
Hiccup while processing long comma-delimited string
as per "README.md" from author of "mysql2sqlite" we used these:
-
Dump MySQL DB mysqldump --skip-extended-insert --compact trading_db crb_portfolios_table
-
Convert the dump to SQLite3 DB ./mysql2sqlite crb_portfolios_table | sqlite3 trading_db
This short dump file will reproduce the error:
/*!40101 SET @saved_cs_client = @@character_set_client /;
/!40101 SET character_set_client = utf8 /;
CREATE TABLE crb_portfolios_table
(
portfolio_name
varchar(100) NOT NULL DEFAULT '',
portfolio_selections_string
blob,
PRIMARY KEY (portfolio_name
)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/!40101 SET character_set_client = @saved_cs_client */;
INSERT INTO crb_portfolios_table
VALUES ('all active futures liquid 15-80 cash 5',_binary 'AA,AD,AZ,BO,BP,C-,CB,CC,CD,CL,CT,DF,EC,ED,ES,FF,FX,GC,HG,HH,HO,HS,JY,KC,KS,KW,LC,LH,LJ,MX,MZ,NG,NH,NN,NO,NP,NX,OX,PI,RB,RJ,S-,SB,SI,SM,SP,SS,SZ,TP,TT,VX,W-,X-,ZM');
INSERT INTO crb_portfolios_table
VALUES ('all active futures liquid 5-30 all inclusive',_binary '5L,A2,AA,AD,AZ,B-,BF,BG,BM,BO,BP,BS,BT,C-,CA,CB,CC,CD,CG,CK,CL,CT,DF,DI,DR,DX,EC,ED,ES,F-,FA,FB,FC,FF,FV,FX,FY,GC,GH,GM,GX,HG,HH,HN,HO,HS,IA,IB,IE,II,IL,IT,IX,JY,K6,K7,K8,KB,KC,KD,KL,KM,KS,KW,L-,LC,LF,LH,LJ,LL,LO,LQ,LW,ME,MH,MW,MX,MZ,ND,NE,NF,NG,NH,NI,NK,NN,NO,NP,NV,NX,OT,OX,PA,PI,PL,PO,PP,PU,QB,QE,R-,RB,RE,RJ,S-,SB,SF,SI,SM,SP,SS,SV,SZ,T2,TE,TO,TP,TQ,TS,TT,TU,TY,U0,UG,US,V4,V5,V9,VA,VC,VD,VH,VI,VO,VP,VR,VX,W-,WB,WC,WG,WI,WO,WQ,WR,X-,XO,XT,XU,XV,Y1,Y2,YC,YD,YE,YG,YK,YQ,YS,YT,YV,YW,YX,Z0,Z2,ZM,ZN,ZW');
INSERT INTO crb_portfolios_table
VALUES ('all active options liquid 15-80 cash 5',_binary 'AD,BO,BP,C-,CB,CC,CD,CL,CT,DF,EC,ED,ES,FF,FX,GC,HG,HO,HS,JY,KC,KW,LC,LH,MX,MZ,NG,NN,NX,RB,RJ,S-,SB,SI,SM,SP,SS,SZ,W-,X-,ZM');
INSERT INTO crb_portfolios_table
VALUES ('all futures active and liquid',_binary '2B,2C,2G,2H,2I,2J,2K,2L,2N,2S,2T,2U,2V,2W,5A,5D,5F,5G,5L,5P,5S,5V,5Z,AA,AD,AZ,B-,BF,BM,BO,BP,BS,BT,C-,CA,CB,CC,CD,CG,CK,CL,DF,EC,ED,EI,ES,F-,FA,FF,FV,FX,GC,GH,GM,HG,HH,HN,HO,HS,IL,JY,K6,K7,KB,KC,KD,KS,KW,L-,LC,LF,LH,LJ,MH,MX,MZ,ND,NG,NH,NN,NO,NP,NX,OT,OX,PI,PO,R-,RB,RE,RJ,S-,SB,SI,SM,SP,SS,SZ,TS,TT,TU,TY,U0,UG,US,VA,VC,VD,VH,VI,VP,VR,VX,W-,WG,WI,WJ,WO,WQ,WS,X-,XO,XT,XU,XV,YC,YG,YK,YQ,YS,Z2,ZM');
INSERT INTO crb_portfolios_table
VALUES ('all options active and liquid',_binary 'AD,B-,BF,BO,BP,C-,CB,CC,CD,CK,CL,CT,DF,EC,ED,ES,F-,FF,FV,FX,GC,GH,GM,HG,HO,HS,JY,KC,KW,L-,LC,LF,LH,MH,MX,MZ,NG,NN,NP,NX,R-,RB,RJ,S-,SB,SI,SM,SP,SS,SZ,TU,TY,UG,US,VD,VI,W-,WC,WI,X-,ZM');
INSERT INTO crb_portfolios_table
VALUES ('Currencies active and liquid futures',_binary '5A,5D,5F,5G,5P,5V,AA,AD,BP,CD,EC,JY,KD,NP,U0,VD');
Thanks for the report. Could you please paste the SQL code to a formatted paragraph (using 3 consecutive bacticks on a pre-paragraph line and another 3 on a post-paragraph line)? Currently when I formatted the SQL code, mysql2sqlite
worked for me without any issues.
Spoke with author of the above bug report.
more info as of "Fri Jan 10 21:00:37 CST 2020"
The tables that cause errors are crb_portfolios2_table crb_portfolios_table and are the only 2 tables that contain a column with the data type of "BLOB"'
It seems that mysqldump exports data of type "blob" by putting the string "_binary " before the data to be inserted It imports OK if I remove the "_binary " string from the "INSERT" directive
We are able to work around the problem.
The following are the only substitutions made to INSERT...
lines:
# first replace \\ by \_ that mysqldump never generates to deal with
# sequnces like \\n that should be translated into \n, not \<LF>.
# After we convert all escapes we replace \_ by backslashes.
gsub( /\\\\/, "\\_" )
# single quotes are escaped by another single quote
gsub( /\\'/, "''" )
gsub( /\\n/, "\n" )
gsub( /\\r/, "\r" )
gsub( /\\"/, "\"" )
gsub( /\\\032/, "\032" ) # substitute char
gsub( /\\_/, "\\" )
So I still can't find anything regarding _binary
. If you'll find some time, please post your SQL code here (but formatted please to not lose any information). If it's a bug in mysql2sqlite
, I'd like to fix it.
/*!40101 SET @saved_cs_client = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `crb_portfolios_table` (
`portfolio_name` varchar(100) NOT NULL DEFAULT '',
`portfolio_selections_string` blob,
PRIMARY KEY (`portfolio_name`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
INSERT INTO `crb_portfolios_table` VALUES ('all active futures liquid 15-80 cash 5',_binary 'AA,AD,AZ,BO,BP,C-,CB,CC,CD,CL,CT,DF,EC,ED,ES,FF,FX,GC,HG,HH,HO,HS,JY,KC,KS,KW,LC,LH,LJ,MX,MZ,NG,NH,NN,NO,NP,NX,OX,PI,RB,RJ,S-,SB,SI,SM,SP,SS,SZ,TP,TT,VX,W-,X-,ZM');
INSERT INTO `crb_portfolios_table` VALUES ('all active futures liquid 5-30 all inclusive',_binary '5L,A2,AA,AD,AZ,B-,BF,BG,BM,BO,BP,BS,BT,C-,CA,CB,CC,CD,CG,CK,CL,CT,DF,DI,DR,DX,EC,ED,ES,F-,FA,FB,FC,FF,FV,FX,FY,GC,GH,GM,GX,HG,HH,HN,HO,HS,IA,IB,IE,II,IL,IT,IX,JY,K6,K7,K8,KB,KC,KD,KL,KM,KS,KW,L-,LC,LF,LH,LJ,LL,LO,LQ,LW,ME,MH,MW,MX,MZ,ND,NE,NF,NG,NH,NI,NK,NN,NO,NP,NV,NX,OT,OX,PA,PI,PL,PO,PP,PU,QB,QE,R-,RB,RE,RJ,S-,SB,SF,SI,SM,SP,SS,SV,SZ,T2,TE,TO,TP,TQ,TS,TT,TU,TY,U0,UG,US,V4,V5,V9,VA,VC,VD,VH,VI,VO,VP,VR,VX,W-,WB,WC,WG,WI,WO,WQ,WR,X-,XO,XT,XU,XV,Y1,Y2,YC,YD,YE,YG,YK,YQ,YS,YT,YV,YW,YX,Z0,Z2,ZM,ZN,ZW');
INSERT INTO `crb_portfolios_table` VALUES ('all active options liquid 15-80 cash 5',_binary 'AD,BO,BP,C-,CB,CC,CD,CL,CT,DF,EC,ED,ES,FF,FX,GC,HG,HO,HS,JY,KC,KW,LC,LH,MX,MZ,NG,NN,NX,RB,RJ,S-,SB,SI,SM,SP,SS,SZ,W-,X-,ZM');
INSERT INTO `crb_portfolios_table` VALUES ('all futures active and liquid',_binary '2B,2C,2G,2H,2I,2J,2K,2L,2N,2S,2T,2U,2V,2W,5A,5D,5F,5G,5L,5P,5S,5V,5Z,AA,AD,AZ,B-,BF,BM,BO,BP,BS,BT,C-,CA,CB,CC,CD,CG,CK,CL,DF,EC,ED,EI,ES,F-,FA,FF,FV,FX,GC,GH,GM,HG,HH,HN,HO,HS,IL,JY,K6,K7,KB,KC,KD,KS,KW,L-,LC,LF,LH,LJ,MH,MX,MZ,ND,NG,NH,NN,NO,NP,NX,OT,OX,PI,PO,R-,RB,RE,RJ,S-,SB,SI,SM,SP,SS,SZ,TS,TT,TU,TY,U0,UG,US,VA,VC,VD,VH,VI,VP,VR,VX,W-,WG,WI,WJ,WO,WQ,WS,X-,XO,XT,XU,XV,YC,YG,YK,YQ,YS,Z2,ZM');
INSERT INTO `crb_portfolios_table` VALUES ('all options active and liquid',_binary 'AD,B-,BF,BO,BP,C-,CB,CC,CD,CK,CL,CT,DF,EC,ED,ES,F-,FF,FV,FX,GC,GH,GM,HG,HO,HS,JY,KC,KW,L-,LC,LF,LH,MH,MX,MZ,NG,NN,NP,NX,R-,RB,RJ,S-,SB,SI,SM,SP,SS,SZ,TU,TY,UG,US,VD,VI,W-,WC,WI,X-,ZM');
INSERT INTO `crb_portfolios_table` VALUES ('Currencies active and liquid futures',_binary '5A,5D,5F,5G,5P,5V,AA,AD,BP,CD,EC,JY,KD,NP,U0,VD');
This is the script we use to work around the problem.
#!/opt/local/bin/perl
#
# crb.export.for.sqlite - export select tables for import to sqlite
#
# 20200105-01 original incarnation
# 20200106-01 added code (line 28) so that Mike can put the output wherever he likes - currently set to the users home directory
# 20200112-01 added code to remove "_binary " string from output files that contain "blob" data type
#
# use diagnostics;
use strict;
use warnings;
use 5.010;
if( $#ARGV > -1 ) { # if value was passed
print "[$ARGV[0]] passed as argument\n"; # print debug message
die "no argumants allowed\n"; # and exit
}
print "crb.export.for.sqlite .................................";
my($DATE) = &time_stamp() ;
print " - START $DATE\n";
my($TIMESTAMP) = &time_stamp() ; # create time stamp
my($DAY_OF_WEEK) = `date +%a|tr -d "\n"`; # get day of week
my($LOG_STAMP) = "$TIMESTAMP $DAY_OF_WEEK"; # create log timestamp
my($CUTOFF_DATE) = &cutoff_date() ; # set CUTOFF_DATE back 10 days to rule out symbols no longer trading
# my($BASE_DIRECTORY) = "/trading/data/crb"; # where the crb files reside - ben
my($BASE_DIRECTORY) = $ENV{'HOME'}; # where the crb files reside - mike
if ( ! -d "$BASE_DIRECTORY" ) { # if directory does not exist
die"[$BASE_DIRECTORY] does not exist\n\n"; # then complain and exit
}
my($EXPORT_DIRECTORY) = "$BASE_DIRECTORY/sqlite.export"; # where the sqlite.export files reside
if ( ! -d "$EXPORT_DIRECTORY" ) { # if directory does not exist
mkdir("$EXPORT_DIRECTORY"); # then make one
}
my($SQLITE3_DIRECTORY) = "$BASE_DIRECTORY/sqlite3.database"; # where the sqlite database files reside
if ( ! -d "$SQLITE3_DIRECTORY" ) { # if directory does not exist
mkdir("$SQLITE3_DIRECTORY"); # then make one
}
my(@LIST_OF_TABLES) = `echo "SHOW TABLES" | /usr/local/mysql/bin/mysql --skip-column-names trading_db`; # get list of table names
my(@LIST_OF_ACTIVE_OPTIONS) = `echo "select symbol
FROM crb_active_options_table
WHERE f_max_date>\\\"$CUTOFF_DATE\\\"
AND (ccm_exchange=\\\"CBOT\\\"
OR ccm_exchange=\\\"CME\\\"
OR ccm_exchange=\\\"ICE-US\\\"
OR ccm_exchange=\\\"NYMEX\\\")
;
" | /usr/local/mysql/bin/mysql --skip-column-names trading_db`; # get list of table names for US exchanges
my($TABLE_NAME);
my($ACTIVE_OPTION);
my($test_symbol);
my(@EXPORT_TABLE_NAMES);
my($LINE_COUNTER)=0;
# print "-----------\n";
foreach $ACTIVE_OPTION (@LIST_OF_ACTIVE_OPTIONS) { # convert symbols to table name endings
chomp($ACTIVE_OPTION);
$ACTIVE_OPTION =~ tr/-/_/;
# $LINE_COUNTER++;
# print "$LINE_COUNTER [$ACTIVE_OPTION]\n";
}
# print "-----------\n";
#exit;
print "-----------\n";
foreach $TABLE_NAME (@LIST_OF_TABLES) { # only pass names of tables to export
chomp($TABLE_NAME);
if (index($TABLE_NAME, "crb_avg_history_table") != -1) { # skip crb_avg_history_table
next;
}
if (index($TABLE_NAME, "crb_futures_table") == 0) { # if crb_futures_table
$test_symbol=substr($TABLE_NAME,-2); # get symbol
if (grep { $test_symbol eq $_ } @LIST_OF_ACTIVE_OPTIONS) {
;
# print "test_symbol $test_symbol is in LIST_OF_ACTIVE_OPTIONS\n";
} else {
next;
}
}
if (index($TABLE_NAME, "crb_options_table") == 0) { # if crb_options_table
$test_symbol=substr($TABLE_NAME,-2); # get symbol
if (grep { $test_symbol eq $_ } @LIST_OF_ACTIVE_OPTIONS) {
;
# print "test_symbol $test_symbol is in LIST_OF_ACTIVE_OPTIONS\n";
} else {
next;
}
}
$LINE_COUNTER++;
my(@BLOB_INDICATOR) = `echo "describe \\\`$TABLE_NAME\\\`
" | /usr/local/mysql/bin/mysql -N trading_db`;
chomp(@BLOB_INDICATOR);
## if ($LINE_COUNTER < 169) { # code for testing to limit run to a specific subset of tables
## next;
## }
# print "$LINE_COUNTER [$TABLE_NAME]\n";
# print "$LINE_COUNTER [mysqldump trading_db $TABLE_NAME > $EXPORT_DIRECTORY/$TIMESTAMP.$TABLE_NAME]\n"
$DATE = &time_stamp() ;
printf '%4s',$LINE_COUNTER;
print " $DATE.$TABLE_NAME";
if (grep { (index($_, "blob") != -1) } @BLOB_INDICATOR) {
print " - [contains blob]";
# remove string "_binary " from output file;
`mysqldump --skip-extended-insert --compact trading_db $TABLE_NAME | sed 's/_binary //' > $EXPORT_DIRECTORY/$TIMESTAMP.$TABLE_NAME`; # mysql2sqlite version
} else {
`mysqldump --skip-extended-insert --compact trading_db $TABLE_NAME > $EXPORT_DIRECTORY/$TIMESTAMP.$TABLE_NAME`; # mysql2sqlite version
}
print "\n";
`mysql2sqlite $EXPORT_DIRECTORY/$TIMESTAMP.$TABLE_NAME | sqlite3 $SQLITE3_DIRECTORY/trading_db`; # import data into sqlite3
`rm $EXPORT_DIRECTORY/$TIMESTAMP.$TABLE_NAME`; # delete dump file after import into sqlite3 database
## if ($LINE_COUNTER >= 5) { # code for testing to limit run to a set number of tables
## exit;
## }
}
print "-----------\n";
print "crb.export.for.sqlite .................................";
$DATE = &time_stamp() ;
print " - DONE - $DATE\n";
exit;
# ____________________________________________________________________________________
# subroutines
# _____________
sub time_stamp {
my ($stamp);
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$year += 1900;
$mon++;
$stamp = sprintf("%4d%2.2d%2.2d-%2.2d%2.2d%2.2d",$year,$mon,$mday,$hour,$min,$sec);
return($stamp);
}
# _____________
sub cutoff_date {
my ($now) = time();
my ($cutoff_date) = $now - (10 * 86400);
my ($day, $mon, $year) = (localtime($cutoff_date))[3, 4, 5];
$cutoff_date=sprintf("%04d-%02d-%02d", $year+1900, $mon+1, $day);
return($cutoff_date)
}
@crenelle thanks for the additional info. I wasn't fully aware of this complication with _binary
. I think a better solution than removing _binary
could be to use mysqldump --hex-blob ...
- could you test it on your DB?
On 1/12/20 11:10 AM, dumblob wrote:
@crenelle https://github.com/crenelle thanks for the additional info. I wasn't fully aware of this complication with |_binary|. I think a better solution than removing |_binary| could be to use |mysqldump --hex-blob ...| - could you test it on your DB?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dumblob/mysql2sqlite/issues/66?email_source=notifications&email_token=AACK6RO42NR7TDYTZEFQTYDQ5NTK7A5CNFSM4KEAP3X2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIXBWQY#issuecomment-573446979, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACK6RPSB74QP5SGLAC2F5DQ5NTK7ANCNFSM4KEAP3XQ.
Sure can. Ben has been doing the wild-eyed scripting, I'll tell him about the different approach tomorrow!
Tnx,
-Mike
--
Michael Brian Bentley Application Development Consulting Crenelle Technologies LLC
[email protected] +1 (206) 601-7758 17823 McLean Rd SW Vashon, WA 98070-5421
@crenelle thanks for the additional info. I wasn't fully aware of this complication with
_binary
. I think a better solution than removing_binary
could be to usemysqldump --hex-blob ...
- could you test it on your DB?
after using --hex-blob in mysqldump command, mysql2sqlite script will warn WARN Hex number trimmed (length longer than 16 chars).
and then truncate the blob data.
can i fix issue by adding gsub( /^_binary/, "" ) after ?
# first replace \\ by \_ that mysqldump never generates to deal with
# sequnces like \\n that should be translated into \n, not \<LF>.
# After we convert all escapes we replace \_ by backslashes.
gsub( /\\\\/, "\\_" )
# single quotes are escaped by another single quote
gsub( /\\'/, "''" )
gsub( /\\n/, "\n" )
gsub( /\\r/, "\r" )
gsub( /\\"/, "\"" )
gsub( /\\\032/, "\032" ) # substitute char
gsub( /\\_/, "\\" )
I'm really busy nowadays. Could you prepare a PR like https://github.com/dumblob/mysql2sqlite/pull/68/files (i.e. the patch + test cases) so that I can test it quickly? Thanks!