percona-toolkit
percona-toolkit copied to clipboard
Patch newlines in table columns
mysql 5.6.40 allows newlines in column names however the following code:
my @defs = $ddl =~ m/^(\s+`.*?),?$/gm;
breaks due to it detecting newlines as line ends. The 'm' argument at the end does this by auto-detecting lines by newline characters.
To correct this issue I've made use of zero-length assertions known as " positive lookback"
https://www.regular-expressions.info/lookaround.html
what does it do?
m/(?:(?<=,\n)|(?<=\(\n))(\s+`(?:.|\n)+?`.+?),?\n/g;
TLDR:
Treat the string as one long string and don't treat \n as the end of a line.
look for (\s+`(?:.|\n)+?`.+?),?\n
if one of those matches look at what precedes the string
if it's ',\n' or '(\n' the string matches. Only save what's in (\s+`(?:.|\n)+?`.+?)
m/ is declaring this a matching regex.
(?:(?<=,\n)|(?<=(\n))
This is an OR statement including two look-behind clauses. The ?: tells the enclosing parentheses to not store the result as a variable. I've put the two look-behinds in this OR statement below this line:
(?<=,\n) Look behind the matched string for a comma followed by a newline, the comma must be there for this look behind to match.
(?<=(\n) Look behind the matched string for a open parentheses followed by a newline, the open parentheses must be there.
(\s+`(?:.|\n)+?`.+?),?\n
This is the actual match. Match newline character followed by one or more spaces followed by back-tick followed by a character which can be any character or a newline one or more times, but don't be greedy and take the rest of the match into consideration. Followed by a back tick and any character one or more times. This match stops where there is a comma or failing that a newline following a back tick and some characters.
,?\n match a comma that may not be there followed by a newline. /g don't stop if this pattern matches keep looking for more patterns to the end of the string.
Anyone have a quick one-liner they use to run the tests? I got /usr/bin/perl -MExtUtils::Command::MM -MTest::Harness -e "undef Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t//*.t
But every test fails so i don't think this is right. After reading the make file and editing the way it globs test files.
I may have a test that works having taken the Bug 821675: can't parse column names containing periods, and copying to instead test for newlines. But I'll be honest, I am out of practice with this stuff and have no idea how to run the test suite to figure out if my test it good or not.
In https://github.com/percona/percona-toolkit/blob/3.0/CONTRIBUTE.md#setting-up-the-development-environment you can find instructions about how to set up the sandbox and run the tests.
Thanks.
Sorry guys I ran out of time to work on this issue. And won't be coming back to it.