added ignoreQuoteInToken support to ignore quotes in strings
added ignoreQuoteInToken support to ignore quotes in strings even when there are few encapsulatedTokens with comma within. This will help in parsing csv values like abc,"xyz" 123 bar,3,11961034,"First author, Second Author"
Coverage decreased (-0.2%) to 92.618% when pulling ad01ee10977550e226fff4b968b49105fe34170a on ranjithrp:master into 7754cd4c84299e72043067501d2965f55e7ff769 on apache:master.
Coverage increased (+0.08%) to 92.913% when pulling a381d53772829ea17938d48618c5e1dec661179d on ranjithrp:master into 7754cd4c84299e72043067501d2965f55e7ff769 on apache:master.
@ranjithrp , You'll need to provide unit tests so we can clearly assess what it is you are trying to achieve. Thank you, Gary
@ranjithrp , You'll need to provide unit tests so we can clearly assess what it is you are trying to achieve. Thank you, Gary
i have added the junits.
Would you mind showing an example with actual expected rows please? It is not clear to me yet if this is a good thing.
Would you mind showing an example with actual expected rows please? It is not clear to me yet if this is a good thing.
In the actual expected row, we have one column which has value with quotes in it, and other column which has a comma in it.
Ex row -
abc,"xyz" 123 bar,3,11961034,"First author, Second Author"
Here the second column has value "xyz" 123 bar. This has quotes in the token The 5th column has value First author, Second Author. This has comma in the token
if i use withQuote(null), it ignores the quote for the fifth column and then splits that value to 2. If i dont use withQuote(null), for the second column, the lexer tries to parseEncapsulatedToken and when it sees character other than delimiter or newline, it throws exception.
This was the problem we were facing, and to handle this, i had made the above change.
@garydgregory Hope the above explanation clarifies your question. Please let me know if you need any additional details. Thanks
@ranjithrp I would prefer to see a unit test that that compares your example input with an actual parsed row where each column value is asserted for its correctness. Thank you, Gary
@ranjithrp I would prefer to see a unit test that that compares your example input with an actual parsed row where each column value is asserted for its correctness. Thank you, Gary
@garydgregory i have tried to do the same in src/test/java/org/apache/commons/csv/LexerTest.java. In one of the test case, i have set the boolean to true and shown how it is able to retrieve each token and have asserted the values also. In another test case, i have set the boolean to false and shown how the parser throws an exception while parsing a token with quotes within. Please let me know if you are looking for anything specific apart from this?
@ranjithrp May you please rebase on master?