miller icon indicating copy to clipboard operation
miller copied to clipboard

sub and gsub: how to use "\n" in replace

Open aborruso opened this issue 7 months ago • 8 comments

Hi, if I run echo "a=lorem" | mlr --ocsv sub -a "r" "\n"

I get this without carriage return

a
lo\nem

Shouldn't I also have in output with sub and gsub the ability to use these special chars?

Thank yoy

aborruso avatar May 10 '25 17:05 aborruso

Does csv "standard" support newlines in cells? I think it does, but maybe check with less strict format..

janxkoci avatar May 22 '25 10:05 janxkoci

Does csv "standard" support newlines in cells?

Yes, it does, and I think this is a small bug. But maybe I am wrong and I am waiting (as soon as possible) for a check from @johnkerl

aborruso avatar May 22 '25 10:05 aborruso

Yes, RFC-4180 CSV supports newline within double-quoted cells: https://miller.readthedocs.io/en/6.13.0/file-formats/#csvtsvasvusvetc

johnkerl avatar May 23 '25 19:05 johnkerl

Note that normally \r and/or \n are not within cells, they're delimiters between them.

It's a bug that

echo "a=lorem" | mlr --ocsv sub -a "r" "\n"

produces a cell with \n contained within it, and that that cell isn't being double-quoted as it should be.

johnkerl avatar May 23 '25 19:05 johnkerl

It's a bug that

echo "a=lorem" | mlr --ocsv sub -a "r" "\n"

produces a cell with \n contained within it, and that that cell isn't being double-quoted as it should be.

Do I need to open a new issue for this bug, or can we use this?

Thank you @johnkerl

aborruso avatar May 25 '25 07:05 aborruso

I'll use this issue -- thanks @aborruso !

johnkerl avatar May 27 '25 13:05 johnkerl

A workaround is using embedded newline, e.g.:

$ echo "a=lorem" | mlr --ocsv sub -a "r" $'\n'
a
"lo
em"

or:

$ echo "a=lorem" | mlr --ocsv sub -a "r" ' 
'
a
"lo
em"

agguser avatar Jun 27 '25 03:06 agguser

Thank you very much @agguser

aborruso avatar Jun 27 '25 06:06 aborruso