miller icon indicating copy to clipboard operation
miller copied to clipboard

Find and replace special character & with and using ssub

Open karudonaldson opened this issue 1 year ago • 2 comments

Check Column D (four) for special characters:

  1. Replace & with string AND
  2. Remove characters ' ( ) /

$ cat example.csv

one,two,three,four,five
pan,pan,1,&,10
wye,wye,1,&,20
eks,wye,,1,'',10
zee,pan,1,(test),60

After transformation $ cat example.csv

one,two,three,four,five
pan,pan,1,and,10
wye,wye,1,and,20
eks,wye,1,,10
zee,pan,1,,60

karudonaldson avatar Apr 18 '24 02:04 karudonaldson

@karudonaldson please also always write down what test you did, what didn't work for you. Don't just write what you want.

So it will be possible to give you a better answer. Thank you

aborruso avatar Apr 18 '24 06:04 aborruso

@aborruso I think this question is well-posed

@karudonaldson how about:

cat 1546.csv

one,two,three,four,five
pan,pan,1,&,10
wye,wye,1,&,20
eks,wye,1,'',10
zee,pan,1,(test),60

cat 1546.mlr

for (k in ["four"]) {
    v = $[k];
    if (typeof(v) == "string") {
        v = ssub(v, "&", "AND");
        v = gsub(v, "['()/]", "");
    }
    $[k] = v;
}

mlr --csv --from 1546.csv put -f 1546.mlr

one,two,three,four,five
pan,pan,1,AND,10
wye,wye,1,AND,20
eks,wye,1,,10
zee,pan,1,test,60

johnkerl avatar Apr 18 '24 12:04 johnkerl

I believe this is resolved -- @karudonaldson if I'm mistaken please let me know and we can re-open. Thank you!

johnkerl avatar Jun 09 '24 00:06 johnkerl

Apologies I hadn't responded, this rule hadn't been confirmed.

The objective should be, if string exists and starts with 123 e.g 123abc or 123def and in column three or four, replace string with 123replaced.

cat input.csv

one,two,three,four,five
pan,pan,123xx,123ab,30
wye,wye,123,123,y20
eks,wye,123yz,123xy,10
zee,pan,123test,123test2,60

Expected output: cat output.csv

one,two,three,four,five
pan,pan,123replaced,123replaced,30
wye,wye,123replaced,123replaced,y20
eks,wye,123replaced,123replaced,10
zee,pan,123replaced,123replaced,60

Using ssub: Test 1 - failed, no fields were changed? mlr --csv --from ./input.csv put -f ./1546.mlr > output.csv

cat 1546.mlr

'
for (k in ["three"] || ["four"]) {
    v = $[k];
    if (typeof(v) == "string") {
        v = ssub(v, "^123", "123replaced");
    }
    $[k] = v;
}
'

Test 2 - failed, no fields were changed cat 1546.mlr

'
for (k in ["three"] || ["four"]) {
    v = $[k];
    if (typeof(v) == "string") {
        v = ssub(v, "123*", "123replaced");
    }
    $[k] = v;
}
'

Presumably it's the ssub value1 string I'm using between (v, value1, value2) in the MLR file, which must be incorrect and I thought we had covered this in a previous issue, but apparently not!

karudonaldson avatar Jun 24 '24 04:06 karudonaldson

My bad, I fixed this by removing the '' from the mlr file.

karudonaldson avatar Jun 26 '24 00:06 karudonaldson