jq icon indicating copy to clipboard operation
jq copied to clipboard

Add toUpper and toLower methods

Open pasamio opened this issue 10 years ago • 19 comments

Looking through the system, support for a toUpper and toLower might make some folks lives easier. In my case I have some keys that are upper case, some that are mixed case and others that are lower case and I'd like to use a single match on them by being able to use the same case.

pasamio avatar Jul 18 '14 22:07 pasamio

Yeah, this would be nice. Generic Unicode case conversion is a complex thing to do, and we don't really want to either write a Unicode library for jq to nor have a run-time dependency on an external library unless we can link statically for the standalone jq executable. There are several options though, and I agree that this is somewhat important. We might well end up having to have a Unicode library in jq.

nicowilliams avatar Jul 18 '14 23:07 nicowilliams

@pasamio wrote:

I'd like to use a single match on them by being able to use the same case.

If you have access to a recent (github) version of jq, you can use regular expressions with the "ignore case" option, e.g.

$ jq -n '"abC"| test( "abc";"i" )'
true
$ jq -n ' {"abC":1} | keys[] | select( test("abc";"i"))  '
"abC"
$ jq -n '{"abC":1} | keys[] | if test("abc";"i") then "abc" else empty end'
"abc"

There's also a regex filter named match.

pkoppstein avatar Jul 19 '14 02:07 pkoppstein

:+1: I really need this as well!

nikolay avatar Jan 29 '15 20:01 nikolay

@nikolay - ascii_upcase and ascii_downcase have been in "master" on github since Dec 27. Their definitions can also be used in jq 1.4, as explained in the FAQ.

pkoppstein avatar Jan 29 '15 20:01 pkoppstein

Any plans to implement "upcase" and "downcase" functions with UTF-8 support, something similar to awk's "tolower".

With jq 1.5: echo '"ÁgUA"' | jq '. | ascii_downcase' Output: "Água"

With GNU Awk 4.1.3: echo '"ÁgUA"' | awk '{print(tolower($0))}' Output: "água"

gabrielmagno avatar Mar 21 '16 19:03 gabrielmagno

This :point_up: would be great!

paulochf avatar Nov 16 '17 20:11 paulochf

Currently I have to create multiple jq queries and link them or use linux commands with regular expressions to lower case only the target fields

AndresPineros avatar Mar 14 '18 14:03 AndresPineros

Waiting for this one.

eintr avatar May 09 '18 03:05 eintr

Started today using jq and found myself needing this.

flaviotordini avatar Jun 18 '18 08:06 flaviotordini

:+1:

LordMike avatar Dec 07 '18 16:12 LordMike

My wish would be to support regexp substitions from perlre: \l \u \L \U \E

mmacek7 avatar Jun 18 '19 13:06 mmacek7

I found ascii_downcase from builtin.jq to do just this

dsynkd avatar Jun 05 '20 01:06 dsynkd

I found ascii_downcase from builtin.jq to do just this

Yes, but as the name suggests, it works only for ASCII characters. For instance, UTF-8 characters of accented letters (e.g. "Á" or "Ñ") in a string, will be simply ignored and returned in uppercase.

The current workaround, as stated in my 2016 comment in this issue, is using awk's tolower, wich properly supports accented letters.

I hope that something similar will be implemented in jq someday.

gabrielmagno avatar Jun 05 '20 02:06 gabrielmagno

how do I use it?

ralyodio avatar Oct 22 '21 21:10 ralyodio

seems that utf8 case folding should be fairly straightforward using utf8proc (https://juliastrings.github.io/utf8proc/doc/). We might give it a go, anyone interested to take a look if we do?

liquidaty avatar Mar 01 '22 01:03 liquidaty

I've implemented this as a couple of built-ins, utf8tolower and utf8toupper. It works like so:

> echo '"DŽIDŽA"' | jq '.|utf8tolower'
"džidža"

The code changes are small and easily toggled. Is there any interest from this repo's maintainers (or others) to incorporate these changes? If so pls lmk and I will generate a pull request.

liquidaty avatar Jun 02 '22 15:06 liquidaty

@nicowilliams @gabrielmagno @flaviotordini @eintr @paulochf @LordMike: given that it seems from your comments (years ago by now...) that you'd have an interest in a solution-- any thoughts on the above?

liquidaty avatar Jun 09 '22 19:06 liquidaty

Hey @liquidaty . Well, in terms of functionality, it looks good to me. In terms of the implementation itself, I'm afraid I will not be able to judge, because I'm not familiar to jq's codebase. By the way, maybe you could point us to your repo, so that more people could see it?

gabrielmagno avatar Jun 09 '22 20:06 gabrielmagno

@gabrielmagno, anyone else: here's a diff from the latest jq version pulled today: https://github.com/liquidaty/jq/commit/9f0bde5f5a560e021bc05be98906a0e555270f4e

Note that the autoconf includes settings that appear to support building utf8proc, but that part doesn't actually work-- rather, it just looks for utf8proc already being available and uses it if found

liquidaty avatar Jun 09 '22 21:06 liquidaty

Pull request: https://github.com/stedolan/jq/pull/2547

liquidaty avatar Mar 02 '23 17:03 liquidaty