jq Add toUpper and toLower methods

Looking through the system, support for a toUpper and toLower might make some folks lives easier. In my case I have some keys that are upper case, some that are mixed case and others that are lower case and I'd like to use a single match on them by being able to use the same case.

Jul 18 '14 22:07 pasamio

Yeah, this would be nice. Generic Unicode case conversion is a complex thing to do, and we don't really want to either write a Unicode library for jq to nor have a run-time dependency on an external library unless we can link statically for the standalone jq executable. There are several options though, and I agree that this is somewhat important. We might well end up having to have a Unicode library in jq.

Jul 18 '14 23:07 nicowilliams

@pasamio wrote:

I'd like to use a single match on them by being able to use the same case.

If you have access to a recent (github) version of jq, you can use regular expressions with the "ignore case" option, e.g.

$ jq -n '"abC"| test( "abc";"i" )'
true
$ jq -n ' {"abC":1} | keys[] | select( test("abc";"i"))  '
"abC"
$ jq -n '{"abC":1} | keys[] | if test("abc";"i") then "abc" else empty end'
"abc"

There's also a regex filter named match.

Jul 19 '14 02:07 pkoppstein

:+1: I really need this as well!

Jan 29 '15 20:01 nikolay

@nikolay - ascii_upcase and ascii_downcase have been in "master" on github since Dec 27. Their definitions can also be used in jq 1.4, as explained in the FAQ.

Jan 29 '15 20:01 pkoppstein

Any plans to implement "upcase" and "downcase" functions with UTF-8 support, something similar to awk's "tolower".

With jq 1.5: echo '"ÁgUA"' | jq '. | ascii_downcase' Output: "Água"

With GNU Awk 4.1.3: echo '"ÁgUA"' | awk '{print(tolower($0))}' Output: "água"

Mar 21 '16 19:03 gabrielmagno

This :point_up: would be great!

Nov 16 '17 20:11 paulochf

Currently I have to create multiple jq queries and link them or use linux commands with regular expressions to lower case only the target fields

Mar 14 '18 14:03 AndresPineros

Waiting for this one.

May 09 '18 03:05 eintr

Started today using jq and found myself needing this.

Jun 18 '18 08:06 flaviotordini

:+1:

Dec 07 '18 16:12 LordMike

My wish would be to support regexp substitions from perlre: \l \u \L \U \E

Jun 18 '19 13:06 mmacek7

I found ascii_downcase from builtin.jq to do just this

Jun 05 '20 01:06 dsynkd

I found ascii_downcase from builtin.jq to do just this

Yes, but as the name suggests, it works only for ASCII characters. For instance, UTF-8 characters of accented letters (e.g. "Á" or "Ñ") in a string, will be simply ignored and returned in uppercase.

The current workaround, as stated in my 2016 comment in this issue, is using awk's tolower, wich properly supports accented letters.

I hope that something similar will be implemented in jq someday.

Jun 05 '20 02:06 gabrielmagno

how do I use it?

Oct 22 '21 21:10 ralyodio

seems that utf8 case folding should be fairly straightforward using utf8proc (https://juliastrings.github.io/utf8proc/doc/). We might give it a go, anyone interested to take a look if we do?

Mar 01 '22 01:03 liquidaty

I've implemented this as a couple of built-ins, utf8tolower and utf8toupper. It works like so:

> echo '"ǄIǄA"' | jq '.|utf8tolower'
"ǆiǆa"

The code changes are small and easily toggled. Is there any interest from this repo's maintainers (or others) to incorporate these changes? If so pls lmk and I will generate a pull request.

Jun 02 '22 15:06 liquidaty

@nicowilliams @gabrielmagno @flaviotordini @eintr @paulochf @LordMike: given that it seems from your comments (years ago by now...) that you'd have an interest in a solution-- any thoughts on the above?

Jun 09 '22 19:06 liquidaty

Hey @liquidaty . Well, in terms of functionality, it looks good to me. In terms of the implementation itself, I'm afraid I will not be able to judge, because I'm not familiar to jq's codebase. By the way, maybe you could point us to your repo, so that more people could see it?

Jun 09 '22 20:06 gabrielmagno

@gabrielmagno, anyone else: here's a diff from the latest jq version pulled today: https://github.com/liquidaty/jq/commit/9f0bde5f5a560e021bc05be98906a0e555270f4e

Note that the autoconf includes settings that appear to support building utf8proc, but that part doesn't actually work-- rather, it just looks for utf8proc already being available and uses it if found

Jun 09 '22 21:06 liquidaty

Pull request: https://github.com/stedolan/jq/pull/2547

Mar 02 '23 17:03 liquidaty

jq jq copied to clipboard

Add toUpper and toLower methods

jq
jq copied to clipboard