postgresql-container icon indicating copy to clipboard operation
postgresql-container copied to clipboard

Add support for locale and encoding, fix #406

Open mscherer opened this issue 3 years ago • 26 comments

mscherer avatar Jul 22 '21 19:07 mscherer

Can one of the admins verify this patch?

centos-ci avatar Jul 22 '21 19:07 centos-ci

Can one of the admins verify this patch?

centos-ci avatar Jul 22 '21 19:07 centos-ci

Can one of the admins verify this patch?

centos-ci avatar Jul 22 '21 19:07 centos-ci

Tagged as draft, since the tests are missing.

mscherer avatar Jul 22 '21 19:07 mscherer

Thanks for this contribution. It makes sense to me to have such options.

hhorak avatar Sep 15 '21 09:09 hhorak

[test]

hhorak avatar Sep 15 '21 09:09 hhorak

What I'm thinking about is whether this shouldn't be set earlier, for the whole DB dir, as https://www.postgresql.org/docs/13/multibyte.html says: "The default character set is selected while initializing your PostgreSQL database cluster using initdb".

hhorak avatar Sep 15 '21 09:09 hhorak

@fila43 what do you think?

hhorak avatar Sep 15 '21 09:09 hhorak

oups, seems I pushed the wrong branch and removed my own code...

mscherer avatar Sep 15 '21 14:09 mscherer

12:06:00  + run_locales_test
12:06:00  + local name=pg-test-locales
12:06:00  + DOCKER_ARGS='-e POSTGRESQL_ENCODING=ISO_8859_6 -e POSTGRESQL_LOCALE=C'
12:06:00  + create_container pg-test-locales
12:06:00  + local name=pg-test-locales
12:06:00  + shift
12:06:00  + local 'cargs=-e POSTGRESQL_ENCODING=ISO_8859_6 -e POSTGRESQL_LOCALE=C'
12:06:00  ++ echo '-e POSTGRESQL_ENCODING=ISO_8859_6 -e POSTGRESQL_LOCALE=C'
12:06:00  ++ tr '\n' ' '
12:06:00  + cargs='-e POSTGRESQL_ENCODING=ISO_8859_6 -e POSTGRESQL_LOCALE=C '
12:06:00  + cidfile=/tmp/tmp.oNB18ocbyupostgresql_test_cidfiles/pg-test-locales
12:06:00  + eval 'docker run -e POSTGRESQL_ENCODING=ISO_8859_6 -e POSTGRESQL_LOCALE=C  --cidfile $cidfile -d $IMAGE_NAME "$@"'
12:06:00  ++ docker run -e POSTGRESQL_ENCODING=ISO_8859_6 -e POSTGRESQL_LOCALE=C --cidfile /tmp/tmp.oNB18ocbyupostgresql_test_cidfiles/pg-test-locales -d f31/postgresql:0
12:06:00  7cb14cec8112f6789df61a5cd5a1f1240f0077946da4bc497f421de838334773
12:06:01  ++ cat /tmp/tmp.oNB18ocbyupostgresql_test_cidfiles/pg-test-locales
12:06:01  Created container 7cb14cec8112f6789df61a5cd5a1f1240f0077946da4bc497f421de838334773
12:06:01  + echo 'Created container 7cb14cec8112f6789df61a5cd5a1f1240f0077946da4bc497f421de838334773'
12:06:01  + wait_ready pg-test-locales
12:06:01  ++ get_cid pg-test-locales
12:06:01  ++ local id=pg-test-locales
12:06:01  ++ shift
12:06:01  +++ cat /tmp/tmp.oNB18ocbyupostgresql_test_cidfiles/pg-test-locales
12:06:01  ++ echo 7cb14cec8112f6789df61a5cd5a1f1240f0077946da4bc497f421de838334773
12:06:01  + docker exec 7cb14cec8112f6789df61a5cd5a1f1240f0077946da4bc497f421de838334773 /usr/libexec/check-container
12:06:01  rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:1122: getting init process 7285 start time caused "read /proc/7285/stat: no such process"
12:06:01

I am not sure exactly what to do with that error message

mscherer avatar Sep 15 '21 14:09 mscherer

So using LANG, one can influence the default encoding and collate. However, this doesn't work that well.

I tried to create a database with LANG=C.UTF8, and it set the server encoding correctly, but not LC_COLLATE.

So I guess we need initdb + the createdb options.

mscherer avatar Sep 15 '21 16:09 mscherer

And also, postgresql crash if I set LC_COLLATE to UTF8, so I suspect that using LANG and similar hacks is out of question, as this seems quite fragile and obscure.

mscherer avatar Sep 15 '21 16:09 mscherer

initdb will derive the encoding from the locales (here ). And this code answer a hardcoded SQL_ASCII answer when the locale is C which mean that I can't use LANG only to get what I need for Synapse (encoding = UTF8, locale=C).

mscherer avatar Sep 15 '21 18:09 mscherer

12:06:01 rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:1122: getting init process 7285 start time caused "read /proc/7285/stat: no such process" I am not sure exactly what to do with that error message

I read this as the container's process ended before the exec was done, which might be what you refer to with "postgresql crash if I set LC_COLLATE".

hhorak avatar Sep 17 '21 12:09 hhorak

So I pushed a fix, but as the CI is manually triggered, if someone could trigger for me, it would help (as I have no idea how to do it locally)

mscherer avatar Sep 20 '21 20:09 mscherer

Thanks. [test]

hhorak avatar Sep 22 '21 16:09 hhorak

So I fixed my tests (and my code), now it fail on:

ERROR: File /help.1 does not include 'POSTGRESQL\_ADMIN\_PASSWORD'.

It also fail on the same error for the master branch, so I guess that's not my code causing that.

mscherer avatar Oct 19 '21 20:10 mscherer

The issue reported in my last comment is tracked on #421

mscherer avatar Oct 20 '21 11:10 mscherer

[test-all]

phracek avatar Sep 14 '22 12:09 phracek

@mscherer Can you please rebased it against master and also run these two commands:

$ make clean-versions
$ make generate-all

And commit all changes?

phracek avatar Sep 14 '22 12:09 phracek

Done

mscherer avatar Sep 14 '22 17:09 mscherer

[test-all]

phracek avatar Mar 15 '23 15:03 phracek

@fila43 @hhorak Please folks, go through this pull request and let me know if everything is fine, you are experts in PostgreSQL then /me.

phracek avatar Mar 15 '23 15:03 phracek

I rebased (before rediffing the other PR)

mscherer avatar Oct 05 '23 18:10 mscherer