flatpak-xdg-utils icon indicating copy to clipboard operation
flatpak-xdg-utils copied to clipboard

flatpak-spawn: don't use locale conversion for env and sandbox-expose

Open 2-www opened this issue 2 years ago • 14 comments

this should allow using non-ascii text in environment variables (and sandbox paths) even if the locale failed to load where the flatpak-spawn helper is run

previously: https://github.com/flatpak/flatpak/pull/4138, but this was on the portal side

related: https://github.com/flathub/org.gnome.Epiphany/issues/21 https://github.com/flatpak/flatpak/issues/5398 is one of the reasons why the locale might be unavailable

2-www avatar Apr 26 '23 19:04 2-www

Hmm. This makes sense for sandbox-expose. I'm not sure env makes sense to consider filename encoding.

TingPing avatar Apr 26 '23 19:04 TingPing

I'm not sure env makes sense to consider filename encoding.

afaik --env=PS1=[emoji] is what causes the problem in epiphany

2-www avatar Apr 26 '23 19:04 2-www

sandbox-expose, while referring to a file, is not a filesystem path (that is sandbox-expose-path). I don't think it should be in that encoding.

TingPing avatar Apr 26 '23 19:04 TingPing

afaik --env=PS1=[emoji] is what causes the problem in epiphany

I think this is just a workaround that happens to work when .Locale is missing.

TingPing avatar Apr 26 '23 19:04 TingPing

sandbox-expose, while referring to a file, is not a filesystem path (that is sandbox-expose-path). I don't think it should be in that encoding.

this is just a flag glib uses to copy the string without locale conversion before passing it to the callback on unix: https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/goption.c#L1439 sandbox-expose is a relative file path, which can be non-ascii: https://docs.flatpak.org/en/latest/portal-api-reference.html#gdbus-method-org-freedesktop-portal-Flatpak.Spawn

sandbox-expose as

A list of filenames for files inside the sandbox that will be exposed to the new sandbox, for reading and writing. Note that absolute paths or subdirectories are not allowed.

The files must be in the sandbox subdirectory of the instance directory (i.e. ~/.var/app/$APP_ID/sandbox).

2-www avatar Apr 26 '23 19:04 2-www

Ah. Yes you are right about sandbox-expose.

However I still disagree with env. The flatpak CLI for example always considers them UTF-8.

TingPing avatar Apr 26 '23 20:04 TingPing

Ah. Yes you are right about sandbox-expose.

However I still disagree with env. The flatpak CLI for example always considers them UTF-8.

yes, but it runs on host, where thing are stable, but flatpak-spawn usually runs in a container, where extensions may be missing, so it might consider valid utf-8 invalid

2-www avatar Apr 26 '23 22:04 2-www

However I still disagree with env. The flatpak CLI for example always considers them UTF-8.

Strictly speaking, that's actually wrong, and they should be FILENAME. Unix environment variables are in the same encoding as Unix filenames: "bytestrings" containing anything except \0, encoded in an unspecified superset of ASCII, which is often UTF-8 but not always, and in pathological cases might not even be consistent within one process's environment (you can have one environment variable with a UTF-8 value and another with a Latin-1 value). In really pathological situations, even the names of environment variables don't have to be UTF-8.

Unfortunately, flatpak-session-helper and flatpak-portal use a{ss} to represent the environment, which cannot represent environment variable names or values that are not valid UTF-8. We should ideally do a new revision of those D-Bus APIs in a future Flatpak release that can represent these.

When communicating with a version of Flatpak that doesn't have that new API, flatpak-spawn should check that the environment variables it's given are valid UTF-8, and if not, either warn and ignore or fail with an error. I'm not sure which one of those is better.

New D-Bus APIs that fix this could add env-bytes aay and unset-env-bytes aay to the options, with the equivalent of env -u UNSET_THIS -u AND_THIS HELLO=world X=y encoded like this (in GVariant text notation):

options = {'env-bytes': <[b'HELLO=world', b'X=y']>, 'unset-env-bytes': <[b'UNSET_THIS', b'AND_THIS']>}

(reminder that in GVariant text notation, b'X=y' is the same array of 4 bytes as [byte 88, 61, 121, 0], with an implicit trailing \0.)

(Or alternatively env-bytes could be a single byte-array in env -0//proc/*/environ/flatpak run --env-fd format, but I think unset-env-bytes only really makes sense as an array of names to unset.)

smcv avatar Apr 27 '23 14:04 smcv

bump, bug happened again

2-www avatar May 22 '23 21:05 2-www

As Simon said:

flatpak-session-helper and flatpak-portal use a{ss} to represent the environment, which cannot represent environment variable names or values that are not valid UTF-8.

So part of this change is incorrect. The part that happens to bypass the original issue.

TingPing avatar May 23 '23 17:05 TingPing

Patrick @.***> wrote:

As Simon said:

> flatpak-session-helper and flatpak-portal use a{ss} to represent the environment, which cannot represent environment variable names or values that are not valid UTF-8.

So part of this change is incorrect. The part that happens to bypass the original issue.

-- Reply to this email directly or view it on GitHub: https://github.com/flatpak/flatpak-xdg-utils/pull/65#issuecomment-1559866352 You are receiving this because you authored the thread.

Message ID: @.***>

without this change, it's not possible for flatpak-spawn to recognize valid utf-8 after updating locales

2-www avatar May 23 '23 22:05 2-www

without this change, it's not possible for flatpak-spawn to recognize valid utf-8 after updating locales

That is a different bug. The locale should never be missing.

TingPing avatar May 24 '23 01:05 TingPing

That is a different bug. The locale should never be missing.

yes, but if you want to test something with LANG=C? this is currently not possible because of this error

2-www avatar Jun 22 '24 13:06 2-www

and again: this is a real bug that is probably affecting all non-ascii users. localization is never technically perfect. this is what we can do now so that english speaking users don't get browsers broken. please merge this

2-www avatar Jun 22 '24 13:06 2-www