supabase icon indicating copy to clipboard operation
supabase copied to clipboard

Bug: Logflare UTF-8 Error on Docker 1.5.32V

Open myagizmaktav opened this issue 1 year ago • 2 comments

Bug report

  • [ x ] I confirm this is a bug with Supabase, not with my own application.
  • [ x ] I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

When using Logflare 1.5.32 on Docker, I encounter a UTF-8 error. In my PostgreSQL database, I have tables with multiple language text cells (Danish, Russian, Indian, Chinese, etc.). Logflare attempts to log these tables, but when it encounters an issue, it repeatedly tries to read, causing my CPU usage to spike to 90%.

To Reproduce

  1. Create a database.
  2. Create a table with multi-lang text cells.
  3. Add 10,000 words in each language.
  4. Restart the Supabase Docker container.
  5. Monitor Logflare logs.

Expected behavior

Logflare should not throw a UTF-8 error, and in case of failure, it should not repeatedly attempt to reread.

System information

  • OS: Ubuntu 22.4
  • Logflare Version: 1.5.32
  • CPU: Intel Core i7-13650HX

Additional context

[Provide any additional information or context that might help in understanding or resolving the issue.]

Bug report

  • [ x ] I confirm this is a bug with Supabase, not with my own application.
  • [ x ] I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

When using Logflare 1.5.32 on Docker, I encounter a UTF-8 error. In my PostgreSQL database, I have tables with multiple language text cells (Danish, Russian, Indian, Chinese, etc.). Logflare attempts to log these tables, but when it encounters an issue, it repeatedly tries to read, causing my CPU usage to spike to 90%.

To Reproduce

  1. Create a database.
  2. Create a table with multi-lang text cells.
  3. Add 10,000 words in each language.
  4. Restart the Supabase Docker container.
  5. Monitor Logflare logs.

Expected behavior

Logflare should not throw a UTF-8 error, and in case of failure, it should not repeatedly attempt to reread.

System information

  • OS: Ubuntu 22.4
  • Logflare Version: 1.5.32
  • CPU: Intel Core i7-13650HX

Additional context

Same problem have on 1.4 version to. [Provide any additional information or context that might help in understanding or resolving the issue.]

Logflare Logs
 (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.260 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.260 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.261 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.261 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.261 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.261 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3
20:23:00.262 [error] ** (Postgrex.Error) ERROR 22P05 (untranslatable_character) unsupported Unicode escape sequence
\u0000 cannot be converted to text.
    (ecto_sql 3.11.0) lib/ecto/adapters/sql.ex:1054: Ecto.Adapters.SQL.raise_sql_call_error/1
    (ecto 3.11.0) lib/ecto/repo/schema.ex:775: Ecto.Repo.Schema.apply/4
    (ecto 3.11.0) lib/ecto/repo/schema.ex:377: anonymous fn/15 in Ecto.Repo.Schema.do_insert/4
    (broadway 1.0.7) lib/broadway/message.ex:76: Broadway.Message.update_data/2
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:159: anonymous fn/6 in Broadway.Topology.ProcessorStage.handle_messages/4
    (telemetry 0.4.3) /app/deps/telemetry/src/telemetry.erl:272: :telemetry.span/3
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:146: Broadway.Topology.ProcessorStage.handle_messages/4
    (broadway 1.0.7) lib/broadway/topology/processor_stage.ex:65: anonymous fn/2 in Broadway.Topology.ProcessorStage.handle_events/3

myagizmaktav avatar Mar 04 '24 20:03 myagizmaktav

Do you have an example of the utf-8 error?

it should not repeatedly attempt to reread I do not quite understand what you mean by "reread", does it try to log the fields?

Ziinc avatar Apr 13 '24 17:04 Ziinc

Do you have an example of the utf-8 error?

it should not repeatedly attempt to reread I do not quite understand what you mean by "reread", does it try to log the fields?

Hello, last line showing logs. Thank you.

myagizmaktav avatar Apr 14 '24 01:04 myagizmaktav

Hi this error is due to postgres rejecting the special Unicode characters as the stored body of the object is on the body jsonb field. You will need to sanitise your stored data prior to attempting to store the data.

The infinite loop is expected, as we are essentially routing the postgres error logs to the Analytics container and if the character shows up in the postgres error logs when attempting to insert into postgres, then it would spit out another error log.

If you wish to avoid this, I would suggest using the bigquery backend for storing the logging data instead to have separation of concerns.

Relevant further readings:

https://www.postgresql.org/message-id/368156.1677514339%40sss.pgh.pa.us

Ziinc avatar Sep 05 '24 03:09 Ziinc

I'll leave this up as we can possibly improve the logflare postgres adaptor to filter out these invalid characters on insert.

Ziinc avatar Sep 05 '24 03:09 Ziinc

Thank you.

myagizmaktav avatar Sep 06 '24 16:09 myagizmaktav