posthog icon indicating copy to clipboard operation
posthog copied to clipboard

fix: encoding problem when exporting CSV

Open nykolaslima opened this issue 1 year ago • 10 comments

Problem

Encoding problem for CSV exports when opening the generated file in Microsoft Excel

Changes

Implemented BOM in CSV exports to resolve encoding issues with special characters on Microsoft Excel.

Introduces a Byte Order Mark (BOM) at the start of CSV files to improve the handling of special characters across various platforms. The UTF-8 BOM, a specific byte sequence (EF BB BF), is used as a signal to software that the file is encoded in UTF-8, ensuring consistent interpretation of Unicode characters, especially on systems where UTF-8 is not the default encoding.

This change aims to enhance cross-platform compatibility and prevent misinterpretation of special characters in CSV exports.

How did you test this code?

  1. Generate a Survey
  2. Add a response with some special characters. e.g. "你好吗"
  3. Export the Survey responses
  4. Open in Microsoft Excel

Fixes #19580

@liyiy could you please help with a review here?

nykolaslima avatar Jan 09 '24 02:01 nykolaslima

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

posthog-bot avatar Jan 16 '24 07:01 posthog-bot

@liyiy could you give some help to review here?

nykolaslima avatar Jan 19 '24 02:01 nykolaslima

@pauldambra @daibhin could you please help me with a review/feedback here?

nykolaslima avatar Jan 23 '24 14:01 nykolaslima

@liyiy fixed lint error on commit/PR message

nykolaslima avatar Jan 26 '24 21:01 nykolaslima

@liyiy rebased with main. could you take a look please? thanks!

nykolaslima avatar Jan 29 '24 15:01 nykolaslima

Hey @nykolaslima

To make this pass the mypy check (mypy -p posthog | mypy-baseline filter), you'll need to change a few lines higher to: render_context: dict = {}

I'd be happy to merge it in with the change, though in my testing locally I didn't actually see the BOM symbol inside the exported CSV files, when looking at them with a hex editor... and I'm not sure why 🤔. Did you test locally that this works?

Also, the real answer is probably to implement XLS exports 😅

mariusandra avatar Jan 30 '24 12:01 mariusandra

@mariusandra I had to test it with microsoft excel - when importing in google sheets for example it works, but in microsoft excel it didn't

will fix what you mentioned and will let you know when its done

nykolaslima avatar Jan 30 '24 22:01 nykolaslima

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

posthog-bot avatar Feb 07 '24 07:02 posthog-bot

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

posthog-bot avatar Feb 15 '24 07:02 posthog-bot

I would agree with @mariusandra here that implementing an option for XLS exports is a much more proper solution than introducing a bom mark to the file especially since there's no pressing urgency for a quick fix 👍 Did you want to take a go at that?

liyiy avatar Feb 15 '24 17:02 liyiy

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

posthog-bot avatar Feb 23 '24 07:02 posthog-bot

FYI see #20568 for xlsx support

webjunkie avatar Feb 28 '24 10:02 webjunkie

This PR hasn't seen activity in a week! Should it be merged, closed, or further worked on? If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in another week.

posthog-bot avatar Mar 07 '24 07:03 posthog-bot

This PR was closed due to lack of activity. Feel free to reopen if it's still relevant.

posthog-bot avatar Mar 14 '24 07:03 posthog-bot