superset icon indicating copy to clipboard operation
superset copied to clipboard

feat(reports): xlsx attachments

Open cemremengu opened this issue 2 years ago • 4 comments

SUMMARY

Most people and departments in enterprise companies are addicted to spreadsheets. Currently our customers are heavily demanding Excel reports and, unfortunately, we have failed to persuade them to continue with CSV.

So here I am with a PR that adds the ability to attach XLSX files as report data and notifications (slack). If this one gets approved, I will also attempt to implement PDF in a similar way.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

image

image

TESTING INSTRUCTIONS

Try sending an xlsx report or notification. I wasn't able to test slack since I don't use it actively but implemented it nevertheless for completeness.

Known Problems

Currently the first column is displayed like

image

It can be monkey patched but it should be fixed naturally once/if #19735 lands.

ADDITIONAL INFORMATION

  • [ ] Has associated issue:
  • [ ] Required feature flags:
  • [x] Changes UI
  • [ ] Includes DB Migration (follow approval process in SIP-59)
    • [ ] Migration is atomic, supports rollback & is backwards-compatible
    • [ ] Confirm DB migration upgrade and downgrade tested
    • [ ] Runtime estimates and downtime expectations provided
  • [x] Introduces new feature or API
  • [ ] Removes existing feature or API

cemremengu avatar Apr 21 '22 18:04 cemremengu

Codecov Report

Merging #19810 (77b9047) into master (62b4564) will decrease coverage by 0.05%. The diff coverage is 76.36%.

:exclamation: Current head 77b9047 differs from pull request most recent head 3edb8d5. Consider uploading reports for the commit 3edb8d5 to get more accurate results

@@            Coverage Diff             @@
##           master   #19810      +/-   ##
==========================================
- Coverage   68.98%   68.93%   -0.05%     
==========================================
  Files        1903     1904       +1     
  Lines       74126    74101      -25     
  Branches     8110     8120      +10     
==========================================
- Hits        51138    51085      -53     
- Misses      20876    20904      +28     
  Partials     2112     2112              
Flag Coverage Δ
hive 53.85% <30.76%> (-0.16%) :arrow_down:
javascript 55.65% <86.20%> (+0.02%) :arrow_up:
mysql ?
postgres 79.32% <65.38%> (-0.07%) :arrow_down:
presto 53.78% <30.76%> (-0.16%) :arrow_down:
python 83.25% <65.38%> (-0.11%) :arrow_down:
sqlite 77.82% <65.38%> (-0.07%) :arrow_down:
unit 54.54% <30.76%> (+0.03%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...acy-preset-chart-deckgl/src/components/Tooltip.tsx 0.00% <0.00%> (ø)
...lugins/plugin-chart-table/src/utils/formatValue.ts 66.66% <0.00%> (-3.93%) :arrow_down:
...c/components/Chart/DrillDetail/DrillDetailPane.tsx 75.32% <ø> (ø)
...rset-frontend/src/components/ReportModal/index.tsx 79.03% <ø> (ø)
...-frontend/src/components/TableCollection/index.tsx 100.00% <ø> (ø)
...mponents/DataTablesPane/components/SamplesPane.tsx 97.67% <ø> (ø)
...ataTablesPane/components/SingleQueryResultPane.tsx 85.71% <ø> (ø)
...-frontend/src/features/alerts/AlertReportModal.tsx 54.44% <ø> (ø)
superset/datasets/api.py 87.95% <ø> (ø)
superset/security/manager.py 93.83% <ø> (ø)
... and 13 more

... and 7 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov[bot] avatar Apr 21 '22 18:04 codecov[bot]

Any update on this?

smartsense-as avatar Jun 23 '22 13:06 smartsense-as

@cemremengu , I've posted this in superset slack (https://apache-superset.slack.com/archives/CCKHMGRRB/p1662748936923259) and there's some feedback. Hopefully get some 👀 on it. Thank you for creating this PR!

etadelta222 avatar Sep 13 '22 20:09 etadelta222

You are welcome! I am happy to continue once maintainers have time to chime in.

cemremengu avatar Sep 14 '22 15:09 cemremengu

Hi @cemremengu. Any updates?

EugeneTorap avatar Oct 12 '22 12:10 EugeneTorap

@EugeneTorap thanks for the heads up! Just pushed the updates, my testing environment is limited now so apologies for any inconvenience

cemremengu avatar Oct 12 '22 15:10 cemremengu

@cemremengu Thanks. Can you format a file with black to fix CI: Python Misc / pre-commit (3.8)?

pip install black
black superset/tests/integration_tests/reports/commands_tests.py

EugeneTorap avatar Oct 13 '22 07:10 EugeneTorap

@cemremengu Can you also add Excel support in Download on explore page for consistency?

Screenshot 2022-10-13 at 10 17 45

EugeneTorap avatar Oct 13 '22 07:10 EugeneTorap

@cemremengu you forgot to add image

to the file superset/reports/commands/execute.py

Otherwise, email reports look like this: image

artemonsh avatar Nov 09 '22 10:11 artemonsh

@cemremengu Thanks. Can you format a file with black to fix CI: Python Misc / pre-commit (3.8)?

pip install black
black superset/tests/integration_tests/reports/commands_tests.py

EugeneTorap avatar Dec 06 '22 10:12 EugeneTorap

@cemremengu We use Excel name instead of XLSX in our https://github.com/apache/superset/pull/22006 PR Can we use the Excel name for email alert/report UI logic?

image

EugeneTorap avatar Dec 06 '22 11:12 EugeneTorap

@EugeneTorap you mean replacing all occurrences of XLSX with Excel like EXPORT_TO_EXCEL right ?

cemremengu avatar Dec 06 '22 11:12 cemremengu

I think yes, need to replace all UI label from XLSX to Excel in this PR for name consistency. @villebro @zhaoyongjie @michael-s-molina What do you think?

EugeneTorap avatar Dec 06 '22 11:12 EugeneTorap

Just to clarify @EugeneTorap should something like this be

<StyledRadio value="EXCEL">{t('Send as Excel')}</StyledRadio>

or

<StyledRadio value="Excel">{t('Send as Excel')}</StyledRadio>

cemremengu avatar Dec 06 '22 11:12 cemremengu

@cemremengu Value must be XLSX but label is Excel

<StyledRadio value="XLSX">{t('Send as Excel')}</StyledRadio>

EugeneTorap avatar Dec 06 '22 11:12 EugeneTorap

@EugeneTorap Fixed 2 of them, I agree it is better like this.

cemremengu avatar Dec 06 '22 11:12 cemremengu

/testenv up

rusackas avatar Dec 13 '22 00:12 rusackas

@rusackas Ephemeral environment spinning up at http://35.86.251.160:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

github-actions[bot] avatar Dec 13 '22 00:12 github-actions[bot]

/testenv up FEATURE_ALERT_REPORTS=true

rusackas avatar Jan 19 '23 16:01 rusackas

@cemremengu it looks like there was a PR by @lyndsiWilliams where some translations are moved into constants in one of the files. Would you mind rebasing off master to conform to this change? Sorry we didn't get this one merged first ;)

rusackas avatar Jan 19 '23 16:01 rusackas

@rusackas Ephemeral environment spinning up at http://34.209.245.171:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

github-actions[bot] avatar Jan 19 '23 16:01 github-actions[bot]

@EugeneTorap here is my attempt, still needs to be checked for runtime errors in a test-env though.

I (if I got you correctly) basically combined _get_embedded_data and _get_csv_data into one function as _get_data to obtain the dataframe directly. Then from there, I applied the conversion according to ReportDataFormat

cemremengu avatar Jan 29 '23 14:01 cemremengu

@geido @dpgaspar Can you review it? Maybe you have a good suggestion to improve the code?

EugeneTorap avatar Jan 30 '23 16:01 EugeneTorap

@cemremengu Btw, python integration tests are currently failing for this, superset.reports.commands.exceptions.ReportScheduleUnexpectedError: argument of type 'method' is not iterable

This is because there is line self._report_schedule.report_format in ReportDataFormat.table_like: it should be self._report_schedule.report_format in ReportDataFormat.table_like():

kakoni avatar Feb 06 '23 13:02 kakoni

@kakoni thank you for the tip!

cemremengu avatar Feb 09 '23 14:02 cemremengu

@cemremengu did you get time to fix this ?

mdeshmu avatar May 18 '23 14:05 mdeshmu

Not sure what I need to fix. As far as I remember failing tests were flaky (I cannot see the CI log anymore)

@EugeneTorap @rusackas how should we proceed with this PR?

cemremengu avatar Jun 09 '23 09:06 cemremengu

Easiest way to get CI logs again is to close/reopen it, or apply new commits (like fixing the merge conflicts would do)

rusackas avatar Jun 12 '23 18:06 rusackas

@rusackas rebased, not sure what the failing tests are about.

@kakoni had doubts about _get_data vs _get_csv_data in previous discussions. Once we settle that I think we can merge?

cemremengu avatar Jun 14 '23 09:06 cemremengu

The test-mysql, test-postgres, and test-sqlite actions are failing due to this:

FAILED tests/integration_tests/reports/commands_tests.py::test_email_chart_report_schedule_with_csv - superset.reports.commands.exceptions.ReportScheduleDataFrameFailedError: Failed generating dataframe Input string must be text, not bytes

the pre-commit check is failing because of a linting error. You can probably just run pre-commit run --all-files and commit any changes to clear that up.

rusackas avatar Jun 14 '23 15:06 rusackas

@rusackas thanks for pointing these out! Cypress tests are failing but those seem to be caused by something else?

I decided to revert back to the old _get_csv_data way to benefit from the post processing. The only problem now seems that the excel contains the index column even though I export it with Index=False. Should just force drop the first column? Any ideas about that?

image

cemremengu avatar Jun 15 '23 11:06 cemremengu

@rusackas can we please unblock this by answering @cemremengu's question in last comment.

mdeshmu avatar Oct 05 '23 00:10 mdeshmu