superset icon indicating copy to clipboard operation
superset copied to clipboard

[4.1.0.rc3] Error to download PDF or Image

Open jacob-roldan opened this issue 1 year ago • 7 comments

Bug description

Go to one Dashboard 3 dots top right / Download / Export to PDF or Download as Image

One error message is showed. See screenshot

I've tested in environment http://35.87.42.94:8080/

Screenshots/recordings

image

Superset version

master / latest-dev

Python version

3.9

Node version

16

Browser

Chrome

Additional context

No response

Checklist

  • [ ] I have searched Superset docs and Slack and didn't find a solution to my problem.
  • [ ] I have searched the GitHub issue tracker and didn't find a similar bug report.
  • [ ] I have checked Superset's logs for errors and if I found a relevant Python stacktrace, I included it here as text in the "additional context" section.

jacob-roldan avatar Oct 17 '24 15:10 jacob-roldan

Hey @jacob-roldan! 👋 I'm here to help you with any bugs, questions, or becoming a contributor. Let's squash those bugs together!

I found a similar open issue that might be related to the problem you're experiencing: Upon download as PDF, charts on a long dashboard render as loading image. This issue suggests that the problem might be due to the virtualization feature, which off-loads charts that are away from the viewport. A workaround is to disable the virtualization feature by updating the Superset configuration:

  1. Open the superset_config_docker.py file.

  2. Update the FEATURE_FLAGS variable as follows:

    FEATURE_FLAGS = {
        "DASHBOARD_VIRTUALIZATION": False,
    }
    

This change ensures that the dashboard loads all charts at once, preventing loading spinners from appearing in the downloaded PDF or image [1].

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Oct 17 '24 15:10 dosubot[bot]

@geido i can't remember if i should have spun this up with a feature flag or not.

sadpandajoe avatar Oct 17 '24 16:10 sadpandajoe

@jacob-roldan I ran this on an instance I have and it looks like I do see that message for a bit (maybe a minute or two) but eventually the dashboard does download. How long did you wait until moving away from the dashboard?

sadpandajoe avatar Oct 17 '24 16:10 sadpandajoe

@sadpandajoe I can also confirm this problem using our internal test environment.

4.0.2

https://github.com/user-attachments/assets/fc5f4562-be15-4cdf-9a05-1811cdec1ea4

4.1.0rc3

https://github.com/user-attachments/assets/787c9648-dc1f-49e6-9c03-df56e54c2ebd

michael-s-molina avatar Oct 17 '24 19:10 michael-s-molina

Is there any configuration change between 4.0.2 and 4.1.0rc3 that is needed to generate the screenshots?

michael-s-molina avatar Oct 17 '24 19:10 michael-s-molina

@michael-s-molina @jacob-roldan are there any logs? We've actually been running this code on prod for a bit and haven't gotten this issue. Trying to debug this but can't seem to repro it on our end.

sadpandajoe avatar Oct 18 '24 16:10 sadpandajoe

@sadpandajoe @geido @eschutho I was able to pinpoint the problem. The reason for the failure is because the screenshot generation on 4.1.0 RC3 caches the screenshots using the THUMBNAIL_CACHE_CONFIG which is a NullCache by default. A NullCache is a cache that does not cache anything, and that's why the frontend cannot find the screenshots and enters in a loop. The fix for this would be to make the default configuration of THUMBNAIL_CACHE_CONFIG similar to what we do with the Explore form data and save the thumbnails in the database using the SupersetMetastoreCache:

THUMBNAIL_CACHE_CONFIG = {
    "CACHE_TYPE": "SupersetMetastoreCache",
    "CACHE_DEFAULT_TIMEOUT": // set a value
    # Should the timeout be reset when retrieving a cached value?
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
    # The following parameter only applies to `MetastoreCache`:
    # How should entries be serialized/deserialized?
    "CODEC": // define the appropriate codec
}

Talking to @villebro about this issue, he raised a good point where previously Celery workers were not a hard requirement to install Superset but more of an optional feature. If the screenshot generation always requires Celery workers from now on, that could constitute a breaking change. Let me know your thoughts.

michael-s-molina avatar Oct 18 '24 21:10 michael-s-molina

@sadpandajoe @geido @eschutho I was able to pinpoint the problem. The reason for the failure is because the screenshot generation on 4.1.0 RC3 caches the screenshots using the THUMBNAIL_CACHE_CONFIG which is a NullCache by default. A NullCache is a cache that does not cache anything, and that's why the frontend cannot find the screenshots and enters in a loop. The fix for this would be to make the default configuration of THUMBNAIL_CACHE_CONFIG similar to what we do with the Explore form data and save the thumbnails in the database using the SupersetMetastoreCache:

THUMBNAIL_CACHE_CONFIG = {
    "CACHE_TYPE": "SupersetMetastoreCache",
    "CACHE_DEFAULT_TIMEOUT": // set a value
    # Should the timeout be reset when retrieving a cached value?
    "REFRESH_TIMEOUT_ON_RETRIEVAL": True,
    # The following parameter only applies to `MetastoreCache`:
    # How should entries be serialized/deserialized?
    "CODEC": // define the appropriate codec
}

Talking to @villebro about this issue, he raised a good point where previously Celery workers were not a hard requirement to install Superset but more of an optional feature. If the screenshot generation always requires Celery workers from now on, that could constitute a breaking change. Let me know your thoughts.

Thanks @michael-s-molina we are currently discussing what the next steps should be for having Celery optional and the cache.

geido avatar Oct 22 '24 13:10 geido