animint2 icon indicating copy to clipboard operation
animint2 copied to clipboard

Add file size columns to download status table

Open ANAMASGARD opened this issue 4 months ago • 20 comments

Fixes #249

Summary

Adds file size information (total_MB, mean_MB, and rows) to the download status table, helping users understand disk usage for each geom.

Problem

Currently, the download status table shows how many TSV files need to be downloaded, but doesn't show file sizes. Users can't see if they're downloading 1MB or 100MB.

Test output: Screenshot From 2025-10-29 10-50-36

Only added the test case in this first commit in the next commit will try to fix the issue !

ANAMASGARD avatar Oct 29 '25 05:10 ANAMASGARD

Add download size info to status table (#249)

Adds total_MB, mean_MB, and rows columns to the download status table. Calculates total data size and row count for each geom by iterating through all downloaded chunks. Handles both array and object data structures. Values update as data is downloaded and displayed.

Browser screenshot verification included showing new columns working. Screenshot From 2025-10-30 21-04-08

ANAMASGARD avatar Oct 30 '25 15:10 ANAMASGARD

Codecov Report

:x: Patch coverage is 99.00990% with 1 line in your changes missing coverage. Please review. :white_check_mark: Project coverage is 77.89%. Comparing base (68b074e) to head (69bc4c7).

Files with missing lines Patch % Lines
R/z_animintHelpers.R 88.88% 1 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #272      +/-   ##
==========================================
+ Coverage   73.06%   77.89%   +4.83%     
==========================================
  Files         164      164              
  Lines        8758     8845      +87     
  Branches        0      564     +564     
==========================================
+ Hits         6399     6890     +491     
+ Misses       2359     1955     -404     
Flag Coverage Δ
javascript 95.57% <100.00%> (+14.82%) :arrow_up:
r 69.59% <93.75%> (+0.05%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Oct 30 '25 16:10 codecov[bot]

No obvious timing issues in HEAD=add-download-size-info Comparison Plot

Generated via commit 69bc4c74942ca94c0d7053d57669151ded098ad0

Download link for the artifact containing the test results: ↓ atime-results.zip

Task Duration
R setup and installing dependencies 2 minutes and 22 seconds
Installing different package versions 14 seconds
Running and plotting the test cases 3 minutes and 40 seconds

tdhock avatar Oct 31 '25 09:10 tdhock

Thanks! This is a good start. I see the result below using https://github.com/animint/animint2/blob/master/inst/examples/WorldBank-paper.R image

Can you please make the following changes?

  • remove selected chunk and status columns
  • right justify numeric columns
  • remove mean MB column
  • combine downloaded and total columns into one column named "files" value "4/52"
  • same for MB and rows, value should be " # downloaded / # possible "
  • This data viz, for geom3 and geom4, makes 52 numbered chunk files (chunk1 to chunk52.tsv) plus a chunk_common.tsv file. Currently the common file is counted for the numbr downloaded "4" but not for the total number "52" but we should change so that the common file is counted for both. (and for each attribute, files/MB/rows)

tdhock avatar Oct 31 '25 14:10 tdhock

please click Resolve so I can see which previous comments you have already addressed

tdhock avatar Nov 11 '25 16:11 tdhock

can you please post a screenshot like this one https://github.com/animint/animint2/pull/272#issuecomment-3473335127 but created with the new code?

tdhock avatar Nov 11 '25 16:11 tdhock

Sir @tdhock Here's the updated screenshot with the new download status table format:

Screenshot From 2025-11-16 21-28-41

Changes implemented:

  • ✅ Removed "selected chunk" and "status" columns
  • ✅ Removed "mean MB" column
  • ✅ Combined columns into "files", "MB", and "rows" format (downloaded/total)
  • ✅ Right-justified all numeric columns
  • ✅ Common chunk now counted in BOTH downloaded and total for all metrics

All values update correctly as data downloads, and common chunk is properly counted in both downloaded and total counts.

Please give your feedback and suggest what changes I need to make so that the failing test can be passed on the GitHub , Thank You !

ANAMASGARD avatar Nov 16 '25 16:11 ANAMASGARD

https://github.com/animint/animint2/pull/272#issuecomment-3538898963 seems to have the wrong old screenshot, please revise

tdhock avatar Nov 17 '25 15:11 tdhock

Updated screenshot with WorldBank-paper.R example showing new download status table format:

Screenshot From 2025-11-28 21-40-21
  • "files", "MB", and "rows" columns in "downloaded/total" format
  • Removed "selected chunk", "status", and "mean MB" columns
  • Right-justified numeric columns
  • Common chunk counted in both downloaded and total counts

ANAMASGARD avatar Nov 28 '25 16:11 ANAMASGARD

that new screenshot looks good, thanks! please fix test failures.

tdhock avatar Nov 28 '25 16:11 tdhock

  • Added id="download_status" to the table element for test discovery
  • Added initial display values ("0 / X" format) for files, MB, and rows columns

The remaining test failures in CI (ghpages permission issues, knit-print duplicate IDs) are unrelated to this PR's download status table changes.

ANAMASGARD avatar Nov 29 '25 05:11 ANAMASGARD

this needs to be fixed

── Failure ('test-renderer3-knit-print.R:67:3'): id property is unique over entire page ──
Expected `length(not.unique)` to equal 0.
Differences:
1/1 mismatches
[1] 1 - 0 == 1

tdhock avatar Dec 01 '25 20:12 tdhock

using this branch I see image

  • large number of rows would benefit from having commas, like 4,321 instead of 4321 -- can you please fix?
  • sometimes 0.00 is displayed for MB which could be confusing. I'm not sure what the best fix for this would be. Either change to 0 (without .00) or change column heading to "disk space" and use KB or MB in each row (depending on how big they are).

also please change the computation or column name "MB" to be consistent with "man du" which says

       The SIZE argument is an integer and optional unit (example: 10K is
       10*1024).  Units are K,M,G,T,P,E,Z,Y,R,Q (powers of 1024) or
       KB,MB,... (powers of 1000).  Binary prefixes can be used, too:
       KiB=K, MiB=M, and so on.

in the example above we see 15.62 MB for geom2. but looking at the file sizes in R we see:

> sum(size.dt[grep("geom2",name), R_bytes/1000/1000])
[1] 16.38009
> sum(size.dt[grep("geom2",name), R_bytes/1024/1024])
[1] 15.62127

15.62 is the number you get when you divide by 1024, which should be MiB (not MB as currently written in column header) so please

  • change MB to MiB in header, or
  • change divisor from 1024 to 1000

thanks!

tdhock avatar Dec 10 '25 17:12 tdhock

  • Updated NEWS/DESCRIPTION to version 2025.12.4 with the final download table description.
  • Download status table now shows files/disk/rows in “downloaded / total”, rows use commas, disk uses KiB/MiB (1024 divisor, man du consistent), and all numeric columns are right‑aligned.
  • Tests now target rendered HTML (headers and alignment) and are passing; full CI (JS/R/CRAN) is green.

Sir @tdhock can you please again review this PR .

ANAMASGARD avatar Dec 15 '25 13:12 ANAMASGARD

Sir @tdhock Updated screenshot showing the new download status table format:

Screenshot From 2025-12-15 21-54-33

ANAMASGARD avatar Dec 15 '25 16:12 ANAMASGARD

the updated screenshot looks better, thanks! it would be easier to read if there were a grey or black line to separate each row/column in the table. Can you add that please?

tdhock avatar Dec 15 '25 18:12 tdhock

also can you please post updated screenshot?

tdhock avatar Dec 16 '25 15:12 tdhock

Yes Sir @tdhock I will definitely create a function and post the updated screenshot. Actually my laptop screen is broken, ASAP it is fixed I will make the changes.

ANAMASGARD avatar Dec 16 '25 17:12 ANAMASGARD

Sir @tdhock Updated screenshot with table borders :- Screenshot From 2025-12-19 15-51-47

ANAMASGARD avatar Dec 19 '25 10:12 ANAMASGARD

Sir @tdhock The CI test failures in test-compiler-ghpages.R are not related to this PR's changes.

  • The failures are caused by a race condition: both JS_coverage and R_coverage jobs run in parallel and use the same test repo (animint-test/animint2pages_test_repo), causing them to interfere with each other.

  • I can fix this by making each test suite use a unique repo name based on the TEST_SUITE environment variable. Should I push that fix to this PR, or would you prefer to handle it separately (Like I raise an issue regarding this one )?

  • The download status table feature is complete and working. The test failures are flaky tests that sometimes pass (run #1335 passed, #1339 failed with same code).

ANAMASGARD avatar Dec 19 '25 11:12 ANAMASGARD