learningtower
learningtower copied to clipboard
Package size is too large
After adding the 2022 data, the package current exceeds 5MB. Resolutions:
- Re-curate the 2022 data
- Explore alternative data compression for the subset data creation. https://github.com/kevinwang09/learningtower/blob/master/inst/sampling.R
We can make the samples for each year smaller, to fix this.
On 4 Oct 2024, at 1:28 pm, kevinwang @.***> wrote:
After adding the 2022 data, the package current exceeds 5MB. Resolutions: • Re-curate the 2022 data • Explore alternative data compression for the subset data creation. https://github.com/kevinwang09/learningtower/blob/master/inst/sampling.R — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>
cheers, Di
Dianne Cook @.***
@dicook, the installed package is about 5.1MB even after reducing the number of rows.
I suggest that we subset the data to just the OECD countries: https://www.oecd.org/en/about/members-partners.html.
Procedures: https://github.com/kevinwang09/learningtower/blob/master/inst/sampling_student_and_school.R
That’s a reasonable approach
On 2 Dec 2024, at 4:08 pm, kevinwang @.***> wrote:
@dicook, the installed package is about 5.1MB even after reducing the number of rows. I suggest that we subset the data to just the OECD countries: https://www.oecd.org/en/about/members-partners.html. Procedures: https://github.com/kevinwang09/learningtower/blob/master/inst/sampling_student_and_school.R image.png (view on web) — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
cheers, Di
Dianne Cook @.***
The use of factor column was the main cause of a large package size. After re-curation,
- We ensured that year and school_id are now integer and character columns.
- We subsetted the in-package data is limited to OECD countries. The full data remains intact.