arrow icon indicating copy to clipboard operation
arrow copied to clipboard

[R][C++] "negative buffer resize" error with arrow and dplyr in R

Open chenyiwrites opened this issue 1 year ago • 6 comments

Hi everyone,

I was working on a large dataset with over 1 billion observations, stored in 3040 parquet files, with 41 variables. I read the data with open_dataset() and then wanted to apply dplyr functions:

individual_positions %>% 
  group_by(user_id) %>% 
  summarize(n_positions = n()) %>% 
  count(n_positions, sort = TRUE) %>% 
  collect()

individual_positions is my dataset, which consists of different job positions a user held throughout her career. I tried to understand the distribution of the number of all job positions that a user ever held. And I got the following error message:

Error in `compute.arrow_dplyr_query()`:
! Invalid: Negative buffer resize: -2147483584
Backtrace:
 1. ... %>% collect()
 3. arrow:::collect.arrow_dplyr_query(.)
 4. arrow:::compute.arrow_dplyr_query(x)

I googled what "negative buffer resize" really means, but it was in vain. Can anyone please help me with the interpretation and provide any solutions? I know it's possible to process the dataset in SAS, but I'm an R lover and I really want to stick with it. Thanks a lot!

Important update here: Because the observations seemed to be randomly sliced into each parquet file (i.e., a user's position-level observations may be in different parquet files), I think when performing the group_by() functions, it has to pull all parquet files together, instead of picking only the necessary few ones. This might be overwhelming for the memory. Do I have to repartition the data? Thanks!

Component(s)

R, C++

chenyiwrites avatar Feb 02 '24 13:02 chenyiwrites

Thanks for reporting this @chenyiwrites.

That error comes from somewhere in the Arrow C++ codebase, to do with memory allocation, though not somewhere I'm personally familiar with. I can ping one of the C++ folks to see if anything looks familiar. Which version of the package are you using?

thisisnic avatar Feb 02 '24 18:02 thisisnic

Thanks for reporting this @chenyiwrites.

That error comes from somewhere in the Arrow C++ codebase, to do with memory allocation, though not somewhere I'm personally familiar with. I can ping one of the C++ folks to see if anything looks familiar. Which version of the package are you using?

Thanks for following up, Nic! I am using arrow 14.0.0.2 and tidyverse 2.0.0. Please see the detailed session info below:

─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31 ucrt)
 os       Windows 11 x64 (build 22621)
 system   x86_64, mingw32
 ui       RStudio
 language (EN)
 collate  Chinese (Simplified)_China.utf8
 ctype    Chinese (Simplified)_China.utf8
 tz       Asia/Shanghai
 date     2024-02-03
 rstudio  2023.12.1+402 Ocean Storm (desktop)
 pandoc   NA

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version  date (UTC) lib source
 arrow       * 14.0.0.2 2023-12-02 [1] CRAN (R 4.3.2)
 assertthat    0.2.1    2019-03-21 [1] CRAN (R 4.3.2)
 bit           4.0.5    2022-11-15 [1] CRAN (R 4.3.2)
 bit64         4.0.5    2020-08-30 [1] CRAN (R 4.3.2)
 cli           3.6.2    2023-12-11 [1] CRAN (R 4.3.2)
 clipr         0.8.0    2022-02-22 [1] CRAN (R 4.3.2)
 colorspace    2.1-0    2023-01-23 [1] CRAN (R 4.3.2)
 dplyr       * 1.1.4    2023-11-17 [1] CRAN (R 4.3.2)
 fansi         1.0.6    2023-12-08 [1] CRAN (R 4.3.2)
 forcats     * 1.0.0    2023-01-29 [1] CRAN (R 4.3.2)
 generics      0.1.3    2022-07-05 [1] CRAN (R 4.3.2)
 ggplot2     * 3.4.4    2023-10-12 [1] CRAN (R 4.3.2)
 glue          1.7.0    2024-01-09 [1] CRAN (R 4.3.2)
 gtable        0.3.4    2023-08-21 [1] CRAN (R 4.3.2)
 hms           1.1.3    2023-03-21 [1] CRAN (R 4.3.2)
 knitr         1.45     2023-10-30 [1] CRAN (R 4.3.2)
 lifecycle     1.0.4    2023-11-07 [1] CRAN (R 4.3.2)
 lubridate   * 1.9.3    2023-09-27 [1] CRAN (R 4.3.2)
 magrittr      2.0.3    2022-03-30 [1] CRAN (R 4.3.2)
 munsell       0.5.0    2018-06-12 [1] CRAN (R 4.3.2)
 pillar        1.9.0    2023-03-22 [1] CRAN (R 4.3.2)
 pkgconfig     2.0.3    2019-09-22 [1] CRAN (R 4.3.2)
 purrr       * 1.0.2    2023-08-10 [1] CRAN (R 4.3.2)
 R6            2.5.1    2021-08-19 [1] CRAN (R 4.3.2)
 readr       * 2.1.5    2024-01-10 [1] CRAN (R 4.3.2)
 rlang         1.1.3    2024-01-10 [1] CRAN (R 4.3.2)
 rstudioapi    0.15.0   2023-07-07 [1] CRAN (R 4.3.2)
 scales        1.3.0    2023-11-28 [1] CRAN (R 4.3.2)
 sessioninfo   1.2.2    2021-12-06 [1] CRAN (R 4.3.2)
 stringi       1.8.3    2023-12-11 [1] CRAN (R 4.3.2)
 stringr     * 1.5.1    2023-11-14 [1] CRAN (R 4.3.2)
 tibble      * 3.2.1    2023-03-20 [1] CRAN (R 4.3.2)
 tidyr       * 1.3.1    2024-01-24 [1] CRAN (R 4.3.2)
 tidyselect    1.2.0    2022-10-10 [1] CRAN (R 4.3.2)
 tidyverse   * 2.0.0    2023-02-22 [1] CRAN (R 4.3.2)
 timechange    0.3.0    2024-01-18 [1] CRAN (R 4.3.2)
 tzdb          0.4.0    2023-05-12 [1] CRAN (R 4.3.2)
 utf8          1.2.4    2023-10-22 [1] CRAN (R 4.3.2)
 vctrs         0.6.5    2023-12-01 [1] CRAN (R 4.3.2)
 withr         3.0.0    2024-01-16 [1] CRAN (R 4.3.2)
 xfun          0.41     2023-11-01 [1] CRAN (R 4.3.2)

 [1] C:/Users/sibo/AppData/Local/R/win-library/4.3
 [2] C:/Program Files/R/R-4.3.2/library

─────────────────────────────────────────────────────────────────────────────────────────────────────────────────

chenyiwrites avatar Feb 03 '24 03:02 chenyiwrites

For anyone who had the same issue, I add duckdb::to_duckdb() and it ran successfully now!

individual_positions %>% 
  to_duckdb() %>%
  group_by(user_id) %>% 
  summarize(n_positions = n()) %>% 
  count(n_positions, sort = TRUE) %>% 
  collect()

chenyiwrites avatar Feb 07 '24 09:02 chenyiwrites

Hi @chenyiwrites, is there any chance this is a public-available dataset? I think the best next step here is for us to reproduce your issue. Re-partitioning your dataset may make the issue go away but what you ran into is definitely a bug and it needs fixing.

If you aren't able to share the dataset, could you give us a bit more information about the structure of it? Two bits of information would be useful:

  1. Your schema: Call individual_positions$schema and share the output.
  2. Statistics on your files and their number of rows and row groups. The output of the below will be long so let us know if every value is the same or what the range/distribution is.
    num_rows <- vapply(individual_positions$files, function(f) { ParquetFileReader$create(f)$num_rows }, 0, USE.NAMES=FALSE)
    num_row_groups <- vapply(individual_positions$files, function(f) { ParquetFileReader$create(f)$num_row_groups }, 0, USE.NAMES=FALSE)
    

amoeba avatar Feb 15 '24 03:02 amoeba

Hi @amoeba,thanks for reaching out! Sorry that I did not check GitHub for a while.

I'm sorry that I'm unable to share the data at this moment, as it is still academic work in progress. But of course I can provide aggregate information about the dataset to help fix the bug.

  1. The schema:
Schema
user_id: int64
position_id: int64
company_raw: string
company_linkedin_url: string
company_cleaned: string
location_raw: string
region: string
country: string
state: string
metro_area: string
startdate: string
enddate: string
jobtitle_raw: string
mapped_role: string
job_category: string
role_k50: string
role_k150: string
role_k300: string
role_k500: string
role_k1000: string
remote_suitability: float
weight: float
description: string
start_mean_sampled_salary: double
end_mean_sampled_salary: double
seniority: int16
salary: float
rn: int16
rcid: int64
company_name: string
ultimate_parent_rcid: int64
ultimate_parent_company_name: string
onet_code: string
onet_title: string
ticker: string
exchange: string
cusip: string
naics: string
naics_desc: string
final_parent_factset_id: string
final_parent_factset_name: string
  1. Statistics on the parquet files:
> dput(num_rows)
c(720896, 421888, 438272, 454656, 503808, 376832, 380928, 700416, 
491520, 430080, 458752, 405504, 471040, 425984, 425984, 598016, 
425984, 569344, 434176, 327680, 425984, 368640, 425984, 446464, 
425984, 417792, 483328, 454656, 413696, 479232, 491520, 487424, 
430080, 315392, 586567, 413696, 450560, 438272, 491520, 430080, 
421888, 409600, 424274, 442368, 430080, 421888, 438272, 507904, 
667648, 458752, 479232, 425984, 421888, 430080, 409600, 344064, 
430080, 421888, 413696, 503808, 606208, 442368, 421888, 435194, 
450722, 434176, 421888, 700416, 430080, 692224, 405504, 249856, 
483328, 495616, 434176, 397312, 438272, 634880, 421888, 421888, 
425984, 418277, 618496, 434176, 421888, 421888, 417792, 413696, 
430080, 425984, 442368, 409600, 466944, 413696, 385024, 430080, 
581632, 425984, 24576, 413696, 425984, 471040, 466944, 421888, 
413696, 425984, 425984, 463557, 548864, 417792, 421367, 411289, 
200704, 512256, 458752, 430080, 421888, 425728, 421888, 421888, 
413696, 425984, 425984, 405504, 421888, 409600, 458752, 94208, 
356352, 65536, 438355, 365346, 425984, 453920, 671744, 409600, 
416410, 458752, 540672, 442368, 413696, 446464, 434176, 421888, 
434176, 421888, 258048, 417102, 700416, 401408, 466944, 422917, 
458752, 430080, 425984, 421888, 417792, 438272, 425984, 421888, 
393216, 380928, 421888, 446464, 298752, 438272, 435689, 614656, 
454656, 430080, 434720, 417792, 442368, 496960, 430080, 421888, 
368640, 491776, 440563, 430080, 433920, 466944, 417792, 438272, 
427535, 413696, 552960, 761856, 425984, 442368, 523107, 430080, 
421888, 425984, 421888, 434176, 593920, 372736, 421888, 430080, 
430080, 415313, 430080, 458752, 425984, 376832, 442368, 425984, 
417792, 430080, 413696, 442368, 385024, 344064, 421888, 466944, 
540672, 373889, 438272, 438272, 413696, 442368, 413696, 536576, 
421888, 638976, 467578, 421888, 487424, 413696, 405504, 475136, 
544768, 356352, 425984, 143360, 430080, 430080, 438272, 282624, 
442368, 425984, 717858, 385024, 413696, 430080, 425059, 356352, 
499712, 428485, 421888, 385024, 339968, 425984, 421888, 602112, 
421888, 421888, 425984, 417792, 438272, 380928, 643072, 422417, 
212992, 372736, 412213, 466944, 434176, 430080, 434176, 638976, 
417792, 450560, 405504, 643072, 417433, 425984, 466944, 421888, 
442368, 491520, 430080, 425984, 356352, 430080, 421888, 425984, 
442368, 421888, 421888, 540672, 446464, 438272, 462848, 421888, 
421888, 425984, 491776, 429498, 422579, 409344, 393216, 372736, 
421888, 405504, 389120, 397312, 446464, 421888, 430080, 614400, 
413696, 422760, 520192, 471040, 425984, 470110, 417792, 434176, 
118784, 421888, 421888, 442368, 421888, 417792, 413696, 757760, 
421888, 450560, 421888, 430080, 425984, 520192, 417792, 425984, 
430080, 573440, 421888, 430080, 430080, 421888, 413696, 446464, 
389120, 462848, 647168, 479232, 446464, 438272, 532480, 417792, 
430080, 557900, 421888, 425984, 454656, 516096, 425984, 442368, 
479232, 421888, 409600, 569344, 425984, 425984, 433471, 446464, 
421888, 417792, 397312, 438272, 368640, 749568, 421888, 458752, 
401408, 466944, 421888, 593920, 438272, 421888, 405504, 327680, 
434176, 471040, 548864, 425984, 442780, 389120, 389120, 430080, 
425984, 421888, 425984, 598016, 438272, 454656, 552359, 704512, 
393216, 401408, 397728, 413696, 401408, 487424, 544768, 409600, 
356352, 479232, 422057, 462848, 425984, 421888, 421888, 413696, 
446464, 420633, 368640, 458752, 417792, 397312, 516096, 389120, 
401408, 503808, 413696, 413696, 434176, 425984, 446464, 548864, 
425984, 356352, 442368, 572061, 442368, 421888, 614400, 466944, 
420842, 458752, 438272, 462848, 303104, 573440, 430080, 421888, 
446464, 417792, 413696, 417792, 434176, 430080, 528384, 442368, 
425984, 425984, 520192, 438272, 425984, 409600, 425984, 421888, 
434176, 421415, 421888, 450560, 425984, 430080, 237568, 376832, 
418061, 479232, 430080, 430698, 348160, 421888, 413696, 430080, 
434176, 569344, 409600, 425984, 475136, 438272, 192512, 401853, 
372736, 525229, 499712, 454656, 479232, 122880, 417792, 425984, 
421888, 708608, 430080, 430080, 491520, 421888, 462848, 503808, 
425984, 425984, 278528, 475136, 434176, 434176, 425984, 425984, 
425984, 471040, 438272, 425984, 434176, 397312, 466944, 421888, 
696320, 421888, 483328, 463369, 348160, 331776, 393216, 419225, 
397312, 548864, 450560, 458694, 421888, 512000, 413696, 425984, 
540672, 425984, 450560, 417792, 420504, 417792, 430080, 417792, 
428813, 131072, 409600, 452236, 464803, 421888, 415714, 430080, 
348160, 442368, 425984, 734870, 544768, 446464, 425984, 532480, 
417792, 507904, 462848, 442368, 438272, 577536, 393216, 430080, 
430080, 385024, 449275, 438272, 749568, 454656, 417792, 401408, 
453173, 364544, 417792, 417792, 442368, 315392, 430080, 428711, 
400467, 409600, 417792, 438272, 430080, 417792, 417792, 589824, 
466944, 409600, 393216, 491520, 413696, 405504, 466944, 430080, 
483328, 450560, 417792, 204800, 540112, 417792, 454656, 434176, 
700416, 417792, 421888, 458752, 434176, 421888, 425984, 430080, 
413696, 589824, 417792, 421888, 77824, 421888, 425984, 430080, 
389120, 430080, 425984, 425984, 454656, 425984, 577536, 491776, 
425984, 421888, 397056, 421888, 417792, 430080, 425984, 425732, 
438272, 438272, 458752, 471040, 417792, 397312, 401408, 507648, 
425984, 417792, 467200, 458752, 454656, 430080, 425984, 434176, 
393216, 454656, 421888, 155648, 425984, 425984, 450560, 430080, 
421888, 438272, 516096, 471040, 425984, 643072, 430080, 430080, 
430080, 417792, 434176, 446704, 512000, 421888, 249856, 434176, 
466746, 438272, 385024, 425984, 430080, 458752, 544768, 485954, 
446464, 626688, 417792, 401644, 425984, 456211, 421888, 454656, 
413696, 450560, 20480, 430080, 430080, 499712, 430080, 740116, 
413696, 360448, 421888, 430080, 512000, 561152, 421888, 466944, 
482292, 425984, 424987, 430080, 450560, 446464, 430080, 417792, 
28672, 431465, 409600, 446464, 491520, 602112, 417792, 458752, 
454656, 438272, 430080, 425984, 417792, 430080, 430080, 430080, 
491520, 69632, 393216, 471040, 483328, 425984, 507904, 425984, 
424300, 291527, 434176, 630784, 458752, 446464, 409600, 398362, 
413696, 413696, 434176, 421888, 540672, 417792, 421888, 413696, 
438272, 466944, 462848, 425984, 483328, 462848, 454656, 434176, 
442368, 421888, 479232, 421888, 471040, 434176, 425984, 393216, 
417233, 421888, 413696, 487424, 438272, 495616, 397312, 581632, 
397312, 413696, 496964, 417792, 421888, 499712, 352256, 416727, 
253952, 421888, 438272, 434176, 417792, 418176, 462848, 430080, 
425984, 412697, 425984, 298752, 421888, 434176, 602368, 434176, 
421888, 413696, 409600, 655360, 417792, 417792, 409600, 503808, 
434176, 520192, 45056, 601136, 410686, 375029, 405504, 442368, 
425984, 417792, 450560, 413818, 466944, 417792, 430080, 348160, 
438272, 430080, 462848, 466944, 425984, 430080, 425984, 430080, 
417792, 430080, 434176, 434176, 454656, 532480, 352256, 319232, 
446464, 425984, 454912, 425984, 434176, 438272, 655360, 430080, 
409600, 98304, 749568, 540672, 462848, 446464, 434176, 425984, 
471040, 397312, 437445, 290816, 428294, 434176, 491520, 430080, 
430080, 434176, 425984, 421888, 434176, 425984, 462848, 434176, 
425984, 413696, 421888, 425984, 559851, 421888, 618496, 430945, 
409600, 466944, 421888, 393216, 409600, 421888, 491520, 278528, 
393216, 524288, 405504, 425984, 421888, 409600, 417792, 442368, 
430080, 434176, 65536, 475136, 450560, 417792, 446464, 446464, 
440646, 438272, 430080, 425984, 458752, 417792, 421888, 421888, 
737280, 405504, 458752, 454656, 430080, 548864, 421888, 421888, 
417792, 417792, 434176, 630784, 102400, 761856, 94208, 417792, 
446464, 418637, 401408, 573440, 483328, 425984, 28672, 368640, 
412617, 442368, 421888, 425984, 425984, 385024, 417769, 380928, 
446464, 421888, 421888, 426906, 421193, 430080, 425984, 503808, 
421888, 450560, 442368, 421888, 438272, 442368, 454656, 409600, 
499712, 532480, 487424, 45056, 561152, 151552, 430080, 647168, 
421888, 397312, 77824, 438272, 421261, 651264, 376832, 454656, 
73728, 434176, 442368, 540672, 421888, 421888, 651264, 430080, 
430080, 73728, 425984, 421888, 655360, 393216, 495983, 643072, 
614400, 436224, 657152, 466944, 421888, 461293, 438272, 483328, 
458752, 450560, 397312, 409600, 427575, 430080, 512000, 352256, 
405504, 413696, 458752, 425984, 413696, 438272, 425984, 425984, 
430080, 425984, 458752, 457880, 421888, 434176, 417792, 434176, 
409600, 487424, 405504, 454656, 393984, 507648, 458752, 388352, 
495872, 479232, 417792, 438272, 471040, 421888, 441819, 425984, 
503808, 253952, 548864, 446464, 405504, 419033, 454656, 503808, 
409600, 380928, 405504, 421888, 470173, 421888, 421888, 458752, 
425984, 425984, 425984, 438272, 417792, 417792, 450560, 440363, 
389120, 430080, 417792, 536576, 507648, 364544, 393216, 430336, 
446464, 426661, 415116, 425984, 578203, 417792, 577536, 81920, 
311303, 221184, 398772, 462848, 417792, 438272, 421888, 417792, 
368640, 430080, 409600, 421888, 516096, 434176, 430080, 420699, 
430080, 450560, 425984, 417792, 417792, 402417, 417792, 430080, 
458752, 430080, 454656, 561152, 671744, 483328, 409600, 544768, 
593920, 602112, 475136, 192512, 131072, 364544, 421888, 421888, 
425984, 569344, 420923, 417792, 422128, 446464, 430080, 430080, 
430080, 425984, 438272, 421888, 446464, 417792, 425984, 454656, 
421888, 417792, 422329, 425984, 425984, 417792, 528128, 479232, 
344064, 393472, 479232, 585728, 409600, 421888, 471040, 790528, 
364544, 589824, 131072, 98304, 237568, 462848, 446464, 438272, 
421888, 471040, 417792, 448915, 446464, 417792, 417792, 442368, 
413696, 413696, 442368, 434176, 425984, 430080, 421888, 413696, 
462848, 438272, 430080, 446464, 429863, 372736, 733184, 442368, 
487424, 409600, 581632, 411594, 462848, 413696, 421888, 503808, 
499712, 262144, 417792, 401408, 385024, 417792, 461470, 441557, 
434176, 421888, 417792, 413696, 421888, 413422, 409600, 430080, 
434176, 434176, 430080, 428964, 425984, 430080, 430080, 417792, 
426286, 425984, 339968, 589824, 600209, 458752, 434176, 436157, 
430080, 643072, 409600, 634880, 131072, 626688, 282624, 421888, 
45056, 417792, 430080, 475136, 434176, 430080, 430080, 442368, 
434176, 454656, 434176, 430080, 475136, 438272, 421888, 430080, 
434176, 471040, 421888, 438272, 413696, 430592, 421888, 724992, 
409600, 516096, 442531, 368640, 421888, 516513, 548864, 434176, 
208896, 438272, 438272, 430080, 425984, 237568, 438272, 311296, 
409600, 421888, 454656, 438272, 417792, 499712, 430080, 438272, 
442368, 425984, 421888, 430080, 417792, 421888, 434176, 479232, 
434176, 409600, 438272, 692224, 417792, 434176, 499712, 454656, 
581632, 425984, 543570, 364544, 409600, 450560, 430080, 479232, 
452692, 438272, 425984, 458752, 135168, 437541, 413696, 430080, 
413696, 487424, 430080, 421888, 421888, 413696, 421888, 438272, 
417792, 385024, 446464, 466944, 413696, 413696, 421073, 712704, 
438272, 430080, 614400, 483328, 425984, 503808, 425984, 548864, 
417792, 680992, 446464, 434176, 380928, 431377, 425984, 425984, 
643072, 425984, 454656, 458752, 413696, 466944, 405504, 442368, 
425984, 430080, 442368, 438272, 434176, 425984, 413548, 462848, 
614400, 454656, 425984, 417792, 417792, 425984, 417792, 389120, 
425984, 589824, 417792, 552960, 339968, 606208, 442368, 430080, 
507904, 368640, 438272, 503808, 454656, 438272, 602112, 65536, 
434176, 49152, 405504, 430080, 450560, 393216, 450560, 417792, 
425984, 421888, 421888, 430080, 515840, 451959, 413696, 454912, 
425984, 432009, 438272, 417792, 430080, 475136, 434176, 425984, 
389120, 389120, 773847, 421888, 778240, 466944, 409600, 389120, 
638976, 434176, 577536, 114688, 426956, 110592, 413696, 413696, 
450560, 417792, 430080, 440875, 421888, 430080, 434176, 425984, 
430080, 430080, 421888, 515840, 393216, 417792, 463104, 421888, 
442368, 417792, 499712, 421888, 487424, 430080, 286929, 212992, 
592466, 626688, 417792, 475136, 491520, 450560, 435566, 368640, 
409600, 319488, 425984, 440099, 450560, 450560, 419516, 479232, 
438272, 436390, 498325, 471040, 430080, 434176, 487424, 425984, 
446464, 430080, 720896, 430080, 454656, 380928, 589824, 430080, 
487424, 425984, 540672, 200704, 360448, 479232, 413696, 552960, 
495616, 446464, 426474, 317236, 443456, 503808, 425984, 417792, 
138851, 425984, 372736, 434176, 405504, 430080, 450257, 446464, 
423933, 417792, 450560, 417792, 401408, 458752, 598460, 413914, 
434176, 479232, 434176, 417792, 409600, 434176, 592145, 450560, 
528453, 607922, 315778, 610319, 552960, 360448, 430080, 303104, 
418905, 577536, 430080, 425984, 462848, 417792, 446464, 438272, 
417792, 430080, 446464, 442368, 422333, 536162, 411825, 421291, 
434501, 401408, 454656, 421888, 425984, 593920, 471040, 420710, 
446464, 425984, 745107, 528384, 524288, 483328, 266240, 409600, 
487424, 528384, 507904, 131072, 417792, 512000, 421888, 425984, 
413696, 430080, 425984, 491520, 413696, 417792, 421888, 413696, 
421888, 434176, 421888, 430080, 430080, 421156, 421888, 438272, 
417792, 487424, 458752, 434176, 434176, 393216, 372736, 409600, 
585728, 557056, 692430, 421888, 354234, 451701, 450560, 647168, 
397312, 135168, 405504, 421888, 450560, 434176, 421888, 434176, 
425984, 450560, 454656, 425984, 417792, 475136, 417792, 425984, 
425984, 425984, 425984, 458752, 425984, 581632, 438272, 413696, 
466944, 425984, 446464, 438272, 663552, 491520, 524288, 421888, 
425984, 118784, 413270, 393612, 413696, 483328, 69632, 454656, 
176128, 462848, 434176, 458752, 389120, 413696, 434176, 438272, 
434176, 430080, 421888, 430080, 451001, 434176, 430080, 430080, 
429233, 438272, 417792, 421888, 638976, 421888, 417792, 450560, 
421888, 671744, 434176, 573440, 446464, 487358, 487424, 397007, 
262144, 602112, 233472, 466944, 425984, 417792, 438272, 434176, 
421888, 425984, 442368, 421888, 430080, 455980, 425984, 425984, 
413696, 405504, 421888, 439802, 446891, 430080, 475136, 421888, 
430080, 442368, 421888, 565248, 530256, 327680, 450560, 454656, 
442368, 471040, 495771, 413696, 557056, 290816, 540672, 135168, 
413696, 349347, 421888, 446464, 528384, 417792, 430080, 434176, 
425984, 425984, 425984, 417792, 479232, 446464, 405504, 438272, 
421888, 421888, 458752, 425984, 425984, 435013, 446464, 438272, 
360448, 421888, 667648, 409600, 339968, 401408, 446464, 507904, 
425984, 524288, 503808, 487424, 364544, 409600, 438272, 434176, 
466944, 425984, 430080, 413696, 442368, 454656, 409600, 425984, 
401408, 434176, 417792, 405504, 434176, 431797, 466944, 430080, 
425984, 417792, 466944, 413696, 430080, 417792, 581632, 561152, 
491520, 421888, 421888, 449283, 425984, 548864, 503808, 430080, 
114688, 331776, 434176, 462848, 434176, 427280, 466944, 413696, 
417792, 540672, 425984, 409600, 491520, 430080, 417792, 413696, 
430080, 458752, 450560, 442368, 413696, 483328, 434176, 413696, 
536576, 430080, 557056, 700416, 761856, 421888, 479232, 376832, 
425984, 593607, 315392, 430080, 311296, 409600, 163840, 442368, 
450560, 430080, 438272, 417792, 421888, 430080, 425984, 430080, 
425984, 425064, 430080, 421888, 421888, 421888, 421888, 421888, 
430080, 421888, 466944, 361864, 380928, 466944, 425984, 507904, 
458752, 487424, 434176, 413696, 397312, 422989, 532480, 430080, 
536576, 425984, 430080, 425984, 446464, 544768, 466944, 421888, 
454656, 479232, 430080, 421888, 425984, 425984, 430080, 405504, 
442368, 421888, 434029, 421888, 418731, 430080, 430080, 434176, 
401408, 438272, 413696, 364544, 475136, 356352, 298752, 430080, 
483328, 463104, 585728, 438272, 438272, 421888, 387301, 466747, 
462848, 413696, 450560, 430080, 450560, 53248, 348160, 417792, 
446464, 425984, 434176, 417792, 436428, 425984, 430080, 434176, 
417792, 438272, 475136, 417792, 434176, 348160, 430080, 450560, 
565248, 466944, 421888, 425984, 671744, 425984, 461031, 520192, 
458752, 421888, 442368, 397312, 413696, 487424, 430080, 450560, 
430080, 532480, 434176, 458752, 40960, 454656, 430080, 479232, 
540672, 413696, 436192, 499712, 438272, 409600, 425984, 393216, 
413696, 495360, 421888, 413696, 422144, 434176, 430080, 421888, 
493444, 425984, 591240, 445319, 421888, 417792, 475136, 295168, 
453194, 471040, 454400, 470289, 360448, 425984, 409937, 438272, 
561152, 421888, 430080, 475136, 364544, 425984, 434176, 491520, 
413696, 425984, 417792, 417792, 425984, 421888, 425984, 417792, 
507904, 413696, 430080, 389120, 434176, 430080, 450560, 626688, 
352256, 458583, 416347, 582519, 466944, 540672, 450560, 421888, 
233472, 421888, 482941, 425984, 311296, 421888, 438272, 421888, 
401408, 417792, 421888, 499712, 458480, 421888, 421888, 421888, 
442368, 311040, 417792, 438272, 401664, 429807, 417792, 425984, 
430080, 430080, 561152, 671744, 450560, 167936, 398811, 540148, 
450560, 419136, 417792, 434176, 385024, 421888, 483328, 421888, 
281589, 471040, 425984, 438272, 462848, 475136, 421888, 412290, 
430080, 413696, 450560, 479232, 430080, 433388, 430080, 425984, 
667648, 409600, 421888, 405504, 421888, 417792, 475136, 503808, 
430080, 452694, 389120, 462848, 446464, 431122, 475136, 438272, 
569344, 417792, 434176, 421888, 577536, 462848, 421888, 36864, 
524288, 443754, 475136, 487424, 421888, 421888, 435432, 430080, 
417792, 389120, 425984, 417792, 457990, 417792, 450560, 459008, 
417792, 417792, 421888, 352256, 434176, 491520, 667648, 421888, 
438272, 458752, 591404, 425984, 475136, 352256, 430080, 12288, 
462848, 348160, 417792, 430080, 417792, 430129, 490096, 512000, 
417792, 520192, 483328, 380928, 8192, 434176, 421888, 405504, 
364544, 425984, 438272, 524288, 430080, 446464, 417792, 425984, 
413696, 434176, 409344, 569344, 241664, 590080, 425984, 451253, 
413696, 434176, 425984, 455365, 446464, 421888, 315392, 491520, 
421036, 454656, 389120, 393257, 430080, 471040, 479232, 425984, 
348160, 413696, 405504, 450560, 393216, 417792, 462848, 524288, 
450560, 331776, 394766, 421888, 430080, 454656, 421888, 417792, 
409600, 306944, 434176, 438272, 532736, 438272, 434176, 401408, 
430080, 376832, 442368, 442368, 434176, 159744, 421888, 430080, 
484741, 487424, 405504, 434176, 602112, 417792, 450560, 421888, 
446464, 425984, 446464, 753664, 434176, 385024, 438272, 413696, 
434721, 471040, 430080, 679936, 434687, 425984, 81920, 409600, 
630442, 405504, 423788, 528384, 442368, 430080, 454656, 442368, 
434176, 286720, 401408, 425984, 462848, 428217, 430080, 446464, 
421888, 472926, 417792, 425984, 573440, 417792, 421888, 598016, 
414414, 589824, 430080, 425984, 528384, 319488, 434176, 421888, 
463851, 462848, 409600, 417792, 549204, 425984, 413696, 561152, 
499712, 421888, 450556, 94208, 417792, 507728, 466944, 413696, 
159744, 421888, 421888, 462848, 421888, 461822, 417792, 385024, 
577536, 446464, 417792, 468012, 487424, 413696, 421888, 413696, 
430080, 413696, 417792, 430080, 413696, 430080, 765952, 425984, 
425984, 475715, 434176, 475136, 475136, 421888, 520192, 397312, 
421888, 424754, 516096, 421888, 425984, 20480, 425984, 339968, 
450560, 405644, 385484, 462848, 544768, 397537, 421888, 481140, 
434176, 425984, 540672, 348160, 425984, 225280, 430080, 438903, 
458752, 425984, 450560, 417792, 425984, 425984, 413696, 438272, 
541787, 425984, 602112, 409600, 450560, 471040, 389120, 409600, 
544657, 397312, 434176, 433597, 405504, 427994, 430080, 413696, 
655360, 417792, 333050, 622592, 421888, 483328, 557056, 425984, 
421888, 425984, 441361, 446464, 483328, 430080, 405504, 430080, 
421888, 479232, 417792, 430080, 430080, 442368, 745472, 430080, 
417792, 421888, 425984, 421888, 540672, 368640, 417792, 57344, 
593920, 557056, 458752, 430080, 442368, 430080, 385024, 413696, 
442368, 425984, 638976, 421888, 450560, 405504, 421888, 507904, 
417792, 442368, 360448, 438272, 442368, 405504, 418880, 466944, 
430080, 487424, 450560, 495616, 425984, 557056, 425984, 421888, 
421888, 450560, 421888, 389120, 442368, 605309, 45056, 446464, 
425984, 466944, 565248, 421888, 417792, 523246, 323584, 421888, 
434176, 462848, 430080, 430080, 413696, 421888, 450560, 421888, 
421888, 425984, 385024, 425984, 430080, 438272, 425984, 425984, 
417792, 344064, 446464, 432996, 425984, 425984, 442368, 489108, 
325607, 515840, 417792, 487424, 418048, 442368, 417792, 418557, 
73728, 430080, 494894, 450560, 450560, 90112, 438272, 438272, 
471040, 454656, 438272, 422198, 413696, 421888, 417792, 425984, 
307200, 434176, 421888, 421888, 442368, 425984, 430080, 421888, 
454656, 421888, 450560, 462848, 425984, 417792, 405504, 417792, 
503808, 425984, 430080, 450560, 425984, 344064, 417792, 434176, 
434176, 389120, 442368, 306944, 421888, 315743, 405760, 421888, 
487424, 434176, 344064, 475136, 466944, 430080, 430080, 524288, 
425984, 454656, 102400, 417792, 458752, 385024, 430080, 434176, 
425984, 421888, 425984, 417792, 610304, 471040, 435610, 466944, 
622592, 421888, 434176, 450560, 413696, 647168, 430080, 471040, 
28672, 462848, 425984, 425984, 434176, 478976, 385024, 524288, 
372992, 421888, 4096, 425984, 425984, 434176, 487424, 425984, 
454656, 155648, 438272, 417792, 425984, 413696, 438272, 458752, 
466944, 356352, 417792, 679936, 413696, 421888, 413696, 417792, 
421888, 421888, 434176, 466944, 503808, 466944, 430080, 32768, 
528384, 421888, 447392, 417792, 593920, 468730, 422271, 446464, 
430080, 364544, 421888, 434717, 425820, 342737, 451475, 438272, 
434176, 438272, 28672, 425984, 425984, 466944, 450560, 438272, 
446464, 462848, 532480, 425984, 430080, 430080, 430080, 360448, 
438272, 430080, 421888, 499712, 430080, 423324, 126976, 458752, 
577536, 450560, 421888, 466944, 430080, 655360, 417792, 446464, 
581632, 552960, 425984, 425984, 430080, 421888, 438272, 446464, 
421888, 385024, 421888, 393216, 438272, 475136, 421888, 90112, 
352256, 516096, 438272, 430080, 468421, 417792, 393216, 524288, 
417792, 430080, 331776, 413696, 548634, 450560, 356352, 438272, 
430080, 458752, 421888, 413696, 430080, 503808, 425984, 430080, 
159744, 434176, 327680, 438272, 442368, 586945, 425984, 425984, 
430080, 425984, 393216, 425984, 425984, 360448, 421888, 438272, 
614400, 421888, 425984, 442368, 299008, 356352, 499741, 446464, 
417792, 237870, 417792, 441796, 466944, 454656, 413440, 446464, 
430080, 425984, 425984, 421888, 569344, 421888, 425984, 434176, 
356352, 421888, 434176, 430080, 430080, 417792, 438272, 331776, 
417792, 413696, 552960, 417792, 532480, 417792, 434176, 442368, 
425984, 425984, 438272, 421888, 425984, 454656, 421888, 331520, 
413696, 417792, 557312, 430080, 466944, 421888, 421888, 442368, 
421888, 417792, 425984, 540672, 420006, 425984, 409600, 430080, 
434176, 417792, 417792, 417792, 471040, 417792, 466944, 425984, 
378453, 381987, 425984, 544768, 729088, 446464, 421888, 529136, 
425984, 557056, 434176, 413696, 167936, 425984, 360448, 466944, 
417792, 667648, 446464, 409600, 425984, 438272, 53248, 532480, 
434176, 421888, 16384, 425984, 439270, 425984, 430080, 431489, 
398672, 442368, 421888, 430080, 417792, 430080, 460523, 331776, 
421888, 450560, 706684, 421888, 450560, 491520, 421888, 430080, 
548864, 741376, 430080, 45056, 421888, 507648, 471040, 417792, 
577792, 430080, 425984, 413696, 421888, 491971, 421888, 544768, 
339968, 212992, 425984, 442368, 417792, 358024, 430080, 417792, 
409600, 360448, 446464, 425984, 420348, 421888, 667275, 417792, 
454656, 450764, 434176, 474735, 491520, 430080, 466944, 98908, 
425984, 442368, 516096, 458752, 286720, 434176, 450257, 417792, 
450560, 425984, 421888, 434176, 286720, 425984, 483328, 430080, 
446464, 417792, 446464, 159744, 425984, 495616, 425984, 460118, 
413696, 407356, 630784, 499711, 425984, 479232, 446464, 434176, 
431936, 500160, 430080, 126976, 458752, 409600, 475136, 724992, 
380928, 425984, 458752, 430080, 421888, 417792, 286084, 451700, 
336060, 413696, 442368, 352256, 413696, 421888, 430080, 417792, 
438272, 401408, 665750, 417792, 479232, 376832, 442368, 401408, 
417792, 442368, 421888, 425984, 499712, 454656, 417792, 651264, 
401408, 417792, 81912, 614400, 479232, 479232, 438272, 487424, 
450560, 552960, 417792, 425783, 454656, 446464, 421428, 434176, 
401408, 352256, 483328, 442368, 446464, 415296, 425984, 430080, 
430080, 241664, 446464, 425984, 421888, 663552, 421888, 430080, 
434176, 487424, 421888, 491520, 417792, 442368, 180224, 475136, 
425984, 430080, 417792, 421888, 372736, 417792, 427554, 417792, 
684032, 483328, 548864, 57344, 622592, 442368, 446464, 335872, 
573440, 438272, 430080, 421888, 454656, 438272, 483328, 495616, 
425984, 487424, 421888, 450560, 446464, 446464, 425984, 499712, 
430080, 417792, 303104, 360448, 425984, 417792, 327680, 438272, 
442368, 552960, 417792, 385024, 421888, 430080, 643072, 528384, 
757760, 421888, 385024, 425984, 438272, 450560, 446464, 446464, 
528384, 532480, 479232, 425984, 417792, 610304, 413696, 438272, 
577536, 450560, 430080, 487424, 421888, 405504, 122880, 425984
)
> dput(num_row_groups)
c(5, 7, 7, 7, 5, 6, 6, 5, 8, 7, 7, 4, 3, 7, 7, 8, 6, 4, 7, 3, 
7, 3, 6, 4, 7, 6, 8, 6, 7, 7, 8, 8, 7, 2, 5, 3, 7, 7, 3, 7, 7, 
7, 7, 5, 7, 7, 4, 8, 4, 7, 7, 7, 4, 7, 5, 2, 7, 7, 6, 4, 5, 7, 
7, 7, 7, 7, 7, 8, 7, 5, 7, 3, 8, 5, 7, 6, 7, 7, 7, 7, 7, 7, 5, 
6, 6, 7, 3, 7, 7, 7, 6, 7, 7, 7, 4, 7, 5, 7, 1, 7, 4, 7, 6, 7, 
7, 7, 6, 4, 8, 6, 7, 4, 3, 4, 6, 7, 6, 4, 7, 7, 7, 7, 7, 4, 6, 
5, 7, 1, 6, 1, 7, 6, 7, 5, 4, 7, 7, 7, 5, 5, 7, 7, 7, 7, 7, 7, 
3, 7, 5, 5, 7, 7, 5, 7, 7, 7, 7, 7, 7, 7, 3, 3, 7, 7, 2, 7, 7, 
4, 7, 7, 7, 7, 7, 7, 7, 7, 1, 4, 7, 5, 3, 7, 7, 7, 7, 5, 6, 5, 
6, 6, 4, 7, 7, 7, 7, 7, 5, 6, 7, 7, 7, 7, 7, 4, 7, 6, 3, 7, 7, 
7, 6, 7, 5, 2, 7, 7, 6, 4, 7, 7, 7, 7, 7, 5, 7, 6, 3, 7, 7, 7, 
6, 5, 7, 3, 7, 2, 7, 7, 7, 2, 7, 7, 3, 6, 6, 7, 7, 6, 3, 7, 7, 
6, 2, 7, 7, 3, 7, 7, 7, 6, 7, 4, 4, 7, 1, 3, 6, 5, 7, 7, 7, 5, 
7, 7, 6, 4, 7, 7, 5, 7, 7, 8, 7, 7, 1, 4, 7, 7, 5, 7, 7, 5, 6, 
7, 4, 7, 7, 7, 4, 7, 5, 4, 6, 6, 7, 6, 6, 7, 7, 7, 5, 5, 7, 5, 
6, 7, 7, 4, 7, 7, 1, 7, 7, 7, 7, 7, 6, 5, 7, 7, 6, 7, 7, 4, 7, 
7, 6, 5, 7, 7, 3, 7, 7, 7, 6, 7, 8, 4, 6, 7, 4, 7, 7, 8, 7, 7, 
1, 7, 7, 7, 4, 7, 6, 5, 7, 7, 7, 7, 7, 7, 6, 7, 3, 5, 7, 7, 6, 
7, 7, 7, 7, 7, 7, 2, 7, 7, 5, 7, 7, 6, 6, 7, 7, 7, 7, 7, 7, 7, 
5, 4, 6, 5, 6, 7, 5, 7, 4, 7, 3, 6, 5, 6, 7, 7, 7, 7, 6, 7, 4, 
6, 7, 5, 4, 6, 6, 5, 7, 7, 7, 7, 7, 8, 7, 6, 7, 5, 7, 7, 5, 7, 
7, 7, 7, 7, 3, 5, 7, 7, 6, 7, 7, 7, 6, 7, 3, 7, 7, 7, 6, 7, 7, 
4, 7, 7, 7, 7, 7, 7, 7, 7, 3, 2, 7, 7, 3, 7, 6, 7, 7, 7, 7, 5, 
6, 4, 4, 7, 2, 6, 6, 3, 8, 7, 7, 1, 7, 7, 7, 6, 7, 7, 4, 7, 7, 
8, 7, 6, 4, 7, 7, 7, 7, 7, 6, 4, 7, 7, 5, 6, 7, 7, 5, 7, 7, 6, 
6, 3, 6, 7, 6, 6, 7, 7, 7, 4, 7, 7, 5, 7, 6, 7, 7, 7, 6, 7, 7, 
2, 7, 7, 7, 7, 7, 7, 2, 7, 7, 5, 5, 7, 7, 5, 7, 8, 7, 7, 4, 4, 
6, 6, 7, 4, 7, 7, 4, 7, 7, 5, 7, 6, 7, 7, 7, 3, 7, 6, 6, 6, 7, 
7, 7, 7, 7, 5, 2, 6, 5, 5, 6, 5, 7, 6, 7, 7, 7, 1, 3, 7, 6, 5, 
3, 6, 7, 5, 7, 7, 7, 6, 7, 2, 7, 7, 1, 7, 7, 7, 5, 7, 7, 6, 7, 
7, 4, 4, 7, 5, 3, 7, 7, 7, 7, 3, 4, 7, 7, 6, 7, 6, 6, 5, 7, 7, 
4, 6, 7, 7, 7, 5, 6, 7, 7, 3, 7, 7, 7, 7, 7, 7, 4, 4, 7, 3, 3, 
7, 7, 7, 7, 5, 6, 7, 3, 5, 7, 7, 6, 7, 7, 7, 4, 7, 7, 5, 7, 7, 
7, 6, 7, 7, 7, 7, 1, 7, 7, 6, 7, 4, 7, 2, 5, 7, 4, 3, 7, 7, 7, 
7, 7, 7, 7, 6, 5, 6, 1, 7, 6, 5, 7, 3, 7, 7, 5, 7, 7, 7, 7, 7, 
7, 7, 4, 1, 6, 5, 7, 7, 7, 7, 7, 1, 7, 6, 7, 7, 5, 5, 7, 7, 7, 
7, 4, 7, 7, 7, 7, 7, 7, 6, 5, 7, 7, 3, 7, 7, 7, 6, 4, 7, 7, 4, 
3, 7, 7, 6, 7, 7, 6, 5, 3, 7, 6, 6, 7, 7, 2, 7, 4, 7, 7, 7, 7, 
7, 6, 7, 7, 5, 7, 2, 7, 7, 5, 7, 7, 7, 6, 4, 7, 7, 4, 2, 7, 6, 
1, 3, 3, 5, 5, 7, 7, 7, 7, 7, 6, 7, 7, 3, 7, 7, 7, 7, 7, 7, 7, 
7, 7, 6, 7, 7, 7, 4, 6, 2, 5, 7, 3, 7, 7, 6, 1, 7, 4, 2, 6, 5, 
6, 6, 4, 7, 6, 6, 7, 3, 6, 7, 8, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 
7, 7, 7, 4, 7, 5, 4, 7, 3, 7, 6, 6, 7, 5, 3, 5, 4, 6, 7, 6, 6, 
7, 6, 6, 7, 1, 7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 4, 6, 4, 
6, 7, 3, 7, 7, 7, 7, 7, 7, 2, 5, 1, 7, 6, 5, 5, 6, 7, 7, 1, 6, 
7, 7, 7, 7, 7, 6, 7, 4, 7, 7, 7, 7, 7, 7, 7, 4, 7, 4, 3, 7, 4, 
7, 7, 5, 7, 2, 7, 1, 2, 1, 7, 1, 6, 6, 2, 7, 7, 1, 6, 6, 2, 7, 
7, 2, 7, 4, 1, 7, 7, 2, 7, 7, 1, 5, 4, 1, 4, 5, 1, 5, 7, 4, 7, 
6, 6, 4, 1, 7, 7, 7, 5, 6, 6, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 6, 8, 6, 6, 2, 4, 6, 4, 4, 4, 7, 7, 6, 7, 5, 7, 5, 
2, 6, 6, 6, 7, 7, 7, 6, 5, 6, 7, 7, 6, 6, 7, 7, 7, 7, 7, 7, 7, 
7, 7, 6, 7, 7, 5, 4, 2, 4, 5, 4, 7, 7, 7, 6, 6, 8, 2, 2, 1, 5, 
7, 6, 7, 7, 7, 6, 7, 5, 7, 7, 7, 7, 7, 7, 6, 7, 7, 7, 5, 7, 7, 
7, 7, 4, 5, 4, 5, 5, 3, 8, 7, 7, 2, 2, 3, 4, 7, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 4, 2, 2, 3, 
5, 4, 6, 7, 7, 8, 6, 8, 1, 1, 2, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 
6, 7, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 2, 4, 4, 5, 6, 6, 7, 
7, 7, 7, 4, 3, 3, 7, 4, 5, 7, 7, 7, 7, 7, 6, 7, 7, 7, 5, 7, 7, 
7, 7, 7, 7, 7, 7, 7, 7, 7, 2, 5, 5, 5, 6, 3, 7, 7, 4, 7, 1, 6, 
2, 7, 1, 7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 
7, 7, 7, 7, 5, 6, 5, 7, 2, 6, 7, 4, 7, 2, 6, 5, 5, 6, 2, 7, 3, 
6, 7, 6, 7, 7, 8, 7, 7, 7, 7, 7, 5, 7, 7, 7, 7, 7, 6, 7, 6, 4, 
7, 6, 5, 5, 7, 4, 2, 3, 4, 7, 7, 5, 4, 7, 7, 2, 7, 4, 6, 7, 7, 
7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 5, 7, 7, 5, 7, 6, 6, 6, 
5, 6, 5, 4, 7, 5, 7, 7, 7, 6, 7, 5, 7, 7, 7, 6, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 5, 7, 7, 6, 7, 7, 7, 6, 7, 7, 7, 5, 5, 5, 5, 7, 4, 
4, 7, 8, 4, 7, 5, 1, 7, 1, 6, 7, 7, 6, 7, 6, 7, 7, 7, 7, 5, 7, 
7, 3, 6, 5, 7, 7, 7, 8, 7, 7, 3, 5, 5, 6, 5, 7, 5, 5, 6, 7, 6, 
1, 7, 1, 7, 7, 7, 7, 7, 6, 7, 7, 6, 7, 7, 5, 7, 4, 4, 7, 4, 7, 
6, 7, 6, 7, 5, 7, 2, 1, 5, 5, 6, 4, 7, 7, 5, 3, 7, 3, 7, 7, 7, 
7, 7, 6, 7, 6, 8, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 4, 8, 7, 7, 7, 
5, 3, 2, 5, 6, 4, 7, 7, 7, 4, 7, 4, 6, 7, 2, 7, 6, 5, 7, 7, 6, 
7, 7, 7, 7, 7, 6, 7, 5, 7, 7, 4, 7, 7, 7, 7, 5, 5, 6, 4, 2, 5, 
7, 6, 4, 3, 6, 7, 6, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 
6, 7, 7, 7, 5, 6, 7, 4, 7, 5, 8, 5, 7, 3, 5, 7, 6, 7, 1, 7, 1, 
7, 7, 6, 7, 6, 8, 7, 7, 7, 7, 7, 7, 7, 7, 4, 7, 7, 7, 7, 5, 7, 
7, 5, 6, 2, 6, 5, 4, 6, 6, 6, 7, 7, 6, 6, 1, 6, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 4, 6, 7, 6, 7, 4, 7, 5, 
4, 8, 6, 7, 1, 5, 5, 6, 8, 1, 7, 1, 7, 7, 7, 6, 7, 5, 7, 7, 6, 
7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 4, 7, 6, 6, 7, 4, 7, 5, 6, 7, 6, 
6, 1, 7, 2, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 
7, 7, 6, 7, 6, 6, 7, 5, 4, 2, 6, 7, 3, 7, 8, 7, 8, 1, 5, 1, 6, 
2, 7, 7, 3, 7, 7, 6, 7, 7, 7, 6, 6, 7, 6, 6, 7, 7, 7, 7, 7, 7, 
7, 7, 2, 7, 5, 5, 2, 6, 7, 5, 7, 7, 8, 6, 1, 3, 7, 7, 5, 7, 6, 
7, 7, 7, 7, 7, 6, 7, 7, 3, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 5, 5, 
4, 6, 6, 5, 7, 6, 8, 4, 1, 3, 6, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 7, 6, 7, 7, 7, 6, 6, 8, 7, 5, 5, 5, 5, 6, 5, 7, 8, 
3, 7, 1, 6, 1, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 
7, 7, 7, 7, 5, 5, 4, 7, 4, 4, 4, 5, 7, 5, 7, 6, 7, 5, 7, 7, 1, 
5, 4, 5, 7, 6, 7, 7, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 7, 6, 
4, 7, 6, 6, 6, 2, 7, 5, 4, 5, 4, 7, 7, 6, 7, 7, 4, 2, 7, 7, 1, 
6, 7, 7, 7, 7, 7, 7, 7, 7, 6, 6, 7, 7, 7, 7, 2, 7, 7, 5, 7, 7, 
7, 2, 7, 4, 4, 2, 6, 7, 4, 7, 5, 7, 7, 7, 7, 7, 7, 1, 7, 7, 7, 
7, 7, 7, 8, 7, 7, 7, 5, 6, 3, 7, 7, 5, 7, 7, 7, 3, 7, 7, 3, 7, 
7, 7, 2, 7, 7, 4, 7, 4, 7, 6, 6, 6, 7, 7, 6, 6, 7, 7, 8, 6, 7, 
7, 7, 7, 4, 7, 7, 5, 6, 7, 6, 7, 7, 4, 5, 6, 7, 6, 6, 7, 6, 4, 
7, 1, 7, 6, 7, 3, 7, 7, 7, 6, 7, 7, 8, 7, 7, 7, 7, 7, 2, 7, 7, 
4, 7, 7, 7, 7, 7, 7, 5, 7, 1, 6, 5, 5, 7, 4, 6, 3, 7, 6, 7, 3, 
7, 7, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 6, 7, 6, 7, 7, 6, 5, 7, 
7, 5, 7, 2, 5, 4, 6, 7, 3, 7, 4, 7, 6, 6, 8, 6, 7, 1, 5, 7, 7, 
7, 7, 5, 5, 7, 7, 4, 7, 7, 4, 7, 7, 5, 7, 7, 7, 2, 7, 5, 5, 7, 
7, 7, 5, 7, 6, 2, 7, 1, 7, 6, 6, 7, 7, 7, 8, 7, 7, 4, 8, 6, 1, 
7, 7, 5, 2, 7, 7, 4, 7, 7, 7, 7, 6, 7, 2, 7, 3, 4, 7, 7, 7, 7, 
7, 7, 7, 7, 1, 4, 7, 6, 3, 4, 7, 7, 6, 6, 6, 6, 7, 6, 5, 7, 7, 
5, 7, 4, 4, 7, 7, 7, 7, 6, 3, 2, 7, 6, 6, 6, 7, 6, 6, 6, 5, 6, 
7, 3, 7, 6, 6, 7, 6, 7, 5, 7, 7, 4, 7, 7, 6, 5, 7, 6, 6, 6, 7, 
5, 7, 6, 6, 7, 1, 6, 6, 6, 7, 4, 7, 7, 7, 7, 7, 3, 6, 7, 7, 7, 
7, 7, 7, 7, 7, 7, 4, 7, 7, 6, 7, 5, 6, 7, 4, 2, 7, 7, 7, 7, 6, 
6, 2, 7, 6, 2, 4, 7, 7, 2, 7, 3, 7, 7, 3, 7, 7, 6, 7, 7, 7, 6, 
5, 7, 6, 4, 7, 7, 7, 7, 7, 4, 7, 7, 6, 7, 5, 4, 7, 6, 7, 4, 7, 
6, 4, 5, 7, 7, 7, 7, 7, 1, 7, 5, 7, 7, 5, 7, 5, 5, 6, 4, 7, 7, 
7, 6, 6, 1, 7, 3, 7, 7, 7, 7, 7, 7, 7, 7, 5, 7, 5, 5, 7, 4, 6, 
6, 7, 6, 7, 3, 5, 7, 6, 7, 5, 5, 3, 3, 7, 6, 4, 7, 7, 7, 7, 7, 
7, 7, 6, 7, 7, 7, 7, 7, 7, 7, 5, 7, 7, 4, 6, 7, 6, 2, 7, 1, 4, 
5, 7, 7, 4, 7, 6, 7, 6, 7, 3, 7, 7, 4, 7, 4, 7, 7, 5, 7, 7, 6, 
7, 7, 7, 7, 4, 7, 7, 4, 7, 7, 7, 6, 7, 4, 7, 5, 1, 6, 4, 6, 5, 
7, 7, 4, 3, 7, 6, 7, 7, 7, 6, 7, 7, 7, 7, 5, 6, 7, 4, 7, 7, 7, 
7, 2, 7, 6, 5, 7, 7, 7, 5, 5, 7, 7, 6, 4, 7, 7, 1, 7, 7, 4, 4, 
1, 6, 5, 7, 7, 7, 7, 7, 7, 7, 7, 3, 7, 7, 7, 7, 7, 7, 7, 6, 6, 
7, 4, 7, 7, 3, 7, 5, 7, 7, 7, 7, 6, 7, 7, 7, 6, 7, 2, 7, 4, 4, 
7, 7, 7, 2, 7, 7, 7, 7, 6, 7, 7, 1, 7, 7, 3, 7, 7, 7, 7, 7, 7, 
5, 7, 7, 6, 4, 7, 6, 5, 7, 1, 7, 7, 1, 7, 7, 7, 7, 4, 6, 5, 4, 
7, 1, 7, 7, 7, 3, 7, 7, 1, 7, 7, 7, 7, 6, 7, 7, 6, 7, 5, 7, 7, 
4, 7, 7, 7, 7, 7, 8, 4, 7, 1, 4, 7, 7, 7, 4, 7, 7, 5, 7, 6, 7, 
7, 7, 4, 6, 7, 7, 7, 1, 7, 7, 4, 7, 7, 4, 7, 4, 7, 7, 6, 7, 6, 
7, 7, 7, 8, 7, 6, 2, 6, 4, 7, 7, 7, 7, 4, 7, 7, 2, 4, 7, 7, 7, 
7, 7, 7, 7, 6, 7, 5, 7, 7, 7, 1, 6, 5, 6, 7, 5, 7, 6, 8, 7, 7, 
4, 6, 5, 7, 6, 4, 6, 6, 7, 7, 7, 8, 7, 7, 1, 7, 2, 7, 7, 4, 7, 
7, 7, 7, 6, 7, 7, 2, 7, 7, 4, 6, 7, 7, 5, 6, 7, 7, 7, 3, 7, 4, 
7, 7, 5, 7, 7, 7, 7, 7, 5, 7, 6, 7, 6, 7, 7, 7, 7, 6, 7, 2, 7, 
7, 5, 7, 5, 7, 5, 6, 7, 7, 7, 7, 7, 7, 7, 2, 6, 5, 4, 7, 6, 7, 
7, 7, 7, 7, 7, 4, 7, 7, 7, 7, 7, 6, 7, 7, 7, 7, 7, 7, 2, 6, 7, 
3, 4, 7, 7, 4, 7, 6, 7, 5, 3, 7, 2, 7, 7, 3, 6, 2, 7, 6, 1, 6, 
7, 7, 1, 7, 6, 5, 7, 7, 6, 6, 7, 7, 7, 7, 7, 2, 7, 7, 3, 7, 7, 
8, 7, 7, 2, 5, 7, 1, 6, 4, 7, 7, 3, 7, 7, 7, 6, 8, 5, 7, 3, 2, 
7, 7, 7, 6, 7, 6, 5, 6, 7, 7, 7, 7, 4, 7, 7, 3, 7, 7, 8, 7, 7, 
2, 7, 7, 7, 4, 2, 7, 4, 3, 7, 7, 7, 7, 4, 7, 6, 6, 7, 7, 7, 2, 
7, 8, 3, 7, 7, 7, 4, 7, 7, 6, 4, 7, 6, 8, 7, 2, 7, 7, 7, 5, 3, 
7, 7, 7, 7, 7, 2, 7, 5, 3, 5, 6, 7, 7, 7, 7, 7, 5, 5, 7, 6, 3, 
4, 4, 6, 3, 7, 7, 8, 7, 7, 1, 4, 6, 1, 4, 7, 7, 4, 7, 7, 8, 6, 
7, 3, 4, 6, 7, 3, 6, 8, 7, 7, 7, 6, 7, 7, 2, 7, 7, 5, 4, 7, 7, 
5, 8, 7, 8, 7, 7, 3, 4, 7, 7, 3, 7, 5, 7, 7, 5, 6, 7, 8, 1, 4, 
7, 4, 3, 6, 7, 7, 7, 7, 4, 6, 7, 6, 4, 7, 7, 3, 6, 5, 8, 6, 7, 
4, 6, 7, 6, 2, 6, 6, 4, 7, 6, 6, 7, 7, 7, 5, 7, 3, 5, 6, 6, 7, 
7, 2, 5, 7, 7, 7, 5, 7, 7, 5, 7, 7, 7, 7, 7, 2, 7)

Hope these could help :)

chenyiwrites avatar Mar 01 '24 09:03 chenyiwrites

I'm getting the same error message when trying to count or run distinct() on a 1B+ row parquet dataset. Adding to_duckdb() fixed it. I also can't share the dataset now, but here is the metadata requested.

> ds$schema
Schema
cvr_id: double
precinct: string
pres: string
pid: string
column: double
item: string
choice: string
choice_id: double
office_type: string
dist: string
party: string
incumbent: double
measure: double
place: string
topic: string
unexp_term: double
num_votes: double
state: string
county: string

See $metadata for additional Schema metadata
> 
> n_rows = vapply(ds$files, function(f) { ParquetFileReader$create(f)$num_rows }, 0, USE.NAMES=FALSE)
> n_rowgrps = vapply(ds$files, function(f) { ParquetFileReader$create(f)$num_row_groups }, 0, USE.NAMES=FALSE)
> summary(n_rows); sum(n_rows)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
     2588    177040    500624   2773980   1776367 140958860 
[1] 1137331911
> summary(n_rowgrps); sum(n_rowgrps)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.00    6.00   16.00   85.16   54.75 4302.00 
[1] 34916
> packageVersion("arrow")
[1] ‘17.0.0.1’

kuriwaki avatar Oct 17 '24 13:10 kuriwaki

I'm seeing the same issue when filtering a ~400M row dataset to remove rows where a column is duplicated. I'm running R 4.4.1 with Arrow 17.0.0.1 on mac. Is there another way do do this within Arrow that I'm missing? Here's the code that produces the error:

  group_by(timestamp) |> 
  mutate(duplicate = n()) |> 
  filter(duplicate == 1) |> 
  ungroup()

Converting to_duckdb() gets around this error, but is erroring out due to lack of memory. One solution there is to let duckDB work on disk, but that adds more time.

blongworth avatar Oct 25 '24 13:10 blongworth

This issue is definitely related to data size. Breaking it up into groups smaller than 170M rows works with my data.

Here's a working duckDB solution:

con <- dbConnect(duckdb(), dbdir = "my-db.duckdb", read_only = FALSE)

ds_filt <- ds_filt |> 
  to_duckdb(con = con) |> 
  group_by(timestamp) |> 
  mutate(duplicate = n()) |> 
  filter(duplicate == 1) |> 
  ungroup() |> 
  to_arrow()

duckDB needs an on-disk store to finish this, at least on my 32GB mac.

blongworth avatar Oct 25 '24 15:10 blongworth

@blongworth Since DuckDB can now natively read Parquet datasets, it is possible that the arrow package is not needed here; would it not work if I ran duckdb alone without the arrow package? https://duckdb.org/docs/api/r#dbplyr

eitsupi avatar Oct 26 '24 03:10 eitsupi

@blongworth are you able to share your data? And, just to be clear, you get the same type or error as the OP? "! Invalid: Negative buffer resize: -2147483584"

amoeba avatar Oct 26 '24 04:10 amoeba

Here's the full error, so same as OP:

Error in `compute.arrow_dplyr_query()`:
! Invalid: Negative buffer resize: -2147483584
Backtrace:
 1. dplyr::collect(...)
 2. arrow:::collect.arrow_dplyr_query(...)
 3. arrow:::compute.arrow_dplyr_query(x)

Here's some info about the data:

> nrow(ds)
[1] 400909276
> schema(ds)
Schema
timestamp: timestamp[us, tz=UTC]
pressure: double
u: double
v: double
w: double
amp1: int32
amp2: int32
amp3: int32
corr1: int32
corr2: int32
corr3: int32
temp: double
DO_percent: double
ph_counts: int32
ox_umol_l: double
pH: double
duplicate: int32
pH_cal: double
ox_umol_l_cal: double
year: int32
month: int32

blongworth avatar Oct 28 '24 11:10 blongworth

Hey all, ran into the same issue, and I actually can share the dataset :) 500M rows, spread across a bunch of gzipped CSV files, approx 5.x GB. Where do I put them?

Query:


ds_multi_follows_final = open_dataset("./multi_follows_final",
                                      format="csv",
                                      schema = schema(
                                        did=arrow::utf8(),
                                        multi_follow_id=arrow::uint64(),
                                        follow_created_at=arrow::utf8(),
                                        follow_subject=arrow::utf8(),
                                        sp_list_uri=arrow::utf8(),
                                        match_score=arrow::float64()
                                      ),
                                      skip=1)

ds_multi_follows_final %>%
  group_by(did,follow_subject) %>%
  summarize() %>%
  collect() %>%
  write_csv("all_multifollow_edges.csv.gz")

Backtrace:

> rlang::last_trace(drop=FALSE)
<error/rlang_error>
Error in `compute.arrow_dplyr_query()`:
! Invalid: Negative buffer resize: -2147483584
---
Backtrace:
     ▆
  1. ├─... %>% write_csv("all_multifollow_edges.csv.gz")
  2. ├─readr::write_csv(., "all_multifollow_edges.csv.gz")
  3. │ └─readr::write_delim(...)
  4. │   ├─base::stopifnot(is.data.frame(x))
  5. │   └─base::is.data.frame(x)
  6. ├─dplyr::collect(.)
  7. └─arrow:::collect.arrow_dplyr_query(.)
  8.   └─arrow:::compute.arrow_dplyr_query(x)
  9.     └─base::tryCatch(...)
 10.       └─base (local) tryCatchList(expr, classes, parentenv, handlers)
 11.         └─base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 12.           └─value[[3L]](cond)
 13.             └─arrow:::augment_io_error_msg(e, call, schema = schema())
 14.               └─rlang::abort(msg, call = call)

Session info:

> sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 22.04.5 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8       
 [4] LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: Etc/UTC
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] tidyjson_0.3.2  scales_1.3.0    stringr_1.5.1   readr_2.1.5     xtable_1.8-4    forcats_1.0.0  
 [7] lubridate_1.9.4 tidyr_1.3.1     ggplot2_3.5.1   arrow_18.1.0    pracma_2.4.4    dplyr_1.1.4    

loaded via a namespace (and not attached):
 [1] bit_4.5.0.1      jsonlite_1.8.9   gtable_0.3.6     crayon_1.5.3     compiler_4.4.2   renv_1.0.11     
 [7] tidyselect_1.2.1 parallel_4.4.2   assertthat_0.2.1 R6_2.5.1         labeling_0.4.3   generics_0.1.3  
[13] tibble_3.2.1     munsell_0.5.1    pillar_1.10.1    tzdb_0.4.0       rlang_1.1.4      utf8_1.2.4      
[19] stringi_1.8.4    bit64_4.5.2      timechange_0.3.0 cli_3.6.3        withr_3.0.2      magrittr_2.0.3  
[25] grid_4.4.2       vroom_1.6.5      hms_1.1.3        lifecycle_1.0.4  vctrs_0.6.5      glue_1.8.0      
[31] farver_2.1.2     colorspace_2.1-1 purrr_1.0.2      tools_4.4.2      pkgconfig_2.0.3 

I compiled/installed arrow with renv like so:

Sys.setenv(ARROW_WITH_ZLIB="ON")
Sys.setenv("LIBARROW_MINIMAL" = FALSE)
Sys.setenv("LIBARROW_BINARY" = FALSE)
Sys.setenv("ARROW_R_DEV" = TRUE)
Sys.setenv(MAKEFLAGS = sprintf("-j%d", parallel::detectCores()))
options(renv.config.pak.enabled = TRUE)
install.packages(c("dplyr","pracma","arrow","ggplot2","tidyr","lubridate","forcats","xtable","readr","stringr","scales","tidyjson"))

mrd0ll4r avatar Jan 08 '25 15:01 mrd0ll4r

That's great, thanks for the info @mrd0ll4r. You can try uploading to my Dropbox at https://www.dropbox.com/request/dR1ACeYDzZvj9b5Qsjbn. I can move them somewhere else after.

amoeba avatar Jan 14 '25 17:01 amoeba

Oh, that returns This file request has been closed, deleted, or never existed

mrd0ll4r avatar Jan 14 '25 18:01 mrd0ll4r

Whoops, can you try https://www.dropbox.com/request/oAfPalVlE9utSmVMJJMj?

amoeba avatar Jan 14 '25 19:01 amoeba

Got it, thanks so much. I'm able to reproduce the issue.

amoeba avatar Jan 14 '25 21:01 amoeba

Hi @zanmato1984, I initially thought I'd have time to look into this but haven't yet. Do you have any interest in taking a look at this one? I can send you a link to the files if so.

amoeba avatar Feb 06 '25 03:02 amoeba

Hi @amoeba , sorry for the late reply, I was on vacation last few days. I can take a look. But I may have trouble reproducing in R cause I have hardly developed anything in R. Do you think it is available to create an equal C++ or Python reproduction? Or do you have a link for setting up local R environment? Thanks.

zanmato1984 avatar Feb 09 '25 08:02 zanmato1984

Oh hi @amoeba , one more thing worth looking. What arrow version were you using to reproduce? If it's not 19.0.0, could you try it? I'm recalling some fix between 18.1.0 to 19.0.0 that is solving a similar issue. Thanks.

zanmato1984 avatar Feb 10 '25 10:02 zanmato1984

Hi @zanmato1984, hardest thing will probably be finding the minimal dataset that will produce the issue. Once there's a minimal dataset that triggers the issue, reproducing in C++ or python would help isolate the issue. LMK if I can help with setting up or testing in R.

For my data, I've whittled it down to the summarize step that counts elements in each group:

dsd <- ds |> 
  group_by(timestamp) |>
  summarize(n = n()) |>
  collect() 

I'm not sure whether it's the summarizing or the counting. I still see the problem in arrow 19.0.0.

blongworth avatar Feb 10 '25 15:02 blongworth

Hi @blongworth , thanks for the information. Seeing it in 19.0.0 negates my previous assumption of an existing fix (this is helpful as well!). I think I'll just wait for @amoeba 's data and try to reproduce it in my local.

zanmato1984 avatar Feb 10 '25 15:02 zanmato1984

Similar issue, but in Python with pyarrow 20.0.0. Works fine with 30M less rows.

>>> mydf.shape
(338440930, 3)
>>> mydf.dtypes                                                                    
sample     category
peptide    category
N             Int32
dtype: object
>>> mydf.to_parquet('pep_all_pandas.parquet', index=False, row_group_size=8192*8192)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/protobios/gmm-pipeline/.venv/lib/python3.12/site-packages/pandas/util/_decorators.py", line 333, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/protobios/gmm-pipeline/.venv/lib/python3.12/site-packages/pandas/core/frame.py", line 3113, in to_parquet
    return to_parquet(
           ^^^^^^^^^^^
  File "/opt/protobios/gmm-pipeline/.venv/lib/python3.12/site-packages/pandas/io/parquet.py", line 480, in to_parquet
    impl.write(
  File "/opt/protobios/gmm-pipeline/.venv/lib/python3.12/site-packages/pandas/io/parquet.py", line 228, in write
    self.api.parquet.write_table(
  File "/opt/protobios/gmm-pipeline/.venv/lib/python3.12/site-packages/pyarrow/parquet/core.py", line 1909, in write_table
    writer.write_table(table, row_group_size=row_group_size)
  File "/opt/protobios/gmm-pipeline/.venv/lib/python3.12/site-packages/pyarrow/parquet/core.py", line 1115, in write_table
    self.writer.write_table(table, row_group_size=row_group_size)
  File "pyarrow/_parquet.pyx", line 2226, in pyarrow._parquet.ParquetWriter.write_table
  File "pyarrow/error.pxi", line 92, in pyarrow.lib.check_status
pyarrow.lib.ArrowInvalid: Negative buffer resize: -2008352576

adlerpriit avatar May 26 '25 14:05 adlerpriit