datawizard icon indicating copy to clipboard operation
datawizard copied to clipboard

Implement `data_arrange()`

Open etiennebacher opened this issue 2 years ago • 7 comments

Close #193 (see the issue for the discussion about this function design)

library(datawizard)

# for comparison
head(mtcars)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

data_arrange(head(mtcars), "carb")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
data_arrange(head(mtcars), "gear", "carb")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
data_arrange(head(mtcars), "-carb")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
data_arrange(head(mtcars), "-gear", "carb")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
data_arrange(head(mtcars), "foo")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
data_arrange(head(mtcars), "foo", safe = FALSE)
#> Error: The following column(s) don't exist in the dataset: foo.

Created on 2022-07-06 by the reprex package (v2.0.1)

etiennebacher avatar Jul 06 '22 12:07 etiennebacher

Codecov Report

Merging #195 (d8b7b8a) into main (2ef538c) will increase coverage by 0.11%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #195      +/-   ##
==========================================
+ Coverage   84.45%   84.56%   +0.11%     
==========================================
  Files          52       53       +1     
  Lines        3402     3427      +25     
==========================================
+ Hits         2873     2898      +25     
  Misses        529      529              
Impacted Files Coverage Δ
R/data_arrange.R 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

codecov-commenter avatar Jul 06 '22 12:07 codecov-commenter

I added data_arrange2() as alternative with different syntax, using select etc.:

library(datawizard)

x1 <- data_arrange(head(mtcars), "gear", "carb")
x2 <- data_arrange2(head(mtcars), select = c("gear", "carb"))
identical(x1, x2)
#> [1] TRUE

x1 <- data_arrange(head(mtcars), "-carb")
x2 <- data_arrange2(head(mtcars), select = "carb", descending = "carb")
identical(x1, x2)
#> [1] TRUE

x1 <- data_arrange(head(mtcars), "gear", "-carb")
x2 <- data_arrange2(head(mtcars), select = c("gear", "carb"), descending = "carb")
identical(x1, x2)
#> [1] TRUE

x1 <- data_arrange(head(mtcars), "gear", "-carb", "am")
x2 <- data_arrange2(head(mtcars), select = c("gear", "carb", "am"), descending = "carb")
identical(x1, x2)
#> [1] TRUE

x1 <- data_arrange(head(mtcars), "gear", "-carb", "-am")
x2 <- data_arrange2(head(mtcars), select = c("gear", "carb", "am"), ascending = "gear")
identical(x1, x2)
#> [1] TRUE

Created on 2022-07-06 by the reprex package (v2.0.1)

strengejacke avatar Jul 06 '22 12:07 strengejacke

@etiennebacher Can you please resolve the merge conflict here?

IndrajeetPatil avatar Jul 25 '22 18:07 IndrajeetPatil

I added data_arrange2() as alternative with different syntax, using select etc.:

Given the main function uses the dots, we could maybe just put both syntaxes into one?

What do you think?

bwiernik avatar Jul 25 '22 18:07 bwiernik

@strengejacke Should this be part of the next release?

IndrajeetPatil avatar Aug 07 '22 09:08 IndrajeetPatil

I'd say yes, however, we haven't finally decided on the function design. I think we can use the quoted variable approach from this PR, but maybe instead of dots, we have a select argument, which behaves differently from those in other functions. I don't think that's a problem as long as we document this properly.

So instead of

data_arrange(head(mtcars), "-gear", "carb")

we'd have

data_arrange(head(mtcars), select = c("-gear", "carb"))

strengejacke avatar Aug 07 '22 10:08 strengejacke

Okay, I think we don't need to make a decision in a hurry. We can make this part of the next release.

I do want to get the datawizard out ASAP, though. Since other updates depend on it.

IndrajeetPatil avatar Aug 07 '22 13:08 IndrajeetPatil

@etiennebacher What do you think? See my comment https://github.com/easystats/datawizard/pull/195/#issuecomment-1207376045

strengejacke avatar Aug 21 '22 07:08 strengejacke

Can we merge this? I think we should go with the design that @etiennebacher suggested first, for consistency, I just used a select argument instead of dots.

library(datawizard)

# for comparison
head(mtcars)
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

data_arrange(head(mtcars), "carb")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
data_arrange(head(mtcars), c("gear", "carb"))
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
data_arrange(head(mtcars), "-carb")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
data_arrange(head(mtcars), c("-gear", "carb"))
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
data_arrange(head(mtcars), "foo")
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
data_arrange(head(mtcars), "foo", safe = FALSE)
#> Error: The following column(s) don't exist in the dataset: foo.

Created on 2022-08-24 by the reprex package (v2.0.1)

strengejacke avatar Aug 24 '22 06:08 strengejacke

Can we merge this? I think we should go with the design that @etiennebacher suggested first, for consistency, I just used a select argument instead of dots.

I'm ok with this, I merge this PR

etiennebacher avatar Aug 24 '22 12:08 etiennebacher