frictionless-r
frictionless-r copied to clipboard
`add_resource(replace = TRUE)` fails silently if resource does not exist
Hi,
Is this desired? I would expect that replace=TRUE should not prevent the addition of a new (non-yet-exisiting) resource.
library(frictionless)
package <- example_package()
# Create a data frame
df <- data.frame(
multimedia_id = c(
"aed5fa71-3ed4-4284-a6ba-3550d1a4de8d",
"da81a501-8236-4cbd-aa95-4bc4b10a05df"
),
x = c(718, 748),
y = c(860, 900)
)
# Add the resource "positions" from the data frame
add_resource(package, "positions", data = df)
#> A Data Package with 4 resources:
#> • deployments
#> • observations
#> • media
#> • positions
#> Use `unclass()` to print the Data Package as a list.
add_resource(package, "positions", data = df, replace = TRUE)
#> A Data Package with 3 resources:
#> • deployments
#> • observations
#> • media
#> Use `unclass()` to print the Data Package as a list.
I didn't look at the function, so not sure why this is happening.
You didn't assign your first add_resource() to package (it is not updated in place). So the following works:
library(frictionless)
package <- example_package()
# Create a data frame
df <- data.frame(
multimedia_id = c(
"aed5fa71-3ed4-4284-a6ba-3550d1a4de8d",
"da81a501-8236-4cbd-aa95-4bc4b10a05df"
),
x = c(718, 748),
y = c(860, 900)
)
package <- add_resource(package, "positions", data = df)
add_resource(package, "positions", data = df, replace = TRUE)
#> A Data Package with 4 resources:
#> • deployments
#> • observations
#> • media
#> • positions
#> Use `unclass()` to print the Data Package as a list.
Created on 2024-08-28 with reprex v2.1.0
However - and I'm not sure that this is what you wanted to point out - if you replace a resource that is not there, then nothing happens. It's probably better if:
add_resource()just adds the resource then- Or alternatively,
add_resource()returns an error to say it can't find the resource
I prefer the latter.
Yes, that's what I wanted to show (it works without Replace=T but fail with it).
Ok, I personally prefer the traditional overwrite=FALSE, which allows the modification even if the element does not exist. But up to you!
Changed my mind, I agree that overwrite = TRUE should mainly be used to avoid overwriting resources by accident. It can silently add the resource if it isn't there yet.
What about the case where you might want to literally add data to an exsiting resources (i.e., with a bind_rows()). Should this be part of the scope of this function?
@Rafnuss No, data manipulation should be done outside of add_resource():
# 1. Read
df <- read_resource(x, "my_data")
# 2. Manipulate df
filter(), mutate(), bind_rows()
# 3. Attach
x <- add_resource("my_data", updated_df, replace = TRUE)
That gives users full flexibility on how to manipulate the data (with e.g. dplyr). A simple schema will be created (again) when (re)adding the resource or - if they want more control - they can provide a schema to add_resource() with their data (using the schema parameter).
Ok, makes sense. I guess the functionality that I was thinking about is more precisely what bind_row does (no modification of the data) just adding data. Maybe another function bind_resources(), but not sure that it would be useful in many cases.