frictionless-r icon indicating copy to clipboard operation
frictionless-r copied to clipboard

`add_resource(replace = TRUE)` fails silently if resource does not exist

Open Rafnuss opened this issue 1 year ago • 6 comments
trafficstars

Hi, Is this desired? I would expect that replace=TRUE should not prevent the addition of a new (non-yet-exisiting) resource.

library(frictionless)

package <- example_package()

# Create a data frame
df <- data.frame(
  multimedia_id = c(
    "aed5fa71-3ed4-4284-a6ba-3550d1a4de8d",
    "da81a501-8236-4cbd-aa95-4bc4b10a05df"
  ),
  x = c(718, 748),
  y = c(860, 900)
)

# Add the resource "positions" from the data frame
add_resource(package, "positions", data = df)
#> A Data Package with 4 resources:
#> • deployments
#> • observations
#> • media
#> • positions
#> Use `unclass()` to print the Data Package as a list.

add_resource(package, "positions", data = df, replace = TRUE)
#> A Data Package with 3 resources:
#> • deployments
#> • observations
#> • media
#> Use `unclass()` to print the Data Package as a list.

I didn't look at the function, so not sure why this is happening.

Rafnuss avatar Aug 28 '24 12:08 Rafnuss

You didn't assign your first add_resource() to package (it is not updated in place). So the following works:

library(frictionless)
package <- example_package()

# Create a data frame
df <- data.frame(
  multimedia_id = c(
    "aed5fa71-3ed4-4284-a6ba-3550d1a4de8d",
    "da81a501-8236-4cbd-aa95-4bc4b10a05df"
  ),
  x = c(718, 748),
  y = c(860, 900)
)

package <- add_resource(package, "positions", data = df)

add_resource(package, "positions", data = df, replace = TRUE)
#> A Data Package with 4 resources:
#> • deployments
#> • observations
#> • media
#> • positions
#> Use `unclass()` to print the Data Package as a list.

Created on 2024-08-28 with reprex v2.1.0

However - and I'm not sure that this is what you wanted to point out - if you replace a resource that is not there, then nothing happens. It's probably better if:

  1. add_resource() just adds the resource then
  2. Or alternatively, add_resource() returns an error to say it can't find the resource

I prefer the latter.

peterdesmet avatar Aug 28 '24 13:08 peterdesmet

Yes, that's what I wanted to show (it works without Replace=T but fail with it).

Ok, I personally prefer the traditional overwrite=FALSE, which allows the modification even if the element does not exist. But up to you!

Rafnuss avatar Aug 28 '24 13:08 Rafnuss

Changed my mind, I agree that overwrite = TRUE should mainly be used to avoid overwriting resources by accident. It can silently add the resource if it isn't there yet.

peterdesmet avatar Aug 29 '24 11:08 peterdesmet

What about the case where you might want to literally add data to an exsiting resources (i.e., with a bind_rows()). Should this be part of the scope of this function?

Rafnuss avatar Jan 13 '25 12:01 Rafnuss

@Rafnuss No, data manipulation should be done outside of add_resource():

# 1. Read
df <- read_resource(x, "my_data")

# 2. Manipulate df
filter(), mutate(), bind_rows()

# 3. Attach
x <- add_resource("my_data", updated_df, replace = TRUE)

That gives users full flexibility on how to manipulate the data (with e.g. dplyr). A simple schema will be created (again) when (re)adding the resource or - if they want more control - they can provide a schema to add_resource() with their data (using the schema parameter).

peterdesmet avatar Jan 29 '25 15:01 peterdesmet

Ok, makes sense. I guess the functionality that I was thinking about is more precisely what bind_row does (no modification of the data) just adding data. Maybe another function bind_resources(), but not sure that it would be useful in many cases.

Rafnuss avatar Jan 31 '25 08:01 Rafnuss