diesel icon indicating copy to clipboard operation
diesel copied to clipboard

Allow to ignore field of struct

Open DavidBM opened this issue 7 years ago • 61 comments

Actually I have this struct

#[derive(Clone, Debug, Queryable, Serialize, AsChangeset, Identifiable, Associations)]
#[has_many(authors)]
pub struct User {
	pub id: i32,
	pub name: String,
	pub email: String,
	#[serde(skip_serializing)]
	pub password: String,
	pub created_at: DateTime<UTC>,
	pub author: Option<Author>
}

The column author is not in the database, but when I return the model, I would like to put inside of the struct the Author (a user can or not to have an author). For that I want to tell diesel that ignores the field author. I'm searching if something like #[ignore_field].

	#[ignore_field]
	pub author: Option<Author>

DavidBM avatar Apr 15 '17 00:04 DavidBM

I'd call it #[diesel(skip_deserializing)] to be similar to serde. And we should probably only allow that for fields whose type implements Default.

killercup avatar Apr 17 '17 08:04 killercup

Umm, I don't know if I explain correctly the idea, maybe the serde annotation in my code confuses more than helps.

What I was searching is a way to totally ignore that field in the database. That means, Diesel to load (in that case) always None in the author field. For being able to add data after query to the struct.

Other question than can help. Having this struct:

#[derive(Clone, Debug, Queryable, Serialize, AsChangeset, Identifiable, Associations)]
#[has_many(authors)]
pub struct User {
	pub id: i32,
	pub email: String,
	pub author: Vec<Author>
}

is possible to have something like #[loads_has_many(authors)] to tell diesel to load all the authors in that field?

DavidBM avatar Apr 17 '17 14:04 DavidBM

What I was searching is a way to totally ignore that field in the database. That means, Diesel to load (in that case) always None in the author field. For being able to add data after query to the struct.

Oh, yes, I think that's what I understood as well. I was suggesting

  1. an attribute name that would not collide with other crates,

  2. and to use the Default trait to figure out what the value of that field should be when diesel tries to deserialize into a User.

    (We could specialize this to Option<_>, but I'd rather not, as this is exactly what Default is for. Option<_>::default() == None, but also String::default() == "" and Vec::default() == vec![].)

Did I get that right?

is possible to have something like #[loads_has_many(authors)] to tell diesel to load all the authors in that field?

Currently, it's possible to get (User, Vec<Author>) but you'd have to write a bit of boilerplate to put that into a new struct. I can see why this is nice to have (and a lot of ORMs have this), but I'm not quite sure how it'd fit Diesel's design. This is usually the "feature" that leads to n+1 query bugs, and we currently try to stay away from mangling models with association data. So, I'd try to work with tuples (using type aliases liberally), and see how it goes.

killercup avatar Apr 17 '17 16:04 killercup

Ok, you understand correctly :)

Thanks! I will check this frequently!

DavidBM avatar Apr 17 '17 21:04 DavidBM

I've definitely warmed to the idea of marking a field as ignorable for update/insert (to skip a field for update, you can always use my hilarious, hacky workaround for now. I'm not sure I see the use case for skipping something on Queryable though. If you want to populate other data after loading, it seems like it'd make more sense to have another struct that is struct Foo { data_from_database: DbModel, other_data: whatever }

sgrif avatar May 09 '17 19:05 sgrif

I think the only real use case for skipping a field on Queryable would be to imitate the classic ORM approach to associations.

Using the example code from the association docs:

let user = try!(users::find(1).first(&connection));
let posts = Post::belonging_to(&user).load(&connection);

then you would make posts a proper child of user:

user.posts = posts;

The docs say that you could pass it around as (User, Vec<Post>) but that doesn't exactly play nicely with serializing to JSON, which seems to be the initial reason behind this issue. The typical JSON representation would be:

{
   "id": 0,
   "name": "Foo",
   "posts": []
}

But passing it around as a tuple would have it represented as:

{
  "user": {
     "id": 0,
     "name": "Foo"
  },
  "posts": []
}

matt4682 avatar May 16 '17 01:05 matt4682

My use case. I have a struct:

#[derive(Deserialize, AsChangeset)]
#[table_name = "users"]
pub struct UpdateUserData {
    username: Option<String>,
    email: Option<String>,
    bio: Option<String>,
    image: Option<String>,
    #[skip_somehow_plz]
    password: Option<String>,
}

I use this struct to parse POST params and then I'd like to use it as a changeset. But my table doesn't have a password field. It has hash field.

So I want to skip password field in the changeset and set hash manually based on password.

TatriX avatar Jan 20 '18 19:01 TatriX

I have the same use case as @TatriX, the reason I want the field to be in the same struct as Insertable is because I use it for field validations as well.

carlosdp avatar Jan 29 '18 05:01 carlosdp

As far as I know this depends on #1512 Original use case currently can be fulfilled with nested structs and query with join. Example from @weiznich:

#[derive(Queryable)]
struct Foo {
    id: i32,
    name: String,
    bar: Bar,
}

#[derive(Queryable)]
struct Bar {
     id: i32,
    name: String
}

let r: Vec<Foo> = foo::table.inner_join(bar::table).select((foo::id, foo::name, (bar::id, bar::name)).load(conn)?;

I'm ready to take this issue if you think this should be implemented.

TatriX avatar Jan 29 '18 06:01 TatriX

It's not blocked on that PR. It just hasn't been a priority. As I mentioned in https://github.com/diesel-rs/diesel/issues/860#issuecomment-300280659 though, if/when this lands, it'll be for AsChangeset and Insertable, not Queryable.

sgrif avatar Jan 29 '18 22:01 sgrif

It has a good use case to be implemented on Quaryable too. Basically it's the same as with Insertable. I want to be able to reuse the same struct to do select queries, then fill the Option<NestedStruct> from another select and pass it to the json serializer.

TatriX avatar Jan 30 '18 06:01 TatriX

I've definitely warmed to the idea of marking a field as ignorable for update/insert (to skip a field for update, you can always use my hilarious, hacky workaround for now. I'm not sure I see the use case for skipping something on Queryable though. If you want to populate other data after loading, it seems like it'd make more sense to have another struct that is struct Foo { data_from_database: DbModel, other_data: whatever }

The link is broken. Could you repeat your solution? I have a structure with only one field other in update and create action (e.g. upsert action), and I want to use one struct with more than 40 fields in two cases.

misha-krainik avatar Feb 04 '19 14:02 misha-krainik

@mikhail-krainik That was this thing. But be warned: I'm quite sure doing that is a bad idea…

weiznich avatar Feb 04 '19 14:02 weiznich

I believe I have another use case where it would be helpful for marking fields of a struct to be ignored for a derived AsChangeset: timestamps that are automatically updated by the database.

For example, most of my tables track when they were created and the most recent modification time in two columns, created and modified. I never want diesel to update those columns. It would be very handy to mark those columns as ignored during an update, like so:

#[derive(Identifiable, AsChangeset, PartialEq, Debug, Serialize)]
#[table_name = "bodies"]
pub struct Body {
	pub id: Uuid,
	pub name: String,
	#[ignore_during_update]
	pub created: NaiveDateTime,
	#[ignore_during_update]
	pub modified: NaiveDateTime,
}

TrevorFSmith avatar Sep 23 '19 18:09 TrevorFSmith

@TrevorFSmith This would be lovely, or maybe detect on update fields in schema and then skip them automatically.

lamka02sk avatar Dec 23 '19 19:12 lamka02sk

Interested in this feature. My use case is that I want to take a nested JSON object returned by an API and generate a number of insertable structs. I want the parent struct to have the child struct as a Vec field so I can deserialize the JSON, but that makes it so I can't insert the struct into the database.

alexwennerberg avatar Dec 31 '19 05:12 alexwennerberg

Another use case:

Reuse of objects between Diesel and Juniper

dyst5422 avatar Feb 17 '20 20:02 dyst5422

@dyst5422 I've written wundergraph for this use case as you have to think about much more than a few field attributes for providing a graphql API that does not suffer from N+1 query problems. The normal juniper API generates N+1 queries for a nested entity.

weiznich avatar Feb 17 '20 20:02 weiznich

I have to add some std::marker::PhantomData<T> into my struct. I can't achieve this without this function.

tuxzz avatar May 18 '20 13:05 tuxzz

@tuxzz Just commenting here that you need a certain feature will help that the feature appears in a magical way. You either need to implement all traits manually, which would allow you to do whatever you want there, or sit down and implement that feature for diesel or find a other solution.

weiznich avatar May 18 '20 13:05 weiznich

TL;DR: I have an initial implementation of something that could address this issue.

I think there are a number of user cases that are fairly common throughout different projects. Perhaps one of the main underlying reasons is that nobody likes having to create a bunch of different structs to deal with the same "object" (e.g. user, account, etc.) I believe there are reasons for Diesel separating the Queryables from the Insertables, and I don't have any basis to argue that there's a better way - even if I'd prefer a more unified approach, purely out of laziness.

However, it is often the case that the struct that you get from DB needs to be "augmented" with additional information before it becomes complete, and that process would (ideally) be lazy in nature. A typical scenario (at least the one I'm in) is: DB -> struct -> logic -> struct -> JSON. By the time the "object" coming from DB is fully formed to serialise into JSON it might have required the addition of other bits of data. Now, you can do that by "collating" the main struct and the additional info into another, bigger struct. However, given Rust's ownership rules, I've found this to be quite challenging without cloning data all over the place. I really dislike cloning data for no good reason; and I have the impression that's pretty common amongst Rust users.

It would be really useful if the more experienced Diesel users could share their best practices to overcome the challenge described above. It could alleviate the problem of not having optional fields and, most importantly, show me some proper Rust coding practices! :-)

So, now that the long and winded intro is out of the way, I've actually gone ahead and created my version of Queryable that allows for optional fields. It looks something like this:

#[derive(PartialQueryable, Serialize)]
pub struct Account {
    pub id: u32,
    pub name: String,
    #[table_name = "group_acc"]
    #[column_name = "group_id"]
    pub id_of_group = u32,
    
    // ... other fields ...
    
    #[diesel(skip)]
    pub activity_dates: Vec<AccActivityDates>
}

For now I'm calling it PartialQueryable and it allows you to add a #[diesel(skip)] attribute to any struct field, which would not be included in the derive's code generation. Furthermore, in order to simplify the query generation code, it creates a tuple with all the columns as defined in the struct. For the example above it would create Account::ALL_COLUMNS, so you can write something like this:

account::table
    .inner_join(group_acc::table)
    .filter(account::id.eq_any(acc_ids))
    // This is equivalent to (accounts::id, accounts::name, ...) - `activity_dates` not included
    .select(Account::ALL_COLUMNS)
    .load::<Account>(&conn)
    .expect("Something went wrong.")

This is working in my own project, but I have to clarify that I'm only using it in one place at the moment.

In fact, it includes some additional options, which you might have noticed in the example. In order to allow you to include fields from other tables in your struct - and still be able to auto-generate ::ALL_COLUMNS -, you can use the table_name and column_name attributes for fields. In the example above I'm including the Account's id_of_group in the struct, which is comming from a second table inner-join'd in the query. I've used a weird field name to illustrate using column_name. So Account::ALL_COLUMNS is actually something like (accounts::id, accounts::name, group_acc::group_id, ...).

My implementation is fairly basic and probably doesn't cover a number of corner cases. Things might (will?) start to fall apart if you start doing fancy stuff with these additional "powers". In any case, I'll be happy to create a PR if the maintainers think it is worth considering for review and potential inclusion - as an extension to Queryable or as a separate derive.

At the moment this is just on a local repo that's simply a manual copy of enough code from Queryable to make it work. But I could prepare it for a PR if it's going to be considered.

Cheers!

alfonso-eusebio avatar May 24 '20 11:05 alfonso-eusebio

@alfonso-eusebio As a general note at the top: As pointed out in this comment (https://github.com/diesel-rs/diesel/issues/860#issuecomment-300280659) and this PR (https://github.com/diesel-rs/diesel/pull/1582) we are not interested in having an option to ignore fields for Queryable as it breaks the basic contract that Queryable (and similar functionality) is a simple direct mapping from results to a struct. It wouldn't be clear anymore what would be used to initialize the other fields. This has been discussed a few times before and I honestly do not see any new arguments here. For the general case there are already several solutions for solving the perceived difference between the data structure returned by the query and the data structure to return a certain serialization format:

  • For anything using serde (so serializing to json) it is as simple as using #[serde(flatten)] on a field on an outer struct that contains the data loaded from the database. So something like this already just works out of the box: (In decreasing order of relevance)
#[derive(Queryable, Serialize)]
struct User {
    id: i32,
    name: String,
    // …
}

#[derive(Serialize)]
struct OuterUser {
    #[serde(flatten)]
    user: User
    other_additional_field: Foo,
}

// serializing OuterUser results in
// { "id": 42, "name": "John", "other_additional_field": {}}

(This case already covers >90% of the use cases of the requested feature in my experience)

  • Request a default value via query: users::table.select((users::id, users::name, 1.into_sql::<Integer>()).load::<UserWithOtherField>(&conn)?;
  • Manually implementing Queryable, that allows you to specify whatever constant value you want to use to initialize a "ignored" struct field
  • Provide a third party derive that does the last thing automatically

The following section addresses specific parts of the comment above:

Now, you can do that by "collating" the main struct and the additional info into another, bigger struct. However, given Rust's ownership rules, I've found this to be quite challenging without cloning data all over the place. I really dislike cloning data for no good reason; and I have the impression that's pretty common amongst Rust users.

I'm not sure where exactly additional cloning is involved to include additional information. The workflow with your proposed extension is roughly the following:

#[derive(PartialQueryable, Serialize)]
pub struct Account {
    pub id: u32,
    pub name: String,  
    // ... other fields ...   
    #[diesel(skip)]
    pub activity_dates: Vec<AccActivityDates>
}

let mut db_result = accounts::table.load::<Account>(&conn)?;
for a in &mut db_result {
    // just a unrelated side note:
    // if this accesses the database you get a N+1 query bug
    a.get_activity_dates();
}

while using the built in API + serde gives you this:

#[derive(Queryable, Serialize)]
pub struct Account {
    pub id: u32,
    pub name: String,  
}
#[derive(Serialize)]
pub struct JsonAccount {
    #[serde(flatten)]
    pub user: Account,
    pub activity_dates: Vec<AccActivityDates>
}

let db_results = accounts::table.load::<Account>(&conn);

db_results.into_iter().map(|user| { 
    // same about N+1 queries applies here
    // but if you need to load stuff from the database here
    // you can now use the `diesel::associations` api to separately load
    // those data at once and then use `.zip()` instead to merge the lists
    JsonAccount {
        user,
        activity_dates: get_activity_dates()
    }
}).collect(); 

In fact, it includes some additional options, which you might have noticed in the example. In order to allow you to include fields from other tables in your struct - and still be able to auto-generate ::ALL_COLUMNS -, you can use the table_name and column_name attributes for fields. In the example above I'm including the Account's id_of_group in the struct, which is comming from a second table inner-join'd in the query. I've used a weird field name to illustrate using column_name. So Account::ALL_COLUMNS is actually something like (accounts::id, accounts::name, group_acc::group_id, ...).

That is a completely separate feature in my opinion. In fact there is already something like that in development with #2367. I would appreciate feedback about the usability of this API there. Also if someone comes up with good idea that does not require a separate query method to implement this feature that would also be fantastic.

weiznich avatar May 26 '20 09:05 weiznich

@weiznich Thanks for the detailed response and all the examples and references. I'll make sure to review in detail and try to find a solution within the available functionality.

My apologies if it seems like I was trying to push on a direction not aligned with the project. I probably misinterpreted your previous comment about trying to offer a potential solution for this functionality.

One the flip side, I've now learnt a couple of things about proc_macro, which hopefully might come in handing for future (hopefully more useful) contributions.

Cheers

alfonso-eusebio avatar May 26 '20 10:05 alfonso-eusebio

@weiznich

This has been discussed a few times before and I honestly do not see any new arguments here.

Can you point me to where this was discussed? #1582 doesn't seem to elaborate what the problems are. I've found this:

With the feature in general. I think this affects more things in non-obvious ways than #[serde(skip)] does.

Which still leaves me wondering what the problem actually is.

From a high level perspective, #[derive(Default)] (or manually implementing Default) is idiomatic Rust. And it seems a combo of Default + some way to mark that a Queryable field should init from Default rather than the database would have both predictable behavior and be ergonomic.

But there must be some discussion elsewhere that I missed, about what more things would be affected in "non-obvious" ways.

lolgesten avatar May 26 '20 10:05 lolgesten

@alfonso-eusebio There is no need to apologies here. Anyone that comes with a well written proposal as yours is well come to ask questions and propose improvements.

@lolgesten

Which still leaves me wondering what the problem actually is. From a high level perspective, #[derive(Default)] (or manually implementing Default) is idiomatic Rust. And it seems a combo of Default + some way to mark that a Queryable field should init from Default rather than the database would have both predictable behavior and be ergonomic. But there must be some discussion elsewhere that I missed, about what more things would be affected in "non-obvious" ways.

At least some of the discussion was made in our internal channel. In the end the argument was just: It would change the mapping from obvious to not so obvious (I consider materializing values out of nowhere more nonobvious than just having database values) without anyone having an good use case for that, that cannot be solved by existing variants listed above. Additionally there is this potential N+1 issue I've pointed out in the examples above. So it's just a that the costs (maintenance, documentation, complexity to learn for new users, …) of a potential are higher than the potential gains we see. Again if someone has a different example that cannot be easily solved with some of the solutions above I'm willing to reconsider that, but the basic workflow with this feature seems for me to be just the same as just using #[serde(flatten)] in terms of complexity.

weiznich avatar May 26 '20 12:05 weiznich

Throwing my hat in. I would love to be able to use serde's flatten micro without it needing to be in my table/schema

#[derive(Debug, Serialize, Deserialize, Queryable, AsChangeset)]
#[table_name = "router"]
pub struct DbRouter { 
  pub router_id: u64,
  //pub rack_device_id: u64,
  //pub inventory_id: String,
  //pub hostname: String,
  #[serde(flatten)]
  pub new_db_router: NewDbRouter,
  pub deploy_date: chrono::NaiveDate
}

table! {
    router (router_id) { 
        router_id -> Unsigned<Bigint>,
        rack_device_id -> Unsigned<Bigint>,
        inventory_id -> Varchar,
        hostname -> Varchar,
        deploy_date -> Date,
    }
}   

error[E0412]: cannot find type `new_db_router` in module `router`
  --> src/models.rs:73:6
   |
73 |     pub new_db_router: NewDbRouter,
   |         ^^^^^^^^^^^^^ not found in `router`

error[E0425]: cannot find value `new_db_router` in module `router`
  --> src/models.rs:73:6
   |
73 |     pub new_db_router: NewDbRouter,
   |         ^^^^^^^^^^^^^ not found in `router`

twiclo avatar Dec 14 '20 22:12 twiclo

Are there any updates? It really is a very useful feature.

ModPhoenix avatar Feb 06 '21 21:02 ModPhoenix

@ModPhoenix If you don't have arguments to add, which are not already listed above please don't comment on old issues asking for updates. If there the issue has no update for some time there is no update.

weiznich avatar Feb 07 '21 10:02 weiznich

I have read through most of the comments here and hopefully adding this is productive, but I myself have another use case for this when it comes to inserting a row. I am using NewArticle as a Json type in a rocket api so I can send a POST request with the title, description, tag, and the body itself of the article. I don't want to store the body in the database, and therefore I want to ignore this field, however I need to handle it on a request to store it in a file system.

This may be a small niche but I think it is definitely a worthwhile feature adding.

#[derive(Insertable, Debug, Deserialize)]
#[table_name="articles"]
pub struct NewArticle {
    pub title: String,
    pub description: String,
    pub tag: String,
    // I want to ignore the body of the article, but i still want it to be apart of my JSON request and be able to handle it.
    pub body: String
}

Jackbaude avatar Feb 22 '21 05:02 Jackbaude

@Jackbaude That's a classical case that could easily be solved by using #[serde(flatten)]:

#[derive(Insertable, Debug, Deserialize)]
#[table_name = "articles"]
pub struct NewDbArticle {
    pub title: String,
    pub description: String,
    pub tag: String,
}

#[derive(Debug, Deserialize)]
pub struct NewArticle {
    #[serde(flatten)]
    pub db: NecDbArticle,
    pub body: String
}

weiznich avatar Feb 22 '21 08:02 weiznich