rust-csv icon indicating copy to clipboard operation
rust-csv copied to clipboard

Newbie question: implementation of `_::_serde::Deserialize` is not general enough

Open lucabol opened this issue 3 years ago • 9 comments

Sorry for the Rust newbie question. I am trying to port existing working code to be zero-copy.

I have spent a fair amount adding/removing lifetimes, without success. Everything commented out I have already tried. The full error is at the end. Perhaps it is obvious stuff to a more Rust-savy person.

Any help appreciated.

#[derive(Debug, Deserialize)]
#[serde(rename_all = "PascalCase")]
pub struct Trade<'a> {
    pub account: &'a str,
    #[serde(with = "my_date_format")]
    pub date: DateTime<Utc>,
    pub r#type: TradeType,
    pub stock: &'a str,
    pub units: f64,
    pub price: Option<f64>,
    pub fees: Option<f64>,
    pub split: f64,
    pub currency: f64,
}

impl Store<'_> {
    pub fn load_trades(&self) -> Result<Vec<Trade>> {
        self.load_csv(TRADES_FILE)
    }
    pub fn load_stocks(&self) -> Result<Vec<Stocks>> {
        self.load_csv(STOCKS_FILE)
    }

    fn load_csv<T>(&self, data: &str) -> Result<Vec<T>>
    where
        //T: for<'de> Deserialize<'de>,
        T: serde::de::DeserializeOwned,
    {
        let mut rdr = csv::ReaderBuilder::new()
            .delimiter(b'\t')
            .flexible(true)
            .trim(csv::Trim::All)
            .comment(Some(b'#'))
            .from_reader(data.as_bytes());

        /*
        let raw_record = csv::StringRecord::new();
        let headers = rdr.headers().chain_err(|| "Can't get headers?")?.clone();

        let mut res = Vec::with_capacity(1024);
        while rdr
            .read_record(&mut raw_record)
            .chain_err(|| "Csv not well formed")?
        {
            let record: T = raw_record
                .deserialize(Some(&headers))
                .chain_err(|| "Csv not well formed")?;
            res.push(record);
        }
Ok(res)
        */
        rdr.deserialize()
            .map(|r| r.chain_err(|| "Badly formatted csv."))
            .collect::<Result<Vec<T>>>()
    }
}
error[E0277]: the trait bound `for<'de> T: _::_serde::Deserialize<'de>` is not satisfied
     --> src/lib.rs:196:14
      |
  196 |             .collect::<Result<Vec<T>>>()
      |              ^^^^^^^ the trait `for<'de> _::_serde::Deserialize<'de>` is not implemented for `T`
      |
      = note: required because of the requirements on the impl of `DeserializeOwned` for `T`
      = note: required because of the requirements on the impl of `Iterator` for `DeserializeRecordsIter<'_, &[u8], T>`
      = note: 1 redundant requirements hidden
      = note: required because of the requirements on the impl of `Iterator` for `std::iter::Map<DeserializeRecordsIter<'_, &[u8], T>, [closure@src/lib.rs:195:18: 195:60]>`
  help: consider further restricting this bound
      |
  168 |         T: Deserialize<'b> + for<'de> _::_serde::Deserialize<'de>,
      |                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

lucabol avatar Apr 15 '21 13:04 lucabol

I'm happy to help. Could you please provide a complete program that I can attempt to compile on my own? This should include a Cargo.toml with any relevant dependencies.

BurntSushi avatar Apr 15 '21 13:04 BurntSushi

Thx a bunch.

https://github.com/lucabol/lupo

lucabol avatar Apr 15 '21 13:04 lucabol

@lucabol I cloned your repo but I'm seeing a different error than you after running cargo build.

error[E0308]: mismatched types
   --> src/lib.rs:160:14
    |
160 |         self.load_csv(TRADES_FILE)
    |              ^^^^^^^^ one type is more general than the other
    |
    = note: expected trait `_::_serde::Deserialize<'de>`
               found trait `_::_serde::Deserialize<'_>`

error[E0308]: mismatched types
   --> src/lib.rs:163:14
    |
163 |         self.load_csv(STOCKS_FILE)
    |              ^^^^^^^^ one type is more general than the other
    |
    = note: expected trait `_::_serde::Deserialize<'de>`
               found trait `_::_serde::Deserialize<'_>`

error: aborting due to 2 previous errors

For more information about this error, try `rustc --explain E0308`.
error: could not compile `lupo`

To learn more, run the command again with --verbose.

I cannot stress enough the importance of making it easy for others to reproduce what you're seeing.

BurntSushi avatar Apr 15 '21 14:04 BurntSushi

For me cloning https://github.com/lucabol/lupo and 'cargo build' shows the errors I reported above.

I am running

cargo 1.51.0
rustc 1.51.0

I get:

  Compiling lupo v0.1.0 (/home/lucabol/dev/temp/lupo)
error: implementation of `_::_serde::Deserialize` is not general enough
   --> src/lib.rs:160:14
    |
160 |           self.load_csv(TRADES_FILE)
    |                ^^^^^^^^ implementation of `_::_serde::Deserialize` is not general enough
    |
   ::: /home/lucabol/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.124/src/de/mod.rs:530:1
    |
530 | / pub trait Deserialize<'de>: Sized {
531 | |     /// Deserialize this value from the given Serde deserializer.
532 | |     ///
533 | |     /// See the [Implementing `Deserialize`][impl-deserialize] section of the
...   |
568 | |     }
569 | | }
    | |_- trait `_::_serde::Deserialize` defined here
    |
    = note: `Trade<'_>` must implement `_::_serde::Deserialize<'0>`, for any lifetime `'0`...
    = note: ...but `Trade<'_>` actually implements `_::_serde::Deserialize<'1>`, for some specific lifetime `'1`

error: implementation of `_::_serde::Deserialize` is not general enough
   --> src/lib.rs:163:14
    |
163 |           self.load_csv(STOCKS_FILE)
    |                ^^^^^^^^ implementation of `_::_serde::Deserialize` is not general enough
    |
   ::: /home/lucabol/.cargo/registry/src/github.com-1ecc6299db9ec823/serde-1.0.124/src/de/mod.rs:530:1
    |
530 | / pub trait Deserialize<'de>: Sized {
531 | |     /// Deserialize this value from the given Serde deserializer.
532 | |     ///
533 | |     /// See the [Implementing `Deserialize`][impl-deserialize] section of the
...   |
568 | |     }
569 | | }
    | |_- trait `_::_serde::Deserialize` defined here
    |
    = note: `Stocks<'_>` must implement `_::_serde::Deserialize<'0>`, for any lifetime `'0`...
    = note: ...but `Stocks<'_>` actually implements `_::_serde::Deserialize<'1>`, for some specific lifetime `'1`

error: aborting due to 2 previous errors

error: could not compile `lupo`

To learn more, run the command again with --verbose.

lucabol avatar Apr 15 '21 15:04 lucabol

OK, thanks, I was using nightly Rust and it looks like the error message has changed here.

Looking at your code, I don't see how you expect this to work. The zero-copy APIs specifically borrow from the CSV record. So once you go to re-use that record, if the Rust compiler allowed what you were trying to do, you'd just overwrite your data.

When you use the zero-copy APIs, you need to be able to process the stream of records as they are parsed. It's fundamentally incompatible with returning a Vec<T>.

BurntSushi avatar Apr 15 '21 15:04 BurntSushi

Thx. As I said, I am not familiar with either Rust or this lib. My overall goal is to load the whole csv file in memory at application start time and then create a Vec<Trade> where each Trade contains refs to the memory (aka nothing more is heap-allocated).

Is there a way to do that?

lucabol avatar Apr 15 '21 15:04 lucabol

Yes. Your original comment specifically said zero copy though, so that's what I thought your goal was.

If you just want it to work, then the simplest thing in your current design is to abandon zero copy and used owned values everywhere. I sent you a patch: https://github.com/lucabol/lupo/pull/1

BurntSushi avatar Apr 15 '21 16:04 BurntSushi

Thanks a bunch. I will look at it tomorrow as it is getting late in my locale. Cheers.

lucabol avatar Apr 15 '21 16:04 lucabol

I don't think that's a valid answer as @lucabol specifically asked for "where each Trade contains refs to the memory (aka nothing more is heap-allocated).", that's basically still zero-copy, using owned Strings creates copies and introduces (more) heap allocation.

I think it's currently impossible due to the fact that the API is based on io::Read, as a counter example, serde_json has a from_str method which will work in similar scenario.

Jimmy-Z avatar Aug 31 '22 13:08 Jimmy-Z