unrar.rs icon indicating copy to clipboard operation
unrar.rs copied to clipboard

Add a method to get the bytes as a vec for an entry in the rar file.

Open casret opened this issue 6 years ago • 5 comments

I needed this for my use case (and didn't want to go the route of creating a temporary directory to extract too), so thought it might be useful for other people as well.

casret avatar Jul 05 '18 20:07 casret

Hey there, thanks for this very nice contribution. I see a lot of repetition to what I had written and I tried to circumvent this a long time ago by splitting the extraction process into read_header and process and give the caller more control over what happens with each file on a more granular basis.

I had taken the wrong approach and encountered some compile errors which I couldn't resolve (I pushed the non-compiling code to the branch draft-wrong-approach if you wanna check it out)

Either way, I had actually forgotten that the data has to be streamed, i.e. you cannot ReadHeader countless times and Process them all one by one (at least I'm pretty sure, correct me if I'm wrong... since you're using the Operation::Skip I'm pretty certain though)

So my thought of a better way would be:

in next, don't process the file after reading the header. Instead, set a struct field current: Option<Entry> (initiated with None and set after each ReadHeader).

Add a process method to Entry which consumes (or invalidates) the entry and is tracked inside the OpenArchive#next. If the caller decides to not call process on the current Entry, then the Entry is processed with Operation::Skip before continuing reading header.

Pseudo code:

fn next(&mut self) {
  if let Some(entry) = self.current && entry.processable {
    entry.process(Operation::Skip)
  }
  let entry = Entry::from(native::RARReadHeader(self.handle)?);
  self.current = Some(entry)
  Some(entry)
}

I think you could then use this function instead and allow for less duplicate code... i.e. (again pseudo-code):

archive.drop_while(|e| e.filename != my_file_name).nth(0).unwrap().process(Operation::Read)

Thoughts?

muja avatar Jul 05 '18 22:07 muja

Yeah, that makes a lot of sense. I can merge it in with your approach if you get it working, or I can try to refactor it myself if I get some time.

casret avatar Jul 06 '18 06:07 casret

Unfortunately, changing the locale of the entire process's C library is not acceptable as a side effect in a library, which is why I was very declined against merging this.

What problem did you try to fix with that? Maybe we can get a failing test in and try to solve it some other way?

As for the bytes method, I believe providing the filename and having the archive object iterate over all entries looking for it, is way too specific.

Are you still interested in getting this merged?

muja avatar Dec 21 '19 09:12 muja

The tests are in the merge request.

casret avatar Dec 21 '19 16:12 casret

Are you still interested in getting this merged?

@muja I am 🙂 I'll try to get a functional implementation over the next few months, following your detailed explanation above

aaronleopold avatar May 05 '22 02:05 aaronleopold

There are some new upcoming changes to the library. With the 0.5 release (hopefully this or next month) it should be possible to get the bytes of specific entries as Vec and to be more flexible what exactly happens with each entry. I will close this PR as this implementation here was too specific.

muja avatar Jan 09 '23 08:01 muja

This is now possible in v0.5, see the read_named example.

muja avatar Jun 22 '23 21:06 muja