The problem of creating a structure with the Select field
Hello, I want to create my iterator that would return my structure, based on the passed html, but there is a problem when creating the iterator (fn init), please tell me how to implement it correctly and how to adjust the lifetime, I tried different ways but I don't have it turned out Here is a sample code:
struct Film {
name: String,
}
struct FilmParser<'a> {
html: Html,
foo_selector: Selector,
search_iter: scraper::html::Select<'a, 'a>,
}
impl<'a> FilmParser<'a> {
fn init(page_body: &str)->Self{
let html = Html::parse_document(page_body);
let foo_selector = Selector::parse("foo").unwrap();
Self{
html,
foo_selector,
search_iter: html.select(&foo_selector)// the problem is here
}
}
}
impl<'a> Iterator for FilmParser<'a> {
type Item = Film;
fn next(&mut self) -> Option<Self::Item> {
self.search_iter.next().map(|element| Film {
name: element.inner_html(),
})
}
}
This is a fundamental limitation of Rust's type system: You are trying to create a self-referential struct, in this case because search_iter references the other two fields.
This is not possible in safe Rust and you will probably need to restructure your code to store html and foo_selector separately from search_iter. There are crates to produce self-referential structs, but they are often tricky to use or have soundness bugs.
Please tell me how to properly organize the code to hide the conversion of ElementRef to Film and be able to use an iterator to avoid unnecessary conversions, it would be great if the selector could be given instead of being used by reference
it would be great if the selector could be given instead of being used by reference
But I think this is the crux of it, i.e. you will need to reference the selector from elsewhere if you want to wrap Select to produce an Iterator<Item=Film>.
need to reference the selector
I mean that the select method could absorb a selector, for example pub fn select<'a, 'b>(&'a self, selector: Selector)
need to reference the selector
I mean that the
selectmethod could absorb a selector, for examplepub fn select<'a, 'b>(&'a self, selector: Selector)
In general, it's better to use references when ownership is not needed. That's why we do not take ownership of Selector here
You should probably separate the parser state from the iterator state.
struct FilmParser {
html: Html,
foo_selector: Selector,
}
struct FilmIterator<'a> {
search_iter: scraper::html::Select<'a, 'a>,
}
Like so, implementing separately
impl FilmParser { /* ... */ }
impl<'a> Iterator for FilmIterator<'a> { /* ... */ }
I will close this one, because the solutions provided work