Scraping function works on web, but doesn't on mobile
Problem
I am trying to create a mobile app that scrapes a website and then shows the data in the app. I try to run it on web, works fine, but when I try it on mobile it doesn't work. I tried adding various prints around the code, but when the component that will handle the scraping is called, none of the prints do anything
Steps To Reproduce
Steps to reproduce the behavior:
- create a new project using
$ dx new app, choose fullstack, with router - copy the code provided
- run the code using
$ dx serve --platform android - wait for it to start
- the data is not showing
Expected behavior
the scraped data shows
here's the codes I use:
ui/src/logic/scrape.rs:
use reqwest::Client;
use scraper::{Html, Selector};
use std::collections::HashMap;
#[cfg(target_arch = "wasm32")]
use lol_alloc::{AssumeSingleThreaded, FreeListAllocator};
// Set up the global allocator for WebAssembly
#[cfg(target_arch = "wasm32")]
#[global_allocator]
static ALLOCATOR: AssumeSingleThreaded<FreeListAllocator> =
unsafe { AssumeSingleThreaded::new(FreeListAllocator::new()) };
pub async fn scrape(url: &str) -> Result<Vec<HashMap<String, String>>, String> {
let client = Client::new();
// Use allorigins as a CORS proxy
let proxy_base = "https://api.allorigins.win/raw?url=";
let target_url = if url.is_empty() {
format!("{}https://website.com/", proxy_base)
} else {
format!("{}https://website.com/{}", proxy_base, url)
};
println!("Scraping: {}", target_url);
// Set a size limit for the response
let res = match client.get(&target_url)
.send()
.await
{
Ok(response) => {
// Check response size before processing
let content_length = response.content_length().unwrap_or(0);
if content_length > 5_000_000 { // 5MB limit
return Err("Response too large".to_string());
}
response
},
Err(e) => return Err(format!("Failed to send request: {}", e))
};
let body = match res.text().await {
Ok(text) => text,
Err(e) => return Err(format!("Failed to get response body: {}", e))
};
// Process in chunks if document is large
let document = Html::parse_document(&body);
// More specific selector to reduce memory usage
let links_selector = match Selector::parse("div.content > a") {
Ok(selector) => selector,
Err(e) => return Err(format!("Invalid selector: {}", e))
};
// Limit result size
let mut results: Vec<HashMap<String, String>> = Vec::with_capacity(100);
// Process limited number of links
for (i, link) in document.select(&links_selector).enumerate() {
if i >= 100 { // Limit to 100 results max
break;
}
let mut item = HashMap::new();
// Extract only what's needed
if let Some(href) = link.value().attr("href") {
item.insert("href".to_string(), href.to_string());
}
// Avoid collecting all text nodes into a vector first
let text = link.text().collect::<String>().trim().to_string();
item.insert("title".to_string(), text);
results.push(item);
}
Ok(results)
}
ui/src/favorites.rs:
use dioxus::prelude::*;
use serde_json;
use serde::Deserialize;
use std::fs::File;
use std::io::{BufReader, Read};
use crate::logic::scrape::scrape;
use std::collections::HashMap;
use wasm_bindgen_futures::spawn_local;
const FAVORITES_CSS: Asset = asset!("./assets/styling/favorites.css");
#[derive(Deserialize, PartialEq)]
struct Course {
title: String,
favorite: bool,
}
#[derive(Deserialize, PartialEq)]
struct CourseData {
courses: Vec<Course>,
}
#[component]
pub fn FavoriteList() -> Element {
let mut scraped_data = use_signal(|| Vec::<HashMap<String, String>>::new());
use_effect(move || {
to_owned![scraped_data];
spawn_local(async move {
match scrape("").await {
Ok(data) => {
scraped_data.set(data);
},
Err(err) => {
scraped_data.set(vec![{
let mut map = HashMap::new();
map.insert("title".to_string(), err);
map
}]);
}
}
});
});
rsx! {
document::Link { rel: "stylesheet", href: FAVORITES_CSS }
meta { name: "viewport", content: "width=device-width, initial-scale=1" }
div {
h2 {
class: "favorites-title",
"Favorites"
}
if !scraped_data.read().is_empty() {
div {
class: "favorites-courses-list",
h3 { "Favorites:" }
for (index, item) in scraped_data.read().iter().enumerate() {
if let Some(title) = item.get("title") {
div {
class: "favorites-course",
svg {
xmlns: "http://www.w3.org/2000/svg",
view_box: "0 0 24 24",
width: "24",
height: "24",
fill: "#FF0000",
stroke: "none",
path {
d: "M12 21.35l-1.45-1.32C5.4 15.36 2 12.28 2 8.5 2 5.42 4.42 3 7.5 3c1.74 0 3.41.81 4.5 2.09C13.09 3.81 14.76 3 16.5 3 19.58 3 22 5.42 22 8.5c0 3.78-3.4 6.86-8.55 11.54L12 21.35z"
}
}
p { "{title}" }
p {
class: "favorites-clickable",
">"
}
}
}
}
}
}
}
}
}
mobile/src/main.rs:
use dioxus::html::completions::CompleteWithBraces::progress;
use dioxus::prelude::*;
use ui::{ProgressTracker, search_bar::search_bar, FavoriteList};
#[component]
pub fn Home() -> Element {
rsx! {
search_bar {},
ProgressTracker {id: 1, progress: 5},
FavoriteList {}
}
}
Screenshots
Phone:
Web:
Environment:
- Dioxus version: 0.6.3
- Rust version: 1.82.0
- OS info: Android 15(Emulator), Windows 11 Pro 24H2
- App platform: web, android
I tried adding various prints around the code, but when the component that will handle the scraping is called, none of the prints do anything
println! will not work on mobile or wasm. You need to use tracing::info! to log in Dioxus. None of the scraping logic is related to Dioxus. If you think that is what is failing here, it doesn't sound like a bug with Dioxus.