vega icon indicating copy to clipboard operation
vega copied to clipboard

questions about wordcount example

Open Bran-Sun opened this issue 4 years ago • 1 comments

I write a WordCount example with your framework as follows. It only processes a 17-lines text but takes 240s to finish on my computer. Why does it run so slow?

use chrono::prelude::*;
use vega::io::*;
use vega::*;
use std::fs::File;

fn main() -> Result<()> {
    let context = Context::new()?;

    let num_splits = 4;
    let deserializer = Fn!(|file: Vec<u8>| {
        String::from_utf8(file)
        .unwrap()
        .lines()
        .map(|s| s.to_string())
        .collect::<Vec<_>>()
    });
    let lines = context
                .read_source(LocalFsReaderConfig::new("./README.md"), deserializer)
                .flat_map(Fn!(|lines: Vec<String>| {
                    Box::new(lines.into_iter()) as Box<dyn Iterator<Item = _>>
                }));
    
    let words = lines.flat_map(Fn!(|line: String| {
        Box::new(line.split(' ').map(|s| (s.to_string(), 1)).collect::<Vec<_>>().into_iter()) as Box<dyn Iterator<Item = _>>
    }));

    let result = words.reduce_by_key(Fn!(|(a, b)| a + b), num_splits);

    let output = result.collect().unwrap();

    println!("result: {:?}", output);

    Ok(())
}

Bran-Sun avatar Dec 17 '20 07:12 Bran-Sun

Hello, Sorry for a very late reply. I was taking some break from maintaining the public branch of this library for some time. Hence the delay.

240s doesn't seem correct. Can you provide more details? Maybe you are taking initial compilation time also into account?

rajasekarv avatar May 10 '21 13:05 rajasekarv