Scrapegraph-ai
Scrapegraph-ai copied to clipboard
Prallel Execution of Nodes
Hey Folks,
this looks like a great project. I am not sure if its a bug or a feature.
Describe the bug
Executes the node's logic to instantiate and run multiple graph instances in parallel. Stated in here
To Reproduce Can someone explain to me, where the parallelism kicks in? As far as i can tell, it runs in a for loop:
# run the graph for each url and use tqdm for progress bar
graphs_answers = []
for graph in tqdm(graphs_instances, desc="Processing Graph Instances", disable=not self.verbose):
result = graph.run()
graphs_answers.append(result)
Expected behavior I would expect that the nodes are executed in parallel and joined back together for speedup .
hey @ChrisDelClea Chris, yes it is a feature to implement! Right now it is iterating over the SmartScraper instances so it is sequential but we could implement an async
/await
mechanism on the run method of the graphs
This would be great.
@PeriniM I might take care of it as soon as I'm done running tests for #147.
I would use a semaphore with a param specifying the number of concurrent graphs, not to exhaust too many resources.
@DiTo97 That would be sweet!
@PeriniM opened a PR; code should be working, but better if you test it first hand before merging
Closed with #213