Dieter Weber

Results 225 comments of Dieter Weber

Integration work by @shabihsherjeel depends on this feature. If I understood correctly, the timeline is flexible for now? In any case, an important feature for the next release and should...

Discussion with @sk1p: We should also provide an interface to start a cluster or connect to one without user interaction for integrations like this.

Moved to the 0.10 milestone so that we get the executor and Dask integration work out first.

Requirements include the following points: * Read bulk input data by opening a file on each node, rather than sending the data itself. We should only send task descriptors and...

More requirements: * Easy to deploy on a regular PC * Reasonable to deploy on existing HPC clusters * Development know-how available * Reasonable to build a DIY cluster based...

# What is the problem, actually Express a data-intensive algorithm in simple terms and then parallelize and execute it efficiently on a distributed-memory cluster with distributed, chunked storage. Two extremes:...

[Link to Spark machine learning](https://spark.apache.org/docs/2.2.0/ml-guide.html)

Requirement: Offer a good selection of optimized routines for linear algebra and machine learning, or a middleware that is exceptionally good at optimizing such operations when they are implemented at...

HDF5 in a Hadoop context: https://stackoverflow.com/questions/28565970/reading-hdf5-files-in-apache-spark It seems difficult because Hadoop needs to connect the index range within the data set to offsets within the file in order to figure...