Dieter Weber comments

Results 225 comments of


                                            Dieter Weber

Frameworks to distribute computation on many workers

Looking a bit into Lustre, it doesn't seem to have the Hadoop functionality of keeping data directly on the client node to get massive collective IO bandwidth very cheaply. Lustre...

Frameworks to distribute computation on many workers

Essentially, it seems to boil down to two options for good scaling that each have pros and cons. # HDF5, MPI and Lustre ++ Works on regular HPC clusters ++...

Frameworks to distribute computation on many workers

In my feeling, the first choice looks like an easier path with a short-term perspective, but we would start running into limits in the medium and long term when we...

Frameworks to distribute computation on many workers

An example what work with Apache Spark et al would look like: https://de.slideshare.net/KevinMader/interactive-scientific-image-analysis-using-spark

Frameworks to distribute computation on many workers

On a single node, dask is actually doing very well under good conditions: https://github.com/LiberTEM/LiberTEM/issues/14#issuecomment-369186198 The problems in hyperspy/hyperspy#1840 are not a fundamental dask issue. In general, the first step should...

Frameworks to distribute computation on many workers

Apache Spark as part of the Hadoop ecosystem looks like the system of choice for analytics and machine learning on very large datasets. From what we can see, it has...

Frameworks to distribute computation on many workers

Comparison of an Apache Spark solution with our [requirements](https://github.com/uellue/opixtem/wiki/Requirements): https://github.com/LiberTEM/LiberTEM/projects/1 Only requirements that relate to the data processing and storage are considered for GUI and live acquisition.