Question about development: How to introduce heterogeneous parallelism?
I'm an MPhil student doing research on GPU database. I want to extend heavydb's execution engine to introduce heterogeneous parallelism and new decision model. Here's what I want to achieve.
- Place a pipeline(execution unit) on CPU or GPU when executing.
- Concurrently execute different pipelines.
I'm new to heavydb and found that the codebase is quite complicated. Where should I start and what components should I focus?
Hi @TKONIY,
I think that in the remote past we already did something similar (running the same kernel on CPU and GPU to speed up the single execution of a query) and soon we'll have something that's going to change the part of software you want to contribute a lot, so it's better to wait. I'll come back to you when the code will be in the master of the public repo.
p.s. I'm sorry for the huge delay replying you question
Thanks! I'm looking forward to it.
Hi @TKONIY,
In the OS should be landed the commit relative to the heterogeneous parallelism we were discussing in this thread.
The new part of code enable the concurrent run of one or more queries in the Cpu with one on the Gpu, so I think it's a good starting point of the project you were talking about.
You can find the commit here
https://github.com/heavyai/heavydb/commit/57a1d07ae547ac93c0e2eade79a1d05974f8e363
Candido
I'm really happy that you still remember my issue. Thanks a lot! This is a very exciting feature for users. As I know, HeavyDB is the first GPUDB to address the problem of "runtime resource availability". I think I finally found a way into my research problem. I would like to keep track of its development and hope I could contribute.
Hi @TKONIY,
Well, I'm sorry that taken so much time to disclose this segment of code, but we are having some issues moving the code from the internal repository to the OS one; in the end, this feature is in the 6.4.
I'm happy you are happy, but I do my best to keep track and help our fellow users; sometimes, I forget someone, but I'll improve for sure (I think I should be a little more methodical) ;)
Let me know if you having trouble with the resource manager; it's a little tricky to test because of the concurrency of CPU and GPU execution, mainly when you use a cross-filtered dashboard.
Candido.