heavydb icon indicating copy to clipboard operation
heavydb copied to clipboard

Question about development: How to introduce heterogeneous parallelism?

Open TKONIY opened this issue 3 years ago • 5 comments

I'm an MPhil student doing research on GPU database. I want to extend heavydb's execution engine to introduce heterogeneous parallelism and new decision model. Here's what I want to achieve.

  • Place a pipeline(execution unit) on CPU or GPU when executing.
  • Concurrently execute different pipelines.

I'm new to heavydb and found that the codebase is quite complicated. Where should I start and what components should I focus?

TKONIY avatar Oct 26 '22 01:10 TKONIY

Hi @TKONIY,

I think that in the remote past we already did something similar (running the same kernel on CPU and GPU to speed up the single execution of a query) and soon we'll have something that's going to change the part of software you want to contribute a lot, so it's better to wait. I'll come back to you when the code will be in the master of the public repo.

p.s. I'm sorry for the huge delay replying you question

cdessanti avatar Nov 18 '22 07:11 cdessanti

Thanks! I'm looking forward to it.

TKONIY avatar Nov 18 '22 07:11 TKONIY

Hi @TKONIY,

In the OS should be landed the commit relative to the heterogeneous parallelism we were discussing in this thread.

The new part of code enable the concurrent run of one or more queries in the Cpu with one on the Gpu, so I think it's a good starting point of the project you were talking about.

You can find the commit here

https://github.com/heavyai/heavydb/commit/57a1d07ae547ac93c0e2eade79a1d05974f8e363

Candido

cdessanti avatar Jan 16 '23 08:01 cdessanti

I'm really happy that you still remember my issue. Thanks a lot! This is a very exciting feature for users. As I know, HeavyDB is the first GPUDB to address the problem of "runtime resource availability". I think I finally found a way into my research problem. I would like to keep track of its development and hope I could contribute.

TKONIY avatar Jan 20 '23 18:01 TKONIY

Hi @TKONIY,

Well, I'm sorry that taken so much time to disclose this segment of code, but we are having some issues moving the code from the internal repository to the OS one; in the end, this feature is in the 6.4.

I'm happy you are happy, but I do my best to keep track and help our fellow users; sometimes, I forget someone, but I'll improve for sure (I think I should be a little more methodical) ;)

Let me know if you having trouble with the resource manager; it's a little tricky to test because of the concurrency of CPU and GPU execution, mainly when you use a cross-filtered dashboard.

Candido.

cdessanti avatar Jan 23 '23 15:01 cdessanti