clickhouse-cpp icon indicating copy to clipboard operation
clickhouse-cpp copied to clipboard

new feature needed : suport coroutine for co_yield c++20

Open asialiugf opened this issue 2 years ago • 7 comments

    client.Select("SELECT id, name FROM default.numbers", [] (const Block& block) -> std::generator<Block>
        {
            for (size_t i = 0; i < block.GetRowCount(); ++i) {
                std::cout << block[0]->As<ColumnUInt64>()->At(i) << " "
                          << block[1]->As<ColumnString>()->At(i) << "\n";
                          co_yield block ;
            }
        }
    );

asialiugf avatar May 09 '23 03:05 asialiugf

Would you mind giving more context about what you need to implement?

ljluestc avatar May 21 '23 17:05 ljluestc

I have a websocket server eventloop (just one thread for all user, all client sockets, just like node.js) , when the eventloop recieved user's request, it will select data from cliclhouse and response .

one eventloop loops again and again to deal users sockets. (I just call them first loop, second loop ... )

if one user's return data is huge , the other users will be blocked because of one eventloop for all sockets.

so I hope coroutine can help me . for example: if a user's return data from clickhouse is large , the return data can be seperated for several parts to response one by one. after the first part sent back to the user in first loop, the clickhouse select will be coyeild, and then jump to deal the other's request, after first loop end, the eventloop run the second loop. in second loop, the clickhouse select result will be resumed to send next parts of the return data for this user untill all parts be sent.

in coroutine generator, the deal function should have a coroutine return type : std::generator<Block>

but here is a lumbda in client.Select( {} ); and I dont kown how to difine the generator.

If you have time, you can refer this: https://github.com/lewissbaker/generator an example:

        int intn = 10;
        std::generator<int> g_int = [&]() -> std::generator<int> {
            for (int i = 0; i < intn; i++) co_yield i;
        }();
        size_t count_int = 0;
        auto it_int = g_int.begin();
        for (int i = 0; i < 30; ++i) {
            if (it_int == g_int.end()) break;
            decltype(auto) m_int = *it_int;
            it_int++;
            printf("%d ---- int !!\n", m_int);
        }

thank you !

asialiugf avatar May 24 '23 12:05 asialiugf

A good idea

1261385937 avatar Jun 10 '23 09:06 1261385937

The checklist :

  1. Async interface is a large feature, a lot of code about network , so i plan to use standalone contrib/asio .
  2. Replace the present code about network (retain the sync interfaces)
  3. Add async interfaces (query, instert and so on)
  4. Provide cluster interfaces (health detection, load balancing writing, etc) based on asynchronous interfaces
  5. Coroutine interfaces based on asio is cheap.

@Enmk Please give me your advice, thanks

1261385937 avatar Jun 10 '23 10:06 1261385937

Due to network interaction repeatedly in one requset-response, the async interfaces will use coroutine inside. So c++20 is necessary, the clickhouse-cpp will go to v3.0 😅

1261385937 avatar Jun 12 '23 06:06 1261385937

  • Async interface is a large feature, a lot of code about network , so i plan to use standalone contrib/asio .
  • Replace the present code about network (retain the sync interfaces)
  • Add async interfaces (query, instert and so on)
  • Provide cluster interfaces (health detection, load balancing writing, etc) based on asynchronous interfaces
  • Coroutine interfaces based on asio is cheap.

I don't think that health checks (of CH server, I assume) and load balancing (aside from #310) should be implemented in this library.

Also, sine one Client instance means one connection (Client effectively uses one socket), IMO having async interface in the Client itself might be high-effort-low-benefit endeavor.

Perhaps one might want to implement some sort of AsyncClient on top of the Client, using 1 Client per 1 request, in either throw-away manner or have a pool (it is important to re-use only Client instances that didn't raise exceptions). That also doesn't break existing user code and doesn't require a full rewrite of the networking code.

Enmk avatar Jun 14 '23 11:06 Enmk

Due to network interaction repeatedly in one requset-response, the async interfaces will use coroutine inside. So c++20 is necessary, the clickhouse-cpp will go to v3.0 😅

please do not use asio coroutines/ your own coroutines, instead provide callback api, which may be used with any coroutines

kelbon avatar Nov 19 '24 17:11 kelbon