v6d icon indicating copy to clipboard operation
v6d copied to clipboard

Client and server interaction has a long latency.

Open Daniel-blue opened this issue 1 year ago • 11 comments

Describe your problem


During testing, it was found that the interaction between the client and server(vineyard instance) has a significant impact on latency, especially in scenarios involving multiple consecutive interactions. What is the purpose of breaking down the process between the client and the server into multiple interactions? Is there potential for improvement?

Daniel-blue avatar Nov 08 '24 01:11 Daniel-blue

Hi @Daniel-blue, Thanks for your reporting.

How do you start the vineyard? It's advisable to include the test code here to identify the source of the latency.

dashanji avatar Nov 16 '24 09:11 dashanji

put:create_buffer_request-->seal_request-->create_data_request-->persist_request-->put_name_request get:get_name_request-->get_data_request-->get_buffers_request

[ms] image image

import datetime
import numpy as np
import vineyard

block_size_MB = 30

def calculate_time_difference(label, start_time, end_time):
    time_diff = (end_time - start_time).total_seconds() * 1000  # Convert to milliseconds
    print(f"{label}: {time_diff:.2f} ms")

def create_data(block_size_MB, num_blocks):
    bytes_per_mb = 1024**2
    num_elements = (block_size_MB * bytes_per_mb) // 8
    data = np.random.rand(num_elements * num_blocks)
    return data

t1 = datetime.datetime.now()
client = vineyard.connect("/var/run/vineyard.sock")
t2 = datetime.datetime.now()
calculate_time_difference("client connect", t1, t2)
t3 = datetime.datetime.now()
data = create_data(block_size_MB, 32)  
object_id1 = client.put(data,persist=True,name="obj1")
t4 = datetime.datetime.now()
calculate_time_difference("put", t3, t4)
client.status
client.clear()
client.close()
import datetime
import numpy as np
import vineyard

def calculate_time_difference(label, start_time, end_time):
    time_diff = (end_time - start_time).total_seconds() * 1000  # Convert to milliseconds
    print(f"{label}: {time_diff:.2f} ms")

client = vineyard.connect("/var/run/vineyard.sock")
t9 = datetime.datetime.now()
object_ = client.get("object_id1")
t10 = datetime.datetime.now()
calculate_time_difference("get", t9, t10)
client.status
client.close()

Daniel-blue avatar Nov 17 '24 12:11 Daniel-blue

ipc 4K:30MB32 8K:30MB64

Pre fill 0 data objects (get) image

Pre fill 10000 data objects (get) image

import numpy as np
import hashlib
import vineyard
client = vineyard.connect('/var/run/vineyard.sock')

def create_block(size_MB):
    bytes_per_mb = 1024**2  
    num_elements = (size_MB * bytes_per_mb) // 8 
    
    block = np.random.rand(num_elements)
    return block

def hash_block(block):
    hasher = hashlib.sha256()
    hasher.update(block.tobytes())
    return hasher.hexdigest()

def process_blocks(num_blocks, block_size_MB):
    hash_list = []
    for _ in range(num_blocks):
        block = create_block(block_size_MB)
        block_hash = hash_block(block)
        client.put(block, name=block_hash, persist=True)
        hash_list.append(block_hash)
    return hash_list


block_size_MB = 30  
num_blocks_4k = 32  
num_blocks_8k = 64  

hash_list_4k = process_blocks(num_blocks_4k, block_size_MB)
print(f"4K scenario data hashes: {hash_list_4k}")

hash_list_8k = process_blocks(num_blocks_8k, block_size_MB)
print(f"8K scenario data hashes: {hash_list_8k}")
import time

def read_blocks(hash_list):
    start_time = time.time()
    for hash_value in hash_list:
        client.get(name=hash_value)
    end_time = time.time()
    elapsed_time = end_time - start_time
    return elapsed_time

time_4k = read_blocks(hash_list_4k)
print(f"Time to read 4K data: {time_4k} seconds")
time_8k = read_blocks(hash_list_8k)
print(f"Time to read 8K data: {time_8k} seconds")

Daniel-blue avatar Nov 18 '24 02:11 Daniel-blue

Hi @Daniel-blue, How do you start the vineyardd? Could you please provide some details about it?

dashanji avatar Nov 18 '24 02:11 dashanji

Have you started the etcd?

dashanji avatar Nov 18 '24 02:11 dashanji

Deploy the Vineyard server and client according to the guide at https://v6d.io/docs.html, and use kubectl exec to enter the client and operate with Python 3.0. using redis,it is ok。All other components are running normally. image

Daniel-blue avatar Nov 18 '24 02:11 Daniel-blue

Hi @Daniel-blue. Thanks for the detail. Basically, the latency comes from two parts, one is memory alloc in vineyard server(put) or vineyard client(get), the other one is the metadata sync(persist/put name).

In the first part, you can reduce the memory alloc in the vineyard server by adding the --reserve_memory=True. As for vineyard client, we don't have the part to pre-alloc memory for vineyard object at present.

In the second part, the persist and name will be converted to call for the metadata service, which will cause high latency. If your client and server still running in a process, you can just delete the persist and option to reduce the latency. If you can make sure the objects will be put in one vineyard instance, you can use the stream object to bypass the metadata sync as the following example. If your client and server is distributed, it may be possible to optimize the latency of get by putting multiple get operations into a single batch, with one metadata sync per batch.

import vineyard
import numpy as np
import time
from threading import Thread

from vineyard.io.recordbatch import RecordBatchStream

chunk_size = 1000

def stream_producer(vineyard_client):
    data = np.random.rand(10, 10).astype(np.float32)
    
    stream = RecordBatchStream.new(vineyard_client)
    vineyard_client.persist(stream.id)
    vineyard_client.put_name(stream.id, "stream11")
    chunk_list = []
    for _ in range(chunk_size):
        chunk_id = vineyard_client.put(data)
        chunk_list.append(chunk_id)
    start = time.time() 
    writer = stream.open_writer(vineyard_client)
    for _ in range(chunk_size):
        writer.append(chunk_id)
    writer.finish()
 
    end = time.time()
    per_chunk = (end - start) / chunk_size
    print(f"Producer sent {chunk_size} chunks in {end - start:.5f} seconds, per chunk cost {per_chunk:.5f} seconds")

def stream_consumer(vineyard_client):
    start = time.time()
    
    stream_id = vineyard_client.get_name("stream11", wait=True)
    stream = vineyard_client.get(stream_id)
    reader = stream.open_reader(vineyard_client)
    
    count = 0
    while True:
        try:
            chunk_id = reader.next_chunk_id()
            # data = vineyard_client.get(chunk_id)
            count += 1
        except StopIteration:
            break
    
    end = time.time()
    per_chunk = (end - start) / chunk_size
    print(f"Consumer received {count} chunks in {end - start:.5f} seconds, per chunk cost {per_chunk:.5f} seconds")

if __name__ == "__main__":
    endpoint = "172.20.6.103:9600"
    rpc_client = vineyard.connect(endpoint=endpoint)

    producer_thread = Thread(target=stream_producer, args=(rpc_client,))
    producer_thread.start()
    producer_thread.join()

    consumer_thread = Thread(target=stream_consumer, args=(rpc_client,))
    consumer_thread.start()
    consumer_thread.join()

dashanji avatar Nov 18 '24 03:11 dashanji

Is it effective to merge the process and change the ordermap for name to an unordered_map? Does the Vineyard client and server support concurrent put and get operations?

Daniel-blue avatar Nov 18 '24 06:11 Daniel-blue

Hi @Daniel-blue. Thanks for the detail. Basically, the latency comes from two parts, one is memory alloc in vineyard server(put) or vineyard client(get), the other one is the metadata sync(persist/put name).

In the first part, you can reduce the memory alloc in the vineyard server by adding the --reserve_memory=True. As for vineyard client, we don't have the part to pre-alloc memory for vineyard object at present.

In the second part, the persist and name will be converted to call for the metadata service, which will cause high latency. If your client and server still running in a process, you can just delete the persist and option to reduce the latency. If you can make sure the objects will be put in one vineyard instance, you can use the stream object to bypass the metadata sync as the following example. If your client and server is distributed, it may be possible to optimize the latency of get by putting multiple get operations into a single batch, with one metadata sync per batch.

The scenario may be more suitable for the third case---'the client and server are distributed'. Does 'putting multiple get operations into a single batch' mean that the metadata does not include the data object? How can this be done?

Daniel-blue avatar Nov 18 '24 07:11 Daniel-blue

Is it effective to merge the process and change the ordermap for name to an unordered_map?

It's hard to say it can reduce a lot latency.

Does the Vineyard client and server support concurrent put and get operations?

Yes, you could try it in multithreads.

Does 'putting multiple get operations into a single batch' mean that the metadata does not include the data object? How can this be done?

You can replace get_object with get_objects. But unfortunately, we haven't integrated it into get. It could be an enhancement that we can achieve in the future.

https://github.com/v6d-io/v6d/blob/main/python/vineyard/core/client.py#L600-L606

dashanji avatar Nov 18 '24 07:11 dashanji

/cc @sighingnow, this issus/pr has had no activity for a long time, please help to review the status and assign people to work on it.

github-actions[bot] avatar Dec 12 '24 00:12 github-actions[bot]