Xiao

Results 58 comments of Xiao
trafficstars

I add timer in the `/llama_index/llms/huggingface/base.py` ``` @llm_completion_callback() def complete( self, prompt: str, formatted: bool = False, **kwargs: Any ) -> CompletionResponse: """Completion endpoint.""" full_prompt = prompt def getlen(s): return...

I found if I run the `query_engine.query(tmp_query)`, the latency is larger then one of new query. part of log is below. my original query is `query*******:Based on the abstract of...

does the llamaindex support batch query? currently, my code is for single query. if i want do batch query, how to do?

I try https://github.com/hpcaitech/ColossalAI/tree/main/examples and there are lots of problem. BTW, I am working on a benchmark and try to implement gpt and swin transformer by using SP+PP of ColossalAI. I'm...

> Actually, I am not an expert on SP and PP. I can help you to contact the author of the SP paper. @FrankLeeeee can you help with this project?...

Then I change the get_batch_for_sequence_parallel to make synthetic data. ``` def get_batch_for_sequence_parallel(data_iterator): global_rank = torch.distributed.get_rank() local_world_size = 1 if not gpc.is_initialized(ParallelMode.TENSOR) else gpc.get_world_size(ParallelMode.TENSOR) seq_length=gpc.config.SEQ_LENGTH dp_size = gpc.get_world_size(ParallelMode.DATA) global_batch_size = gpc.config.GLOBAL_BATCH_SIZE...

I run this code on 1GPU. when the GLOBAL_BATCH_SIZE = 1 , it can run. But when I set the GLOBAL_BATCH_SIZE = 32, 64, 128 or bigger, it is OOM....

this is our understanding for the config. ``` Consider no Pipeline: The meaning of global batch size: the total sample number of all GPU performs one forward. The meaning of...

> This issue has been stale for a long time. Global batch size = data parallel size * num_micro_batch * micro_batch_size. I get it. BTW, is there any mistake about...