namespace-Pt comments

Results 50 comments of


                                            namespace-Pt

long-llm run for more than 1 epoch

This is a weird issue. I'll try it myself very soon. Could you please share your training log? Is the loss for more than 1 epoch normal?

long-llm run for more than 1 epoch

I think it would work. Please report your result here if you would like to have a try :)

Apply LLM Embedder

Hi, thanks for your interest. - LLM-Embedder is able to encode the conversation context as the input query because it has been trained on QReCC, a conversational search dataset. You...

Apply LLM Embedder

Hi, maybe you need to further fine-tune the model to retrieve based on the user input.

Training with unsloth

Hi, it's weird that unsloth was able to use DDP two months ago :anguished:. Maybe you should wait newer version of unsloth or use some LLM framework like Megatron for...

Try the following? ```bash CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 -m main.train \ --data_root /data/long-llm \ --output_dir data/outputs/$output_name \ --model_name_or_path meta-llama/Meta-Llama-3-8B-Instruct \ --train_data long-llm:gpt/one_detail_book.train.64K.json long-llm:gpt/one_detail_paper.train.64K.json long-llm:gpt/multi_detail_book.train.json long-llm:gpt/multi_detail_paper_short.train.json long-llm:gpt/multi_detail_paper_long.train.json long-llm:gpt/bio_book.train.json long-llm:longalpaca/train.json long-llm:redpajama/train.json[5000] \...

namespace-Pt

long-llm run for more than 1 epoch

long-llm run for more than 1 epoch

Apply LLM Embedder

Apply LLM Embedder

Training with unsloth

Training with unsloth

How to implement LongLLM in NPU device

ICLR‘25 Activation Beacon pretrained model的checkpoint

Dataset used in LongLLM_QLORA

请求activation beacon在LLaMA2-7B-base（非chat）上训练的checkpoints