南栖 issues

Results 28 issues of


                                            南栖

[character_AI_open] 相关工作，开源了代码，数据，模型

https://github.com/Minami-su/character_AI_open

Related Works

[character_AI_open] Related works with open-source code, data, models, and it's two months ahead.

https://github.com/Minami-su/character_AI_open

Related Works

How to accelerate the inference speed of 1bit+lora model

Because it's so slow, 34b model 1bit+lora is about 1token/s

enhancement

When I used galore, the learning rate was set to 8e-6, but the training rate was 0.001

``` import os import sys from typing import List import fire import torch import transformers from datasets import load_dataset import os # os.environ["NCCL_P2P_DISABLE"] = "1" # os.environ["NCCL_IB_DISABLE"] = "1" """...

Support 3bit quip# model.

As title.

Add Amara-o1-7B-Qwen Amara-o2-7B-Qwen to AlpacaEval

Add the result of Amara-o1-7B-Qwen Amara-o2-7B-Qwen to AlpacaEval 2.0. Amara-o* is a powerful language model that continuously enhances its capabilities through the integration of Monte Carlo Tree Search (MCTS) and...

Qwen3 support

model = AutoAWQForCausalLM.from_pretrained(model_path, low_cpu_mem_usage=True, use_cache=False, device_map={"": "cpu"},offload_folder="offload_folder",max_memory=memory_config) File "/data/jcxy/haolu/anaconda3/envs/haolu/lib/python3.10/site-packages/awq/models/auto.py", line 70, in from_pretrained model_type = check_and_get_model_type( File "/data/jcxy/haolu/anaconda3/envs/haolu/lib/python3.10/site-packages/awq/models/auto.py", line 48, in check_and_get_model_type raise TypeError(f"{config.model_type} isn't supported yet.") TypeError: qwen3 isn't...

Add Semantic memory

This commit introduces a new implementation of the semantic memory module, replicating the original functionality using the GPT-4.1-mini model. The primary goal was to establish a performance baseline with the...