Enrico Shippole
Enrico Shippole
Add GooseAI integration, Documentation, and tests. Hopefully, I am not missing anything. Usage: ```python import os from langchain.llms import GooseAI from langchain import PromptTemplate, LLMChain os.environ["GOOSEAI_API_KEY"] = "" llm =...
Add GooseAI, CerebriumAI, Petals, ForefrontAI
### System Info ```Shell Accelerate config: compute_environment: LOCAL_MACHINE deepspeed_config: {} distributed_type: MULTI_GPU downcast_bf16: 'no' fsdp_config: {} gpu_ids: all machine_rank: 0 main_process_ip: '' main_process_port: '' main_training_function: main megatron_lm_config: {} mixed_precision: bf16...
### System Info ```Shell compute_environment: LOCAL_MACHINE deepspeed_config: {} distributed_type: FSDP downcast_bf16: 'no' fsdp_config: {} gpu_ids: all machine_rank: 0 main_process_ip: '' main_process_port: '' main_training_function: main megatron_lm_config: {} mixed_precision: bf16 num_machines: 8...
Hi all, I was wondering if you could give any input on whether the standard PyTorch FSDP wrapper was compatible with Huggingface `accelerate.prepare()`? For example: ```python import torch from accelerate...
Hello, Is there a proper way to handle INT8/UINT8 for quantization? I am attempting to reproduce the functions below in order to quantize flash-attention with Triton. ```python def quantize_to_int8(tensor, clip_max,...
**Short description** ``` raise ValueError(f'Shapes {shape1} and {shape2} must have the same rank') ValueError: Failed to encode example: {'id': '7ba1e8f4261d3170fcf42e84a81dd749116fae95', 'title': 'Brain', 'context': 'Another approach to brain function is to...
Kosmos 2.5
Hello, Thank you for all your great research. I was wondering if there were plans to release Kosmos 2.5 similar to how Kosmos 2 was released on Huggingface. Thank you,...
Hello @ShengdingHu, Are you able to confirm whether Flash Attention will be compatible with Open Delta LoRA? For example: ```python tokenizer = AutoTokenizer.from_pretrained("EleutherAI/pythia-1.4b") tokenizer.pad_token = tokenizer.mask_token model = GPTNeoXForCausalLM.from_pretrained("EleutherAI/pythia-1.4b") max_positions...
Hello, I was wondering if you knew whether Apex's Tensor Parallelism was compatible with LoRA? Would tensor parallelism work for both the base model and lora? I appreciate your time...