hyperformer
hyperformer copied to clipboard
Off the shelf generation from trained hyperformer++
from hyperformer.adapters import AdapterController, AutoAdapterConfig
from hyperformer.third_party.models import T5Config, T5ForConditionalGeneration
from transformers import AutoTokenizer, set_seed
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"
set_seed(42)
config = T5Config.from_pretrained('t5-3b',cache_dir="/local/nlpswordfish/tuhin/")
tokenizer = AutoTokenizer.from_pretrained('t5-3b',cache_dir="/local/nlpswordfish/tuhin/")
adapter_config = AutoAdapterConfig.get('meta-adapter')
#####################
adapter_config.input_dim = 1024
adapter_config.tasks = data_args.tasks
adapter_config.device = training_args.device
adapter_config.task_to_adapter = {task:adapter for task, adapter in zip(data_args.tasks, data_args.adapters)} if data_args.adapters is not None else None
adapter_config.task_to_embeddings = {task:embedding for task, embedding in zip(data_args.tasks, data_args.task_embeddings)} if (data_args.task_embeddings is not None) else None
######################
extra_adapter_params = ("task_embedding_dim","add_layer_norm_before_adapter","add_layer_norm_after_adapter","reduction_factor","hidden_dim","non_linearity","train_task_embeddings","projected_task_embedding_dim","task_hidden_dim","conditional_layer_norm","train_adapters_blocks","unique_hyper_net","unique_hyper_net_layer_norm","efficient_unique_hyper_net")
for p in extra_adapter_params:
if hasattr(adapter_args, p) and hasattr(adapter_config, p):
setattr(adapter_config, p, getattr(adapter_args, p))
model = T5ForConditionalGeneration.from_pretrained("/mnt/swordfish-datastore/tuhin/hyperformer++",from_tf=False, config=config,cache_dir="/local/nlpswordfish/tuhin/",adapter_config=adapter_config)
model.cuda()
inputs = tokenizer.encode("it 's a charming and often affecting journey .", return_tensors="pt")
gen_kwargs = {"max_length": 256, "num_beams": 1}
gen_kwargs["task"] = "sst"
gen_kwargs["task_embedding"] = model.task_embedding_controller("sst") if (self.config.train_adapters and isinstance(self.adapter_config, MetaAdapterConfig)) else None
outputs = model.generate(input_ids=inputs.cuda(),**gen_kwargs)
answer = tokenizer.decode(outputs[0],skip_special_tokens=True)
print("Predicted output", answer)
@rabeehk can you help me with the required parameters inside the #####
Do we need these for inference ?
adapter_config.tasks = data_args.tasks adapter_config.device = training_args.device
Hi @tuhinjubcse
I remember task_to_adapter
specifies a mapping from tasks to adapters for the inference tasks, for instance lets assume you train the model with adapters [X Y Z]
so during training you have these tasks, then you want to run the inference for tasks [F G]
, and here I was specifying a list on how F
and G
should be mapped to the trained available adapters, for instance if we have:
data_args.tasks = [F, G]
data_args.adapters = [X, Z]
it means I am using the following adapters for each task during the inference:
task_to_adapter = {F:X, G:Z}
Similarly task_to_embeddings
specifies which task embedding to use for the inference time. Basically I added these because training and inference tasks can be different and one need to know the mapping between inference tasks and the trained adapters/task embeddings.
Best Rabeeh
Yes but in https://github.com/rabeehk/hyperformer/blob/main/hyperformer/configs/hyperformer%2B%2B.json
We only pass data_args.tasks
Do you know what the value will be for data_args. adapters or data_args.task_embeddings it seems data_args.adapters is it just an array of task names ?
but data_args.task_embeddings whats the value of it ? how to specify it.
Can you also tell the value of
adapter_config.input_dim = config.d_model
I also have another question , I tried modifying your config to add a dropout of 0.1 but i get
Traceback (most recent call last):
File "./finetune_t5_trainer.py", line 311, in <module>
assert hasattr(config, p), f"({config.__class__.__name__}) doesn't have a `{p}` attribute"
AssertionError: (T5Config) doesn't have a `dropout` attribute
Hi,
I only pass data_args.tasks
since by default if one trains and test in the same tasks, as the samples I shared, one would not need to set data_args. adapters or data_args.task_embeddings in the codes. These are only useful to set if one wants to train on some tasks and test on other tasks, and they would need to be set as a list of strings. If these values are not passed default ones are set I think here:
https://github.com/rabeehk/hyperformer/blob/cb218076b5598839a6ba3ea1fc4eedf8f414312c/hyperformer/adapters/adapter_controller.py#L22 and https://github.com/rabeehk/hyperformer/blob/cb218076b5598839a6ba3ea1fc4eedf8f414312c/hyperformer/adapters/adapter_utils.py#L70
About the dropout, I do not have access to run the codes, but for debug you can check the attributes name and make sure this attribute exist. Basically you need to check the T5Config, the version used in the codes and make sure the names are the same.
config.d_model
is the embedding dimension for the T5 model, small and large models have different embedding size.
task_embeddings
is just a list of strings as defined here https://github.com/rabeehk/hyperformer/blob/cb218076b5598839a6ba3ea1fc4eedf8f414312c/hyperformer/training_args.py#L134
Ok you are awesome. These seem to something that will be very useful for zero shot inference :) where train and test task are different
you're very welcome.