transformers
transformers copied to clipboard
TokenGT
Model description
Adding the TokenGT graph transformer model with @Raman-Kumar (see Graphormer issue)
@Raman-Kumar I'll create a PR with what I had ported of TokenGT at the end of the week, to give you a starting point! You'll need to read this first, to get an idea of the steps we follow when integrating a model. Then, 1st step will be checking the code with a checkpoint, so you need to look for one and download it, to compare results with the original implementation. Does that work for you?
Open source status
- [X] The model implementation is available
- [X] The model weights are available
Provide useful links for the implementation
No response
@clefourrier for sure will work
Thanks for assigning. @clefourrier 😊 I am still examining and experimenting more...
Ping me if you need help! :smile:
😢 giving up fingering out myself my level - I was not familiar with transformer architecture, collator etc, and other models like bert now I have studied them, and the TokenGT model’s theoretical aspects.
I have downloaded the checkpoint folder form drive link from the original repo link
Now I have to run both PR with checkpoint and original repo
Can you share the script you did with Graphormer? @clefourrier
Ok so you will need to do something similar to this:
import argparse
import os, sys
from pathlib import Path
import torch
from torch import nn
from torch.hub import load_state_dict_from_url
# Here, you need to import the transformers version of the TokenGT code (from the PR)
from transformers import (
AutoModel,
GraphormerConfig,
GraphormerForGraphClassification,
GraphormerModel,
# GraphormerCollator
)
from transformers.utils import logging
from transformers.models.graphormer.collating_graphormer import preprocess_item, GraphormerDataCollator
# Here, you need to import the original TokenGT code instead of Graphormer
sys.path.append("path to Graphormer/")
import graphormer
import graphormer.tasks.graph_prediction
import graphormer.models.graphormer
from graphormer.evaluate.evaluate import convert_namespace_to_omegaconf, tasks, options
from fairseq import utils
from fairseq.logging import progress_bar
# You will likely have to change some of these depending on the error messages you get when loading the checkpoint to transformers format
rename_keys = [
("encoder.lm_output_learned_bias", "classifier.lm_output_learned_bias"),
("encoder.embed_out.weight", "classifier.classifier.weight"),
#("encoder.embed_out.weight", "classifier.embed_out.weight"),
#("encoder.embed_out.bias", "classifier.embed_out.bias"),
]
def remove_ignore_keys_(state_dict):
ignore_keys = [
"encoder.version",
"decoder.version",
"encoder.masked_lm_pooler.bias", # to check
"encoder.masked_lm_pooler.weight", # to check
"_float_tensor",
]
for k in ignore_keys:
state_dict.pop(k, None)
def rename_key(dct, old, new):
val = dct.pop(old)
dct[new] = val
def make_linear_from_emb(emb):
vocab_size, emb_size = emb.weight.shape
lin_layer = nn.Linear(vocab_size, emb_size, bias=False)
lin_layer.weight.data = emb.weight.data
return lin_layer
# In this section, you need to replace calls to Graphormer by calls to TokenGT models.
# Graphormer model gets replaced by the original TokenGT model
# Transformers model gets replaced by the format in Transformers
@torch.no_grad()
def convert_graphormer_checkpoint(
args, checkpoint_name, pytorch_dump_folder_path
):
pytorch_dump_folder_path = f"{pytorch_dump_folder_path}/{checkpoint_name}"
cfg = convert_namespace_to_omegaconf(args)
task = tasks.setup_task(cfg.task)
# Graphormer model
graphormer_model = task.build_model(cfg.model)
graphormer_state = torch.load(checkpoint_name)["model"]
graphormer_model.load_state_dict(graphormer_state, strict=True, model_cfg=cfg.model)
graphormer_model.upgrade_state_dict(graphormer_model.state_dict())
# Transformers model
config = GraphormerConfig(
num_labels=1,
share_input_output_embed=False,
num_layers=12,
embedding_dim=768,
ffn_embedding_dim=768,
num_attention_heads=32,
dropout=0.0,
attention_dropout=0.1,
activation_dropout=0.1,
encoder_normalize_before=True,
pre_layernorm=False,
apply_graphormer_init=True,
activation_fn="gelu",
no_token_positional_embeddings=False,
)
transformers_model = GraphormerForGraphClassification(config)
# We copy the state dictionary from the original model to our format
state_dict = graphormer_model.state_dict()
remove_ignore_keys_(state_dict)
for src, dest in rename_keys:
rename_key(state_dict, src, dest)
transformers_model.load_state_dict(state_dict)
# Check results
graphormer_model.eval()
transformers_model.eval()
split = args.split
task.load_dataset(split)
batch_iterator = task.get_batch_iterator(
dataset=task.dataset(split),
max_tokens=cfg.dataset.max_tokens_valid,
max_sentences=2, #cfg.dataset.batch_size_valid,
max_positions=utils.resolve_max_positions(
task.max_positions(),
graphormer_model.max_positions(),
),
ignore_invalid_inputs=cfg.dataset.skip_invalid_size_inputs_valid_test,
required_batch_size_multiple=cfg.dataset.required_batch_size_multiple,
seed=cfg.common.seed,
num_workers=cfg.dataset.num_workers,
epoch=0,
data_buffer_size=cfg.dataset.data_buffer_size,
disable_iterator_cache=False,
)
itr = batch_iterator.next_epoch_itr(
shuffle=False, set_dataset_epoch=False
)
progress = progress_bar.progress_bar(
itr,
log_format=cfg.common.log_format,
log_interval=cfg.common.log_interval,
default_log_format=("tqdm" if not cfg.common.no_progress_bar else "simple")
)
# Inference
collator = GraphormerDataCollator() #on_the_fly_processing=True)
ys_graphormer = []
ys_transformers = []
with torch.no_grad():
for i, sample in enumerate(progress):
y_graphormer = graphormer_model(**sample["net_input"])[:, 0, :].reshape(-1)
ys_graphormer.extend(y_graphormer.detach())
#print(sample["net_input"]["batched_data"])
transformer_sample = sample["net_input"]["batched_data"] # data is already collated - collator(sample["net_input"]["batched_data"])
transformer_sample.pop("idx")
transformer_sample["labels"] = transformer_sample.pop("y")
transformer_sample["node_input"] = transformer_sample.pop("x")
torch.set_printoptions(profile="full")
y_transformer = transformers_model(**transformer_sample)["logits"] #[:, 0, :].reshape(-1)
ys_transformers.extend(y_transformer.detach())
ys_graphormer = torch.stack(ys_graphormer)
ys_transformers = torch.stack(ys_transformers).squeeze(-1)
assert ys_graphormer.shape == ys_transformers.shape
assert (ys_graphormer == ys_transformers).all().item()
print("All good :)")
Path(pytorch_dump_folder_path).mkdir(exist_ok=True)
transformers_model.save_pretrained(pytorch_dump_folder_path)
transformers_model.push_to_hub(checkpoint_name, use_auth_token="replace by your token")
if __name__ == "__main__":
parser = options.get_training_parser()
# Required parameters
parser.add_argument(
"--checkpoint_name",
type=str,
help="name of a model to load", # path to a model.pt on local filesystem."
)
parser.add_argument(
"--pytorch_dump_folder_path",
default=None,
type=str,
help="Path to the output PyTorch model.",
)
parser.add_argument(
"--split",
type=str,
)
parser.add_argument(
"--metric",
type=str,
)
args = options.parse_args_and_arch(parser, modify_parser=None)
print(args)
#args = parser.parse_args()
convert_graphormer_checkpoint(
args,
args.checkpoint_name,
args.pytorch_dump_folder_path,
)
new to deep learning
I am using macbook air m1
While running command pip install -e ".[dev]"
for transformers repo,
It shows some error for tensorflow
So, I am using pip install -e ".[dev-torch]"
, which works fine.
what argument list do you supply when running the above script for Graphormer? @clefourrier
Hi @Raman-Kumar! I don't think the tensorflow error is very important atm, don't worry :smile:
Here is my argument list: --checkpoint_name Name_of_the_checkpoint_you_downloaded_for_tokenGT --pytorch_dump_folder_path tmp --user-dir "Directory where you cloned the code from the TokenGT repository" --num-workers 16 --ddp-backend=legacy_ddp --dataset-name MUTAG_0 --user-data-dir "custom_datasets" --task graph_prediction --criterion l1_loss --arch graphormer_base --num-classes 1 --batch-size 64 --pretrained-model-name pcqm4mv1_graphormer_base --load-pretrained-model-output-layer --split valid --seed 1
From ddp-backend
on, you will need to adapt the parameters to launch one of the available datasets in TokenGT, or you could add a custom_datasets
loader in tokengt/data/predict_custom
.
For the latter, I think there is a sample script, but if not you can take inspiration from this, which loads MUTAG from the hub to load it in TokenGT:
from datasets import load_dataset
from tokengt.data import register_dataset
from tokengt.data.pyg_datasets.pyg_dataset import TokenGTPYGDataset
import torch
from torch_geometric.data import Data, Dataset, InMemoryDataset
import numpy as np
class TmpDataset(InMemoryDataset):
def __init__(self, root, data_list):
self.data_list = data_list
super().__init__(root, None, None, None)
@property
def raw_file_names(self):
return []
@property
def processed_file_names(self):
return ["data.pt"]
def len(self):
return len(self.data_list)
def get(self, idx):
data = self.data_list[idx]
return data
def create_customized_dataset(dataset_name, ix_xval):
graphs_dataset = load_dataset(f"graphs-datasets/{dataset_name}")
graphs_dataset = graphs_dataset.shuffle(0)
key = "full" if "full" in graphs_dataset.keys() else "train"
graphs_list = [
Data(
**{
"edge_index": torch.tensor(graph["edge_index"], dtype=torch.long),
"y": torch.tensor(graph["y"], dtype=torch.long),
"num_nodes": graph["num_nodes"],
#"x": torch.ones(graph["num_nodes"], 1, dtype=torch.long), # same embedding for all
#"edge_attr": torch.ones(len(graph["edge_index"][0]), 1, dtype=torch.long), # same embedding for all
"x": torch.tensor(graph["node_feat"], dtype=torch.long) if "node_feat" in graph.keys() else torch.ones(graph["num_nodes"], 1, dtype=torch.long), # same embedding for all
"edge_attr": torch.tensor(graph["edge_attr"], dtype=torch.long) if "edge_attr" in graph.keys() else torch.ones(len(graph["edge_index"][0]), 1, dtype=torch.long), # same embedding for all
}
)
for graph in graphs_dataset[key]
]
len_dataset = len(graphs_dataset[key])
len_xval_batch = int(len_dataset / 10)
cur_val_range_int = list(range(ix_xval * len_xval_batch, (ix_xval + 1) * len_xval_batch))
cur_val_range = np.array(cur_val_range_int, dtype=np.int64)
cur_train_range = np.array(
[ix for ix in range(len_dataset) if ix not in cur_val_range_int], dtype=np.int64
)
dataset = TmpDataset("", graphs_list)
return {
"dataset": TokenGTPYGDataset(
dataset=dataset,
seed=0,
train_idx=torch.tensor([0]), #cur_train_range),
valid_idx=torch.tensor(cur_val_range),
test_idx=torch.tensor(cur_val_range),
),
"source": "pyg",
"train_idx":torch.tensor(cur_train_range),
"valid_idx":torch.tensor(cur_val_range),
"test_idx":torch.tensor(cur_val_range),
}
@register_dataset("MUTAG_0")
def m0():
return create_customized_dataset("MUTAG", 0)
Tell me if anything is unclear! :hugs:
Right now I am running this script
script.py
import argparse
import os, sys
from pathlib import Path
import torch
from torch import nn
from torch.hub import load_state_dict_from_url
from transformers.utils import logging
import tokengt
import tokengt.tasks.graph_prediction
import tokengt.models.tokengt
from tokengt.evaluate.evaluate import convert_namespace_to_omegaconf, tasks, options
from fairseq import utils
from fairseq.logging import progress_bar
@torch.no_grad()
def convert_tokengt_checkpoint(
args, checkpoint_name, pytorch_dump_folder_path
):
pytorch_dump_folder_path = f"{pytorch_dump_folder_path}/{checkpoint_name}"
cfg = convert_namespace_to_omegaconf(args)
# task = tasks.setup_task(cfg.task)
if __name__ == "__main__":
parser = options.get_training_parser()
# Required parameters
parser.add_argument(
"--checkpoint_name",
type=str,
help="name of a model to load", # path to a model.pt on local filesystem."
)
parser.add_argument(
"--pytorch_dump_folder_path",
default=None,
type=str,
help="Path to the output PyTorch model.",
)
parser.add_argument(
"--split",
type=str,
)
parser.add_argument(
"--metric",
type=str,
)
args = options.parse_args_and_arch(parser, modify_parser=None)
print(args.pytorch_dump_folder_path)
args = parser.parse_args()
convert_tokengt_checkpoint(
args,
args.checkpoint_name,
args.pytorch_dump_folder_path,
)
with command
.....script.py --checkpoint_name pcqv2-tokengt-orf64-trained --pytorch_dump_folder_path tmp --user-dir "../tokengt" --num-workers 16 --ddp-backend=legacy_ddp --dataset-name PCQM4Mv2 --user-data-dir "tokengt/data/ogb_datasets" --task graph_prediction --criterion l1_loss --arch tokengt_base --num-classes 1 --batch-size 64 --pretrained-model-name mytokengt --load-pretrained-model-output-layer --split valid --seed 1
in cfg = convert_namespace_to_omegaconf(args)
I am getting this error
2023-02-09 13:05:21 | ERROR | fairseq.dataclass.utils | Error when composing. Overrides: ['common.no_progress_bar=False', 'common.log_interval=100', 'common.log_format=null', 'common.log_file=null', 'common.aim_repo=null', 'common.aim_run_hash=null', 'common.tensorboard_logdir=null', 'common.wandb_project=null', 'common.azureml_logging=False', 'common.seed=1', 'common.cpu=False', 'common.tpu=False', 'common.bf16=False', 'common.memory_efficient_bf16=False', 'common.fp16=False', 'common.memory_efficient_fp16=False', 'common.fp16_no_flatten_grads=False', 'common.fp16_init_scale=128', 'common.fp16_scale_window=null', 'common.fp16_scale_tolerance=0.0', 'common.on_cpu_convert_precision=False', 'common.min_loss_scale=0.0001', 'common.threshold_loss_scale=null', 'common.amp=False', 'common.amp_batch_retries=2', 'common.amp_init_scale=128', 'common.amp_scale_window=null', "common.user_dir='../tokengt'", 'common.empty_cache_freq=0', 'common.all_gather_list_size=16384', 'common.model_parallel_size=1', 'common.quantization_config_path=null', 'common.profile=False', 'common.reset_logging=False', 'common.suppress_crashes=False', 'common.use_plasma_view=False', "common.plasma_path='/tmp/plasma'", 'common_eval.path=null', 'common_eval.post_process=null', 'common_eval.quiet=False', "common_eval.model_overrides='{}'", 'common_eval.results_path=null', 'distributed_training.distributed_world_size=1', 'distributed_training.distributed_num_procs=1', 'distributed_training.distributed_rank=0', "distributed_training.distributed_backend='nccl'", 'distributed_training.distributed_init_method=null', 'distributed_training.distributed_port=-1', 'distributed_training.device_id=0', 'distributed_training.distributed_no_spawn=False', "distributed_training.ddp_backend='legacy_ddp'", "distributed_training.ddp_comm_hook='none'", 'distributed_training.bucket_cap_mb=25', 'distributed_training.fix_batches_to_gpus=False', 'distributed_training.find_unused_parameters=False', 'distributed_training.gradient_as_bucket_view=False', 'distributed_training.fast_stat_sync=False', 'distributed_training.heartbeat_timeout=-1', 'distributed_training.broadcast_buffers=False', 'distributed_training.slowmo_momentum=null', "distributed_training.slowmo_base_algorithm='localsgd'", 'distributed_training.localsgd_frequency=3', 'distributed_training.nprocs_per_node=1', 'distributed_training.pipeline_model_parallel=False', 'distributed_training.pipeline_balance=null', 'distributed_training.pipeline_devices=null', 'distributed_training.pipeline_chunks=0', 'distributed_training.pipeline_encoder_balance=null', 'distributed_training.pipeline_encoder_devices=null', 'distributed_training.pipeline_decoder_balance=null', 'distributed_training.pipeline_decoder_devices=null', "distributed_training.pipeline_checkpoint='never'", "distributed_training.zero_sharding='none'", 'distributed_training.fp16=False', 'distributed_training.memory_efficient_fp16=False', 'distributed_training.tpu=False', 'distributed_training.no_reshard_after_forward=False', 'distributed_training.fp32_reduce_scatter=False', 'distributed_training.cpu_offload=False', 'distributed_training.use_sharded_state=False', 'distributed_training.not_fsdp_flatten_parameters=False', 'dataset.num_workers=16', 'dataset.skip_invalid_size_inputs_valid_test=False', 'dataset.max_tokens=null', 'dataset.batch_size=64', 'dataset.required_batch_size_multiple=8', 'dataset.required_seq_len_multiple=1', 'dataset.dataset_impl=null', 'dataset.data_buffer_size=10', "dataset.train_subset='train'", "dataset.valid_subset='valid'", 'dataset.combine_valid_subsets=null', 'dataset.ignore_unused_valid_subsets=False', 'dataset.validate_interval=1', 'dataset.validate_interval_updates=0', 'dataset.validate_after_updates=0', 'dataset.fixed_validation_seed=null', 'dataset.disable_validation=False', 'dataset.max_tokens_valid=null', 'dataset.batch_size_valid=null', 'dataset.max_valid_steps=null', 'dataset.curriculum=0', "dataset.gen_subset='test'", 'dataset.num_shards=1', 'dataset.shard_id=0', 'dataset.grouped_shuffling=False', 'dataset.update_epoch_batch_itr=null', 'dataset.update_ordered_indices_seed=False', 'optimization.max_epoch=0', 'optimization.max_update=0', 'optimization.stop_time_hours=0.0', 'optimization.clip_norm=0.0', 'optimization.sentence_avg=False', 'optimization.update_freq=[1]', 'optimization.lr=[0.25]', 'optimization.stop_min_lr=-1.0', 'optimization.use_bmuf=False', 'optimization.skip_remainder_batch=False', "checkpoint.save_dir='checkpoints'", "checkpoint.restore_file='checkpoint_last.pt'", 'checkpoint.continue_once=null', 'checkpoint.finetune_from_model=null', 'checkpoint.reset_dataloader=False', 'checkpoint.reset_lr_scheduler=False', 'checkpoint.reset_meters=False', 'checkpoint.reset_optimizer=False', "checkpoint.optimizer_overrides='{}'", 'checkpoint.save_interval=1', 'checkpoint.save_interval_updates=0', 'checkpoint.keep_interval_updates=-1', 'checkpoint.keep_interval_updates_pattern=-1', 'checkpoint.keep_last_epochs=-1', 'checkpoint.keep_best_checkpoints=-1', 'checkpoint.no_save=False', 'checkpoint.no_epoch_checkpoints=False', 'checkpoint.no_last_checkpoints=False', 'checkpoint.no_save_optimizer_state=False', "checkpoint.best_checkpoint_metric='loss'", 'checkpoint.maximize_best_checkpoint_metric=False', 'checkpoint.patience=-1', "checkpoint.checkpoint_suffix=''", 'checkpoint.checkpoint_shard_count=1', 'checkpoint.load_checkpoint_on_all_dp_ranks=False', 'checkpoint.write_checkpoints_asynchronously=False', 'checkpoint.model_parallel_size=1', 'bmuf.block_lr=1.0', 'bmuf.block_momentum=0.875', 'bmuf.global_sync_iter=50', 'bmuf.warmup_iterations=500', 'bmuf.use_nbm=False', 'bmuf.average_sync=False', 'bmuf.distributed_world_size=1', 'generation.beam=5', 'generation.nbest=1', 'generation.max_len_a=0.0', 'generation.max_len_b=200', 'generation.min_len=1', 'generation.match_source_len=False', 'generation.unnormalized=False', 'generation.no_early_stop=False', 'generation.no_beamable_mm=False', 'generation.lenpen=1.0', 'generation.unkpen=0.0', 'generation.replace_unk=null', 'generation.sacrebleu=False', 'generation.score_reference=False', 'generation.prefix_size=0', 'generation.no_repeat_ngram_size=0', 'generation.sampling=False', 'generation.sampling_topk=-1', 'generation.sampling_topp=-1.0', 'generation.constraints=null', 'generation.temperature=1.0', 'generation.diverse_beam_groups=-1', 'generation.diverse_beam_strength=0.5', 'generation.diversity_rate=-1.0', 'generation.print_alignment=null', 'generation.print_step=False', 'generation.lm_path=null', 'generation.lm_weight=0.0', 'generation.iter_decode_eos_penalty=0.0', 'generation.iter_decode_max_iter=10', 'generation.iter_decode_force_max_iter=False', 'generation.iter_decode_with_beam=1', 'generation.iter_decode_with_external_reranker=False', 'generation.retain_iter_history=False', 'generation.retain_dropout=False', 'generation.retain_dropout_modules=null', 'generation.decoding_format=null', 'generation.no_seed_provided=False', 'generation.eos_token=null', 'eval_lm.output_word_probs=False', 'eval_lm.output_word_stats=False', 'eval_lm.context_window=0', 'eval_lm.softmax_batch=9223372036854775807', 'interactive.buffer_size=0', "interactive.input='-'", 'ema.store_ema=False', 'ema.ema_decay=0.9999', 'ema.ema_start_update=0', 'ema.ema_seed_model=null', 'ema.ema_update_freq=1', 'ema.ema_fp32=False', 'task=graph_prediction', 'task._name=graph_prediction', "task.dataset_name='PCQM4Mv2'", 'task.num_classes=1', 'task.max_nodes=128', "task.dataset_source='pyg'", 'task.num_atoms=4608', 'task.num_edges=1536', 'task.num_in_degree=512', 'task.num_out_degree=512', 'task.num_spatial=512', 'task.num_edge_dis=128', 'task.multi_hop_max_dist=5', 'task.spatial_pos_max=1024', "task.edge_type='multi_hop'", 'task.seed=1', "task.pretrained_model_name='mytokengt'", 'task.load_pretrained_model_output_layer=True', 'task.train_epoch_shuffle=True', "task.user_data_dir='tokengt/data/ogb_datasets'", 'criterion=l1_loss', 'criterion._name=l1_loss', 'lr_scheduler=fixed', 'lr_scheduler._name=fixed', 'lr_scheduler.force_anneal=null', 'lr_scheduler.lr_shrink=0.1', 'lr_scheduler.warmup_updates=0', 'lr_scheduler.lr=[0.25]', 'scoring=bleu', 'scoring._name=bleu', 'scoring.pad=1', 'scoring.eos=2', 'scoring.unk=3']
Traceback (most recent call last):
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 513, in _apply_overrides_to_config
OmegaConf.update(cfg, key, value, merge=True)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/omegaconf.py", line 613, in update
root.__setattr__(last_key, value)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 285, in __setattr__
raise e
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 282, in __setattr__
self.__set_impl(key, value)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 266, in __set_impl
self._set_item_impl(key, value)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/basecontainer.py", line 398, in _set_item_impl
self._validate_set(key, value)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 143, in _validate_set
self._validate_set_merge_impl(key, value, is_assign=True)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/dictconfig.py", line 156, in _validate_set_merge_impl
self._format_and_raise(
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/base.py", line 95, in _format_and_raise
format_and_raise(
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/_utils.py", line 694, in format_and_raise
_raise(ex, cause)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/omegaconf/_utils.py", line 610, in _raise
raise ex # set end OC_CAUSE=1 for full backtrace
omegaconf.errors.ValidationError: child 'dataset.update_epoch_batch_itr' is not Optional
full_key: dataset.update_epoch_batch_itr
reference_type=DatasetConfig
object_type=DatasetConfig
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/Users/ramankumar/OpenSource/script.py", line 106, in <module>
convert_graphormer_checkpoint(
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/Users/ramankumar/OpenSource/script.py", line 74, in convert_graphormer_checkpoint
cfg = convert_namespace_to_omegaconf(args)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/fairseq/dataclass/utils.py", line 399, in convert_namespace_to_omegaconf
composed_cfg = compose("config", overrides=overrides, strict=False)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/hydra/experimental/compose.py", line 31, in compose
cfg = gh.hydra.compose_config(
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/hydra/_internal/hydra.py", line 507, in compose_config
cfg = self.config_loader.load_configuration(
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 151, in load_configuration
return self._load_configuration(
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 277, in _load_configuration
ConfigLoaderImpl._apply_overrides_to_config(config_overrides, cfg)
File "/Users/ramankumar/OpenSource/transformers/.env/lib/python3.9/site-packages/hydra/_internal/config_loader_impl.py", line 520, in _apply_overrides_to_config
raise ConfigCompositionException(
hydra.errors.ConfigCompositionException: Error merging override dataset.update_epoch_batch_itr=null
child 'dataset.update_epoch_batch_itr' is not Optional ?? @clefourrier
I think you read the error correctly, apparently for TokenGT+fairseq it does not seem to be.
You could try passing it as False
(I think it's a boolean), or looking for it either in the loading scripts or config files to see how it is managed for the project.
Once again explain how to supply datasets in an argument
I created a file predict_custom.py
alongside (in same folder) conversion script.py
and pasted all code you gave
from datasets import load_dataset
....
class TmpDataset(InMemoryDataset):
....
def create_customized_dataset(dataset_name, ix_xval):
....
@register_dataset("MUTAG_0")
def m0():
return create_customized_dataset("MUTAG", 0)
--dataset-name --MUTAG_0 --user-data-dir "/tokengt/data/ogb_datasets" How I should write here? @clefourrier
The simplest would be to do what you did initially, and use one of the native datasets for TokenGT with --dataset-name PCQM4Mv2
.
If you want to use custom datasets, your --user-data-dir
must point to the folder containing your dataset script if I remember well.
🙂 Got familiar with PyTorch geometric and Graph Neural Network Project I read about parameters and datasets for Graph from Graphormer/docs
here at tokengt/large-scale-regression/scripts was training script for tokengt using fairseq-train
with argument list
Initially, I assumed that those argument list only used with fairseq-train
but (!No) same applies to conversion script as well (I did not try this.😕 so sad!! )
Now everything works fine. yay 😊
Congratulations, that's very cool! :hugs:
Do you know what your next steps are?
Next
I added some import-related code in transformers folder like src/transformers/__init__.py
and other files (taking the help of Graphormer PR )
after that I was successfully able to import HF🤗tokegt in my conversion script.py
from transformers import (
AutoModel,
TokenGTConfig,
TokenGTForGraphClassification,
)
tokengt_model = task.build_model(cfg.model)
tokengt_state = torch.load(checkpoint_name)["model"]
tokengt_model.load_state_dict(tokengt_state, strict=True, model_cfg=cfg.model)
tokengt_model.upgrade_state_dict(tokengt_model.state_dict())
# upto these lines works fine no error
# Transformers model
config = TokenGTConfig(
tasks_weights=None, # added this
num_labels=1,
share_input_output_embed=False,
num_layers=12,
embedding_dim=768,
ffn_embedding_dim=768,
num_attention_heads=32,
dropout=0.0,
attention_dropout=0.1,
activation_dropout=0.1,
encoder_normalize_before=True,
pre_layernorm=False,
apply_graphormer_init=True,
activation_fn="gelu",
no_token_positional_embeddings=False,
)
transformers_model = TokenGTForGraphClassification(config)
state_dict = tokengt_model.state_dict()
transformers_model.load_state_dict(state_dict) # here shows me following error
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TokenGTForGraphClassification:
Missing key(s) in state_dict: "decoder.lm_output_learned_bias", "decoder.embed_out.weight".
Unexpected key(s) in state_dict: "encoder.lm_output_learned_bias", "encoder.embed_out.weight", "encoder.graph_encoder.final_layer_norm.weight", "encoder.graph_encoder.final_layer_norm.bias", "encoder.graph_encoder.graph_feature.orf_encoder.weight", "encoder.graph_encoder.graph_feature.order_encoder.weight".
size mismatch for encoder.graph_encoder.graph_feature.edge_encoder.weight: copying a param with shape torch.Size([1536, 768]) from checkpoint, the shape in current model is torch.Size([2048, 768]).
there are two checkpoints lap16, orf64. Both gives same error except "encoder.graph_encoder.graph_feature.lap_encoder.weight" "encoder.graph_encoder.graph_feature.orf_encoder.weight"
these are error Missing key(s), Unexpected key(s), size mismatch
need help @clefourrier
edit : adding num_edges=1536 in config removed size mismatch error
I think this should be managed with the remove_ignore_keys_
and rename_keys
parts: you need to find what the "unexpected keys" in the original checkpoint map to in the new format, and rename them accordingly. In essence, you are going from one format (tokenGT format) to another format (transformers style) for your checkpoint, so you need to do this mapping.
Congrats on debugging the other error! :clap:
Initially, I had no idea how to map them and to what. I don't even know what they mean. So, I spent some time studying transformers and looking at code.
suddenly I thought let's print models So, I printed both original models and HF🤗 model
print(transformers_model)
print(tokengt_model)
and compared the differences. Accordingly, I added these arguments to the config
# config for lap16
config = TokenGTConfig(
...
lap_node_id=True,
lap_node_id_k=16,
id2label = {"1":"className"}, # I added a dictionary explained below why I did this
type_id=True,
prenorm=True,
...
)
and renamed keys
rename_keys = [
("encoder.embed_out.weight", "decoder.embed_out.weight"),
# I did not find lm_output_learned_bias in models So, I checked code and doing this made most sense
("encoder.lm_output_learned_bias", "decoder.lm_output_learned_bias"),
]
Doing this works fine. no error.
if I don't do this id2label = {"1":"className"}
putting argument num_labels = 1
in config = TokenGTConfig(
has no effect
because num_labels
gets a default value 2
in PretrainedConfig
(see code below) file (super class of TokenGTConfig(PretrainedConfig)
)
which would give a size mismatch error
https://github.com/huggingface/transformers/blob/9d1116e9951686f937d17697820117636bfc05a5/src/transformers/configuration_utils.py#L326-L330
It's really great to see your motivation, good job! :sparkles:
I'll try to check the code to confirm the key renames you made, but I think they do make sense because of the naming changes between the original and new models.
For the id2label, I don't think it is such a good idea to modify things outside of the TokenGT files - normally the parent class (PretrainedConfig
) is overwritten by the child class (TokenGTConfig
), are you sure this modification is happening here?
I think you could also try changing the TokenGTConfig
num_labels
default value to 1 instead of None and see what happens.
Yes, I am sure
Hi @Raman-Kumar ! I took some time to clean the code a bit and edited some parts, it should be better now for the problems you mentioned. If problems occur in the future, fyi the Graphormer code which was integrated in the lib is quite similar to this one, so you can look at how they are managed there.
Because of a mixup on my github I had to create a new PR for this https://github.com/huggingface/transformers/pull/21745 and this is where you'll find the new code. Hoping it helps you! :hugs:
Hi, @clefourrier I had already figured it out but I was very sick for few days 😔.
In the previous PR, I did three changes after that it printed "All good :)"
- changing
num_labels
tonum_classes
(after that no need to addid2label
which you suggested not to add) - In File models/tokengt/configuration_tokengt.py,
import torch.nn.functional as F
is missing -
decode
name was wrongly written inTokenGTForGraphClassification
class in forward function
I was just going to upload the newly created config.json and pytorch_model.bin file to hugging face id.
Now I will look at new PR and will send changes with Tests and Docs to new PR.
That sounds good, these changes sound similar to the ones in the new PR.
I hope you take rest and get better soon :hugs:
Hi, back again Uploaded converted checkpoint and config for lap - https://huggingface.co/raman-ai/tokengt-base-lap-pcqm4mv2 orf - https://huggingface.co/raman-ai/tokengt-base-orf-pcqm4mv2
Now, I am writing tests,
I tried to push some changes to PR But it says like authentication failed, do not have permission etc.
How should I push new commits to your PR? @clefourrier Need to add me as a collaborator to your forked repo
in my terminal
$ git remote -v
github-desktop-clefourrier https://github.com/clefourrier/transformers.git (fetch)
github-desktop-clefourrier https://github.com/clefourrier/transformers.git (push)
origin https://github.com/Raman-Kumar/transformers.git (fetch)
origin https://github.com/Raman-Kumar/transformers.git (push)
upstream https://github.com/huggingface/transformers.git (fetch)
upstream https://github.com/huggingface/transformers.git (push)
@Raman-Kumar added you to my fork!
I created a new PR #22042 just for making a lot of commits and see where circleci do fail. So, I can correct it. Later I will do a single commit in your PR.
I have added a new dependency einops
in setup.py. In entire repo, it's fist time being used in tokengt model.
I added TokenGTModelIntegrationTest. and now it passes all circleci checks.
I have a question. @clefourrier
How to know the shape of inputs node_data,num_nodes,edge_index,edge_data,edge_num,in_degree,out_degree,lap_eigvec,lap_eigval,labels
of Tokengt for ids_tensor()
function?
Like in Graphormer
attn_bias = ids_tensor(
[self.batch_size, self.graph_size + 1, self.graph_size + 1], self.num_atoms
) # Def not sure here
attn_edge_type = ids_tensor([self.batch_size, self.graph_size, self.graph_size, 1], self.num_edges)
spatial_pos = ids_tensor([self.batch_size, self.graph_size, self.graph_size], self.num_spatial)
in_degree = ids_tensor([self.batch_size, self.graph_size], self.num_in_degree)
out_degree = ids_tensor([self.batch_size, self.graph_size], self.num_out_degree)
input_nodes = ids_tensor([self.batch_size, self.graph_size, 1], self.num_atoms)
input_edges = ids_tensor(
[self.batch_size, self.graph_size, self.graph_size, self.multi_hop_max_dist, 1], self.num_edges
)
labels = ids_tensor([self.batch_size], self.num_classes)
Ok, great for the PR, and congrats for the tests! For einops, do you need a lot of code? It would be better to copy paste the functions we will need (citing them and if the license allows ofc) as we only allow new dependencies for very specific cases.
For TokenGT, are you talking about the shape of inputs provided to the test suite?
Most attributes will have the same shape as for Graphormer (batch_size
in position one, then graph_size
or linked to it for inputs which look over the whole graph, like those pertaining to edges/nodes (includes the degrees for example)). The collation function should be able to help you with the specifics, since the shape must be provided there. Last resort, to confirm your intuition, you can also print all the dimensions for the elements you want.
What is the current status of TokenGT on Hugging Face? Is it possible to use this for token/node classification tasks? If so, could someone point me to a good starting point or example for figuring that out? I would love to try to use this on protein data through Hugging Face for node/token classification :)
Hi @Amelie-Schreiber ! Raman has been working on this integration in their spare time, but I don't think it's complete yet. One of the latest PRs was here if you want to take a look too :)
Hey, I am resuming this. Lost touch for sometime time. Will further contribute to it.
@clefourrier May ask question, if stuck
Cool! Feel free to ask questions! I'm no longer actively working on graphs but I'll do my best to answer in reasonable delays.
How is it going now?it works?🫥