question about the arguments within the get_pretrained_model() function

Open hongruhu opened this issue 1 year ago • 0 comments

Hi,

When I looked at some examples of getting the pretrained models, I saw:

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_human_ref",
    embeddings_layers_to_save=(20,),
    max_positions=32,
)

parameters, forward_fn, tokenizer, config = get_pretrained_model(
    model_name="500M_1000G",
    # Get embedding at layers 5 and 20
    embeddings_layers_to_save=(5, 20,),
    # Get attention map number 4 at layer 1 and attention map number 14
    # at layer 12
    attention_maps_to_save=((1,4), (12, 14)),
    max_positions=128,
)

Here it seems that different pretrained models have different configuration? I was wondering if you could further add detailed clarification about how to choose the embeddings_layers_to_save and max_positions. I also saw in some issues mentioning that we might need to set the max_positions to 1000? Just a bit confused and it would be the best if the authors could provide the suggested configures for each pretrained model somewhere in the tutorial or readme files.

Jul 31 '24 17:07 hongruhu