Akshay comments

Results 15 comments of


                                            Akshay

"The usage quota was exceeded: type=metrics_storage"

I am facing this same issue , I deleted a lot of the tasks but it still shows the storage is exceeded , @oren-allegro could you check and let me...

Is there any particular reason why bias term is kept as False in the projection layers

@rwightman Thanks for the detailed reply , I guess the openAI team through empirical analysis might have seen that there is no difference is adding bias or making the mlp...

How does the logit_scale vary while training , i noitced that in my case it starts from the 14.28(1/0.07) and then just goes down and towards the end of the training it reaches 1

Thanks @rom1504 Thanks for the logs , isnt the logit_scale going towards 100 as the loss decreases in this case , not to 1

VSCode fails to connect: cannot find module minimist (v2)

this works for me sometimes , In the server delete “~/.vscode-server/“ folder and reconnect via vscode

The projection head order needs to be be relooked

so based on experiments it was found that GELU has a significantly smoother gradient transition and its not abrupt or sharp like relu , if u look at both the...

It has been two years. Do you still intend to share the code of clip-art?

lol

Issue in Manual optimisation, during self.manual_backward call

It also works if u do this ``` import math import torch from torch import nn, GradScaler from torch.utils.data import TensorDataset, DataLoader class TestModule(nn.Module): def __init__(self, in_dim=512, out_dim=16): super().__init__() self.in_dim...

[Feature] Prompt caching for claude and gemini and complete message output feature in DSPY

Thanks but ideally i would want to know the prompt that i am sending before hand , before sending it to the llm even for tracing , currently editing the...

Is there any timeline to add uppernet?

Thanks

DeepLabV3Plus is not compatible with encoder_depth=4 and swin models

The issue lies here and it gets solved when u do highres_in_channels = encoder_channels[-3] and high_res_features = self.block1(features[-3]) not sure if its a good workaround , would love to hear...