Charles Foster
Charles Foster
Is there any plan to support the adjacent sibling combinator? I'm working on something with an HTML-to-text component, and selecting adjacent `` tags would be immensely useful for this.
As in the CLIP paper, we should clip the contrastive loss temperature so that the logits are not scaled by more than 100. Should be relatively easy.
CLIP's objective involves a symmetric cross entropy loss between the representstions of the text and the images (spectrograms, in our case) in a batch. It benefits from very large batch...
We can start thinking about how to evaluate our models once trained. The simplest would be contrastive loss over some held out set with a fixed batch size. Another would...