Jong Wook Kim

Results 86 comments of Jong Wook Kim

@maxrmorrison Thanks for the context! I haven't thought in that way, i.e. interpreting the salience matrix as the observation probability distribution, but it does look like a better way to...

Hi, There can be numerical differences that we cannot fully control, e.g. different CUDA and driver versions, batch sizes, hardware, etc., that may cause the 0.5% difference in evals. That...

The conditional code can be found here in `clip.load()` and `clip.load("")` should work; please let me know if it doesn't.

By `` I meant the models downloaded under `~/.cache/clip`. Let me know what the stacktrace looks like if you see an error loading those models with `clip.load()`.

The paper reports FLOPs during a forward pass, and we used [fvcore's flop counting tool]( to get those numbers. The actual wall time might depend on various factors such as...

Hi, thanks for pointing out some of the details we were cursory or missing; upon investigating, we found that: 1. Facial Emotion Recognition 2013: We noticed an error in the...

Hi, 1. Yes, but we later found that a PyTorch version can work as equivalently on linear probes. 2. Please see for more details

"accuracy" during training probably meant the proportion of the training examples that had correctly predicted the contrastive label, e.g.: contrastive_label = torch.arange(batch_size) image_loss = cross_entropy(image_logits, contrastive_label) text_loss = cross_entropy(text_logits, contrastive_label)...

Can you try reshaping the array to (batch_size * num_class, n_ctx) and feed it to the model?

do you have a local directory named `clip` or a file named `` in the same directory?