Kaiyu Shi comments

Results 45 comments of


Kaiyu Shi

num_layers does't work

I'm going to raise a warning for this situation until the multi-layered version is ready.

num_layers does't work

Well, since the actually PPL of `index GRU` is hard to compute, so the printed loss is simply the NCE loss, which is not comparable with the *CrossEntropy* loss.

num_layers does't work

Hi, Eric. I failed to reproduce the PPL of 165 on my server, could you plz delete the `data/penn/vocab.pkl` and runs again to see if it happens again. I suspect...

why squeeze here?

@chaoqing `squeeze(0)` is definitely a better choice, as you said, `squeeze` will remove all the dim with *size=1*, which is unexpected for *N=1*. PR is appreciated. for the non-zero elements...

Not implemented for type Half

Yes, I was following the tutorial. I found that using `AT_DISPATCH_FLOATING_TYPES_AND_HALF` MACRO should do the magic to support half scalar type, but it is not documented in the tutorial. Should...

The export-datasources-and-dashboards process keeps crashing

I delete the pre-built dashboards on `grafana`, and the job `export-datasources-and-dashboards` does exit with 0. It seems like the entry script tries to add the dashboard over and over again.

On reusing codes from gensim

Yes, I meant that. But after using your `Embedding` class, I think it's far simpler than the one from `gensim`. I'm wondering if we can provide a simple way to...

What's the difference between `active_bytes` and `reserved_bytes`?

PyTorch caches CUDA memory to prevent repeated memory allocatation cost, you can get more information here: https://pytorch.org/docs/stable/notes/cuda.html#cuda-memory-management In your case, the reserved bytes should be peak memory usage before `checkpointing`,...

What's the difference between `active_bytes` and `reserved_bytes`?

> Q1: Do you know how to explain this: If I keep the same batch-size, but change how I partition the self.features internally (into checkpointed segments), the active_bytes of the...

dontprint icon not appearing even on suggested test arxiv article

Chrome on Windows 10 shares the same problem.