Results 36 issues of Gwern Branwen

I experimented with using [facedetect](https://github.com/wavexx/facedetect)'s cropping script on anime images to try to crop faces into separate files (the goal here being to experiment with modeling faces using [WGAN](https://github.com/martinarjovsky/WassersteinGAN) to...

In the [BigGAN paper](https://arxiv.org/pdf/1809.11096.pdf), one of the important features is the use of large minibatches, _n_=2048 for most of the results. Table 1 shows that FID/Inception improve considerably with, among...

One useful trick for data cleaning is taking a trained Discriminator and using it to find the 'worst' samples and either manually reviewing them or automatically deleting them. I modified...

The [official BigGAN generator pretrained models](https://tfhub.dev/s?q=biggan) were released 2 or 3 weeks ago & [can be used on Colab](https://colab.research.google.com/github/tensorflow/hub/blob/master/examples/colab/biggan_generation_with_tf_hub.ipynb) for free evaluations, and made a big splash as people started...

When extracting abstracts from Pubmed, the section headers/labels are erased entirely and not present anywhere in the resulting objects or inlined. This makes abstracts substantially harder to read. They should...

When I run IntroVAE on 1 GPU (to test how it works on my [anime faces](https://www.gwern.net/Faces#introvae)), I get indexing/scalar errors from PyTorch (`TypeError: only integer scalar arrays can be converted...

The anime face model used looks like it has interesting results, but is also not the highest quality one. Have you considered using either my [BigGAN](https://www.gwern.net/Faces#biggan) or StyleGAN 1/2 pretrained...

I was recently working with a megawarc from the Google Reader crawl of 25GB or so in size on an Amazon EC2 server. This took a few hours to download,...

help wanted

In some of the mega WARCs produced by Archive Team, extracting all the WARCs to save just a few is infeasible as it can take at least 2 days to...

enhancement

In dealing with a megawarc, any reasonably broad set of results will have many hits, possibly too many to hand-write dd calls to extract efficiently (see https://github.com/chfoo/warcat/issues/7 ). It would...

enhancement