nbviewer.js icon indicating copy to clipboard operation
nbviewer.js copied to clipboard

Download more samples and test them

Open kokes opened this issue 4 years ago • 2 comments

I downloaded a sample from this piece of research by JetBrains link, it will be useful for testing.

It's a public S3 bucket, so it was as easy as aws s3 sync s3://github-notebooks-update1/ data/ (need to Control-C it, there's a lot of data)

Originally posted by @kokes in https://github.com/kokes/nbviewer.js/issues/48#issuecomment-779466979

kokes avatar Feb 15 '21 22:02 kokes

Not all the notebooks in S3 are valid UTF-8 encoded and proper JSON files, so I filtered them out:

import os
import json
from json.decoder import JSONDecodeError

dirname = "tmp"
for filename in os.listdir(dirname):
    full_filename = os.path.join(dirname, filename)
    with open(full_filename, "rb") as f:
        try:
            json.load(f)
        except (JSONDecodeError, UnicodeDecodeError):
            print("removing", full_filename)
            os.remove(full_filename)

kokes avatar Feb 16 '21 07:02 kokes

Also, in my testing, I only check notebooks that throw exceptions, but I still have:

  1. that one console.error
  2. extensions throwing errors (especially katex)

Check those as well.

kokes avatar Feb 16 '21 07:02 kokes