nbdev
nbdev copied to clipboard
Option To Generate "Clean" notebooks for docs / tutorial for Goolge Colab/PaperSpace etc
- I've met with Jonathan Whitaker who runs a very popular course on AI Art, with thousands of students. He is using nbdev & fastai for his next iteration of course which is to be released in a couple of months, his course is also sponsored by W&B. He has "run in colab" badges on his notebook, but he wants a version of his notebook that is stripped of all directives, and also the option to add colab specific cells or hide certain things for colab (like the pip install bits, etc).
- W&B docs also have "run in colab" on their docs, but they end up repeating themselves because they have a slightly different version of the code that for Google Colab
I see that this is quite a common pattern for people wanting to make rich tutorials and such with nbdev. We have to think about the design a bit. Rough sketch
front_matter:
---
copy_nb_path: ....
---
directives:
#|copy_exclude: marks cells that should be excluded from the copied notebook.
Open question: should we have a way to exclude markdown? Perhaps this is possible with conditional rendering?
cc: @jph00 @seeM
Another thread https://twitter.com/charles_irl/status/1563335298213220352?s=21&t=oGbErx4SNkDUiqiI04G7iA
@hamelsmu At my org, we use a lot of Colab. nbdev2 relies on raw cells, if we wish to skip all tests and execution of cells while building docs, I mostly miss the #all_slow tag from nbdev1, it was then very handy to test nbs with that tag. As of now, I'm not aware if we can add raw cells in Google Colab.
p.s. please let me know, if I can help with creation of a Colab tutorial! I'd be more than happy to help
You don't actually have to use raw cells at all, if you'd rather not. I don't use them myself. Instead, I use an alternate format, which is a markdown cell in this format:
# title
> description
- yamlkey1: something
- yamlkey2: other
Please give that a go and let me know if you have any issues.
I checked the nbs in the nbdev repo and couldn't find a comprehensive example using this format. I found that most nbs have this format.
# title
> description
- order: 1
I'm not sure what I would have to do if I wanted to skip tests during tests and building docs. Would it be something like this?
# title
> description
- execute:
- eval: false
Also, about the raw cells, I got that notion from the migrating from nbdev1 nb, this would probably be a good addition in that section.
Message ID: @.***>Good guess! It's actually: - skip_exec: true.
More details on this migration here: https://nbdev.fast.ai/top/migrating.html#update-directive-names
p.s. please let me know, if I can help with creation of a Colab tutorial! I'd be more than happy to help
We'd be delighted to see a colab tutorial :D
From the Forums


I'll be working on this soon. I've been chatting with the Quarto folks about this, and they have created a paved path that will make this possible! More to come soon
I'll be working on this soon. I've been chatting with the Quarto folks about this, and they have created a paved path that will make this possible! More to come soon
Excellent! Please let me know how I can be helpful - happy to test things, write docs etc.
Notes For Creating A Google Colab Shortcode
- To save "rendered" copies of notebooks, in addition to web pages for your docs, modify your
_quarto.ymlin this way. files by default will be written with the prefixout. for example docs.ipynb will be written to_site/docs.out.ipynb, you can change the prefix with theoutput-extfield in _quarto.yml:
format:
html:
theme: cosmo
+ ipynb: {}
+ output-ext: output.ipynb # this is optional (you will likely leave this out)
- Another option is that you don't have to specify this in
_quarto.ymlat all, but can render notebooks from the CLI like this with the-Mflag. this might be an option. Note that we can leave outabout.ipynbin the example below if we like:
quarto render about.ipynb --to ipynb -M output-ext:output.ipynb`
-
You can set repo url metadata like this, which we can automatically populate with
settings.ini(I think we should do this by default so people can enable the "edit this page" button, which is very useful!, we should also set therepo-branchand potentially therepo-subdiroptions -
You can access metadata in a shortcode like this, you have to make sure your entry point includes all three args like this
function(args, kwargs, meta) -
You can use
PANDOC_STATE.output_fileto get the filename, note that quarto will process the document twice, one to html, and one to .ipynb, so you will have to write an if statement to check for that. -
You can use the
QUARTO_PROJECT_DIRenv variable to get access to the root of the Quarto project directory, which will give you a full path (which I'm not sure is helpful yet). You can get this value in lua withos.getenv("QUARTO_PROJECT_DIR"))and yes it appears to be same syntax as python :p
I could not find a way to get the path of the current file in a shortcode, which is necessary for constructing the URL for the Colab badge, so I emailed JJ to ask
Design
- User specifies copy_nb_dir in
settings.ini, if they set this variable, then the following front matter gets injected into all notebooks, wherenbs_path/path_to_nb/file_name.ipynbis the current path relative to the root of the directory to the notebook
copy_nb_loc: nbs_path/path_to_nb/file_name.ipynb
- If user has specified
copy_nb_dir, they can now install an extension and put the shortcode{{ colab }}on any notebook. This will render the proper github badge by constructing the right url which will be something like this:
...github.com/owner/repo/branch/blob/{copy_nb_loc}
this will render the Colab badge.
- In the future, because we have that frontmatter added automatically, we can add other kinds of badges if we like.
Lua shortcode prototype for colab badges
-- colab.lua
local str = pandoc.utils.stringify
local file = quarto.doc.project_output_file()
local prefix = 'https://colab.research.google.com/github/'
local img = '<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" style="max-width: 100%;">'
function colab(args, kwargs, meta)
if quarto.doc.isFormat('ipynb') then
local path = str(meta['colab.gh-repo'])..'/blob/'..str(meta['colab.branch'])..'/'..file
return pandoc.RawBlock('html', '<a href="'..prefix..path..'"rel="nofollow">'..img..'</a>')
end
end
Corresponding quarto.yml fields:
colab:
gh-repo: hamelsmu/quarto_nbcopy
branch: main
exported_dir: colab/
Need to do the following
- [ ] Need to fix the badge url to point to
exported_dir - [ ] Copy the
*.out.ipynbfiles intoexported_dir - [ ] Setup machinery via
settings.inithat sets the default_quarto.ymlproperly ifcolab: True
ok new sketch
ipynb:
output-ext: colab.ipynb
colab:
exported-dir: colab/
And the lua
local str = pandoc.utils.stringify
local file = quarto.doc.project_output_file()
local colab_prefix = 'https://colab.research.google.com/github/'
local colab_img = pandoc.Image('', 'https://colab.research.google.com/assets/colab-badge.svg', 'Open in Colab')
---make ending slash consistent
local function slash(s) return string.gsub(str(s), '/$', '')..'/' end
local function branch(meta)
-- get the target repo's branch giving precedence to the colab: branch field, but defaulting to website: repo-branch
local branch = meta['colab.branch']
local web_branch = meta['website.repo-branch'] -- value set by automatically by nbdev
if branch == nil and web_branch ~= nil then branch = web_branch else branch = 'main' end
return slash(branch)
end
local function repo(meta)
-- get the name of the repo giving predence to the colab: github-repo field, but defaulting to parsing the website: repo-url field
local repo = meta['colab.github-repo']
local web_repo = meta['website.repo-url'] -- value set by automatically by nbdev
if repo == nil and web_repo ~= nil then
repo = str(web_repo):gsub('https://github.com/', '')
end
return slash(repo)
end
local function subdir(meta)
-- get the directory of the exported notebook
local nbdir = meta['colab.exported-dir']
if nbdir == nil then return ''
else return slash(nbdir)
end
end
function colab(args, kwargs, meta)
-- construct the colab badge
if quarto.doc.isFormat('html') then
local path = repo(meta)..'blob/'..branch(meta)..subdir(meta)..file
return pandoc.Div(pandoc.Link(colab_img, colab_prefix..path))
end
end
Here is the repo with this code