docs: add tutorial / how-to-guide for embedding time-series data, e.g., using causalCNNs
We now have a bunch of new embedding nets to process sequential data, e.g., the CausalCNN or the TransformerEmbedding (see also #1499 #1494 #1512 )
It would be great to add a short how-to-guide for using these embeddings, e.g., using a well-known time-series model like the SIR or Lotka-Volterra.
See also https://github.com/sbi-dev/sbi/pull/1499#issuecomment-2743020418 for inspiration.
Hello! Can I work on this please !!
Hello @satwiksps , thanks for offering to work on this 🙏
To get started, please have a look at our existing how-to-guides here: https://sbi.readthedocs.io/en/latest/how_to_guide.html
and at the extended (quite long) tutorial here https://sbi.readthedocs.io/en/latest/advanced_tutorials/04_embedding_networks.html
the how-to-guide should give a short explanation and actionable code so that user can move on quickly. Let's discuss here how the guide for the embedding nets could look like.
Please have a look at our contribution workflow as well, it's here: https://sbi.readthedocs.io/en/latest/contributing.html
let me know if you have any questions :)
Thanks, @janfb
I have gone through the existing SBI documentation and tutorials, including the How-to guides, Advanced tutorial on embedding networks and the Contributing guide, to understand the structure, tone, and technical workflow of the docs.
Here’s my plan for this how-to guide:
I’ll add a new Jupyter notebook under docs/how_to_guide/ titled “20_time_series_embedding.ipynb” The goal is to provide a short, runnable example showing how to handle sequential simulator outputs with the new embedding networks introduced in PRs #1499 (CausalCNNEmbedding) and #1494 (TransformerEmbedding).
The notebook will:
- Use a simple SIR time-series simulator as the example.
- Demonstrate how to instantiate and use both CausalCNNEmbedding and TransformerEmbedding for sequential data.
- Integrate the chosen embedding into an NPE workflow (posterior_nn, simulate_for_sbi, train, and build_posterior).
- Briefly discuss when to prefer each embedding type (e.g., CNN for local patterns, Transformer for long-range dependencies).
- Follow the concise, example-oriented style of the existing how-to guides and link to the advanced embedding tutorial for deeper explanation.
I plan to include both embeddings in the same notebook, since they share the same workflow and purpose, and presenting them side-by-side will make it easier for users to compare their usage without repeating setup code. Once the notebook is complete, I’ll add it to the How-to guide index (index.rst) and verify that the docs build correctly with Sphinx.
If you approve I will be happy to continue with this work.
Hi all!
- Unless there is a really good reason to do it, I would not use the SIR simulator for a how-to guide (it is to much boilerplate code). I would prefer to just use
torch.randn. - Keep the discussion to "when to prefer which embedding type) in the tutorials.
Thanks! Michael
thanks for summary and plan @satwiksps, sounds good!
Good point by @michaeldeistler , given that the how-to-guide should be quite concise, we shouldn't use the SIR here. We could really just use a simulator that returns torch.randn(100) or so, to mimic the format of a time series.
We should also note that depending on the dimensionality of the data and the resulting dimensionality of the CNN or Transformer architectures, a GPU could be useful here.
and the other point by Michael about the discussion of what to use when - yes, lets not do this in the how-to-guide to keep them concise.
Thanks, @janfb and @michaeldeistler for guiding me.
I will proceed with those adjustments:
- I will replace the SIR simulator with a simple function that returns synthetic time series data using
torch.randn()to keep the guide minimal and focused on the embedding workflow. - The notebook will demonstrate both CausalCNNEmbedding and TransformerEmbedding, showing how to integrate them with posterior_nn and NPE.
- I will remove the discussion about when to use which embedding type to keep it concise.