awesome-whole-cell-simulation
awesome-whole-cell-simulation copied to clipboard
MOVING TO: https://github.com/cirosantilli/cirosantilli.github.io/ SEE README. Because sometimes you just want to simulate single prokaryotic biological living whole cell models starting from DNA to m...
= Awesome whole cell simulation :idprefix: :idseparator: - :sectanchors: :sectlinks: :sectnumlevels: 6 :sectnums: :toc: macro :toclevels: 6 :toc-title:
MOVING TO: https://github.com/cirosantilli/cirosantilli.github.io/ as part of Ciro's plan to unify his educational repos as much as possible so that they will survive the apocalypse when GitHub makes a backup only of every proejct with more than 1000 stars.
Because sometimes you just want to simulate single prokaryotic biological living whole cell models starting from DNA to minute detail to understand how it works and predict simple experimental observations.
toc::[]
== The idea
I was thinking maybe something along:
- precalculated interactivity of the molecules (simulated?!) to speed things up
- then discretize cell spatially. TODO necessary? if yes, maybe a 2D slice type of model, on GPU? Model molecule concentration with time in each slot.
- Then modify genome, determine what proteins come out. Protein folding simulations?!
- precalculate interactions of new proteins, and loop
Difficulties:
- can you get enough data out of a single cell? How long until it doubles? Do the cells interact with one another, and or change the extracelullar medium noticeably? Having single cells would be simpler and more precise.
Another option is to make on huge cell in a <
== Cell free
Since cells are themselves to complicated, maybe we can start with simplified DNA systems?
- https://www.synbio.cam.ac.uk/initiatives/cell-free-synthetic-biology possibly by mashing up many cells. I need some references on that.
- https://en.wikipedia.org/wiki/Cell-free_system
- https://en.wikipedia.org/wiki/Cell-free_protein_synthesis
- Paper based medium: ** Paper-Based Synthetic Gene Networks. 2014. Keith Pardee, Alexander A. Green, Tom Ferrante, D. Ewen Cameron, Ajay DaleyKeyser, Peng Yin, and James J. Collins, * Wyss Institute for Biological.
Maybe we can make insulin more efficiently by selecting only the parts of the cell we care about!
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5553329/ J. Porter Hunt, Seung Ook Yang, Kristen M. Wilding, and Bradley C. Bundy
Papers:
- Compartmentalization of an all-E. coli Cell-Free Expression System for the Construction of a Minimal Cell. 2016. Filippo Caschera, Vincent Noireaux
- Engine out of the chassis: Cell-free protein synthesis and its uses. 2013. Gabriel Rosenblum, Barry S. Cooperman.
- Rapid and Scalable Preparation of Bacterial Lysates for Cell-Free Gene Expression. 2017. Andriy Didovyk, Taishi Tonooka, Lev Tsimring, and Jeff Hasty.
- Developing cell-free biology for industrial applications. 2006. Jim Swartz.
- Portable, On-Demand Biomolecular Manufacturing. 2016. Keith Pardee, Shimyn Slomovic, Peter Q. Nguyen, Christopher N. Boddy, Neel S. Joshi, James J. Collins.
=== Minimal biology
I like this.
- http://www.bristol.ac.uk/news/2019/march/minimal-biology.html New Max Planck-Bristol Centre for Minimal Biology launched - 27 March 2019 ** https://www.youtube.com/watch?v=Fs4X7qbpc18
- https://www.ethz.ch/en/news-and-events/eth-news/news/2019/03/bacterial-genome-created-with-computer.html First bacterial genome created entirely with a computer - 2019-04-01. ** https://en.wikipedia.org/wiki/Caulobacter_crescentus 4,000 genes in nature, minimized to 680. One year with manufacturing costs of 120,000 Swiss francs.”
== Projects with source code
Shut up and give me the fine source.
- http://www.wholecellsimdb.org
- https://github.com/BertrandDechoux/OrganicBuilder
- https://github.com/HaseloffLab/CellModeller
- https://github.com/ecell/ecell4
- https://github.com/virtualcell/vcel
- https://github.com/idekerlab/DCell DCell browser and gene deletion simulator. DCell is an application to provide an easy-to-use user interface and interpretable neural network structure for modeling cell structure and function. ** https://github.com/idekerlab/DCell/issues/33#issuecomment-385191883 online demo does not work ** https://github.com/idekerlab/DCell/issues/36 local running does not work
== Videos
Shut up and show me a visualisation.
- https://www.youtube.com/watch?v=PSDd3oHj548 DOE CSGF 2016: Towards a Whole-cell Model of Escherichia coli. 2016. Derek Macklin. Covert lab.
- https://www.youtube.com/watch?v=haRYF73GV3M Simulations of Biological Processes on the Whole Cell Level. Blue Waters Symposium presentation. Project PI: Zaida Ann Luthey-Schulten, University of Illinois at Urbana-Champaign. 2014.
- https://www.youtube.com/watch?v=R6EwzkGyRJ0 Simulating whole cell with E-Cell System by Koichi Takahashi. 2013.
- https://www.youtube.com/watch?v=j84sF_81gCo E.coli in Action: GPU Technology Emables Whole Cell Simulation. 2010.
== Research areas
- https://en.wikipedia.org/wiki/Cellular_model
- https://en.wikipedia.org/wiki/Modelling_biological_systems
- https://en.wikipedia.org/wiki/Systems_biology
- https://en.wikipedia.org/wiki/Bioinformatics https://en.wikipedia.org/wiki/Computational_biology https://www.reddit.com/r/bioinformatics/new/
- https://en.wikipedia.org/wiki/Molecular_dynamics This is interesting on the simulate proteins point of view. The ex wall street dude agrees: ** https://en.wikipedia.org/wiki/D._E.Shaw_Research ** https://www.deshawresearch.com/ ** Dude has custom silicon for it, amazing: *** https://www.nextplatform.com/2016/02/04/anton-sequel-makes-stronger-case-for-custom-supercomputing/ *** https://en.wikipedia.org/wiki/Anton(computer)
=== Cell minimization
- https://en.wikipedia.org/wiki/Artificial_cell#The_minimal_cell
- https://en.wikipedia.org/wiki/Mycoplasma_laboratorium#Minimal_genome_project
=== Gene editing
Ah, it would be even more awesome if we could hack up the cells and see them do stuff.
Heart only in second half 2010's did it become possible to edit genes, but coding the entire DNA from scratch is still too expensive.
- https://en.wikipedia.org/wiki/Genome_editing
Previously, you would have to:
- shine life with UV to get random modifications
- inject plasmids by electrict or heat shocks: https://en.wikipedia.org/wiki/Plasmid
and then kill ones that didn't get the gene, which is less reliable.
https://en.wikipedia.org/wiki/Genome_Project-Write
==== Exciting gene edited systems
- 2019-06 https://www.ethz.ch/en/news-and-events/eth-news/news/2019/06/making-systems-robust.html "ETH researchers add integral feedback control mechanisms in bacteria that maintains constant GFP levels"
== Papers
I guess this is what researchers do instead of blog posts. Go figure!
- The principles of whole-cell modeling. Jonathan R Karr, Koichi Takahashi and Akira Funahashi
- The Future of Whole-Cell Modeling. Derek N. Macklin, Nicholas A. Ruggero, and Markus W. Covert
- Paper-Based Synthetic Gene Networks. Keith Pardee, Alexander A. Green, Tom Ferrante D. Ewen Cameron, Ajay DaleyKeyser, Peng Yin, and James J. Collins Wyss
- Paper as a novel material platform for devices. Jason P. Rolland and Devin A. Mourey
- link:++http://www.cell.com/abstract/S0092-8674(12)00776-3++[] https://www.youtube.com/watch?v=AYC5lE0b8os A Whole-Cell Computational Model Predicts Phenotype from Genotype. Jonathan R. Karr, Jayodita C. Sanghvi, Derek N. Macklin, Miriam V. Gutschow, Markus Covert. Notes: Mycoplasma genitalium. Model apparently at: https://simtk.org/projects/wholecell
== Lectures
- Genomics, Epigenetics & Synthetic Biology. Jim Haseloff. ** http://data.plantsci.cam.ac.uk/Haseloff/education/synbio_index/index.html ** http://data.plantsci.cam.ac.uk/Haseloff/resources/Part2SynBio_refs/PlantSyntheticBiology2018_Lect3s.pdf
- Spatially Distributed Stochastic Dynamical Systems in Biology https://www.newton.ac.uk/event/sdbw04 2016 Michael J. Hallock (University of Illinois at Urbana-Champaign), Joseph R. Peterson (University of Illinois at Urbana-Champaign), John A. Cole (University of Illinois at Urbana-Champaign), Tyler M. Earnest (University of Illinois at Urbana-Champaign), John E. Stone (University of Illinois at Urbana-Champaign)
== Q&A
- https://www.quora.com/How-well-can-whole-cell-simulations-model-the-effects-of-mutated-genes-SNPs
- https://www.quora.com/What-are-some-simulations-used-for-whole-cell-simulation
- https://www.quora.com/unanswered/What-can-we-learn-from-whole-cell-simulations
- https://discuss.biomake.space/t/whole-cell-modelling-simulation-and-verification-experiments/841
- https://www.quora.com/unanswered/How-far-are-we-from-fully-understanding-and-mathematically-modeling-the-metabolism-of-a-bacteria-like-E-coli-or-mycoplasma
- https://www.quora.com/unanswered/Why-would-you-study-eukaryotes-in-system-biology-instead-of-prokaryotes-which-are-much-simpler
- https://www.quora.com/unanswered/Why-would-you-study-eukaryotes-in-system-biology-instead-of-prokaryotes-which-are-much-simpler
== Projects without source code
- Bio cell https://www.youtube.com/watch?v=PSDd3oHj548
== Big projects, institutes and companies
- http://www.sanger.ac.uk ** http://www.sanger.ac.uk/science/groups/single-cell-genomics-core-facility *** https://www.sanger.ac.uk/science/collaboration/sanger-institute-ebi-single-cell-genomics-centre Single-Cell Genomics Centre ** http://www.sanger.ac.uk/science/groups/parts-group Genetic screens of cellular traits ** https://www.sanger.ac.uk/science/groups/voet-group Single-cell genomics ** https://www.sanger.ac.uk/science/groups/hemberg-group Quantitative models of gene expression ** https://www.sanger.ac.uk/science/groups/marioni-group Single cell genomics
- https://www.jic.ac.uk/
- https://en.wikipedia.org/wiki/Horizon_Discovery
- https://www.openplant.org/
- https://www.broadinstitute.org/about-us "Assemble a complete picture of the molecular components of life". Found through their awesome YouTube channel: https://www.youtube.com/channel/UCv4IbnP9j9RC_aZAs8wqdeQ Which does not allows comments lol.
- https://en.wikipedia.org/wiki/Cold_Spring_Harbor_Laboratory ** http://meetings.cshl.edu/SingleCell18 Single Cell Analysis Workshop
- https://en.wikipedia.org/wiki/National_Center_for_Biotechnology_Information
== Experimental
Visibility:
- https://en.wikipedia.org/wiki/Single_cell_sequencing
- Can't see cells on traditional electron microscopes: ** https://newatlas.com/quantum-electron-microscope/13056/ ** https://www.researchgate.net/post/Can_living_cells_be_studied_with_electron_microscopy ** SEM: nm resolution
- Protein measurement ** https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4844680/ Real-time quantification of protein expression at the single-cell level via dynamic protein synthesis translocation reporters - 2016 - Delphine Aymoz ** https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3910158/ What is the total number of protein molecules per cell volume? A call to rethink some published values - 2013 - Ron Milo ** https://www.youtube.com/watch?v=lJ2T8r1xB1A Progress, challenges and standards for single cell proteomics | Nikolai Slavov | SCP2018. There's a converence just for that! https://www.northeastern.edu/scp2018/
- REAP-seq and CITE-seq: https://www.nature.com/articles/d41586-018-05214-w
Manipulate individual cells:
- mother machine: allows to observe and hold indivual bacteria ** https://jun.ucsd.edu/mother_machine.php ** https://www.youtube.com/watch?v=yrJzMW5jcbM
- https://www.youtube.com/watch?v=adCjRYpDSxM Abnova - Micro-Manipulator - Single Cell Collection - Microscope and pipette
DNA sequencing:
- https://nanoporetech.com/how-it-works
=== Single cell stuff
Protein measurement:
- https://www.sciencedirect.com/science/article/pii/S2211124715013637 Quantification of Protein Levels in Single Living Cells Chiu-AnLo13IbrahimKays13FaridaEmran1Tsung-JungLin1VedranaCvetkovska1Brian EdwinChen1
Companies:
- https://www.berkeleylights.com/ | https://www.crunchbase.com/organization/berkeley-lights Founded 2011, 225 USD investment by 2018, single cell manipulation ** https://www.youtube.com/watch?v=6gTGJhja0oI Speeding Cell Development with Berkeley Lights - 2017
- https://www.spherefluidics.com/company/about-us/ | https://www.youtube.com/watch?v=N9XpZHvnzys | Cambridge, UK | Founded: 2010.
- https://www.labcyte.com/
- https://www.10xgenomics.com/ ** This is what many people are using for single cell RNAseq ** HQ: Pleasanton, CA ** Founded: 2012 ** https://www.crunchbase.com/organization/10x-genomics#section-overview ** Investment: 250M in 2019-01 ** Technique: https://www.10xgenomics.com/solutions/single-cell-cnv/ Separate cells into droplets. Add unique genetic barcode to each droplet.
=== Mass spectroscopy
Potentially measure the quantities of every substance in the cell?
- https://www.quora.com/unanswered/Why-would-you-study-eukaryotes-in-system-biology-instead-of-prokaryotes-which-are-much-simpler
- https://www.youtube.com/watch?v=D4JtnM-4Lds Single Cell Proteomics by Mass-spec | CSHL Meeting: Single Cell Analyses 2017 - Nikolai Slavov
- https://www.youtube.com/watch?v=PFOodSbH9IY
== Cell behaviour
Random list of interesting cell behaviour that we have to model and might verify, in particular what kind of external environment they expect to encounter:
- https://en.wikipedia.org/wiki/Toxin-antitoxin_system
- Movement: ** https://www.quora.com/Does-bacteria-move-If-it-does-how ** https://www.quora.com/How-do-bacteria-know-what-to-do
- https://en.wikipedia.org/wiki/CRISPR prokaryote immune system
- https://en.wikipedia.org/wiki/Bacterial_circadian_rhythm cyanobacteria have a circadian rhythm
- https://en.wikipedia.org/wiki/Non-coding_RNA
- https://en.wikipedia.org/wiki/Budding
== Books
Questions:
- https://www.quora.com/What-are-some-good-books-on-molecular-biology
== People
- https://en.wikipedia.org/wiki/Craig_Venter
https://motherboard.vice.com/en_us/article/jpgpz8/craig-venter-created-the-simplest-living-organism-possible-in-a-laboratory + https://youtu.be/HdgfzdlgUHw?t=90 TEDxCaltech - Future Biology, J. Craig Venter, 2011. Managed a full genome transplant and de-novo synthesis?
- Jim Swarts Oxford
- Markus Covert, Stanford. https://www.youtube.com/watch?v=P4OZUFCew0U https://en.wikipedia.org/wiki/Markus_W._Covert
Cambridge UK:
- https://www.sysbiol.cam.ac.uk/Investigators/steve-oliver yeast ** https://www.bioc.cam.ac.uk/research/uto/oliver
- https://ralser-sysbiol.crick.ac.uk/ yeast, mass spectrometry ** https://www.bioc.cam.ac.uk/research/uto/ralser
- https://www.slcu.cam.ac.uk/directory/locke-james
London:
- https://crick.ac.uk
== Courses and conferences
- 2019 05 14-15 - 5th Annual Single Cell Analysis USA Congress - Boston, USA
- 2019 01 13-17 - Keystone Symposia - Single Cell Biology, Colorado, USA - http://www.keystonesymposia.org/19L1 | http://web.archive.org/web/20181229084812/http://www.keystonesymposia.org/19L1
- 2018 09 20-21 - Single Cell Europe Conference, BIOCEV, Prague, Czech Republic - https://singlecell2018.eu/ | http://web.archive.org/web/20181229085120/https://singlecell2018.eu/
- 2018 10 29-31 - Single Cell Genomics 2018 - Broad Institute of MIT and Harvard - http://www.weizmann.ac.il/conferences/SCG2018/program
- 2018 - Single Cell Biology - Welcome Genome Campus, Cambridge, UK - https://coursesandconferences.wellcomegenomecampus.org/our-events/single-cell-biology-2018/
- 2018 - Single cell ecology - The Royal Society - https://royalsociety.org/science-events-and-lectures/2018/12/single-cell/ | http://web.archive.org/web/20181229090134/https://royalsociety.org/science-events-and-lectures/2018/12/single-cell/
== Vendors
- https://en.wikipedia.org/wiki/ATCC_(company)
== Data sources
- https://www.ncbi.nlm.nih.gov/ ** https://www.ncbi.nlm.nih.gov/genbank/ | https://en.wikipedia.org/wiki/GenBank ** NCBI RefSeq: reference genome sequences, the most highly curated genomes they have available, and likely what you want to start with: *** https://en.wikipedia.org/wiki/Reference_genome *** https://en.wikipedia.org/wiki/Genome_Reference_Consortium *** https://www.ncbi.nlm.nih.gov/projects/genome/guide/human/index.shtml
- http://www.genome.jp/kegg/ Kyoto Encyclopedia of Genes and Genomes. KEGG is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies.
- https://www.ebi.ac.uk/biomodels-main/ all in <
> format apparently? ** https://www.ebi.ac.uk/intact/ IntAct provides a freely available, open source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions and are freely available. ** https://www.uniprot.org/ | https://en.wikipedia.org/wiki/UniProt The mission of UniProt is to provide the scientific community with a comprehensive, high-quality and freely accessible resource of protein sequence and functional information. ** https://www.ebi.ac.uk/interpro/ https://en.wikipedia.org/wiki/InterPro ** http://pfam.xfam.org/ The Pfam database is a large collection of protein families, each represented by multiple sequence alignments and hidden Markov models (HMMs). - https://reactome.org/ Reactome is a free, open-source, curated and peer-reviewed pathway database.
- https://www.imexconsortium.org/ https://en.wikipedia.org/wiki/International_Molecular_Exchange_Consortium A non-redundant set of physical molecular interaction data from a broad taxonomic range of organisms.
- http://www.proteomexchange.org/ The ProteomeXchange Consortium has been set up to provide a globally coordinated submission of mass spectrometry proteomics data to the main existing proteomics repositories, and to encourage optimal data dissemination.
- https://www.ensembl.org/index.html | https://en.wikipedia.org/wiki/Ensembl_genome_database_project
- https://www.ddbj.nig.ac.jp/index-e.html
- https://www.wwpdb.org/ Since 1971, the Protein Data Bank archive (PDB) has served as the single repository of information about the 3D structures of proteins, nucleic acids, and complex assemblies.
- http://phobius.sbc.su.se/ A combined transmembrane topology and signal peptide predictor.
Questions that beg for a database answer:
- https://www.quora.com/How-quickly-do-bacteria-reproduce
=== How to download the genomes?
It is freaking hard to get the FASTA with wget links? OMG it is so bad.
http://seqanswers.com/forums/showthread.php?t=18354
Best way so far is to get accession number of type NC_001416.1 and then:
.... wget -O NC_001416.1.fasta 'https://www.ncbi.nlm.nih.gov/search/api/sequence/NC_001416.1/?report=fasta' ....
TODO:
- where is that API documented?
- how to download zipped data?
- data sources?
- how is population genetic variation accounted for?
- what do the
NNNNmean? Uknown? Present on human genome. - what are the "unlocalized genomic scaffold" regions?
=== Cool genomes
==== Lambda phage
https://en.wikipedia.org/wiki/Lambda_phage
.... wget https://www.ncbi.nlm.nih.gov/nuccore/NC_001416.1?report=fasta&log$=seqview&format=text ....
=== Data formatss
==== FASTA
Just raw sequence + origin / id metadata.
https://en.wikipedia.org/wiki/FASTA_format
==== FASTQ
FASTA + unstandardized ASCII scores for base pair calls.
Widely output by sequencing machines as of 2010's.
https://en.wikipedia.org/wiki/FASTQ_format
==== SBML
http://sbml.org/Main_Page
A file format for models?!
== Low entry barrier
DIY off topic you don't need to be a PhD type of resources for people like me
- https://en.wikipedia.org/wiki/Do-it-yourself_biology
== Molecular dynamics
- lists: ** https://youtu.be/yaLPLRO1FLE?t=2075 Introduction to Molecular Dynamics Simulations - Ali Kerrache, 2017, WestGrid ** https://en.wikipedia.org/wiki/Comparison_of_software_for_molecular_mechanics_modeling ** https://www.quora.com/How-can-I-know-or-predict-the-various-chemical-properties-of-all-elements ** https://www.quora.com/How-are-the-various-physical-and-chemical-properties-of-elements-and-compounds-predicted ** https://en.wikipedia.org/wiki/Ab_initio_quantum_chemistry_methods
- protein folding ** https://scicomp.stackexchange.com/questions/1179/are-open-source-codes-available-to-study-protein-folding ** https://www.cresset-group.com/products/ Flare, commercial: https://www.youtube.com/watch?v=E0_pc1qMvWk
- general molecular dynamics: ** http://lammps.sandia.gov/ | https://en.wikipedia.org/wiki/LAMMPS ** http://www.gromacs.org/ | https://en.wikipedia.org/wiki/GROMACS by European universities ** https://github.com/OpenMD/OpenMD ** http://ambermd.org/GetAmber.php freemium, GPL base
- quantum: ** toys: *** https://www.youtube.com/watch?v=jHyO0A7C86E Quantum simulation 1 - double slit experiment 0 - shinzon0 *** http://www.falstad.com/mathphysics.html ** http://www.quantum-espresso.org/ ** http://www.mpqc.org/ Last Update: 2013-08-16.
- algorithms ** https://en.wikipedia.org/wiki/Car%E2%80%93Parrinello_molecular_dynamics Car-Parrinello, looks like the big one.
- people ** the 1998 Nobel prize of chemistry was for computational chemistry: *** https://en.wikipedia.org/wiki/Walter_Kohn *** https://en.wikipedia.org/wiki/John_Pople
In particular, he created a neat little diagram that summarizes the computational efforst vs precision tradeoff of certain classes of algorithms: https://en.wikipedia.org/wiki/File:Pople_diagram_reverse_final.pdf
=== Molecule design
Games:
- https://eternagame.org/home/ Eterna RNA design
== Other random bioinformatics topics that should not be in this repo until I rename it
- http://rosalind.info/problems/topics/ bioinformatics HackerRank with a few dozen problems
Awesome lists:
- https://github.com/danielecook/Awesome-Bioinformatics
How to do s**t in bioinformatics repos:
- link:https://github.com/stephenturner/oneliners[]: too much POSIX that I already know :-)
- https://github.com/rasbt/protein-science
=== Just some awesome visualizations
- https://micro.magnet.fsu.edu/ a piece of early 2000's beauty
=== Awesome history of biology
- https://www.youtube.com/user/webofstories insanely awesome interview with famous people, bio, scientists and more
== Techniques
=== DNA synthesis
Companies:
- https://en.wikipedia.org/wiki/Twist_Bioscience San Francisco, founded 2013, raised 190M USD by 2018, silicon arrays.
- https://www.evonetix.com/technology/ Cambridge, founded 2015, UK, raised 14M USD by 2018, silicon arrays
- http://dnascript.co/ Paris, enzymatic, raised 24M USD by 2018.
- https://www.nuclera.com/ Cambride, UK, enzymatic, raised 1M USD by 2018
- https://www.ansabio.com/ USA East coast
- http://molecularassemblies.com/ https://www.crunchbase.com/organization/molecular-assemblies 7M raised by 2018, founded 2013, Sann Diego, USA.
Tools:
- https://en.wikipedia.org/wiki/BioBrick
Summary of all enzymatic companies in 2019 at Synbiobeta 2019: https://twitter.com/jiahao28/status/1179532081958809602 (http://archive.is/XAUt2[archive])
=== Cell synthesis
Models need experimental data, experimental data needs models:
- https://www.sciencemag.org/news/2018/11/biologists-create-most-lifelike-artificial-cells-yet
- https://en.wikipedia.org/wiki/Genome_Project-Write
=== DNA sequencing
DNA microarray:
- https://en.wikipedia.org/wiki/DNA_microarray
- https://bitesizebio.com/7463/how-dna-microarrays-are-built/
- https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4011503/ - DNA microarrays: Types, Applications and their future - 2013 - Roger Bumgarner
== Action
=== Bowtie2
.... git clone https://github.com/BenLangmead/bowtie2 cd bowtie2 BT2_HOME="$(pwd)" git checkout f5d794d7588a5ce4a7e735c42667be5abe0cdaf2 make mkdir tmp cd tmp "${BT2_HOME}/bowtie2-build" "${BT2_HOME }/example/reference/lambda_virus.fa" lambda_virus "$BT2_HOME/bowtie2" -x lambda_virus -U "${BT2_HOME}/example/reads/reads_1.fq" -S eg1.sam ....
What happened:
example/reference/lambda_virus.fais the input <> file with the reference genome reads_1.fqis a <> file with the reads and the base call quality.
The bowtie2 manual says that these were just generated from the reference genome input, and are not real read data. + This program can also generate such fake data from reference genomes: https://github.com/nh13/DWGSIM
eg1.samis the output, which says where each read is most likely to go. It is documented at: https://github.com/samtools/hts-specs
TODO: how to:
- visualize
eg1.samalignment? Possibly: https://github.com/igvteam/igv/ - convert
eg1.saminto the most likely FASTA?
== Bibliography
- http://book.bionumbers.org/ Google keeps sending me there.