DL reviews

Eraslan, G., Avsec, Ž., Gagneur, J. et al. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 20, 389–403 (2019)

Koo, Peter K., and Matt Ploenzke. “Deep learning for inferring transcription factor binding sites.” Current opinion in systems biology 19 (2020)

Wong, A.K., Sealfon, R.S.G., Theesfeld, C.L. et al. Decoding disease: from genomes to networks to phenotypes. Nat Rev Genet 22, 774–790 (2021)

AlQuraishi, M., Sorger, P.K. Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms. Nat Methods 18, 1169–1180 (2021)

Whalen, S., Schreiber, J., Noble, W.S. et al. Navigating the pitfalls of applying machine learning in genomics. Nat Rev Genet 23, 169–181 (2022)

Biology reviews

Zeitlinger J. Seven myths of how transcription factors read the cis-regulatory code. Curr Opin Syst Biol. 2020;23: 22–31.

Preissl, S., Gaulton, K.J. & Ren, B. Characterizing cis-regulatory elements using single-cell epigenomics. Nat Rev Genet (2022)

Isbel, L., Grand, R.S. & Schübeler, D. Generating specificity in genome regulation through transcription factor sensitivity to chromatin. Nat Rev Genet (2022)

Kwasnieski JC, Mogno I, Myers CA, Corbo JC, Cohen BA. Complex effects of nucleotide variants in a mammalian cis-regulatory element. Proc Natl Acad Sci. 2012;109: 19498–19503.

Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, et al. DNA-Binding Specificities of Human Transcription Factors. Cell. 2013;152: 327–339.

Papers

Alipanahi B, Delong A, Weirauch MT, Frey BJ. Predicting the sequence specificities of DNA- and RNA- binding proteins by deep learning. Nat Biotechnol. 2015;33: 831–838.

Kelley DR, Snoek J, Rinn JL. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 2016;26: 990–999.

Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015;12: 931–934.

Zhou J, Park CY, Theesfeld CL, Yuan Y, Sawicka K, Darnell JC, et al. Whole-genome deep learning analysis reveals causal role of noncoding mutations in autism. Nat Genet. 2019;51: 973–980.

Agarwal V, Shendure J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 2020;31: 107663.

Bogard N, Linder J, Rosenberg AB, Seelig G. A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation. Cell. 2019;178: 91–106.

Jaganathan K, Kyriazopoulou Panagiotopoulou S, McRae JF, Darbandi SF, Knowles D, Li YI, et al. Predicting Splicing from Primary Sequence with Deep Learning. Cell. 2019;176: 535-548.e24.

Atak ZK, Taskiran II, Demeulemeester J, Flerin C, Mauduit D, Minnoye L, et al. Interpretation of allele- specific chromatin accessibility using cell state-aware deep learning. Genome Res. 2021;31: 1082–1096.

Fudenberg, G., Kelley, D.R. & Pollard, K.S. Predicting 3D genome folding from DNA sequence with Akita. Nat Methods 17, 1111–1117 (2020)

Kim DS, Risca V, Reynolds D, Chappell J, Rubin A. The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation. Nat Genet. 2021;53: 1564–1576.

Dey KK, van de Geijn B, Kim SS, Hormozdiari F, Kelley DR, Price AL. Evaluating the informativeness of deep learning annotations for human complex diseases. Nat Commun. 2020;11: 4703.

Kelley DR. Cross-species regulatory sequence activity prediction. PLoS Comput Biol. 2020;16: e1008050.

Quang D, Xie X. FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data. Methods. 2019;15: 40-47.

Maslova A, Ramirez RN, Ma K, Schmutz H, Wang C, Fox C, et al. Deep learning of immune cell differentiation. Proc Natl Acad Sci. 2020;117: 25655–25666.

Zheng A, Lamkin M, Zhao H, Wu C, Su H, Gymrek M. Deep neural networks identify sequence context features predictive of transcription factor binding. Nat Mach Intell. 2021;3: 172–180.

Karbalayghareh A, Sahin M, Leslie CS. Chromatin interaction-aware gene regulatory modeling with graph attention networks. Genome Res. 2022;32: 930–944.

Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet. 2021;53: 354–366.

de Almeida BP, Reiter F, Pagani M, Stark A. DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers. Nat Genet. 2022;54: 613–624.

Avsec Ž, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, Taylor KR, et al. Effective gene expression prediction from sequence by integrating long-range interactions. Nat Methods. 2021;18: 1196– 1203.

Zhou J. Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale. Nat Genet. 2022;54: 725–734.

Toneyan S, Tang Z, Koo PK. Evaluating deep learning for predicting epigenomic profiles. bioRxiv. 2022. [Preprint] DOI: 10.1101/2022.04.29.490059

scBasset – reproduce - https://www.biorxiv.org/content/10.1101/2021.09.08.459495v1

TRANSFORMERS for DNA

CROSS SPECIES

TF BINDING

In silico mutagenesis w/ compressed sensing

Hi-C

Interpretability

Design approach

Koo PK, Eddy SR. Representation learning of genomic sequence motifs with convolutional neural networks. PLoS Comput Biol. 2019;15: e1007560. PMCID: PMC6941814

Koo PK, Ploenzke M. Improving representations of genomic sequence motifs in convolutional networks with exponential activations. Nat Mach Intell. 2021;3: 258–266. PMCID: PMC8315445

Liu Y, Barr K, Reinitz J. Fully interpretable deep learning model of transcriptional control. Bioinformatics. 2020;36: i499–i507.

Ullah F, Ben-Hur A. A self-attention model for inferring cooperativity between regulatory features. Nucleic Acids Res. 2021;49: e77.

Li J, Pu Y, Tang J, Zou Q, Guo F. DeepATT: a hybrid category attention neural network for identifying functional effects of DNA sequences. Brief Bioinform. 2021;22.

Attribution analysis

Simonyan K, Vedaldi A, Zisserman A. Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps. arXiv. 2013. [Preprint] DOI: 10.48550/arXiv.1312.6034

Zeiler MD, Fergus R. Visualizing and Understanding Convolutional Networks. Computer Vision – ECCV 2014. Springer International Publishing; 2014. pp. 818–833.

Yosinski, Jason, et al. “Understanding neural networks through deep visualization.” arXiv preprint arXiv:1506.06579 (2015).

Selvaraju, Ramprasaath R., et al. “Grad-CAM: Why did you say that?.” arXiv preprint arXiv:1611.07450 (2016).

Ribeiro, Marco Tulio, Sameer Singh, and Carlos Guestrin. “Model-agnostic interpretability of machine learning.” arXiv preprint arXiv:1606.05386 (2016).

Shrikumar A, Greenside P, Kundaje A. Learning Important Features Through Propagating Activation Differences. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1704.02685

Sundararajan M, Taly A, Yan Q. Axiomatic Attribution for Deep Networks. arXix. 2017. [Preprint] DOI: 10.48550/arXiv.1703.01365

Smilkov D, Thorat N, Kim B, Viégas F, Wattenberg M. SmoothGrad: removing noise by adding noise. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1703.01365

Lundberg SM, Lee S-I. A Unified Approach to Interpreting Model Predictions. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1705.07874

Montavon, Grégoire, et al. “Explaining nonlinear classification decisions with deep taylor decomposition.” Pattern recognition 65 (2017): 211-222.

Koh, Pang Wei, and Percy Liang. “Understanding black-box predictions via influence functions.” International conference on machine learning. PMLR, 2017.

Adebayo, Julius, et al. “Sanity checks for saliency maps.” Advances in neural information processing systems 31 (2018).

Alvarez-Melis D, Jaakkola TS. On the robustness of interpretability methods. arXiv. 2018. [Preprint] DOI: 10.48550/arXiv.1806.08049

Ghorbani, Amirata, Abubakar Abid, and James Zou. “Interpretation of neural networks is fragile.” Proceedings of the AAAI conference on artificial intelligence. Vol. 33. No. 01. 2019.

Wang Z, Wang H, Ramkumar S, Fredrikson M, Mardziel P, Datta A. Smoothed geometry for robust attribution. arXiv. 2020. [Preprint] DOI: 10.48550/arXiv.2006.06643

Erion G, Janizek JD, Sturmfels P, Lundberg SM, Lee S-I. Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nat Mach Intell. 2021;3: 620–631.

Jha A, Aicher JK, Gazzara MR, Singh D, Barash Y. Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study. Genome Biol. 2020;21: 149.

Finnegan A, Song JS. Maximum entropy methods for extracting the learned features of deep neural networks. PLoS Comput Biol. 2017;13: e1005836.

Nair S, Shrikumar A, Schreiber J, Kundaje A. fastISM: Performant in-silico saturation mutagenesis for convolutional neural networks. Bioinformatics. 2022;38: 2397–2403.

Tseng AM, Shrikumar A, Kundaje A. Fourier-transform-based attribution priors improve the interpretability and stability of deep learning models for genomics. bioRxiv. 2020. [Preprint] DOI: 10.1101/2020.06.11.147272

Linder J, La Fleur A, Chen Z, Ljubetič A, Baker D, Kannan S, et al. Interpreting neural networks for biological sequences by learning stochastic masks. Nat Mach Intell. 2022;4: 41–54.

2nd order attributions

Greenside P, Shimko T, Fordyce P, Kundaje A. Discovering epistatic feature interactions from neural network models of regulatory DNA sequences. Bioinformatics. 2018;34:i629-i637.

Liu G, Zeng H, Gifford DK. Visualizing complex feature interactions and feature sharing in genomic deep neural networks. BMC Bioinformatics. 2019;20: 401.

Janizek, Joseph D., Pascal Sturmfels, and Su-In Lee. “Explaining Explanations: Axiomatic Feature Interactions for Deep Networks.” J. Mach. Learn. Res. 22 (2021): 104-1.

global interpretability

Kim, Been, et al. “Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav).” International conference on machine learning. PMLR, 2018.

Koo PK, Majdandzic A, Ploenzke M, Anand P, Paul SB. Global importance analysis: An interpretability method to quantify importance of genomic features in deep neural networks. PLoS Comput Biol. 2021;17: e1008925. PMCID: PMC8118286

Hammelman J, Gifford DK. Discovering differential genome sequence activity with interpretable and efficient deep learning. PLoS Comput Biol. 2021;17: e1009282.

Shrikumar A, Tian K, Avsec Ž, Shcherbina A, Banerjee A, Sharmin M, et al. Technical Note on Transcription Factor Motif Discovery from Importance Scores (TF-MoDISco) version 0.5.6.5. arXiv. 2018. [Preprint] DOI: 10.48550/arXiv.1811.00416

Lab papers

deep learning – generalization

Zhang C, Bengio S, Hardt M, Recht B, Vinyals O. Understanding deep learning requires rethinking generalization. arXiv. 2016. [Preprint] DOI: 10.48550/arXiv.1611.03530

Li Z, Zhou Z-H, Gretton A. Towards an understanding of benign overfitting in neural networks. arXiv. [Preprint] DOI: 10.48550/arXiv.2106.03212

Keskar NS, Mudigere D, Nocedal J, Smelyanskiy M, Tang PTP. On large-batch training for Deep Learning: Generalization gap and sharp minima. arXiv. 2016. [Preprint] DOI: 10.48550/arXiv.1609.04836

Li H, Xu Z, Taylor G, Studer C, Goldstein T. Visualizing the loss landscape of neural nets. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1712.09913

Wu L, Zhu Z, E W. Towards understanding generalization of deep learning: Perspective of loss landscapes. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1706.10239

Izmailov P, Podoprikhin D, Garipov T, Vetrov D, Wilson AG. Averaging weights leads to wider optima and better generalization. arXiv. 2018. [Preprint] DOI: 10.48550/arXiv.1803.05407

Wortsman M, Horton M, Guestrin C, Farhadi A, Rastegari M. Learning Neural Network Subspaces. arXiv. 2021. [Preprint] DOI: 10.48550/arXiv.2102.10472

Nakkiran, Preetum, et al. “Deep double descent: Where bigger models and more data hurt.” Journal of Statistical Mechanics: Theory and Experiment 2021.12 (2021): 124003.

Frankle, Jonathan, and Michael Carbin. “The lottery ticket hypothesis: Finding sparse, trainable neural networks.” arXiv preprint arXiv:1803.03635 (2018).

Kalimeris, Dimitris, et al. “Sgd on neural networks learns functions of increasing complexity.” Advances in neural information processing systems 32 (2019).

Geirhos, R., Jacobsen, JH., Michaelis, C. et al. Shortcut learning in deep neural networks. Nat Mach Intell 2, 665–673 (2020).

deep learning – robustness

Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A. Towards Deep Learning Models Resistant to Adversarial Attacks. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1706.06083

Cohen JM, Rosenfeld E, Zico Kolter J. Certified Adversarial Robustness via Randomized Smoothing. arXiv. 2019. [Preprint] DOI: 10.48550/arXiv.1902.02918

Verma V, Lamb A, Beckham C, Najafi A, Mitliagkas I, Courville A, et al. Manifold Mixup: Better representations by interpolating hidden states. arXiv. 2018. [Preprint] DOI: 10.48550/arXiv.1806.05236

Etmann C, Lunz S, Maass P, Schönlieb C-B. On the connection between adversarial robustness and saliency map interpretability. arXiv. 2019. [Preprint] DOI: 10.48550/arXiv.1905.04172

Ross AS, Doshi-Velez F. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1711.09404

Tsipras D, Santurkar S, Engstrom L, Turner A, Madry A. Robustness may be at odds with accuracy. arXiv. 2018. [Preprint] DOI: 10.48550/arXiv.1805.12152

Yoshida Y, Miyato T. Spectral norm regularization for improving the generalizability of deep learning. arXiv. 2017. [Preprint] DOI: 10.48550/arXiv.1705.10941

Ilyas, Andrew, et al. “Adversarial examples are not bugs, they are features.” Advances in neural information processing systems 32 (2019).

deep learning – foundational

Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks.” Proceedings of the thirteenth international conference on artificial intelligence and statistics. JMLR Workshop and Conference Proceedings, 2010.

Szegedy, Christian, et al. “Intriguing properties of neural networks.” arXiv preprint arXiv:1312.6199 (2013).

Srivastava, Nitish and Hinton, Geoffrey and Krizhevsky, Alex and Sutskever, Ilya and Salakhutdinov, Ruslan. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research. 2014;15: 1929–1958.

Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXix. 2015. [Preprint] DOI: 10.48550/arXiv.1502.03167

Yu F, Koltun V. Multi-Scale Context Aggregation by Dilated Convolutions. arXix. 2015. [Preprint] DOI: 10.48550/arXiv.1511.07122

Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. “Distilling the knowledge in a neural network.” arXiv preprint arXiv:1503.02531 2.7 (2015).

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770–778.

He, Kaiming, et al. “Spatial pyramid pooling in deep convolutional networks for visual recognition.” IEEE transactions on pattern analysis and machine intelligence 37.9 (2015): 1904-1916.

Yu F, Koltun V, Funkhouser T. Dilated residual networks. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 472–480.

Huang, Gao, et al. “Densely connected convolutional networks.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.

Chen, Ricky TQ, et al. “Neural ordinary differential equations.” Advances in neural information processing systems 31 (2018).

Tan, Mingxing, and Quoc Le. “Efficientnet: Rethinking model scaling for convolutional neural networks.” International conference on machine learning. PMLR, 2019.

Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018).

Liu, Yinhan, et al. “Roberta: A robustly optimized bert pretraining approach.” arXiv preprint arXiv:1907.11692 (2019).

Yang, Zhilin, et al. “Xlnet: Generalized autoregressive pretraining for language understanding.” Advances in neural information processing systems 32 (2019).

Xie, Qizhe, et al. “Self-training with noisy student improves imagenet classification.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020.

Chen, Ting, et al. “A simple framework for contrastive learning of visual representations.” International conference on machine learning. PMLR, 2020.

Dosovitskiy, Alexey, et al. “An image is worth 16x16 words: Transformers for image recognition at scale.” arXiv preprint arXiv:2010.11929 (2020).

Liu, Ze, et al. “Swin transformer: Hierarchical vision transformer using shifted windows.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

Caron, Mathilde, et al. “Emerging properties in self-supervised vision transformers.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

Liu, Zhuang, et al. “A convnet for the 2020s.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.

Caron, Mathilde, et al. “Emerging properties in self-supervised vision transformers.” Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021.

Rogers, Anna, Olga Kovaleva, and Anna Rumshisky. “A primer in bertology: What we know about how bert works.” Transactions of the Association for Computational Linguistics 8 (2020): 842-866.

Deep generative models

Vincent, Pascal, et al. “Extracting and composing robust features with denoising autoencoders.” Proceedings of the 25th international conference on Machine learning. 2008.

Welling, Max, and Yee W. Teh. “Bayesian learning via stochastic gradient Langevin dynamics.” Proceedings of the 28th international conference on machine learning (ICML-11). 2011.

Kingma, Diederik P., and Max Welling. “Auto-encoding variational bayes.” arXiv preprint arXiv:1312.6114 (2013).

Kingma, Durk P., et al. “Semi-supervised learning with deep generative models.” Advances in neural information processing systems 27 (2014).

Goodfellow, Ian, et al. “Generative adversarial nets.” Advances in neural information processing systems 27 (2014).

Rezende, Danilo, and Shakir Mohamed. “Variational inference with normalizing flows.” International conference on machine learning. PMLR, 2015.

Oord, Aaron van den, et al. “Wavenet: A generative model for raw audio.” arXiv preprint arXiv:1609.03499 (2016).

Kingma, Durk P., et al. “Improved variational inference with inverse autoregressive flow.” Advances in neural information processing systems 29 (2016).

Van Den Oord, Aaron, and Oriol Vinyals. “Neural discrete representation learning.” Advances in neural information processing systems 30 (2017).

Gulrajani, Ishaan, et al. “Improved training of wasserstein gans.” Advances in neural information processing systems 30 (2017).

Brock, Andrew, Jeff Donahue, and Karen Simonyan. “Large scale GAN training for high fidelity natural image synthesis.” arXiv preprint arXiv:1809.11096 (2018).

Ho, Jonathan, Ajay Jain, and Pieter Abbeel. “Denoising diffusion probabilistic models.” Advances in Neural Information Processing Systems 33 (2020): 6840-6851.

Nichol, Alexander Quinn, and Prafulla Dhariwal. “Improved denoising diffusion probabilistic models.” International Conference on Machine Learning. PMLR, 2021.

RC equivariance

Zhou H, Shrikumar A, Kundaje A. Towards a better understanding of reverse-complement equivariance for deep learning models in regulatory genomics. bioRxiv. 2020. [Preprint] DOI: 10.1101/2020.11.04.368803

Brown RC, Lunter G. An equivariant Bayesian convolutional network predicts recombination hotspots and accurately resolves binding motifs. Bioinformatics. 2019;35: 2177–2184.

motif analysis

Ge W, Meier M, Roth C, Söding J. Bayesian Markov models improve the prediction of binding motifs beyond first order. NAR Genom Bioinform. 2021;3: lqab026.

Keilwagen J, Grau J. Varying levels of complexity in transcription factor binding motifs. Nucleic Acids Res. 2015;43: e119.

Interesting papers

Prakash E, Shrikumar A, Kundaje A. Towards more realistic simulated datasets for benchmarking deep learning models in regulatory genomics. bioRxiv. 2021. [Preprint] DOI: 10.1101/2021.12.26.474224