Deep Learning Roadmap

Deep Learning/Fundamentals

Deep Learning Roadmap

투푸월드 2023. 7. 4. 17:30

My own deep learning mastery roadmap, inspired by Deep Learning Papers Reading Roadmap.

There are some customized differences:

not only academic papers but also blog posts, online courses, and other references are included
customized for my own plans - may not include RL, NLP, etc.
updated for 2019 SOTA

Introductory Courses

Basic CNN Architectures

AlexNet (2012) [paper]
- Alex Krizhevsky et al. "ImageNet Classification with Deep Convolutional Neural Networks"
ZFNet (2013) [paper]
- Zeiler et al. "Visualizing and Understanding Convolutional Networks"
VGG (2014)
- Simonyan et al. "Very Deep Convolutional Networks for Large-Scale Image Recognition" (2014) [Google DeepMind & Oxford's Visual Geometry Group (VGG)] [paper]
- VGG-16: Zhang et al. "Accelerating Very Deep Convolutional Networks for Classification and Detection" [paper]
GoogLeNet, a.k.a Inception v.1 (2014) [paper]
- Szegedy et al. "Going Deeper with Convolutions" [Google]
- Original LeNet page from Yann LeCun's homepage.
- Inception v.2 and v.3 (2015) Szegedy et al. "Rethinking the Inception Architecture for Computer Vision" [paper]
- Inception v.4 and InceptionResNet (2016) Szegedy et al. "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning" [paper]
- "A Simple Guide to the Versions of the Inception Network" [blogpost]
ResNet (2015) [paper]
- He et al. "Deep Residual Learning for Image Recognition"
Xception (2016) [paper]
- Chollet, Francois - "Xception: Deep Learning with Depthwise Separable Convolutions"
MobileNet (2016) [paper]
- Howard et al. "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications"
- A nice paper about reducing CNN parameter sizes while maintaining performance.
DenseNet (2016) [paper]
- Huang et al. "Densely Connected Convolutional Networks"

Generative adversarial networks

GAN (2014.6) [paper]
- Goodfellow et al. "Generative Adversarial Networks"
DCGAN (2015.11) [paper]
- Radford et al. "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks"
Info GAN (2016.6) [paper]
- Chen et al. "InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets"
Improved Techinques for Training GANs (2016.6) [paper]
- Salimans et al. "Improved Techinques for Training GANs"
- This paper suggests multiple GAN training techinques such as feautre matching, minibatch discrimination, one sided label smoothing, virtual batch normalization.
- It also suggests a renown generator performance metric, called the inception score.
f-GAN (2016.6) [paper]
- Nowozin et al. "f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization"
Unrolled GAN (2016.7) [paper]
- Metz et al. "Unrolled Generative Adversarial Networks"
ACGAN (2016.10) [paper]
- Odena et al. "Conditional Image Synthesis With Auxiliary Classifier GANs"
LSGAN (2016.11) [paper]
- Mao et al. "Least Squares Generative Adversarial Networks"
Pix2Pix (2016.11) [paper]
- Isola et al. "Image-to-Image Translation with Conditional Adversarial Networks"
EBGAN (2016.11) [paper]
- Zhao et al. "Energy-based Generative Adversarial Network"
WGAN (2017.4) [paper]
- Arjovsky et al., "Wasserstein GAN"
WGAN_GP (2017.5) [paper]
- Gulrajani et al., "Improved Training of Wasserstein GANs"
- Improves the training stability by applying "gradient penalty (GP)" to the loss function
BEGAN (2017.5) [paper]
- Berthelot et al. "BEGAN: Boundary Equilibrium Generative Adversarial Networks"
- Introduces a diversity ratio, or an equilibrium constant that controls the variety - quality tradeoff, and also proposes a convergence measure using it.
CycleGAN (2017.5) [paper]
- DiscoGAN (2017.5) [paper]
- DiscoGAN and CycleGAN proposes the EXACT SAME learning techniques for style transfer task using GAN, developed independently at the same time.
Frechet Inception Distance (FID) (2017.6) [paper]
- Heusel et al. "GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium"
- The paper's main contribution is a technique called Two Time-Scale Update Rule (TTSU), but it is mostly known for the distance metric called Frechet Inception Distance that measures the distance between two distributions of activation values.
ProGAN (2017.10) [paper]
- Karras et al. "Progressive Growing of GANs for Improved Quality, Stability, and Variation"
PacGAN (2017.12) [paper]
- Higgins et al. "PacGAN: The power of two samples in generative adversarial networks"
BigGAN (2018) [paper]
GauGAN (2019.3) [paper]
- Park et al. "Semantic Image Synthesis with Spatially-Adaptive Normalization"

Advanced GANs

DRAGAN (2017.5) [paper]
- Kodali et al. "On Convergence and Stability of GANs"
Are GANs Created Equal? (2017.11) [paper]
- Lucic et al. "Are GANs Created Equal? A Large-Scale Study"
SGAN (2017.12) [paper]
- Chavdarova et al. "SGAN: An Alternative Training of Generative Adversarial Networks"
MaskGAN (2018.1) [paper]
- Fedus et al. "MaskGAN: Better Text Generation via Filling in the _____"
Spectral Normalization (2018.2) [paper]
- Miyato et al. "Spectral Normalization for Generative Adversarial Networks"
SAGAN (2018.5) [paper] [tensorflow]
- Zhang et al. "Self-Attention Generative Adversarial Networks"
Unusual Effectiveness of Averaging in GAN Training (2018) [paper]
- "Benefitting from training on past snapshots."
- Uses exponential moving averaging (EMA)
Disconnected Manifold Learning (2018.6) [paper]
- Khayatkhoei, et al. "Disconnected Manifold Learning for Generative Adversarial Networks"
A Note on the Inception Score (2018.6) [paper]
- Barratt et al., "A Note on the Inception Score"
Which Training Methods for GAN do actually converge? (2018.7) [paper]
- Mescheder et al., "Which Training Methods for GANs do actually Converge?"
GAN Dissection (2018.11) [paper]
- Bau et al. "GAN Dissection: Visualizing and Understanding Generative Adversarial Networks"
Improving Generalization and Stability for GANs (2019.2) [paper]
- Thanh-Tung et al., "Improving Generalization and Stability of Generative Adversarial Networks"
Augustus Odena - "Open Questions about GANs" (2019.4) [distill.pub]
- Very nice article about current state of GAN research and discusses problems yet to be solved.

Autoencoders

Original autoencoder (1986) [paper]
- Rumelhart, Hinton, and Williams, "Learning Internal Representations by Error Propagation"
AutoEncoder [science]
- Hinton et al., "Reducing the Dimensionality of Data with Neural Networks"
Denoising Autoencoders (2008) [paper]
- Vincent et al. "Extracting and Composing Robust Features with Denoising Autoencoders"
Wasserstein Autoencoder (2017) [paper]
- Tolstikhin et al. "Wasserstein Auto Encoders"

Autoregressive models

PixelCNN (2016) [paper]
- van den Oord et al. "Conditional image generation with PixelCNN decoders."
WaveNet (2016) [paper]
- van den Oord et al. "WaveNet: A Generative Model for Raw Audio"
tacotron?

Layer Normalizations

Batch Normalization (2015.2) [paper]
- Ioeffe et al. "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift"
Group Norm
Instance Normalization (2016.7) [paper]
- Ulyanov et al. "Instance Normalization: The Missing Ingredient for Fast Stylization"
Santurkar et al. "How does Batch Normalization help Optimization?" (2018.5) [paper]
Switchable Normalization (2019) [paper]
- Luo et al. "Differentiable Learning-to-Normalize via Switchable Normalization"
Weight Standardization (2019.3) [paper]
- Qiao et al. "Weight Standardization"

Initializations

Xavier Initialization (2010) [paper]
- Glorot et al., "Understanding the difficulty of training deep feedforward neural networks"
Kaiming (He) Initialization (2015.2) [paper]
- He et al., "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification"
All you need is a good init (2015.11) [paper]
- Mishkin et al., "All you need is a good init"
All you need is beyond a good init (2017.4) [paper]
- Xie et al. "All You Need is Beyond a Good Init: Exploring Better Solution for Training Extremely Deep Convolutional Neural Networks with Orthonormality and Modulation"

Dropouts

Dropout (2014) [paper]
- Srivastava et al. "Dropout: A Simple Way to Prevent Neural Networks from Overfitting"
Inverted Dropouts [notes on CS231n]
- Multiplying the inverted keep_prob value on training so that values during inference (or testing) is consistent.
Li et al., "Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift" (2018.1) [paper]

Meta-Learning / Representation Learning (Zero-Shot learning, Few-Shot learning)

Zero-Data Learning (2008) [paper]
- Larochelle et al., "Zero-data Learning of New Tasks"
Palatucci et al., "Zero-shot Learning with Semantic Output Codes" (NIPS 2009) [paper]
Socher et al., "Zero-Shot Learning Through Cross-Modal Transfer" (2013.1) [paper]
Lampert et al., "Attribute-Based Classification for Zero-Shot Visual Object Categorization" (2013.7) [paper]
Dinu et al., "Improving zero-shot learning by mitigating the hubness problem" (2014.12) [paper]
Romera-Paredes et al. - "An embarrassingly simple approach to zero-shot learning" (2015) [paper]
Prototypical Networks (2017.3) [paper]
- Snell et al., "Prototypical Networks for Few-shot Learning"
Zero-shot learning - the Good, the Bad and the Ugly" (2017.3) [paper]
- Xian et al., "Zero-Shot Learning - The Good, the Bad and the Ugly"
In defence of the Triplet Loss (2017.3) [paper]
- Hermans et al., "In Defense of the Triplet Loss for Person Re-Identification"
MAML (2017.3) [paper]
- Finn et al, "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks"
Triplet Loss and Online Triplet Mining in Tensorflow (2018.3) [Oliver Moindrot Blog]
Few-Shot learning Survey (2019.4) [paper]
- Wang et al. "Few-shot Learning: A Survey"

Transfer learning

Survey 2018 (2018) [paper]
- Tan et al. "A Survey on Deep Transfer Learning"

Geometric learning

Geometric Deep Learning (2016) [paper]
- Bronstein et al. "Geometric deep learning: going beyond Euclidean data"

Variational Autoencoders (VAE)

VQ-VAE (2017.11) [paper]
- van den Oord et al., "Neural Discrete Representation Learning"
Semi-Amortized Variational Autoencoders (2018.2) [paper]
- Kim et al. "Semi-Amortized Variational Autoencoders"

Object detection

RCNN: https://arxiv.org/abs/1311.2524
Fast-RCNN: https://arxiv.org/abs/1504.08083
Faster-RCNN: https://arxiv.org/abs/1506.01497
SSD: https://arxiv.org/abs/1512.02325
YOLO: https://arxiv.org/abs/1506.02640
YOLO9000: https://arxiv.org/abs/1612.08242

Semantic Segmentation

FCN: https://arxiv.org/abs/1411.4038
SegNet: https://arxiv.org/abs/1511.00561
UNet: https://arxiv.org/abs/1505.04597
PSPNet: https://arxiv.org/abs/1612.01105
DeepLab: https://arxiv.org/abs/1606.00915
ICNet: https://arxiv.org/abs/1704.08545
ENet: https://arxiv.org/abs/1606.02147
Nice survey

Sequential Model

Seq2Seq (2014) [paper]
- Sutskever et al. "Sequence to sequence learning with neural networks."

Neural Turing Machine

Neural Turing Machines (2014) [paper]
- Graves et al., "Neural turing machines."
Pointer Networks (2015) [paper]]
- Vinyals et al., "Pointer networks."

Attention / Question-Answering

NMT (Neural Machine Translation) (2014) [paper]
- Bahdanau et al, "Neural Machine Translation by Jointly Learning to Align and Translate"
Stanford Attentive Reader (2016.6) [paper]
- Chen et al. "A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task"
BiDAF (2016.11) [paper]
- Seo et al. "Bidirectional Attention Flow for Machine Comprehension"
DrQA or Stanford Attentive Reader++ (2017.3) [paper]
- Chen et al. "Reading Wikipedia to Answer Open-Domain Questions"
Transformer (2017.8) [paper] [google ai blog]
- Vaswani et al. "Attention is all you need"
[read] Lilian Weng - "Attention? Attention!" (2018) [blog_post]
- A nice explanation of attention mechanism and its concepts.
BERT (2018.10) [paper]
- Devlin et al., "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding"
GPT-2 (2019) [paper (pdf)]
- Radford et al. "Language Models are Unsupervised Multitask Learners"

Advanced RNNs

Unitary evolution RNNs : https://arxiv.org/abs/1511.06464
Recurrent Batch Norm : https://arxiv.org/abs/1603.09025
Zoneout : https://arxiv.org/abs/1606.01305
IndRNN : https://arxiv.org/abs/1803.04831
DilatedRNNs : https://arxiv.org/abs/1710.02224

Model Compression

MobileNet (2016) (see above: Basic CNN Architectures)
ShuffleNet (2017)
- Zhang et al. "ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices"

Neural Processes

Neural Processes (2018) [paper]
- Garnelo et al. "Neural Processes"
Attentive Neural Processes (2019) [paper]
- Kim et al. "Attentive Neural Processes"
A Visual Exploration of Gaussian Processes (2019) [Distill.pub]
- Not a neural process, but gives very nice intuition about Gaussian Processes. Good Read.

Self-supervised learning

Denoising AE https://www.iro.umontreal.ca/~vincentp/Publications/denoising_autoencoders_tr1316.pdf
Exemplar Nets https://arxiv.org/abs/1406.6909
Co-occ https://arxiv.org/abs/1511.06811
Egomotion https://arxiv.org/abs/1505.01596
Jigsaw https://arxiv.org/abs/1603.09246
Context Encoders https://arxiv.org/abs/1604.07379
Split-brain autoencoders https://arxiv.org/abs/1611.09842
multi-task self-supervised learning https://arxiv.org/abs/1708.07860
Audio-visual scene analysis https://arxiv.org/abs/1804.03641
a survey https://slideplayer.com/slide/13195863/
Supervising unsupervised learning https://arxiv.org/abs/1709.05262
Unsupervised Representation Learning by Predicting Image Rotations https://arxiv.org/abs/1803.07728
Mahjourian et al., "Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints" (2018.2) [paper]
Gordon et al., "Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras" (2019.4) [paper]

Data Augmentation

Shake Shake Regularization (2017.5) [paper]
- Gastaldi, Xavier - "Shake-Shake Regularization"

Interpretation and Theory on Generalization, Overfitting, and Learning Capacity

MDL (Minimum Description Length)
- Peter Grunwald - "A tutorial introduction to the minimum description length principle" (2004) [paper]
Grunwald et al., - "Shannon Information and Kolmogorov Complexity" (2010) [paper]
Dauphin et al. "Identifying and attacking the saddle point problem in high-dimensional non-convex optimization" (2014.6) [paper]
Choromanska et al. "The Loss Surfaces of Multilayer Networks" (2014.11) [paper]
- argues that non-convexity in NNs are not a huge problem
Knowledge Distillation (2015.3) [paper]
- Hinton et al., "Distilling the Knowledge in a Neural Network"
3-Part Learning Theory by Mostafa Samir
- part 1: Introduction
- part 2: Generalization Bounds
- part 3: Regularization and Variance-Bias Tradeoff
Deconvolution and Checkerboard Artifacts - Odena (2016) [distill.pub article]
Keskar et al. "On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima" (2016.9) [paper]
Rethinking Generalization (2016.11) [paper]
- Zhang et al. "Understanding deep learning requires rethinking generalization"
Information Bottleneck (2017) [paper] [original paper on information bottleneck (2000)] [youtube-talk] [article in quantamagazine]
- Shwartz-Ziv and Tishby, "Opening the Black Box of Deep Neural Networks via Information"
Neyshabur et al, "Exploring Generalization in Deep Learning" (2017.7) [paper]
Sun et al., "Revisiting Unreasonable Effectiveness of Data in Deep Learning Era" (2017.7) [paper]
Super-Convergence (2017.8) [paper]
- Smith et al. - "Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates"
Don't Decay the Learning Rate, Increase the Batch Size (2017.11) [paper]
- Smith et al. "Don't Decay the Learning Rate, Increase the Batch Size"
Hestness et al. "Deep Learning Scaling is Predictable, Empirically" (2017.12) [paper]
Visualizing loss landscape of neural nets (2018) [paper]
Olson et al., "Modern Neural Networks Generalize on Small Data Sets" (NeurIPS 2018) [paper]
Lottery Ticket Hypothesis (2018.3) [paper]
- Frankle et al., "The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks"
- Empirically showed that zeroing small weights after training, rewinding except zeroed wegiths, and then re-triaining with 'pruned' weights showed even better results.
Intrinsic Dimension (2018.4) [paper]
- Li et al., "Measuring the Intrinsic Dimension of Objective Landscapes"
Geirhos et al. "ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness" (2018.11) [paper]
Belkin et al. "Reconciling modern machine learning and the bias-variance trade-off" (2018.12) [paper]
Graetz - "How to visualize convolution features in 40 lines of code" (2019) [medium]
Geiger et al. "Scaling description of generalization with number of parameters in deep learning" (2019.1) [paper]
Are all layers created equal? (2019.2) [paper]
- Zhang et al. "Are all layers created equal?"
Lilian Weng - "Are Deep Neural Networks Dramatically Overfitted?" (2019.4) [lil'log]
- Excellent article about generalization and overfitting of deep neural networks

Adversarial Attacks and Defense against attacks (RobustML)

RobustML site
Adversarial Examples Szegedy et al. - Intreguing Properties of Neural Networks (2013.12) [paper]
- induces missclassification by applying small perturbations
- this paper was the first to coin the term "Adversarial Example"
Fast Gradient Sign Attack (FGSM) (2014.12)
- Goodfellow et al., "Explaining and Harnessing Adversarial Examples" (ICLR 2015) [paper]
- This paper presented the famous "panda example" (as also seen in pytorch tutorial)
Kurakin et al., "Adversarial Machine Learning at Scale" (2016.11) [paper]
Mandry et al., "Towards Deep Learning Models Resistant to Adversarial Attacks" (2017.6) [paper]
Carlini et al., "Audio Adversarial Examples: Targeted Attacks on Speech-to-Text" (2018.1) [paper]

Neural architecture search (NAS) and AutoML

GREAT AutoML Website [site]
- They maintain a blog, a list of NAS literatures, analysis page, and a web book.
AdaNet (2016.7) [paper] [GoogleAI blog]
- Cortes et al. "AdaNet: Adaptive Structural Learning of Artificial Neural Networks"
NAS (2016.12) [paper]
- Zoph et al. "Neural Architecture Search with Reinforcement Learning"
PNAS (2017.12) [paper]
- Liu et al. "Progressive Neural Architecture Search"
ENAS (2018.2) [paper]
- Pham et al. "Efficient Neural Architecture Search via Parameter Sharing"
DARTS (2018.6) [paper]
- Liu et al. "DARTS: Differentiable Architecture Search"
- Uses a continuous relaxation over the discrete neural architecture space.
RandWire (2019) [paper]
- Xie et al. "Exploring Randomly Wired Neural Networks for Image Recognition" [Facebook AI Research]
A Survey on Neural Architecture Search (2019) [paper]
- Witsuba et al., "A Survey on Neural Architecture Search"

Practical Techniques

Andrej Karpathy - "A recipe for training neural networks" (2019) [Andrej Karpathy Blog Post]

DL roadmap reference

https://github.com/songrotek/Deep-Learning-Papers-Reading-Roadmap
https://github.com/terryum/awesome-deep-learning-papers
which DL algorithms should I implement to learn? https://www.reddit.com/r/MachineLearning/comments/8vmuet/d_what_deep_learning_papers_should_i_implement_to/

Theory

Resources

A Selective Overview of Deep Learning (2019) [paper]
- Fan et al. "A Selective Overview of Deep Learning"
- A nice overview paper on deep learning up to early 2019 (about 30 pages)

'Deep Learning > Fundamentals' 카테고리의 다른 글

albumentations - fast image augmentation library 소개 및 사용법 Tutorial (0)	2023.07.19
INSTALLING PREVIOUS VERSIONS OF PYTORCH (0)	2023.07.16
AlexNet 구조 파악 및 PyTorch로 코드 구현해보기 (0)	2023.07.04
Deep learning 유용 사이트 (0)	2023.07.04
퍼셉트론(Perceptron) 개념 이해 (0)	2023.07.04

현재글Deep Learning Roadmap

기분좋은 AI 세상을 함께 합니다.

Today :
Yesterday :

기분좋은 AI 세상을 함께 합니다.

Deep Learning Roadmap

Introductory Courses

Basic CNN Architectures

Generative adversarial networks

Advanced GANs

Autoencoders

Autoregressive models

Layer Normalizations

Initializations

Dropouts

Meta-Learning / Representation Learning (Zero-Shot learning, Few-Shot learning)

Transfer learning

Geometric learning

Variational Autoencoders (VAE)

Object detection

Semantic Segmentation

Sequential Model

Neural Turing Machine

Attention / Question-Answering

Advanced RNNs

Model Compression

Neural Processes

Self-supervised learning

Data Augmentation

Interpretation and Theory on Generalization, Overfitting, and Learning Capacity

Adversarial Attacks and Defense against attacks (RobustML)

Neural architecture search (NAS) and AutoML

Practical Techniques

DL roadmap reference

Theory

Resources

'Deep Learning > Fundamentals' 카테고리의 다른 글

'Deep Learning/Fundamentals'의 다른글

티스토리툴바

« 2026/06 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Deep Learning Roadmap

Introductory Courses

Basic CNN Architectures

Generative adversarial networks

Advanced GANs

Autoencoders

Autoregressive models

Layer Normalizations

Initializations

Dropouts

Meta-Learning / Representation Learning (Zero-Shot learning, Few-Shot learning)

Transfer learning

Geometric learning

Variational Autoencoders (VAE)

Object detection

Semantic Segmentation

Sequential Model

Neural Turing Machine

Attention / Question-Answering

Advanced RNNs

Model Compression

Neural Processes

Self-supervised learning

Data Augmentation

Interpretation and Theory on Generalization, Overfitting, and Learning Capacity

Adversarial Attacks and Defense against attacks (RobustML)

Neural architecture search (NAS) and AutoML

Practical Techniques

DL roadmap reference

Theory

Resources

'Deep Learning > Fundamentals' 카테고리의 다른 글

'Deep Learning/Fundamentals'의 다른글

관련글

티스토리툴바