You can find here slides, videos, and a virtual machine for the course EE-559 “Deep Learning”, taught by François Fleuret in the School of Engineering of the École Polytechnique Fédérale de Lausanne, Switzerland.

EPFL students can access the course's description, and Moodle page, which provides instructions to use the VM on the school's machines.

This course is a thorough introduction to deep-learning, with examples in the PyTorch framework:

- machine learning objectives and main challenges,
- tensor operations,
- automatic differentiation, gradient descent,
- deep-learning specific techniques (batchnorm, dropout, residual networks),
- image understanding,
- generative models, adversarial generative models,
- recurrent models, attention models, NLP.

You can check the pre-requisites.

The materials from 2018 include 16h of voice-overs. However the course's structure was slightly different, and was using a now obsolete version of PyTorch.

Thanks to Adam Paszke, Jean-Baptiste Cordonnier, Alexandre Nanchen, Xavier Glorot, Andreas Steiner, Matus Telgarsky, Diederik Kingma, Nikolaos Pappas, Soumith Chintala, and Shaojie Bai for their answers or comments.

The slide pdfs are the ones I use for the lectures. They are in landscape format, with overlays and font coloring to facilitate the presentation. The handout pdfs are compiled without these fancy effects and with two slides per page in portrait for off-line reading and note-taking.

You can get archives with all the files:

- ee559-slides-all.zip (108.0Mb)
- ee559-handout-all.zip (99.0Mb)

or the individual lectures:

- 1 – Introduction. (86 slides)
- 1.1 – From neural networks to deep learning. (slides, handout – 18 slides)
- 1.2 – Current applications and success. (slides, handout – 25 slides)
- 1.3 – What is really happening? (slides, handout – 10 slides)
- 1.4 – Tensor basics and linear regression. (slides, handout – 12 slides)
- 1.5 – High dimension tensors. (slides, handout – 16 slides)
- 1.6 – Tensor internals. (slides, handout – 5 slides)

- 2 – Machine learning fundamentals. (72 slides)
- 2.1 – Loss and risk. (slides, handout – 12 slides)
- 2.2 – Over and under fitting. (slides, handout – 25 slides)
- 2.3 – Bias-variance dilemma. (slides, handout – 10 slides)
- 2.4 – Proper evaluation protocols. (slides, handout – 6 slides)
- 2.5 – Basic clustering and embeddings. (slides, handout – 19 slides)

- 3 – Multi-layer perceptron and back-propagation. (68 slides)
- 3.1 – The perceptron. (slides, handout – 16 slides)
- 3.2 – Probabilistic view of a linear classifier. (slides, handout – 8 slides)
- 3.3 – Linear separability and feature design. (slides, handout – 10 slides)
- 3.4 – Multi-Layer Perceptrons. (slides, handout – 10 slides)
- 3.5 – Gradient descent. (slides, handout – 13 slides)
- 3.6 – Back-propagation. (slides, handout – 11 slides)

- 4 – Graphs of operators, autograd, and convolutional layers. (86 slides)
- 4.1 – DAG networks. (slides, handout – 11 slides / video – 20min33s)
- 4.2 – Autograd. (slides, handout – 20 slides / video – 22min12s)
- 4.3 – PyTorch modules and batch processing. (slides, handout – 15 slides / video – 14min58s)
- 4.4 – Convolutions. (slides, handout – 23 slides / video – 22min47s)
- 4.5 – Pooling. (slides, handout – 7 slides / video – 5min13s)
- 4.6 – Writing a PyTorch module. (slides, handout – 10 slides / video – 9min45s)

- 5 – Initialization and optimization. (82 slides)
- 5.1 – Cross-entropy loss. (slides, handout – 9 slides / video – 16min30s)
- 5.2 – Stochastic gradient descent. (slides, handout – 17 slides / video – 25min48s)
- 5.3 – PyTorch optimizers. (slides, handout – 8 slides / video – 6min1s)
- 5.4 – L
_{2}and L_{1}penalties. (slides, handout – 11 slides / video – 13min14s) - 5.5 – Parameter initialization. (slides, handout – 21 slides / video – 19min13s)
- 5.6 – Architecture choice and training protocol. (slides, handout – 9 slides / video – 13min11s)
- 5.7 – Writing an autograd function. (slides, handout – 7 slides / video – 8min16s)

- 6 – Going deeper. (85 slides)
- 6.1 – Benefits of depth. (slides, handout – 12 slides / video – 23min44s)
- 6.2 – Rectifiers. (slides, handout – 7 slides / video – 3min46s)
- 6.3 – Dropout. (slides, handout – 11 slides / video – 12min58s)
- 6.4 – Batch normalization. (slides, handout – 16 slides / video – 18min51s)
- 6.5 – Residual networks. (slides, handout – 20 slides / video – 21min32s)
- 6.6 – Using GPUs. (slides, handout – 19 slides / video – 18min10s)

- 7 – Autoencoders. (84 slides)
- 7.1 – Transposed convolutions. (slides, handout – 14 slides / video – 13min42s)
- 7.2 – Autoencoders. (slides, handout – 19 slides / video – 15min54s)
- 7.3 – Denoising autoencoders. (slides, handout – 36 slides / video – 32min55s)
- 7.4 – Variational autoencoders. (slides, handout – 15 slides / video – 19min18s)

- 8 – Computer vision. (85 slides)
- 8.1 – Computer vision tasks. (slides, handout – 14 slides / video – 20min14s)
- 8.2 – Networks for image classification. (slides, handout – 34 slides / video – 43min34s)
- 8.3 – Networks for object detection. (slides, handout – 15 slides / video – 20min38s)
- 8.4 – Networks for semantic segmentation. (slides, handout – 9 slides / video – 11min7s)
- 8.5 – DataLoader and neuro-surgery. (slides, handout – 13 slides / video – 13min23s)

- 9 – Under the hood. (76 slides)
- 9.1 – Looking at parameters. (slides, handout – 11 slides / video – 10min26s)
- 9.2 – Looking at activations. (slides, handout – 20 slides / video – 23min28s)
- 9.3 – Visualizing the processing in the input. (slides, handout – 22 slides / video – 23min20s)
- 9.4 – Optimizing inputs. (slides, handout – 23 slides / video – 24min53s)

- 10 – Generative models. (83 slides)
- 11 – Generative adversarial models. (91 slides)
- 11.1 – Generative Adversarial Networks. (slides, handout – 33 slides / video – 30min25s)
- 11.2 – Wasserstein GAN. (slides, handout – 20 slides / video – 23min37s)
- 11.3 – Conditional GAN and image translation. (slides, handout – 29 slides / video – 20min27s)
- 11.4 – Model persistence and checkpoints. (slides, handout – 9 slides / video – 7min51s)

- 12 – Recurrent models and NLP. (72 slides)
- 13 – Attention models. (79 slides)

- Practical 1 (solution)
- Practical 2 (solution)
- Practical 3 (solution)
- Practical 4 (solution)
- Practical 5 (solution)
- Practical 6 (solution)

- Linear algebra (vectors, matrices, Euclidean spaces),
- differential calculus (Jacobian, Hessian, chain rule),
- Python programming,
- basics in probabilities and statistics (discrete and continuous distributions, law of large numbers, conditional probabilities, Bayes, PCA),
- basics in optimization (notion of minima, gradient descent),
- basics in algorithmic (computational costs),
- basics in signal processing (Fourier transform, wavelets).

You may have to look at the Python, Jupyter notebook, and PyTorch documentations at

Helper Python prologue for the practical sessions: dlc_practical_prologue.py

This prologue parses command-line arguments as follows

usage: dummy.py [-h] [--full] [--tiny] [--seed SEED] [--cifar] [--data_dir DATA_DIR] DLC prologue file for practical sessions. optional arguments: -h, --help show this help message and exit --full Use the full set, can take ages (default False) --tiny Use a very small set for quick checks (default False) --seed SEED Random seed (default 0, < 0 is no seeding) --cifar Use the CIFAR data-set and not MNIST (default False) --data_dir DATA_DIR Where are the PyTorch data located (default $PYTORCH_DATA_DIR or './data')

The prologue provides the function

load_data(cifar = None, one_hot_labels = False, normalize = False, flatten = True)

which downloads the data when required, reshapes the images to 1d vectors if flatten is True, and narrows to a small subset of samples if --full is not selected.

It returns a tuple of four tensors: train_data, train_target, test_data, and test_target.

If cifar is True, the data-base used is CIFAR10, if it is False, MNIST is used, if it is None, the argument --cifar is taken into account.

If one_hot_labels is True, the targets are converted to 2d torch.Tensor with as many columns as there are classes, and -1 everywhere except the coefficients [n, y_n], equal to 1.

If normalize is True, the data tensors are normalized according to the mean and variance of the training one.

If flatten is True, the data tensors are flattened into 2d tensors of dimension N × D, discarding the image structure of the samples. Otherwise they are 4d tensors of dimension N × C × H × W.

import dlc_practical_prologue as prologue train_input, train_target, test_input, test_target = prologue.load_data() print('train_input', train_input.size(), 'train_target', train_target.size()) print('test_input', test_input.size(), 'test_target', test_target.size())

prints

* Using MNIST ** Reduce the data-set (use --full for the full thing) ** Use 1000 train and 1000 test samples train_input torch.Size([1000, 784]) train_target torch.Size([1000]) test_input torch.Size([1000, 784]) test_target torch.Size([1000])

A Virtual Machine (VM) is a software that simulates a complete
computer. The one we provide here includes a Linux operating
system and all the tools needed to use PyTorch from a web
browser
(*e.g.* Mozilla
Firefox or Google
Chrome).

- Download and install Oracle's VirtualBox,
- download the virtual machine OVA package (1.52Gb), and
- open the latter in VirtualBox with File → Import Appliance.

You should now see an entry in the list of VMs. The first time it starts, it provides a menu to choose the keyboard layout you want to use (you can force the configuration later by running the command sudo set-kbd).

**If the VM does not start and VirtualBox complains that the
VT-x is not enabled, you have to activate the virtualization
capabilities of your CPU in the BIOS of your computer.**

The VM automatically starts a JupyterLab on port 8888 and exports that port to the host. This means that you can access this JupyterLab with a web browser on the machine running VirtualBox at http://localhost:8888/ and use Python notebooks, view files, start terminals, and edit source files. Typing !bye in a notebook or bye in a terminal will shutdown the VM.

You can run a terminal and a text editor from inside the Jupyter notebook for exercises that require more than the notebook itself. Source files can be executed by running in a terminal the Python command with the source file name as argument. Both can be done from the main Jupyter window with:

- New → Text File to create the source code, or selecting the file and clicking Edit to edit an existing one.
- New → Terminal to start a shell from which you can run Python.

**Files saved in the VM are erased when the VM is
re-installed, which happens for each session on the EPFL
machines. So you should download files you want to keep from
the Jupyter notebook to your account and re-upload them later
when you need them.**

This VM also exports an ssh port to the port 2022 on the host, which allows to log in with standard ssh clients on Linux and OSX, and with applications such as PuTTY on Windows. The default login is 'dave' and password 'dummy', same password for the root account.

Note that performance for computation will not be as good as if you install PyTorch natively on your machine. In particular, the VM does not take advantage of a GPU if you have one.

**Finally, please also note that this VM is configured in a
convenient but highly non-secured manner, with easy to guess
passwords, including for the root, and network-accessible
non-protected Jupyter notebooks.**

This VM is built on a Linux Debian, with miniconda, PyTorch, MNIST, CIFAR10, and many Python utility packages installed.

My own materials on this page are licensed under the Creative Commons BY-NC-SA 4.0 International License.

More simply: I am okay with this material being used for regular academic teaching, but definitely not for a book / youtube loaded with ads / whatever monetization model I am not aware of.