How to Understand Deep Learning (in a Shallow Way)

Increased usage of ‘deep learning’ from Google ngram viewer

A Short History of the Black Box

The original Perceptron. Source: Wikimedia
Reproduced with kind permission from The Neural Network Zoo
  • Speech recognition: sound waves of speech are processed to output a sequence of words.
  • Image recognition: a set of pixels are processed to give the class of objects in the image.
  • Autonomous Vehicle Control: imagery from a front facing vehicle camera can be used to output controls for the steering and brakes.

Non-Linearity (A More Technical Aside)

Sigmoid. Source: Wikimedia

How to Learn


The Ghost in the Machine and How to Find it

How to Get Deep — And Estimated Time Investments

  • 1 month: The canonical starting point for machine learning (and also one of the very first MOOCs) is Andrew Ng’s course on Machine Learning. It’s long but it covers the bread and butter of ML and being able to listen to one of the masters speak for several hours is invaluable.
  • 10 minutes: Yan LeCunn at Facebook has a series of great educational vignettes on AI concepts for those who learn visually.
  • 1 hour: The single best resource I found for walking through the core components of deep learning is the first chapter of Andrew Nielsens book.
  • 2 hours: For the more practically minded, Caffe in a Day focuses on how to code things in Caffe and troubleshoot non-converging models.
  • 30 minutes: A high level 9 page review of Deep Learning appeared in Nature recently.
  • Unlimited: The tensorflow playground is one of the best ways to understand how neural networks converge and learn.
  • 20 minutes per week: The Machine Learning subreddit is full of interesting and practical questions on best practices in ML that are usually patiently answered.
  • 30 minutes: Practical advice on getting Deep Neural Nets to converge efficiently is hard to come by, this is very useful content.


  • Deep learning: Artificial Neural Networks (ANN) with complex non-linear structure.
  • Shallow learning: Artificial Neural Networks with simple structures and architectures.
  • Convolutional Neural Networks: Neural Networks that have convolutional layers that essentially apply a filter (through convolution, thus the name) to identify important features. The critical part here is that the importance of each pixel is also coupled to related pixels nearby.
  • Recurrent Neural Networks: An ANN that processes a stream of data, rather than a single snapshot of data. For example, speech is made up of a sound signal over time. The frequency profile at one snapshot is used as a vector, but the result of the application of the RNN to that snapshot is partially determined by previous snapshots. For that reason RNNs are stateful.
Reproduced with kind permission from The Neural Network Zoo
  • Long-Short Term Memory: LSTM is an example of an RNN, it is a particular architecture.
Reproduced with kind permission from The Neural Network Zoo
  • Hidden layer: a layer that applies a non-linear function that smoothly applies the effect of a threshold.
  • Epoch: a training epoch refers to one cycle of tuning your model by looking at training data. At the end of an epoch, training begins again. The number of epochs is a hyperparameter.
  • Rectified Linear Unit (ReLu): A rectified linear unit is an activation function to be used in a hidden layer. It is commonly used as a preferable alternative to a sigmoid or logistic activation function.
  • Feature Map: the result of the application of a function to a data vector.
  • Pooling: Pooling is a process by which a high dimensional feature map is reduced in resolution e.g. 128x128 pixels reduce to 32x32 pixels. There are various rules to do so, for example choosing the highest value (max-pooling) or the average value (average-pooling) of a 2x2 subset to represent an output pixel. The rate at which to compress is the stride.


  • Caffe — optimised for image processing and developed by Berkley.
  • Torch — developed by Facebook & NYU, API only available in C.
  • TensorFlow — Google masterpiece.
  • CNTK — From Microsoft.
  • Keras — Seems to be single handedly maintained by a Google alumnus François Chollet. A nice high level interface into TensorFlow, Theano and CNTK.




Data, science, data science and trace amounts of the Middle East and the UN

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

My remote working experience with Space4Good B.V.

Exploring Multi-Class Classification using Deep Learning

How to write Efficient DataPipeline in Keras/Tensorflow with TensorPipe.

Machine Learning crash course from Google(3)

Machine Learning in 5 Minutes

Machine Learning Open Source of the Month (v.Aug 2018)

CDS Assistant Professor Andrew Gordon Wilson Receives NSF CAREER Award

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alex Rutherford

Alex Rutherford

Data, science, data science and trace amounts of the Middle East and the UN

More from Medium

Ensemble learning

engine room predictions based on multi-output regression

Study of DWT Approach to Solve Handwritten Digid Character Problem

Machine Learning Zuihitsu — VII