Deep Learning

Keith Dillon
Fall 2019

Topic 1: Introduction

This topic:¶

Syllabus discussion
Textbooks
A.I.: What works and why
Machine Learning overview
Software installation

Reading:

Chollet Chapter 1 (What is deep learning?)
Google Machine Learning Crash Course https://developers.google.com/machine-learning/crash-course/

Outcomes¶

Highlight important rules that people trip up on
Motivate deep Learning in big picture
Get software installed and working

I. Syllabus Discussion¶

Homework and readings will be provided at end of class or via announcement later that evening. Due at beginning of following class.
No text, however a computer is needed.
Attendance mandatory, please have good reason and prior permission or at least evidence. (Cold and Flu ARE good reasons).
Academic integrity

Prerequisites¶

Programming skills. We will be using Python which is easier than any other language you know. If you can use matlab that will be enough. Concurrent enrollment in DSCI 6001 will work well.
Vector geometry & calculus.
Undergraduate Linear Algebra. Some exposure to prob and stat will also help.

If you are weak in an area you will need to devote extra time to keeping up. The university has plenty of resources for tutoring in the undergraduate subjects listed above.

Prerequisites: exist for a reason¶

Exams will only knowledge and skills you were told in class, but may require knowledge from prerequisites.

Example: perhaps in class you were shown a method that involves derivatievs, and the in-class examples using single variables, but the homework or test requires you to be able to take the derivative of a multivariable function.

Picking up things as you go¶

This is not a course in the following topics, which you are nonetheless expected to learn as needed

(sometimes you will need to look them up on your own, where all it often takes is reading the manual).

Python programming
Common Python libraries (numpy, scipy, scikit-learn)
Specialized Deep Learning frameworks (Keras Tensorflow)
Machine Learning basics (what it means, what it is for)

Grading¶

Quizzes/Homework/Labs/Participation - 50%
Midterm - 30%
Project - 20%

II. Texts¶

“Deep Learning with Python”, François Chollet, Manning 2017.

“Deep Learning”, Aaron C. Courville, Ian Goodfellow, and Yoshua Bengio, MIT 2016. https://www.deeplearningbook.org/

“Hands-On Machine Learning with Scikit-Learn and TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems”, Géron, O'Reilly 2017. (New edition to be released soon).

Chollet Book (Google employee who worked on Keras)¶

Tries to be very practical, non-mathematical (uses code to explain concepts).

By the end the code is probably moving too fast unless you are also studying Keras programming.

Geron Book¶

Second edition (Keras-based rewrite) coming out imminently

GBC Book¶

Very advanced and technical. Graduate level.

Notes vs. Slides¶

Slides are a prop to help discussions and lectures, not a replacement for notes.

Math does not work well via slides. Derivations and problems will be often be given only on the board, requiring note-taking.

If you have a computer science background this is a skill you may need to re-learn.

When you ask "what will be on the exam", I will say "the topics I emphasized in class".

Attendance¶

"The instructor has the right to dismiss from class any student who has been absent more than two weeks (pro-rated for terms different from that of the semester). A dismissed student will receive a withdrawal (W) from the course if they are still eligible for a withdrawal per the University “Withdrawal from a Course” policy, or a failure (F) if not." - Student Handbook

If you don't attend class you are responsible for learning the material on your own.

If you are becoming a burden on the class due to absences I will have you removed.

Participation¶

Active-learning techniques will be used regularly in class, requiring students to work individually and/or with other students.

Refusal to participate (or consistent failure to pay attention) will be treated as absence from class and ultimately lead to dismissal from the class.

Project-based Learning¶

This is not typical projects, where you pick something you know how to do.

It means challenging projects you don't know how to solve yet. Then you may either be led to figure it out, or you may be given the solution now that you are better prepared to understand the difficulties it solves.

You may find it frustrating at times.

Topics¶

... see syllabus...

III. Artificial Intelligence: what works and why¶

Headlines¶

“A.I.” currently means Deep Learning¶

Read the fine print in those articles...

A.I. versus Deep Learning¶

Deep Learning: State-of-the-art¶

Near-human-level image classification
Near-human-level speech recognition
Near-human-level handwriting transcription
Improved machine translation
Improved text-to-speech conversion
Digital assistants such as Google Now and Amazon Alexa
Near-human-level autonomous driving
Improved ad targeting, as used by Google, Baidu, and Bing
Improved search results on the web
Ability to answer natural-language questions
Superhuman Go playing

“Deep Learning with Python”, Chollet 2018

Computer Vision: the killer app¶

Why it works - Theory¶

An artificial neural network is a universal function approximator, a sufficiently complex (i.e. deep) network can find any parametric relation between input data and desired behavior, as long as one exists.

Q: How do we know a function exists?¶

A: If a human can do it...¶

Then we can assume a “function” exists which we can approximate with a sufficiently-complex network. The more complex the network, the more:

Data
Processing power

are required to “fit” the approximation.

Why it works - Scalability¶

With the backpropagation algorithm, implementing simple gradient descent (batchwise)

can exploit lots of data in series
can exploit lots of parallel processors

How it works – Feature engineering¶

From another perspective, Deep Learning brought a huge leap forward because the hardest part of machine learning was able to be mostly automated, by using lots of data

deep network as a shallow machine learning model (logistic regression typically) plus a bunch of preceding layers for representation learning

“mostly automated”¶

We still can’t really exploit enough data to do truly “data driven” training (i.e. devoid of expertise)

regularization – impose simple forms of prior knowledge on network (“dropout”, weight decay, sparsity)
data augmentation – impose desired invariances by generating new data
choosing architecture is a way of limiting model complexity

= Jobs for data scientists

“mostly automated” II¶

We still have no good idea how to choose the network architecture – various ideas are tried with little theory behind them

= more jobs

Overconfidence¶

State of the art models aren’t nearly as smart as they think they are

Adversarial examples
Vulnerable to hacking

= yet more jobs

Single-pixel attack¶

IV. Machine Learning overview¶

Machine Learning¶

Algorithms or programs that can learn from data - clasify, predict, cluster
Goal: Application of result to new data - Generalization

Step 1: use some algorithm to train a model with the data

Step 2: Use morel for something...

Supervised Learning¶

The most important class of methods, includes classification and regression

Here we will focus on using supervised learning for Classification

Given a training set consisting of :

input samples $\mathbf x_i$ such as images, text strings, measurements of some features
output labels $y_i$ corresponding to each input sample, such as cat vs. dog, disease vs. healthy, this is the sample's class
Learn a function $f(\mathbf x) = y$ that can predict the label for a new unlabeled sample, i.e. classify it.

Classification discussion¶

Given a training set consisting of :

input samples $\mathbf x_i$ such as images, text strings, measurements of some features
output labels $y_i$ corresponding to each input sample, such as cat vs. dog, disease vs. healthy, this is the sample's class
Learn a function $f(\mathbf x) = y$ that can predict the label for a new unlabeled sample, i.e. classify it.

Consider how this may be applied (what is $\mathbf x$, $y$, $f$, and what is "learned"?) in fields like:

computer security
online sales (i.e. amazon)
finance

Machine Learning State of the Art¶

"Structured Data" - Important features are used

Classic machine learning methods - Decision trees, ensembles.

"Unstructured Data" - raw info: signals, images, text

Deep Learning dominates (when there's lot of data) - especially "Convolutional Neural Nets"
ImageNet dominance since 2012, plus Kaggle for images

What is a Deep Neural Network?¶

A universal function approximator $f(\mathbf x) \approx y$

$\bf x$ = image of dog or cat
$y$ = 1 if image is dog, 0 if image is cat

The strength of Deep Learning is in methods to adapt this function using data.

Final Word:¶

The #1 reason deep neural nets have been so successful is... what?

...Scalability!¶

Able to exploit massively parallel processing

Able to exploit big data

V. Software Installation¶

Virtualenv¶

A tool to create isolated Python environments

 cd $ML_PATH               # Your ML working directory (e.g., $HOME/ml) 
 $ source env/bin/activate

Can create in Anaconda Navigator

Install Jupyter, Tensorflow, & Keras¶

Ideally using Anaconda: https://www.anaconda.com/download/

Or via command line:

conda create -n tensorflow_py36 python=3.6
conda activate tensorflow_py36
conda install jupyter matplotlib numpy scipy scikit-learn pandas ...
conda install tensorflow
conda install keras

Or from within Anaconda-Navigator using GUI.

Test Installation¶

python3 -c 'import tensorflow; print(tensorflow.__version__)'

Plan B: Cloud-based Notebooks¶

Google CoLab: https://colab.research.google.com/notebooks/welcome.ipynb

Kaggle Kernel: https://www.kaggle.com/kernels

Jupyter - "notebooks" for inline code + LaTex math + markup, etc.¶

A single document containing a series of "cells". Each containing code which can be run, or images and other documentation.

Run a cell via [shift] + [Enter] or "play" button in the menu.

Will execute code and display result below, or render markup etc.

In [2]:

import datetime

print("This code is run right now (" + str(datetime.datetime.now()) + ")")

'hi'

This code is run right now (2019-01-06 17:18:52.917748)

Out[2]:

'hi'

In [5]:

x=1+2+2

print(x)

Python Help Tips¶

Get help on a function or object via [shift] + [tab] after the opening parenthesis function(

Can also get help by executing function?