Deep Learning

Keith Dillon
Fall 2019

drawing

Topic 1: Introduction

This topic:

  1. Syllabus discussion
  2. Textbooks
  3. A.I.: What works and why
  4. Machine Learning overview
  5. Software installation

Reading:

Outcomes

  • Highlight important rules that people trip up on
  • Motivate deep Learning in big picture
  • Get software installed and working

I. Syllabus Discussion

  • Homework and readings will be provided at end of class or via announcement later that evening. Due at beginning of following class.
  • No text, however a computer is needed.
  • Attendance mandatory, please have good reason and prior permission or at least evidence. (Cold and Flu ARE good reasons).
  • Academic integrity

Prerequisites

  • Programming skills. We will be using Python which is easier than any other language you know. If you can use matlab that will be enough. Concurrent enrollment in DSCI 6001 will work well.

  • Vector geometry & calculus.

  • Undergraduate Linear Algebra. Some exposure to prob and stat will also help.

If you are weak in an area you will need to devote extra time to keeping up. The university has plenty of resources for tutoring in the undergraduate subjects listed above.

Prerequisites: exist for a reason

Exams will only knowledge and skills you were told in class, but may require knowledge from prerequisites.

Example: perhaps in class you were shown a method that involves derivatievs, and the in-class examples using single variables, but the homework or test requires you to be able to take the derivative of a multivariable function.

Picking up things as you go

This is not a course in the following topics, which you are nonetheless expected to learn as needed

(sometimes you will need to look them up on your own, where all it often takes is reading the manual).

  • Python programming
  • Common Python libraries (numpy, scipy, scikit-learn)
  • Specialized Deep Learning frameworks (Keras Tensorflow)
  • Machine Learning basics (what it means, what it is for)

Grading

  • Quizzes/Homework/Labs/Participation - 50%
  • Midterm - 30%
  • Project - 20%

II. Texts

“Deep Learning with Python”, François Chollet, Manning 2017.

“Deep Learning”, Aaron C. Courville, Ian Goodfellow, and Yoshua Bengio, MIT 2016. https://www.deeplearningbook.org/

“Hands-On Machine Learning with Scikit-Learn and TensorFlow, Concepts, Tools, and Techniques to Build Intelligent Systems”, Géron, O'Reilly 2017. (New edition to be released soon).

Chollet Book (Google employee who worked on Keras)

Tries to be very practical, non-mathematical (uses code to explain concepts).

drawing

By the end the code is probably moving too fast unless you are also studying Keras programming.

Geron Book

Second edition (Keras-based rewrite) coming out imminently

drawing

GBC Book

Very advanced and technical. Graduate level.

drawing

Notes vs. Slides

Slides are a prop to help discussions and lectures, not a replacement for notes.

Math does not work well via slides. Derivations and problems will be often be given only on the board, requiring note-taking.

If you have a computer science background this is a skill you may need to re-learn.

When you ask "what will be on the exam", I will say "the topics I emphasized in class".

Attendance

"The instructor has the right to dismiss from class any student who has been absent more than two weeks (pro-rated for terms different from that of the semester). A dismissed student will receive a withdrawal (W) from the course if they are still eligible for a withdrawal per the University “Withdrawal from a Course” policy, or a failure (F) if not." - Student Handbook

If you don't attend class you are responsible for learning the material on your own.

If you are becoming a burden on the class due to absences I will have you removed.

Participation

Active-learning techniques will be used regularly in class, requiring students to work individually and/or with other students.

Refusal to participate (or consistent failure to pay attention) will be treated as absence from class and ultimately lead to dismissal from the class.

Project-based Learning

This is not typical projects, where you pick something you know how to do.

It means challenging projects you don't know how to solve yet. Then you may either be led to figure it out, or you may be given the solution now that you are better prepared to understand the difficulties it solves.

You may find it frustrating at times.

Topics

  1. ... see syllabus...

III. Artificial Intelligence: what works and why

Headlines

drawing

“A.I.” currently means Deep Learning

Read the fine print in those articles...

drawing

A.I. versus Deep Learning

drawing

Deep Learning: State-of-the-art

  • Near-human-level image classification
  • Near-human-level speech recognition
  • Near-human-level handwriting transcription
  • Improved machine translation
  • Improved text-to-speech conversion
  • Digital assistants such as Google Now and Amazon Alexa
  • Near-human-level autonomous driving
  • Improved ad targeting, as used by Google, Baidu, and Bing
  • Improved search results on the web
  • Ability to answer natural-language questions
  • Superhuman Go playing

“Deep Learning with Python”, Chollet 2018

Computer Vision: the killer app

drawing

Why it works - Theory

An artificial neural network is a universal function approximator, a sufficiently complex (i.e. deep) network can find any parametric relation between input data and desired behavior, as long as one exists.

drawing

Q: How do we know a function exists?

A: If a human can do it...

Then we can assume a “function” exists which we can approximate with a sufficiently-complex network. The more complex the network, the more:

  • Data
  • Processing power

are required to “fit” the approximation.

Why it works - Scalability

With the backpropagation algorithm, implementing simple gradient descent (batchwise)

  • can exploit lots of data in series
  • can exploit lots of parallel processors
drawing

How it works – Feature engineering

From another perspective, Deep Learning brought a huge leap forward because the hardest part of machine learning was able to be mostly automated, by using lots of data

drawing

deep network as a shallow machine learning model (logistic regression typically) plus a bunch of preceding layers for representation learning

“mostly automated”

We still can’t really exploit enough data to do truly “data driven” training (i.e. devoid of expertise)

  • regularization – impose simple forms of prior knowledge on network (“dropout”, weight decay, sparsity)
  • data augmentation – impose desired invariances by generating new data
  • choosing architecture is a way of limiting model complexity

= Jobs for data scientists

“mostly automated” II

We still have no good idea how to choose the network architecture – various ideas are tried with little theory behind them

drawing

= more jobs

Overconfidence

State of the art models aren’t nearly as smart as they think they are

  • Adversarial examples
  • Vulnerable to hacking

= yet more jobs

drawing

Single-pixel attack

drawing

IV. Machine Learning overview

Machine Learning

  • Algorithms or programs that can learn from data - clasify, predict, cluster

  • Goal: Application of result to new data - Generalization

Step 1: use some algorithm to train a model with the data

Step 2: Use morel for something...

Supervised Learning

The most important class of methods, includes classification and regression

Here we will focus on using supervised learning for Classification

Given a training set consisting of :

  • input samples $\mathbf x_i$ such as images, text strings, measurements of some features
  • output labels $y_i$ corresponding to each input sample, such as cat vs. dog, disease vs. healthy, this is the sample's class
  • Learn a function $f(\mathbf x) = y$ that can predict the label for a new unlabeled sample, i.e. classify it.

Classification discussion

Given a training set consisting of :

  • input samples $\mathbf x_i$ such as images, text strings, measurements of some features
  • output labels $y_i$ corresponding to each input sample, such as cat vs. dog, disease vs. healthy, this is the sample's class
  • Learn a function $f(\mathbf x) = y$ that can predict the label for a new unlabeled sample, i.e. classify it.

Consider how this may be applied (what is $\mathbf x$, $y$, $f$, and what is "learned"?) in fields like:

  • computer security
  • online sales (i.e. amazon)
  • finance

Machine Learning State of the Art

"Structured Data" - Important features are used

  • Classic machine learning methods - Decision trees, ensembles.

"Unstructured Data" - raw info: signals, images, text

  • Deep Learning dominates (when there's lot of data) - especially "Convolutional Neural Nets"
  • ImageNet dominance since 2012, plus Kaggle for images

What is a Deep Neural Network?

A universal function approximator $f(\mathbf x) \approx y$

  • $\bf x$ = image of dog or cat
  • $y$ = 1 if image is dog, 0 if image is cat

The strength of Deep Learning is in methods to adapt this function using data.

Final Word:

The #1 reason deep neural nets have been so successful is... what?

...Scalability!

Able to exploit massively parallel processing

Able to exploit big data

V. Software Installation

Virtualenv

A tool to create isolated Python environments

 cd $ML_PATH               # Your ML working directory (e.g., $HOME/ml) 
 $ source env/bin/activate 

Can create in Anaconda Navigator

Install Jupyter, Tensorflow, & Keras

Ideally using Anaconda: https://www.anaconda.com/download/

Or via command line:

conda create -n tensorflow_py36 python=3.6
conda activate tensorflow_py36
conda install jupyter matplotlib numpy scipy scikit-learn pandas ...
conda install tensorflow
conda install keras

Or from within Anaconda-Navigator using GUI.

Test Installation

python3 -c 'import tensorflow; print(tensorflow.__version__)' 

Plan B: Cloud-based Notebooks

Google CoLab: https://colab.research.google.com/notebooks/welcome.ipynb

Kaggle Kernel: https://www.kaggle.com/kernels

Jupyter - "notebooks" for inline code + LaTex math + markup, etc.

A single document containing a series of "cells". Each containing code which can be run, or images and other documentation.

  • Run a cell via [shift] + [Enter] or "play" button in the menu.
drawing

Will execute code and display result below, or render markup etc.

In [2]:
import datetime

print("This code is run right now (" + str(datetime.datetime.now()) + ")")

'hi'
This code is run right now (2019-01-06 17:18:52.917748)
Out[2]:
'hi'
In [5]:
x=1+2+2

print(x)
5

Python Help Tips

  • Get help on a function or object via [shift] + [tab] after the opening parenthesis function(
drawing
  • Can also get help by executing function?
drawing