Mathematical Methods for Data Science

Keith Dillon
Spring 2019

drawing

Topic 3: "Tensors" and Vectorization

This topic:

  1. Vectorization and Matricization
  2. Kronecker product

Reading:

  • Strang section IV.3
  • "The ubiquitous Kronecker product", Van Loan, 2000

I. Vectorization, Matricization, and Reshaping

Motivation

  • In Machine Learning and especially Deep Learning, data is popularly structured as so-called Tensors.
  • However behind the scenes, (we commonly hear) linear algebra is applied to this data

Q: But isn't linear algebra mainly just vector and matrix operations?

  • A: Yes. The tensors everyone uses are processed using matrices and vectors.

(There is actual higher-order versions of linear algebra, referred to as Tensor operations, such as Tucker decompositions, but this is a small research niche, not related to the math behind Tensorflow etc.)

What is a "Tensor"? (Computer Science version)

drawing

Behind the scenes it's just another list of numbers with some extra info regarding dimensions

Ndarray

https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html

A $d$-dimensional data structure, containing $n_1\times n_2 \times ... \times n_d$ numbers

In [3]:
np.random.rand(3)
Out[3]:
array([0.57100166, 0.07586713, 0.90062779])
In [5]:
np.random.rand(3,2) # note not a tuple (unlike many other numpy functions)
Out[5]:
array([[0.49662458, 0.23158612],
       [0.94269648, 0.84962285],
       [0.68847847, 0.73942405]])
In [7]:
np.random.rand(3,2,4)
Out[7]:
array([[[0.09840227, 0.26310991, 0.58589047, 0.33265107],
        [0.89329091, 0.99117784, 0.49475921, 0.08047502]],

       [[0.09570909, 0.28074458, 0.15119848, 0.99682941],
        [0.46506878, 0.0333583 , 0.08488412, 0.26735574]],

       [[0.65350538, 0.43238726, 0.79292367, 0.58734204],
        [0.3019322 , 0.21985196, 0.41835067, 0.44614492]]])
  • A list of length 3,
  • each of those 3 elements is a list of length 2
  • each of those 2 elements is a list of length 4
In [9]:
T = np.random.rand(3,2,4,3,2)
T.shape
Out[9]:
(3, 2, 4, 3, 2)
In [10]:
T.ndim
Out[10]:
5

Vectorization and Matricization

II. Kronecker Product "$\otimes$"

In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: