BDS 761: Data Science and Machine Learning I
Topic 4: Python Math Basics
This topic:¶
- Basic math in python
- Vectors and Matrices in Python
Readings:
- data science tutor GPT: https://chatgpt.com/g/g-SSBhmwHol-introductory-data-science-tutor
- I2ALA Chapter 1
- Chapter zero of "Coding the Matrix"
0. Motivation¶
Hugging Face Pipelines: Base class implementing NLP operations. Pipeline workflow is defined as a sequence of the following operations:
- A tokenizer in charge of mapping raw textual input to token --> string (or other data format) processing
- A model to make predictions from the inputs --> linear algebra
- Some (optional) post processing for enhancing model’s output --> misc
https://huggingface.co/docs/transformers/en/main_classes/pipelines
Inside the Model¶
Inside the GPU¶
General Matrix Multiplication (GEMM) ~ $C = \alpha AB + \beta C$
https://docs.nvidia.com/deeplearning/performance/dl-performance-matrix-multiplication/index.html
Basic Abstract Mathematics¶
- Sets
- Functions
- Procedure
- Inverse function
Sets¶
- Elements
- Subsets
- Cardinality
- Reals
- Cartesian products of sets
Function¶
- Rule that assigns input to output
- Mapping between sets
- Image
- Pre-image
- Domain
- Co-domain
- Composition
- Probablity as a function
Procedure¶
- Description of a computation
- Also called "function"
- Versus "computational problem"
Inverse function¶
- Forward versus Backward Problem
- One-to-one
- Onto
- Of composition
Fields¶
A set with "+" and "x" operations (that work properly).
- Reals $\bf R$
- Complex numbers $\bf C$
- GF(2)
I. Python Math¶
- Data structures
- Programming
Numerical precision¶
Bytes/words, ints, floats, doubles... how big are they?
IEEE 745 standard numerical arithmetic
nan, inf, eps
"Hard zeros"
Basic Python Data Structures ("collections")¶
- Sets
- Lists
- Tuples
- Dictionaries
Sets {item1, item2, item3}¶
- Order is not necessarily maintained
- Each item occurs at most once
- Mutable
Create a set with a repeated value in and print it
S={1,2,2,3}
print(S)
{1, 2, 3}
Compute the cardinality of your set
print(len(S))
3
T = S.copy()
T.add(6)
print(T)
print(S)
{1, 2, 3, 6} {1, 2, 3}
Sets ...continued¶
Other handy functions
- sum()
- logical test for membership
- add(), remove(), update()
- copy() -- NOTE: Python "=" binds LHS to RHS by reference, it is not a copy.
Q: Is a set a good data structure to make vectors and do linear algebra?
Lists [item1, item2, item3]¶
- Sequence of values - order & repeats ok
- Mutable
- Concatenate lists with "+"
- Index with mylist[index] - note zero based
Convert your set to a list, print list, its length, it's first and last elements
L = list(S)
print(L)
print("length =",len(L))
print(L[0],L[2])
[1, 2, 3] length = 3 1 3
[1,2,3,4,5][3]
4
Slices - mylist[start:end:step]¶
Matlabesque way to select sub-sequences from list
- If first index is zero, can omit - mylist[:end:step]
- If last index is length-1, can omit - mylist[::step]
- If step is 1, can omit mylist[start:end]
Make slices for even and odd indexed members of your list.
Q: Is this a good data structure to make vectors and do linear algebra?
Tuples (1,2,3)¶
- Immutable version of list basically
(1,2,3)[1] # note can access directly
2
- Handy for packing info
- sometimes omit the parenthesis
1,2
(1, 2)
The constructors¶
- convert other collections (and iterators) into collections
L = list(S)
S = set(L)
T = tuple(S)
Dictionaries {key1:value1,key2:value2}¶
- key:value ~ word:definition
- access like a generalized list
mydict = {0:100,1:200,3:300}
mydict[0]
100
mydict = {'X':'hello','Y':'goodbye'}
mydict['X']
'hello'
mydict.items()
dict_items([('X', 'hello'), ('Y', 'goodbye')])
Basic Python Programming¶
- Comprehensions
- Loops
- Whitespace
- Conditional statements
- Functions
Comprehensions¶
- Use iterators for generating sets/lists/tuples
- Similar to mathematical set notation
- popular in "Coding the Matrix"
L=list(i for i in {1,2,3})
print(L)
[1, 2, 3]
dict((mydict[key],key) for key in mydict)
{'hello': 'X', 'goodbye': 'Y'}
Iterate over list and produce list squared
L=[1,1,2,3,4,4,5]
LL = list([x,y] for x in [1,2,3] for y in [1,2,3])
LL
[[1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3], [3, 1], [3, 2], [3, 3]]
Loops¶
for x in {1,2,3}:
print(x)
print(x*x)
1 1 2 4 3 9
Whitespace¶
- used for grouping code
- same indent = same group
- nested code = increased indent
- must be used properly
for x in {1,2,3}:
print(x)
for y in {1,2,3}:
print('...',x,y)
1 ... 1 1 ... 1 2 ... 1 3 2 ... 2 1 ... 2 2 ... 2 3 3 ... 3 1 ... 3 2 ... 3 3
x=0
while x<5:
print(x)
x=x+1
0 1 2 3 4
Range Function¶
list(range(10,0,-1))
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
for x in range(0,10):
print(x)
0 1 2 3 4 5 6 7 8 9
Conditional Statements¶
if not (2+2 == 4):
print('tree')
else:
print('hi')
hi
Defining functions¶
def myfunc(x):
print(x)
return x**2
myfunc(4)
4
16
III. Vectors & Matrices¶
Vector - Multiple numbers "drawn from a field"¶
A $k$-dimensional vector $y$ is an ordered collection of $k$ numbers $y_1 , y_2 , . . . , y_k$ written as $\textbf{y} = (y_1,y_2,...,y_k)$.
The numbers $y_j$, for $j = 1,2,...,k$, are called the $\textbf{components}$ of the vector $y$.
Note boldface for vectors and italic for scalars, a popular convention.
It can be written either as rows (math, engineering) or columns (CS, statistics), and we won't worry about this.
$$\textbf{y} = \begin{bmatrix} y_1 \\y_2 \\ \vdots \\ y_k \end{bmatrix} = [y_1,y_2,...,y_k]^{T} $$
(Swapping rows and columns = transposing. 1st column = top. 1st row = left.)
Vector Examples¶
string?
Coordinates
Direction
GPS entry
Collection of clinical study data for single individual ("sample" in machine learning).
An image
A music or voice signal
A block of binary data
Consider the numbers we want to use for each.
Drawn from a Field?¶
Recall a field is a set. So a vector has each member from the set.
Vector of real numbers $\mathbf v \in \mathbf R^n$
Vector of binary numbers $\mathbf b \in GF(2)^n$
What is the notation for our previous examples?
Data Structure Options¶
- Python Sets
- Python Lists
- Python Tuples
- Python Dictionaries
Consider how you would use each of these to make a vector of coordinates.
Recall the other kinds of info we make into vectors. Does the data structure work for all of them?
Vector Addition $\mathbf a + \mathbf b$¶
Addition of two k-dimensional vectors $\textbf{x} = (x_1, x_2, ... , x_k)$ and $\textbf{y} = (y_1,y_2,...,y_k)$ is defined as a new vector $\textbf{z} = (z_1,z_2,...,z_k)$, denoted $\textbf{z} = \textbf{x}+\textbf{y}$,with components given by $z_j = x_j+y_j$.
Vector Addition $\mathbf a + \mathbf b$¶
Addition of corresponding entries.
Geometrical perspective.
Consider in terms of applications.
Scalar-Vector multiplication $\alpha \mathbf y$¶
Scalar multiplication of a vector $\textbf{y} = (y_1, y_2, . . . , y_k)$ and a scalar α is defined to be a new vector $\textbf{z} = (z_1,z_2,...,z_k)$, written $\textbf{z} = \alpha\ \textbf{y}$ or $\textbf{z} = \textbf{y} \alpha$, whose components are given by $z_j = \alpha y_j$.
Code these operations in Python¶
Consider vectors describing distance travelled in 2D.
Make two functions:
multiply a vector of data by a scalar (i.e. "scale it")
add two vectors.
Make it work for any number of dimensions
The Dot Product $\mathbf x \cdot \mathbf y$¶
If we have two vectors: ${\bf{x}} = (x_1, x_2, ... , x_k)$ and ${\bf{y}} = (y_1,y_2,...,y_k)$
The dot product is written: ${\bf{x}} \cdot {\bf{y}} = x_{1}y_{1}+x_{2}y_{2}+\cdots+x_{k}y_{k}$
If $\mathbf{x} \cdot \mathbf{y} = 0$ then $x$ and $y$ are orthogonal
What is $\mathbf{x} \cdot \mathbf{x}$?
Code the dot product in Python¶
Choose appropriate data structures for your vectors
Make it work for any number of dimensions
Properties of Dot Product¶
- Commutative
- Homogeneous
- Distributes over vector addition
Test them with your code and example vectors
Homogeneity: $(\alpha \mathbf x) \cdot \mathbf y = \alpha (\mathbf x \cdot \mathbf y)$¶
Consider what this means for using the dot product to measure similarity.
Use your functions to implement this both ways.
Classes in Python¶
Clunky but handy to encapsulate related code
Note odd way that constructor is defined
Also need to pass "self" argument to every function
class Complex:
def __init__(self, realpart, imagpart):
self.r = realpart
self.i = imagpart
x = Complex(3.0, -4.5)
x.r, x.i
(3.0, -4.5)
Python Lab¶
Make a class for "dense" vectors that contains all your functions thus far.
Add methods for handling sparse vectors:
- Vector addition
- Scalar-vector multiplication
- Dot product
Compare speed to large dense vectors for different levels of density.
Advanced: compare to sparse vectors in numpy.
Matrices¶
A matrix $\mathbf A$ is a rectangular array of numbers, of size $m \times n$ as follows:
$\mathbf A = \begin{bmatrix} A_{1,1} & A_{1,2} & A_{1,3} & \dots & A_{1,n} \\ A_{2,1} & A_{2,2} & A_{2,3} & \dots & A_{2,n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ A_{m,1} & A_{m,2} & A_{m,3} & \dots & A_{m,n} \end{bmatrix}$
Where the numbers $A_{ij}$ are called the elements of the matrix. We describe matrices as wide if $n > m$ and tall if $n < m$. They are square iff $n = m$.
NOTE: naming convention for scalars vs. vectors vs. matrices.
Scalar Multiplication¶
Scalar multiplication of a matrix $\textit{A}$ and a scalar α is defined to be a new matrix $\textit{B}$, written $\textit{B} = \alpha\ \textit{A}$ or $\textit{B} = \textit{A} \alpha$, whose components are given by $b_{ij} = \alpha a_{ij}$.
Matrix Addition¶
Addition of two $m \times n$ -dimensional matrices $\textit{A}$ and $\textit{B}$ is defined as a new matrix $\textit{C}$, written $\textit{C} = \textit{A} + \textit{B}$, whose components $c_{ij}$ are given by addition of each component of the two matrices, $c_{ij} = a_{ij}+b_{ij}$.
Matrix Equality¶
Two matrices are equal when they share the same dimensions and all elements are equal. I.e.: $a_{ij}=b_{ij}$ for all $i \in I$ and $j \in J$.
Properties of Matrices¶
For three matrices $\mathbf{A}$, $\mathbf{B}$, and $\mathbf{C}$ we have the following properties
Commutative Law of Addition: $\mathbf{A} + \mathbf{B} = \mathbf{B} + \mathbf{A}$
Associative Law of Addition: $(\mathbf{A} + \mathbf{B}) + \mathbf{C} = \mathbf{A} + (\mathbf{B} + \mathbf{C})$
Associative Law of Multiplication: $\mathbf{A}(\mathbf{B}\mathbf{C}) = (\mathbf{A}\mathbf{B})\mathbf{C}$
Distributive Law: $\mathbf{A}(\mathbf{B} + \mathbf{C}) = \mathbf{A}\mathbf{B} + \mathbf{A}\mathbf{C}$
Identity: There is the matrix equivalent of one. We define a matrix $\mathbf{I_n}$ of dimension $n \times n$ such that the elements of $\mathbf{I_n}$ are all zero, except the diagonal elements $i=j$; where $I_{i,i} = 1$
Zero: We define a matrix $\mathbf 0$ of $m \times n$ dimension as the matrix where all components $(\mathbf 0)_{i,j}$ are 0
Identity Matrix¶
$I_3 = \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \\ 0 & 0 & 1\\ \end{bmatrix}$
Here we can write $\textit{I}\textit{B} = \textit{B}\textit{I} = \textit{B}$ or $\textit{I}\textit{I} = \textit{I}$
Again, it is important to reiterate that matrices are not in general commutative with respect to multiplication. That is to say that the left and right products of matrices are, in general different.
$AB \neq BA$
Matrix Transpose¶
The transpose of a matrix $\mathbf A$ is formed by interchanging the rows and columns of $\mathbf A$. That is
$(\mathbf A)^T_{ij} = A_{ji}$
Example 2:¶
$\mathbf B = \begin{bmatrix} 1 & 2 \\ 0 & -3 \\ 3 & 1 \\ \end{bmatrix}$
$\mathbf{B}^{T} = \begin{bmatrix} 1 & 0 & 3 \\ 2 & -3 & 1 \\ \end{bmatrix}$
BONUS:¶
Show that $(\mathbf{A}\mathbf{B})^{T} = \mathbf{B}^{T}\mathbf{A}^{T}$.
Hint: the $ij$th element on both sides is $\sum_{k}A_{jk}A_{ki}$
Matrix-Vector Multiplication¶
Two perspectives:
Linear combination of columns
Dot product of vector with rows of matrix
$\begin{bmatrix} 2 & -6 \\ -1 & 4\\ \end{bmatrix} \begin{bmatrix} 2 \\ -1 \\ \end{bmatrix} = ?$
Test both ways out.
Lab: Matrix-vector multiplication¶
We just defined two different procedures for computing the matrix-vector product. Let us write them in Python.
Suppose we defined vectors as lists, and a matrix as a list of vectors.
- Write code to compute the matrix vector product assuming $\mathbf A$ is a list of rows
- Write code to compute the matrix vector product assuming $\mathbf A$ is a list of columns
Matrix Multiplication¶
Multiplication of an $m \times n$ -dimensional matrices $\textit{A}$ and a $n \times k$ matrix $\textit{B}$ is defined as a new matrix $\textit{C}$, written $\textit{C} = \textit{A}\textit{B}$, whose elements $C_{ij}$ are
$$ C_{i,j} = \sum_{l=1}^n A_{i,l}B_{l,j} $$
This can be memorized as row by column multiplication, where the value of each cell in the result is achieved by multiplying each element in a given row $i$ of the left matrix with its corresponding element in the column $j$ of the right matrix and adding the result of each operation together. This sum is the value of the new the new component $c_{ij}$.
Note that the product of matrices and vectors is a special case, under the assumption that the vector is oriented correctly and is of correct dimension (same rules as a matrix). In this case, we simply treat the vector as a $n \times 1$ or $1 \times n$ matrix.
Also note that $\textit{B}\textit{A} \neq \textit{A}\textit{B}$ in general.
Example 1:¶
$\textit{C} = \textit{A}\textit{B} = \begin{bmatrix} 1 & 2 \\ 0 & 1 \\ \end{bmatrix} \begin{bmatrix} 2 & 0 \\ 1 & 4 \\ \end{bmatrix} = \begin{bmatrix} 4 & 8 \\ 1 & 4 \\ \end{bmatrix}$
Example 2:¶
$\begin{bmatrix} 1 & 2 \\ 0 & -3 \\ 3 & 1 \\ \end{bmatrix} \begin{bmatrix} 2 & 6 & -3 \\ 1 & 4 & 0 \\ \end{bmatrix} = ?$
Example 2:¶
$\begin{bmatrix} 1 & 2 \\ 0 & -3 \\ 3 & 1 \\ \end{bmatrix} \begin{bmatrix} 2 & 6 & -3 \\ 1 & 4 & 0 \\ \end{bmatrix} = \begin{bmatrix} 4 & 14 & -3\\ -3 & -12 & 0 \\ 7 & 22 & -9\\ \end{bmatrix}$
Example 3:¶
$\begin{bmatrix} 2 & 6 & -3 \\ 1 & 4 & 0 \\ \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 0 & -3 \\ 3 & 1 \\ \end{bmatrix} = ?$
Example 3:¶
$\begin{bmatrix} 2 & 6 & -3 \\ 1 & 4 & 0 \\ \end{bmatrix} \begin{bmatrix} 1 & 2 \\ 0 & -3 \\ 3 & 1 \\ \end{bmatrix} = \begin{bmatrix} -7 & -17\\ 1 & -10 \\ \end{bmatrix}$
Example 4:¶
$\begin{bmatrix} 2 & -6 \\ -1 & 4\\ \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} = ?$
Example 4:¶
$\begin{bmatrix} 2 & -6 \\ -1 & 4\\ \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ \end{bmatrix} = \begin{bmatrix} 2\\ -1\\ \end{bmatrix}$
QUIZ:¶
Can you take the product of $\begin{bmatrix} 2 & -6 \\ -1 & 4\\ \end{bmatrix}$ and $\begin{bmatrix} 12 & 46 \\ \end{bmatrix}$ ?
If question seems vague, list all possible ways to address this question.
Lab: Matrix-matrix multiplication¶
There are yet more ways to programmatically implement matrix multiplication
Let us focus on just the two which are direct extensions of the preview matrix-vector multiplication methods
Use your dense vector functions to perform vector matrix multiplication - both by columns and by rows