Face Recognition using Principal Component Analysis
Tweet
Share
Share
Last Updated on October 30, 2021
Recent advance in machine learning has made face recognition not a difficult problem.
7.3k
By Nick Cotes
Last Updated on October 30, 2021
Recent advance in machine learning has made face recognition not a difficult problem. But in the previous, researchers have made various attempts and developed various skills to make computer capable of identifying people. One of the early attempt with moderate success is eigenface, which is based on linear algebra techniques.
In this tutorial, we will see how we can build a primitive face recognition system with some simple linear algebra technique such as principal component analysis.
After completing this tutorial, you will know:
The development of eigenface technique
How to use principal component analysis to extract characteristic images from an image dataset
How to express any image as a weighted sum of the characteristic images
How to compare the similarity of images from the weight of principal components
Let’s get started.
Face Recognition using Principal Component Analysis Photo by Rach Teo, some rights reserved.
Tutorial overview
This tutorial is divided into 3 parts; they are:
Image and Face Recognition
Overview of Eigenface
Implementing Eigenface
Image and Face Recognition
In computer, pictures are represented as a matrix of pixels, with each pixel a particular color coded in some numerical values. It is natural to ask if computer can read the picture and understand what it is, and if so, whether we can describe the logic using matrix mathematics. To be less ambitious, people try to limit the scope of this problem to identifying human faces. An early attempt for face recognition is to consider the matrix as a high dimensional detail and we infer a lower dimension information vector from it, then try to recognize the person in lower dimension. It was necessary in the old time because the computer was not powerful and the amount of memory is very limited. However, by exploring how to compress image to a much smaller size, we developed a skill to compare if two images are portraying the same human face even if the pictures are not identical.
In 1987, a paper by Sirovich and Kirby considered the idea that all pictures of human face to be a weighted sum of a few “key pictures”. Sirovich and Kirby called these key pictures the “eigenpictures”, as they are the eigenvectors of the covariance matrix of the mean-subtracted pictures of human faces. In the paper they indeed provided the algorithm of principal component analysis of the face picture dataset in its matrix form. And the weights used in the weighted sum indeed correspond to the projection of the face picture into each eigenpicture.
In 1991, a paper by Turk and Pentland coined the term “eigenface”. They built on top of the idea of Sirovich and Kirby and use the weights and eigenpictures as characteristic features to recognize faces. The paper by Turk and Pentland laid out a memory-efficient way to compute the eigenpictures. It also proposed an algorithm on how the face recognition system can operate, including how to update the system to include new faces and how to combine it with a video capture system. The same paper also pointed out that the concept of eigenface can help reconstruction of partially obstructed picture.
Overview of Eigenface
Before we jump into the code, let’s outline the steps in using eigenface for face recognition, and point out how some simple linear algebra technique can help the task.
Assume we have a bunch of pictures of human faces, all in the same pixel dimension (e.g., all are r×c grayscale images). If we get M different pictures and vectorize each picture into L=r×c pixels, we can present the entire dataset as a L×M matrix (let’s call it matrix $A$), where each element in the matrix is the pixel’s grayscale value.
Recall that principal component analysis (PCA) can be applied to any matrix, and the result is a number of vectors called the principal components. Each principal component has the length same as the column length of the matrix. The different principal components from the same matrix are orthogonal to each other, meaning that the vector dot-product of any two of them is zero. Therefore the various principal components constructed a vector space for which each column in the matrix can be represented as a linear combination (i.e., weighted sum) of the principal components.
The way it is done is to first take $C=A – a$ where $a$ is the mean vector of the matrix $A$. So $C$ is the matrix that subtract each column of $A$ with the mean vector $a$. Then the covariance matrix is
$$S = Ccdot C^T$$
from which we find its eigenvectors and eigenvalues. The principal components are these eigenvectors in decreasing order of the eigenvalues. Because matrix $S$ is a L×L matrix, we may consider to find the eigenvectors of a M×M matrix $C^Tcdot C$ instead as the eigenvector $v$ for $C^Tcdot C$ can be transformed into eigenvector $u$ of $Ccdot C^T$ by $u=Ccdot v$, except we usually prefer to write $u$ as normalized vector (i.e., norm of $u$ is 1).
The physical meaning of the principal component vectors of $A$, or equivalently the eigenvectors of $S=Ccdot C^T$, is that they are the key directions that we can construct the columns of matrix $A$. The relative importance of the different principal component vectors can be inferred from the corresponding eigenvalues. The greater the eigenvalue, the more useful (i.e., holds more information about $A$) the principal component vector. Hence we can keep only the first K principal component vectors. If matrix $A$ is the dataset for face pictures, the first K principal component vectors are the top K most important “face pictures”. We call them the eigenface picture.
For any given face picture, we can project its mean-subtracted version onto the eigenface picture using vector dot-product. The result is how close this face picture is related to the eigenface. If the face picture is totally unrelated to the eigenface, we would expect its result is zero. For the K eigenfaces, we can find K dot-product for any given face picture. We can present the result as weights of this face picture with respect to the eigenfaces. The weight is usually presented as a vector.
Conversely, if we have a weight vector, we can add up each eigenfaces subjected to the weight and reconstruct a new face. Let’s denote the eigenfaces as matrix $F$, which is a L×K matrix, and the weight vector $w$ is a column vector. Then for any $w$ we can construct the picture of a face as
$$z=Fcdot w$$
which $z$ is resulted as a column vector of length L. Because we are only using the top K principal component vectors, we should expect the resulting face picture is distorted but retained some facial characteristic.
Since the eigenface matrix is constant for the dataset, a varying weight vector $w$ means a varying face picture. Therefore we can expect the pictures of the same person would provide similar weight vectors, even if the pictures are not identical. As a result, we may make use of the distance between two weight vectors (such as the L2-norm) as a metric of how two pictures resemble.
Implementing Eigenface
Now we attempt to implement the idea of eigenface with numpy and scikit-learn. We will also make use of OpenCV to read picture files. You may need to install the relevant package with pip command:
pip install opencv-python
The dataset we use are the ORL Database of Faces, which is quite of age but we can download it from Kaggle:
The file is a zip file of around 4MB. It has pictures of 40 persons and each person has 10 pictures. Total to 400 pictures. In the following we assumed the file is downloaded to the local directory and named as attface.zip.
We may extract the zip file to get the pictures, or we can also make use of the zipfile package in Python to read the contents from the zip file directly:
1
2
3
4
5
6
7
8
9
10
11
12
import cv2
import zipfile
import numpy asnp
faces={}
with zipfile.ZipFile("attface.zip")asfacezip:
forfilename infacezip.namelist():
ifnotfilename.endswith(".pgm"):
continue# not a face picture
with facezip.open(filename)asimage:
# If we extracted files from zip, we can use cv2.imread(filename) instead
The above is to read every PGM file in the zip. PGM is a grayscale image file format. We extract each PGM file into a byte string through image.read() and convert it into a numpy array of bytes. Then we use OpenCV to decode the byte string into an array of pixels using cv2.imdecode(). The file format will be detected automatically by OpenCV. We save each picture into a Python dictionary faces for later use.
Here we can take a look on these picture of human faces, using matplotlib:
To illustrate the capability of using eigenface for recognition, we want to hold out some of the pictures before we generate our eigenfaces. We hold out all the pictures of one person as well as one picture for another person as our test set. The remaining pictures are vectorized and converted into a 2D numpy array:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
...
# Take classes 1-39 for eigenfaces, keep entire class 40 and
# image 10 of class 39 as out-of-sample test
facematrix=[]
facelabel=[]
forkey,val infaces.items():
ifkey.startswith("s40/"):
continue# this is our test set
ifkey=="s39/10.pgm":
continue# this is our test set
facematrix.append(val.flatten())
facelabel.append(key.split("/")[0])
# Create facematrix as (n_samples,n_pixels) matrix
facematrix=np.array(facematrix)
Now we can perform principal component analysis on this dataset matrix. Instead of computing the PCA step by step, we make use of the PCA function in scikit-learn, which we can easily retrieve all results we needed:
...
# Apply PCA to extract eigenfaces
from sklearn.decomposition import PCA
pca=PCA().fit(facematrix)
We can identify how significant is each principal component from the explained variance ratio:
or we can simply make up a moderate number, say, 50, and consider these many principal component vectors as the eigenface. For convenience, we extract the eigenface from PCA result and store it as a numpy array. Note that the eigenfaces are stored as rows in a matrix. We can convert it back to 2D if we want to display it. In below, we show some of the eigenfaces to see how they look like:
1
2
3
4
5
6
7
8
9
10
...
# Take the first K principal components as eigenfaces
From this picture, we can see eigenfaces are blurry faces, but indeed each eigenfaces holds some facial characteristics that can be used to build a picture.
Since our goal is to build a face recognition system, we first calculate the weight vector for each input picture:
...
# Generate weights as a KxN matrix where K is the number of eigenfaces and N the number of samples
weights=eigenfaces@(facematrix-pca.mean_).T
The above code is using matrix multiplication to replace loops. It is roughly equivalent to the following:
...
weights=[]
foriinrange(facematrix.shape[0]):
weight=[]
forjinrange(n_components):
w=eigenfaces[j]@(facematrix[i]-pca.mean_)
weight.append(w)
weights.append(weight)
Up to here, our face recognition system has been completed. We used pictures of 39 persons to build our eigenface. We use the test picture that belongs to one of these 39 persons (the one held out from the matrix that trained the PCA model) to see if it can successfully recognize the face:
Above, we first subtract the vectorized image by the average vector that retrieved from the PCA result. Then we compute the projection of this mean-subtracted vector to each eigenface and take it as the weight for this picture. Afterwards, we compare the weight vector of the picture in question to that of each existing picture and find the one with the smallest L2 distance as the best match. We can see that it indeed can successfully find the closest match in the same class:
Best match s39 with Euclidean distance 1559.997137
and we can visualize the result by comparing the closest match side by side:
We can try again with the picture of the 40th person that we held out from the PCA. We would never get it correct because it is a new person to our model. However, we want to see how wrong it can be as well as the value in the distance metric:
We can see that it’s best match has a greater L2 distance:
Best match s5 with Euclidean distance 2690.209330
but we can see that the mistaken result has some resemblance to the picture in question:
In the paper by Turk and Petland, it is suggested that we set up a threshold for the L2 distance. If the best match’s distance is less than the threshold, we would consider the face is recognized to be the same person. If the distance is above the threshold, we claim the picture is someone we never saw even if a best match can be find numerically. In this case, we may consider to include this as a new person into our model by remembering this new weight vector.
Actually, we can do one step further, to generate new faces using eigenfaces, but the result is not very realistic. In below, we generate one using random weight vector and show it side by side with the “average face”:
How good is eigenface? It is surprisingly overachieved for the simplicity of the model. However, Turk and Pentland tested it with various conditions. It found that its accuracy was “an average of 96% with light variation, 85% with orientation variation, and 64% with size variation.” Hence it may not be very practical as a face recognition system. After all, the picture as a matrix will be distorted a lot in the principal component domain after zoom-in and zoom-out. Therefore the modern alternative is to use convolution neural network, which is more tolerant to various transformations.
Putting everything together, the following is the complete code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
import zipfile
import cv2
import numpy asnp
import matplotlib.pyplot asplt
from sklearn.decomposition import PCA
# Read face image from zip file on the fly
faces={}
with zipfile.ZipFile("attface.zip")asfacezip:
forfilename infacezip.namelist():
ifnotfilename.endswith(".pgm"):
continue# not a face picture
with facezip.open(filename)asimage:
# If we extracted files from zip, we can use cv2.imread(filename) instead