Last Updated on October 29, 2021

Singular value decomposition is a very popular linear algebra technique to break down a matrix into the product of a few smaller matrices. In fact, it is a technique that has many uses. One example is that we can use SVD to discover relationship between items. A recommender system can be build easily from this.

In this tutorial, we will see how a recommender system can be build just using linear algebra techniques.

After completing this tutorial, you will know:

  • What has singular value decomposition done to a matrix
  • How to interpret the result of singular value decomposition
  • What data a single recommender system require, and how we can make use of SVD to analyze it
  • How we can make use of the result from SVD to make recommendations

Let’s get started.

Using Singular Value Decomposition to Build a Recommender System

Using Singular Value Decomposition to Build a Recommender System
Photo by Roberto Arias, some rights reserved.

Tutorial overview

This tutorial is divided into 3 parts; they are:

  • Review of Singular Value Decomposition
  • The Meaning of Singular Value Decomposition in Recommender System
  • Implementing a Recommender System

Review of Singular Value Decomposition

Just like a number such as 24 can be decomposed as factors 24=2×3×4, a matrix can also be expressed as multiplication of some other matrices. Because matrices are arrays of numbers, they have their own rules of multiplication. Consequently, they have different ways of factorization, or known as decomposition. QR decomposition or LU decomposition are common examples. Another example is singular value decomposition, which has no restriction to the shape or properties of the matrix to be decomposed.

Singular value decomposition assumes a matrix $M$ (for example, a $mtimes n$ matrix) is decomposed as
M = Ucdot Sigma cdot V^T
where $U$ is a $mtimes m$ matrix, $Sigma$ is a diagonal matrix of $mtimes n$, and $V^T$ is a $ntimes n$ matrix. The diagonal matrix $Sigma$ is an interesting one, which it can be non-square but only the entries on the diagonal could be non-zero. The matrices $U$ and $V^T$ are orthonormal matrices. Meaning the columns of $U$ or rows of $V$ are (1) orthogonal to each other and are (2) unit vectors. Vectors are orthogonal to each other if any two vectors’ dot product is zero. A vector is unit vector if its L2-norm is 1. Orthonormal matrix has the property that its transpose is its inverse. In other words, since $U$ is an orthonormal matrix, $U^T = U^{-1}$ or $Ucdot U^T=U^Tcdot U=I$, where $I$ is the identity matrix.

Singular value decomposition gets its name from the diagonal entries on $Sigma$, which are called the singular values of matrix $M$. They are in fact, the square root of the eigenvalues of matrix $Mcdot M^T$. Just like a number factorized into primes, the singular value decomposition of a matrix reveals a lot about the structure of that matrix.

But actually what described above is called the full SVD. There is another version called reduced SVD or compact SVD. We still have to write $M = UcdotSigmacdot V^T$ but we have $Sigma$ a $rtimes r$ square diagonal matrix with $r$ the rank of matrix $M$, which is usually less than or equal to the smaller of $m$ and $n$. The matrix $U$ is than a $mtimes r$ matrix and $V^T$ is a $rtimes n$ matrix. Because matrices $U$ and $V^T$ are non-square, they are called semi-orthonormal, meaning $U^Tcdot U=I$ and $V^Tcdot V=I$, with $I$ in both case a $rtimes r$ identity matrix.

The Meaning of Singular Value Decomposition in Recommender System

If the matrix $M$ is rank $r$, than we can prove that the matrices $Mcdot M^T$ and $M^Tcdot M$ are both rank $r$. In singular value decomposition (the reduced SVD), the columns of matrix $U$ are eigenvectors of $Mcdot M^T$ and the rows of matrix $V^T$ are eigenvectors of $M^Tcdot M$. What’s interesting is that $Mcdot M^T$ and $M^Tcdot M$ are potentially in different size (because matrix $M$ can be non-square shape), but they have the same set of eigenvalues, which are the square of values on the diagonal of $Sigma$.

This is why the result of singular value decomposition can reveal a lot about the matrix $M$.

Imagine we collected some book reviews such that books are columns and people are rows, and the entries are the ratings that a person gave to a book. In that case, $Mcdot M^T$ would be a table of person-to-person which the entries would mean the sum of the ratings one person gave match with another one. Similarly $M^Tcdot M$ would be a table of book-to-book which entries are the sum of the ratings received match with that received by another book. What can be the hidden connection between people and books? That could be the genre, or the author, or something of similar nature.

Implementing a Recommender System

Let’s see how we can make use of the result from SVD to build a recommender system. Firstly, let’s download the dataset from this link (caution: it is 600MB big)

This dataset is the “Social Recommendation Data” from “Recommender Systems and Personalization Datasets“. It contains the reviews given by users on books on Librarything. What we are interested are the number of “stars” a user given to a book.

If we open up this tar file we will see a large file named “reviews.json”. We can extract it, or read the included file on the fly. First three lines of reviews.json are shown below:

The above will print:

Each line in reviews.json is a record. We are going to extract the “user”, “work”, and “stars” field of each record as long as there are no missing data among these three. Despite the name, the records are not well-formed JSON strings (most notably it uses single quote rather than double quote). Therefore, we cannot use json package from Python but to use ast to decode such string:

Now we should make a matrix of how different users rate each book. We make use of the pandas library to help convert the data we collected into a table:

As an example, we try not to use all data in order to save time and memory. Here we consider only those users who reviewed more than 50 books and also those books who are reviewed by more than 50 users. This way, we trimmed our dataset to less than 15% of its original size:

Then we can make use of “pivot table” function in pandas to convert this into a matrix:

The result is a matrix of 5593 rows and 2898 columns

Here we represented 5593 users and 2898 books in a matrix. Then we apply the SVD (this will take a while):

By default, the svd() returns a full singular value decomposition. We choose a reduced version so we can use smaller matrices to save memory. The columns of vh correspond to the books. We can based on vector space model to find which book are most similar to the one we are looking at:

And in the above example, we try to find the book that is best match to to first column. The result is:

In a recommendation system, when a user picked a book, we may show her a few other books that are similar to the one she picked based on the cosine distance as calculated above.

Depends on the dataset, we may use truncated SVD to reduce the dimension of matrix vh. In essence, this means we are removing several rows on vh that the corresponding singular values in s are small, before we use it to compute the similarity. This would likely make the prediction more accurate as those less important features of a book are removed from consideration.

Note that, in the decomposition $M=UcdotSigmacdot V^T$ we know the rows of $U$ are the users and columns of $V^T$ are books, we cannot identify what are the meanings of the columns of $U$ or rows of $V^T$ (an equivalently, that of $Sigma$). We know they could be genres, for example, that provide some underlying connections between the users and the books but we cannot be sure what exactly are they. However, this does not stop us from using them as features in our recommendation system.

Tying all together, the following is the complete code:

Further reading

This section provides more resources on the topic if you are looking to go deeper.





In this tutorial, you discovered how to build a recommender system using singular value decomposition.

Specifically, you learned:

  • What a singular value decomposition mean to a matrix
  • How to interpret the result of a singular value decomposition
  • Find similarity from the columns of matrix $V^T$ obtained from singular value decomposition, and make recommendations based on the similarity

Get a Handle on Linear Algebra for Machine Learning!

Linear Algebra for Machine Learning

Develop a working understand of linear algebra writing lines of code in python

Discover how in my new Ebook:
Linear Algebra for Machine Learning

It provides self-study tutorials on topics like:
Vector Norms, Matrix Multiplication, Tensors, Eigendecomposition, SVD, PCA and much more...

Finally Understand the Mathematics of Data

Skip the Academics. Just Results.

See What's Inside