Tweet
Share
Share
Optimization is a field of mathematics concerned with finding a good or best solution among many candidates.
It is
730
By Nick Cotes
Optimization is a field of mathematics concerned with finding a good or best solution among many candidates.
It is an important foundational topic required in machine learning as most machine learning algorithms are fit on historical data using an optimization algorithm. Additionally, broader problems, such as model selection and hyperparameter tuning, can also be framed as an optimization problem.
Although having some background in optimization is critical for machine learning practitioners, it can be a daunting topic given that it is often described using highly mathematical language.
In this post, you will discover top books on optimization that will be helpful to machine learning practitioners.
Let’s get started.
Books on Optimization for Machine Learning Photo by Patrick Alexander, some rights reserved.
Overview
The field of optimization is enormous as it touches many other fields of study.
As such, there are hundreds of books on the topic, and most are textbooks filed with math and proofs. This is fair enough given that it is a highly mathematical subject.
Nevertheless, there are books that provide a more approachable description of optimization algorithms.
Not all optimization algorithms are relevant to machine learning; instead, it is useful to focus on a small subset of algorithms.
Frankly, it is hard to group optimization algorithms as there are many concerns. Nevertheless, it is important to have some idea of the optimization that underlies simpler algorithms, such as linear regression and logistic regression (e.g. convex optimization, least squares, newton methods, etc.), and neural networks (first-order methods, gradient descent, etc.).
These are foundational optimization algorithms covered in most optimization textbooks.
Not all optimization problems in machine learning are well behaved, such as optimization used in AutoML and hyperparameter tuning. Therefore, knowledge of stochastic optimization algorithms is required (simulated annealing, genetic algorithms, particle swarm, etc.). Although these are optimization algorithms, they are also a type of learning algorithm referred to as biologically inspired computation or computational intelligence.
Therefore, we will take a look at both books that cover classical optimization algorithms as well as books on alternate optimization algorithms.
In fact, the first book we will look at covers both types of algorithms, and much more.
This book might be one of the very few textbooks that I’ve seen that broadly covers the field of optimization techniques relevant to modern machine learning.
This book provides a broad introduction to optimization with a focus on practical algorithms for the design of engineering systems. We cover a wide variety of optimization topics, introducing the underlying mathematical problem formulations and the algorithms for solving them. Figures, examples, and exercises are provided to convey the intuition behind the various approaches.
Importantly the algorithms range from univariate methods (bisection, line search, etc.) to first-order methods (gradient descent), second-order methods (Newton’s method), direct methods (pattern search), stochastic methods (simulated annealing), and population methods (genetic algorithms, particle swarm), and so much more.
It includes both technical descriptions of algorithms with references and worked examples of algorithms in Julia. It’s a shame the examples are not in Python as this would make the book near perfect in my eyes.
The complete table of contents for the book is listed below.
Chapter 01: Introduction
Chapter 02: Derivatives and Gradients
Chapter 03: Bracketing
Chapter 04: Local Descent
Chapter 05: First-Order Methods
Chapter 06: Second-Order Methods
Chapter 07: Direct Methods
Chapter 08: Stochastic Methods
Chapter 09: Population Methods
Chapter 10: Constraints
Chapter 11: Linear Constrained Optimization
Chapter 12: Multiobjective Optimization
Chapter 13: Sampling Plans
Chapter 14: Surrogate Models
Chapter 15: Probabilistic Surrogate Models
Chapter 16: Surrogate Optimization
Chapter 17: Optimization under Uncertainty
Chapter 18: Uncertainty Propagation
Chapter 19: Discrete Optimization
Chapter 20: Expression Optimization
Chapter 21: Multidisciplinary Optimization
I like this book a lot; it is full of valuable practical advice. I highly recommend it!
This book is focused on the math and theory of the optimization algorithms presented and does cover many of the foundational techniques used by common machine learning algorithms. It may be a little too heavy for the average practitioner.
The book is intended as a textbook for graduate students in mathematical subjects.
We intend that this book will be used in graduate-level courses in optimization, as offered in engineering, operations research, computer science, and mathematics departments.
Even though it is highly mathematical, the descriptions of the algorithms are precise and may provide a useful alternative description to complement the other books listed.
The complete table of contents for the book is listed below.
Chapter 01: Introduction
Chapter 02: Fundamentals of Unconstrained Optimization
Chapter 13: Linear Programming: The Simplex Method
Chapter 14: Linear Programming: Interior-Point Methods
Chapter 15: Fundamentals of Algorithms for Nonlinear Constrained Optimization
Chapter 16: Quadratic Programming
Chapter 17: Penalty and Augmented Lagrangian Methods
Chapter 18: Sequential Quadratic Programming
Chapter 19: Interior-Point Methods for Nonlinear Programming
It’s a solid textbook on optimization.
Learn More:
If you do prefer the theoretical approach to the subject, another widely used mathematical book on optimization is “Convex Optimization” written by Stephen Boyd and Lieven Vandenberghe and published in 2004.
This book is far less mathematical than the previous textbooks and is more focused on the metaphor of the inspired system and how to configure and use the specific algorithms with lots of pseudocode explanations.
While the material is introductory in nature, it does not shy away from details, and does present the mathematical foundations to the interested reader. The intention of the book is not to provide thorough attention to all computational intelligence paradigms and algorithms, but to give an overview of the most popular and frequently used models.
Algorithms like genetic algorithms, genetic programming, evolutionary strategies, differential evolution, and particle swarm optimization are useful to know for machine learning model hyperparameter tuning and perhaps even model selection. They also form the core of many modern AutoML systems.
The complete table of contents for the book is listed below.
Part I Introduction
Chapter 01: Introduction to Computational Intelligence