# Ben Cook: NumPy meshgrid: Understanding np.meshgrid()

The meshgrid operation creates multi-dimensional coordinate arrays:

``````import numpy as np

x = np.arange(2)
y = np.arange(3)

ii, jj = np.meshgrid(x, y, indexing='ij')

ii

# Expected result
# array([[0, 0, 0],
#        [1, 1, 1]])
``````

`np.meshgrid()` returns a tuple with the inputs broadcast across rows and columns (and higher dimensions if you pass more vectors in). Here’s what the `jj` array looks like:

``````jj

# Expected result
# array([[0, 1, 2],
#        [0, 1, 2]])
``````

I’ll talk about the `indexing='ij'` argument below. But first, let’s take a look at an example.

### A realistic use case

Ok, let’s look at an example. Say we have a simple linear regression model `y = w0 + w1 * x` where `w0` is the intercept and `w1` is the slope. If `x` is a 1-dimensional NumPy array of inputs and `y` is a 1-dimensional NumPy array of targets, we can calculate the mean-squared error (MSE) loss for a given `w0` and `w1` with the following:

``````yhat = w0 + w1 * x
loss = np.square(yhat - y).mean()
``````

Notice, if our data is fixed, then the loss becomes a function of `w0` and `w1`. That means we can plot the loss surface to see how it varies with `w0` and `w1`. This is a perfect use case for `meshgrid()`.

First, let’s fix `w0=3` and `w1=2.5` and then generate 50 data points using this model plus some random noise:

``````x = np.random.normal(0, 1, size=50)
y = 3 + 2.5 * x + np.random.normal(0, 1, size=50)
``````

A scatter plot of these points points will look something like this. You can see visually that the y-intercept is roughly 3 and the slope is roughly 2.5.

Now we want to create all possible values of `w0` and `w1` in the range `[-5, 5]` so we can compute the loss value for those pairs:

``````w0, w1 = np.meshgrid(
np.linspace(-5, 5),
np.linspace(-5, 5),
indexing='ij',
)

w0.shape

# Expected result
# (50, 50)
``````

At this point `w0` and `w1` both have shape `(50, 50)`. That’s because `np.linspace()` creates a 50 evenly spaced values by default. If we flatten `w0` and `w1` and line up the elements, we get 2500 evenly spaced points on a grid.

Let’s take a look at the first 10 points:

``````list(zip(w0.ravel(), w1.ravel()))[:10]

# Expected result
# [(-5.0, -5.0),
#  (-5.0, -4.795918367346939),
#  (-5.0, -4.591836734693878),
#  (-5.0, -4.387755102040816),
#  (-5.0, -4.183673469387755),
#  (-5.0, -3.979591836734694),
#  (-5.0, -3.7755102040816326),
#  (-5.0, -3.571428571428571),
#  (-5.0, -3.36734693877551),
#  (-5.0, -3.163265306122449)]
``````

Now, we can compute the model for each `(w0, w1)` pair for all 50 random data points. That allows us to compute MSE for each of the weight values:

``````yhat = w0[:, None] + w1[:, None] * x
loss = np.square(y - yhat).mean(-1)
``````

Here we expand the `w0` and `w1` arrays so that we can use NumPy broadcasting to compute `yhat` for all points at the same time. The result `yhat` has shape `(2500, 50)`: it’s the predicted value for all pairs of `(w0, w1)` for all 50 points. After that, we use the mean function to reduce the column dimension and compute MSE for each point on the grid.

Now, we can reshape loss to be `(50, 50)` and plot loss for each pair of weights:

``````loss = loss.reshape(w0.shape)
plt.contourf(w0, w1, loss, 30)
plt.plot(3, 2.5, 'x', c='white', ms=7)
plt.xlabel("w0")
plt.ylabel("w1")
plt.title("Loss surface");
``````

That’s a nice looking loss surface! I’ve added a mark at the true values for `w0` and `w1`. The loss surface will never have its optimum at exactly the true values since it’s learning the function on a random sample of all data, but this is pretty close!

### Indexing

Ok let’s revisit indexing. The `indexing='ij'` argument tells NumPy to use matrix indexing instead of Cartesian indexing. For 2-dimensional meshgrids, matrix indexing produces coordinate arrays that are the transpose of Cartesian coordinate arrays. The question is whether those coordinate arrays should be interpreted as the indices of a matrix (matrix indexing) or points in Euclidean space (Cartesian indexing).

Here’s a diagram that displays the index pairs for `xy` versus `ij` indexing:

Notice for Cartesian indexing, the x-axis increases as your move “to the right” and the y-axis increases as you move “down”.

### Other tensor libraries

In NumPy, you actually don’t need meshgrid() but it’s good to know it because the same function exists in PyTorch and TensorFlow. Although the API is more limited in both cases (and the default indexing for PyTorch is different) the basic usage is roughly same:

``````import torch

x = torch.arange(2)
y = torch.arange(2)

xx, yy = torch.meshgrid(x, y)

xx

# Expected result
# tensor([[0, 0],
#         [1, 1]])
``````

And:

``````import tensorflow as tf

x = tf.range(2)
y = tf.range(2)

xx, yy = tf.meshgrid(x, y, indexing='ij')

xx

# Expected result
# <tf.Tensor: shape=(2, 2), dtype=int32, numpy=
# array([[0, 0],
#        [1, 1]], dtype=int32)>
``````

One thing to watch out for: in NumPy and TensorFlow, the default indexing is Cartesian, whereas in PyTorch, the default indexing is matrix. If you want to avoid confusion, you can plan to set `indexing='ij'` whenever you call `meshgrid()` in NumPy or TensorFlow. You probably don’t want to force yourself to remember which indexing gets returned by default.