By Trey Hunner
Watch as video
03:25
Let's talk about the difference between strings and bytes in Python.
Creating bytes
objects in Python
Strings represent text (human language that is).
For example, here we have a string named text
:
But there's another type that's closely associated with strings, which kind of looks like making a string with a b
prefixed in front of it.
That b
is sort of like an f
before an f-string, or an r
before a raw string.
But that b
doesn't actually make a string, it makes a bytes
object:
>>> data
b'hello'
>>> type(data)
<class 'bytes'>
Strings represent text, bytes
objects represent binary data
If we loop over a string in Python, we'll get back sub-strings representing each of the characters in that string:
>>> text = "hello"
>>> list(text)
['h', 'e', 'l', 'l', 'o']
What do you think we'll get if we loop over a bytes
object?
>>> data = b"hello"
>>> list(data)
Since bytes
objects represent binary data, when we loop over them we get back numbers (from 0
to 255
) representing each of the bytes in that binary data:
>>> data = b"hello"
>>> list(data)
[104, 101, 108, 108, 111]
We can also do the opposite of this.
We can take an iterable of numbers and turn it into a bytes
object by passing it to the bytes
constructor:
>>> nums = [0, 65, 97, 255]
>>> bytes(nums)
b'x00Aaxff'
Where are bytes objects used in Python?
All data that comes from outside of our Python process starts as bytes.
But if that data represents text (and Python knows it) Python will convert it to strings automatically.
If we use the urllib module in Python to do an HTTP request, the data that we get back is not represented as a string:
>>> from urllib.request import urlopen
>>> data = urlopen('https://pseudorandom.name').read()
>>> data
b'Grace Jonesn'
>>> type(data)
<class 'bytes'>
The data we get back is represented as a bytes
object because it might not even represent text.
After all, an HTTP request can send back any data, even arbitrary binary data.
If we open up a file with the mode of rb
, we're opening that file not in the default read-text mode, but instead in read-binary mode.
>>> with open("avatar.jpg", mode="rb") as jpg_file:
... jpg_data = jpg_file.read()
...
So when we read from that file, the data that we get out of it will not be a string, it'll be a bytes
object.
>>> type(jpg_data)
<class 'bytes'>
In fact in this case where we're opening up a jpg
file, we get a bytes
object with a lot of bytes in it, because it takes a lot of bytes to represent an image:
>>> len(jpg_data)
1108051
How to convert bytes
into a string
If you end up with a bytes
object in Python, and you know that that object represents text, you can turn it into a string by calling its decode
method:
>>> data = b"bytes! xe2x9cxa8"
>>> data.decode()
'bytes! ?'
The decode
method (without any arguments passed to it) uses a default character encoding of utf-8
.
Even if we know that the data we're working with uses that default character encoding of utf-8
, it's considered a best practice to always specify the encoding of our bytes:
>>> text = data.decode("utf-8")
>>> text
'bytes! ?'
As the Zen of Python says, "explicit is always better than implicit".
If for some reason you have a string you want to turn it into bytes, you can call the encode
method on that string to encode it into bytes:
>>> text.encode()
b'bytes! xe2x9cxa8'
Just like decode
, the encode
method defaults to using utf-8
, but you could specify a different character encoding if you wanted to:
>>> text.encode("utf-8")
b'bytes! xe2x9cxa8'
>>> text.encode("utf-16-le")
b"bx00yx00tx00ex00sx00!x00 x00('"
Summary
Strings represent text-based data, while bytes represent binary data (i.e. images, video, or anything else you could represent on a computer).
Depending on what you use Python for, you probably won't encounter bytes
objects very often.
But when you do, the one thing you'll probably want to do with them is call their decode
method to turn them into a string (assuming those bytes represent text).
{{ scoreMessage }}
Want to see more Python topics explained?
Hello friendly web visitor! ?
This page is part of Python Morsels, an online Python skill-building service.
The best way to learn is by doing.
In the case of Python that means writing Python code.
If you'd like to improve your Python skills every week, try out Python Morsels by entering your email below to create an account.
Python Morsels topics pages are free and the first month's worth of exercises is free as well.
You don't need to enter payment details to sign up.
You can find explanations of many other Python topics by signing up below.
By signing up, you agree to the Privacy Policy.