Dictionary Merge and Update Operators in Python 3.9

Introduction

Python 3.9 was released on Oct. 5, 2020 and it introduces some neat features and optimizations including PEP 584, Union Operators in the built-in class dict; the so-called Dictionary Merge and Update Operators.

In this blog post we will go over the new operators to see if there are any advantages or disadvantages of using them over the earlier ways of merging and updating dictionaries.

The Dictionary Merge Operator

Given two or more dictionaries, we fuse them into a single one.

Let's start by diving into a short example demonstrating the old way of merging two dictionaries:

x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}

z = {**x, **y}
print(z)
# {'key1': 'value1 from x', 'key2': 'value2 from y', 'key3': 'value3 from y'}

This merge creates an entirely new dict object z where we unpack every value from x and y. This way of merging two dictionaries feels unnatural and hardly obvious. If both dictionaries have the same keys, the values of dictionary x are overwritten by the values of dictionary y.

According to Guido:

"I'm sorry for PEP 448, but even if you know about d in simpler
contexts, if you were to ask a typical Python user how to combine two dicts
into a new one, I doubt many people would think of {**d1, **d2}. I know I
myself had forgotten about it when this thread started! If you were to ask
a newbie who has learned a few things (e.g. sequence concatenation) they
would much more likely guess d1+d2."

Here's an example demonstrating the new dictionary merge operator, |:

x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}

z = x | y
print(z)
# {'key1': 'value1 from x', 'key2': 'value2 from y', 'key3': 'value3 from y'}

But remember that the merge operator creates new dictionaries and leaves the two merged dictionaries unchanged:

# before merging
x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}
print(id(x))  # 2670466407744
print(id(y))  # 2670466407808

# after merging
x | y
print(id(x))  # 2670466407744
print(id(y))  # 2670466407808

# assigning the expression to the variable `z`
z = x | y
print(id(z))  # 2670466542912
print(z is x)  # False
print(z is y)  # False

If the expression isn't assigned to a variable, it will be lost.

The same concept applies to the legacy merging method:

# before merging
x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}
print(id(x))  # 2670466407744
print(id(y))  # 2670466407808

# after merging
{**x, **y}
print(id(x))  # 2670466407744
print(id(y))  # 2670466407808

# assigning the expression to the variable `z`
z = {**x, **y}
print(id(z))  # 2670466553600
print(z is x)  # False
print(z is y)  # False

So we have seen they have similar behaviors, and using the merge operator | allows us to write cleaner code but besides this, what good is it for?

The operators have also been implemented for several other standard library packages like collections.

To demonstrate the usefulness of the merge operator, |, let's take a look at the following example using defaultdict:

from collections import defaultdict

user_not_found_message = 'Could not find any user matching the specified user id.'

ceo = defaultdict(
    lambda: user_not_found_message,
    {'id': 1, 'name': 'Jose', 'title': 'Instructor'}
)

author = defaultdict(
      lambda: user_not_found_message,
      {'id': 2, 'name': 'Vlad', 'title': 'Teaching Assistant'}
)

By using the double asterisk, **, merging the two dictionaries will work, but the method is not aware of the class object so we will end up with a traditional dictionary instead:

print({**author, **ceo})
# {'id': 2, 'name': 'Jose', 'title': 'Author', 'title': 'Instructor'}

print({**ceo, **author})
# {'id': 1, 'name': 'Vlad', 'title': 'Teaching Assistant'}

The power of the merge operator | is that it is aware of the class objects. As such, a defaultdict will be returned:

print(author | ceo)
# defaultdict(<function <lambda> at 0x000002212125DE50>, {'id': 2, 'name': 'Jose', 'title': 'Instructor'})

print(ceo | author)
# defaultdict(<function <lambda> at 0x000002212127A3A0>, {'id': 1, 'name': 'Vlad', 'title': 'Teaching Assistant'})

Note: the order of operands is very important as they will behave differently depending on the order they are arranged. In the example above we use both placements so the latter keys and values overwrite the former ones.

Another advantage using the new dictionary merge operator | is having chained expressions following this syntax: dict4 = dict1 | dict2 | dict3, equivalent to dict4 = {**dict1, **dict2, **dict3}.

Let us show a practical example:

basic_data = {'id': 1, 'name': 'Vlad'}
get_role = {'title': 'Teaching Assistant'}
details = {'country': 'Denmark', 'active': True}

vlad_info = basic_data | get_role | details
print(vlad_info)
# {'id': 1, 'name': 'Vlad', 'title': 'Teaching Assistant', 'country': 'Denmark', 'active': True}

The Dictionary Update Operator

In the following example, the dictionary x is being updated by the dictionary y, demonstrating the .update() method:

x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}

print(x.update(y))
# None

print(x)
# {'key1': 'value1 from x', 'key2': 'value2 from y', 'key3': 'value3 from y'}

The dictionary x was updated, and due to the nature of how the built-in .update() method is designed, it operates in-place. The dictionary gets updated but the method returns None.

Using the update operator, |=, we can achieve the same functionality with a cleaner syntax:

x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}

print(x |= y)
# SyntaxError: invalid syntax

print(x)
# {'key1': 'value1 from x', 'key2': 'value2 from x'}

x |= y
print(x)
# {'key1': 'value1 from x', 'key2': 'value2 from y', 'key3': 'value3 from y'}

However, compared to the legacy .update() method, the new dictionary update operator helps prevent misuse by throwing a SyntaxError if we use it inside a print statement, and the dictionary does not get updated. In the next line, following the required syntax the dictionary was updated successfully.

Let's see how updating the dictionary either with the legacy .update() method or with the update operator |=, does not change the object's id nor it creates a new one:

# before update
x = {"key1": "value1 from x", "key2": "value2 from x"}
y = {"key2": "value2 from y", "key3": "value3 from y"}
print(id(x))  # 2627652603136
print(id(y))  # 2627652603200

# after update
x |= y
print(id(x))  # 2627652603136
print(id(y))  # 2627652603200

# after legacy update
x.update(y)
print(id(x))  # 2627652603136
print(id(y))  # 2627652603200

Another example is extending dictionaries with a list of tuples by using the update operator |=:

author = {'id': 1, 'name': 'Vlad'}
author |= [('title', 'Teaching Assistant')]

print(author)
# {'id': 1, 'name': 'Vlad', 'title': 'Teaching Assistant'}

The example above is syntactic sugar for the legacy .update() method:

author = {'id': 1, 'name': 'Vlad'}
new_key = dict([('title', 'Teaching Assistant')])
author.update(new_key)

print(author)
# {'id': 1, 'name': 'Vlad', 'title': 'Teaching Assistant'}

Besides the better syntax that the new dictionary update operator |= has to offer, another advantage of using it is a safer dictionary update by throwing a SyntaxError instead of None when using it inside print.

Older versions
The dictionary update operator |= and the merge operator | are new features in Python 3.9, so if you are trying to use them in an earlier version you will encounter an error similar to this, so make sure you update to the latest version:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for |=: 'dict' and 'dict'

Summary

The new operators are not here to replace the existing ways of merging and updating,
but rather to complement them. Some of the major takeaways are:

  • the merge operator, |, is class aware, offers a better syntax and it creates a new object.
  • the update operator, |=, operates in-place, catches common errors before they happen and it doesn't create a new object.
  • the operators are new features in Python 3.9

If you're learning Python and you find this kind of content interesting, be sure to follow us on Twitter or sign up to our mailing list to stay up to date with all out content. There's a form at the bottom of the page if you're interested.

We also just did a big update to our Complete Python Course, so check that out if you're interested in getting to an advanced level in Python. We have a 30 day money back guarantee, so you really have nothing to lose by giving it a try. We'd love to have you!