Home » Python » Calculate difference in keys contained in two Python dictionaries

# Calculate difference in keys contained in two Python dictionaries

Questions:

Suppose I have two Python dictionaries – `dictA` and `dictB`. I need to find out if there are any keys which are present in `dictB` but not in `dictA`. What is the fastest way to go about it?

Should I convert the dictionary keys into a set and then go about?

Apologies for not stating my question properly.
My scenario is like this – I have a `dictA` which can be the same as `dictB` or may have some keys missing as compared to `dictB` or else the value of some keys might be different which has to be set to that of `dictA` key’s value.

Problem is the dictionary has no standard and can have values which can be dict of dict.

Say

``````dictA={'key1':a, 'key2':b, 'key3':{'key11':cc, 'key12':dd}, 'key4':{'key111':{....}}}
dictB={'key1':a, 'key2:':newb, 'key3':{'key11':cc, 'key12':newdd, 'key13':ee}.......
``````

So ‘key2’ value has to be reset to the new value and ‘key13’ has to be added inside the dict.
The key value does not have a fixed format. It can be a simple value or a dict or a dict of dict.

You can use set operations on the keys:

``````diff = set(dictb.keys()) - set(dicta.keys())
``````

Here is a class to find all the possibilities: what was added, what was removed, which key-value pairs are the same, and which key-value pairs are changed.

``````class DictDiffer(object):
"""
Calculate the difference between two dictionaries as:
(2) items removed
(3) keys same in both but changed values
(4) keys same in both and unchanged values
"""
def __init__(self, current_dict, past_dict):
self.current_dict, self.past_dict = current_dict, past_dict
self.set_current, self.set_past = set(current_dict.keys()), set(past_dict.keys())
self.intersect = self.set_current.intersection(self.set_past)
return self.set_current - self.intersect
def removed(self):
return self.set_past - self.intersect
def changed(self):
return set(o for o in self.intersect if self.past_dict[o] != self.current_dict[o])
def unchanged(self):
return set(o for o in self.intersect if self.past_dict[o] == self.current_dict[o])
``````

Here is some sample output:

``````>>> a = {'a': 1, 'b': 1, 'c': 0}
>>> b = {'a': 1, 'b': 2, 'd': 0}
>>> d = DictDiffer(b, a)
>>> print "Removed:", d.removed()
Removed: set(['c'])
>>> print "Changed:", d.changed()
Changed: set(['b'])
>>> print "Unchanged:", d.unchanged()
Unchanged: set(['a'])
``````

Available as a github repo:
https://github.com/hughdbrown/dictdiffer

Questions:

In case you want the difference recursively, I have written a package for python:
https://github.com/seperman/deepdiff

## Installation

Install from PyPi:

``````pip install deepdiff
``````

## Example usage

Importing

``````>>> from deepdiff import DeepDiff
>>> from pprint import pprint
>>> from __future__ import print_function # In case running on Python 2
``````

Same object returns empty

``````>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = t1
>>> print(DeepDiff(t1, t2))
{}
``````

Type of an item has changed

``````>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:"2", 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{ 'type_changes': { 'root[2]': { 'newtype': <class 'str'>,
'newvalue': '2',
'oldtype': <class 'int'>,
'oldvalue': 2}}}
``````

Value of an item has changed

``````>>> t1 = {1:1, 2:2, 3:3}
>>> t2 = {1:1, 2:4, 3:3}
>>> pprint(DeepDiff(t1, t2), indent=2)
{'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
``````

``````>>> t1 = {1:1, 2:2, 3:3, 4:4}
>>> t2 = {1:1, 2:4, 3:3, 5:5, 6:6}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff)
'dic_item_removed': ['root[4]'],
'values_changed': {'root[2]': {'newvalue': 4, 'oldvalue': 2}}}
``````

String difference

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world"}}
>>> t2 = {1:1, 2:4, 3:3, 4:{"a":"hello", "b":"world!"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { 'root[2]': {'newvalue': 4, 'oldvalue': 2},
"root[4]['b']": { 'newvalue': 'world!',
'oldvalue': 'world'}}}
``````

String difference 2

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world!\nGoodbye!\n1\n2\nEnd"}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n1\n2\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'values_changed': { "root[4]['b']": { 'diff': '--- \n'
'+++ \n'
'@@ -1,5 +1,4 @@\n'
'-world!\n'
'-Goodbye!\n'
'+world\n'
' 1\n'
' 2\n'
' End',
'newvalue': 'world\n1\n2\nEnd',
'oldvalue': 'world!\n'
'Goodbye!\n'
'1\n'
'2\n'
'End'}}}

>>>
>>> print (ddiff['values_changed']["root[4]['b']"]["diff"])
---
+++
@@ -1,5 +1,4 @@
-world!
-Goodbye!
+world
1
2
End
``````

Type change

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":"world\n\n\nEnd"}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'type_changes': { "root[4]['b']": { 'newtype': <class 'str'>,
'newvalue': 'world\n\n\nEnd',
'oldtype': <class 'list'>,
'oldvalue': [1, 2, 3]}}}
``````

List difference

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3, 4]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{'iterable_item_removed': {"root[4]['b'][2]": 3, "root[4]['b'][3]": 4}}
``````

List difference 2:

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
'values_changed': { "root[4]['b'][1]": {'newvalue': 3, 'oldvalue': 2},
"root[4]['b'][2]": {'newvalue': 2, 'oldvalue': 3}}}
``````

List difference ignoring order or duplicates: (with the same dictionaries as above)

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, 3]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 3, 2, 3]}}
>>> ddiff = DeepDiff(t1, t2, ignore_order=True)
>>> print (ddiff)
{}
``````

List that contains dictionary:

``````>>> t1 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:1, 2:2}]}}
>>> t2 = {1:1, 2:2, 3:3, 4:{"a":"hello", "b":[1, 2, {1:3}]}}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (ddiff, indent = 2)
{ 'dic_item_removed': ["root[4]['b'][2][2]"],
'values_changed': {"root[4]['b'][2][1]": {'newvalue': 3, 'oldvalue': 1}}}
``````

Sets:

``````>>> t1 = {1, 2, 8}
>>> t2 = {1, 2, 3, 5}
>>> ddiff = DeepDiff(t1, t2)
>>> pprint (DeepDiff(t1, t2))
``````

Named Tuples:

``````>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> t1 = Point(x=11, y=22)
>>> t2 = Point(x=11, y=23)
>>> pprint (DeepDiff(t1, t2))
{'values_changed': {'root.y': {'newvalue': 23, 'oldvalue': 22}}}
``````

Custom objects:

``````>>> class ClassA(object):
...     a = 1
...     def __init__(self, b):
...         self.b = b
...
>>> t1 = ClassA(1)
>>> t2 = ClassA(2)
>>>
>>> pprint(DeepDiff(t1, t2))
{'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
``````

``````>>> t2.c = "new attribute"
>>> pprint(DeepDiff(t1, t2))
'values_changed': {'root.b': {'newvalue': 2, 'oldvalue': 1}}}
``````

Questions:

not sure whether its “fast” or not, but normally, one can do this

``````dicta = {"a":1,"b":2,"c":3,"d":4}
dictb = {"a":1,"d":2}
for key in dicta.keys():
if not key in dictb:
print key
``````

Questions:

As Alex Martelli wrote, if you simply want to check if any key in B is not in A, `any(True for k in dictB if k not in dictA)` would be the way to go.

To find the keys that are missing:

``````diff = set(dictB)-set(dictA) #sets

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=set(dictB)-set(dictA)"
10000 loops, best of 3: 107 usec per loop

diff = [ k for k in dictB if k not in dictA ] #lc

C:\Dokumente und Einstellungen\thc>python -m timeit -s "dictA =
dict(zip(range(1000),range
(1000))); dictB = dict(zip(range(0,2000,2),range(1000)))" "diff=[ k for k in dictB if
k not in dictA ]"
10000 loops, best of 3: 95.9 usec per loop
``````

So those two solutions are pretty much the same speed.

Questions:

If you really mean exactly what you say (that you only need to find out IF “there are any keys” in B and not in A, not WHICH ONES might those be if any), the fastest way should be:

``````if any(True for k in dictB if k not in dictA): ...
``````

If you actually need to find out WHICH KEYS, if any, are in B and not in A, and not just “IF” there are such keys, then existing answers are quite appropriate (but I do suggest more precision in future questions if that’s indeed what you mean;-).

Questions:
``````set(dictA.keys()).intersection(dictB.keys())
``````

Questions:

There is an other question in stackoverflow about this argument and i have to admit that there is a simple solution explained: the datadiff library of python helps printing the difference between two dictionaries.

Questions:

Here’s a way that will work, allows for keys that evaluate to `False`, and still uses a generator expression to fall out early if possible. It’s not exceptionally pretty though.

``````any(map(lambda x: True, (k for k in b if k not in a)))
``````

EDIT:

THC4k posted a reply to my comment on another answer. Here’s a better, prettier way to do the above:

``````any(True for k in b if k not in a)
``````

Not sure how that never crossed my mind…

Questions:

This is an old question and asks a little bit less than what I needed so this answer actually solves more than this question asks. The answers in this question helped me solve the following:

1. (asked) Record differences between two dictionaries
2. Merge differences from #1 into base dictionary
3. (asked) Merge differences between two dictionaries (treat dictionary #2 as if it were a diff dictionary)
4. Try to detect item movements as well as changes
5. (asked) Do all of this recursively

All this combined with JSON makes for a pretty powerful configuration storage support.

The solution (also on github):

``````from collections import OrderedDict
from pprint import pprint

class izipDestinationMatching(object):
__slots__ = ("attr", "value", "index")

def __init__(self, attr, value, index):
self.attr, self.value, self.index = attr, value, index

def __repr__(self):
return "izip_destination_matching: found match by '%s' = '%s' @ %d" % (self.attr, self.value, self.index)

"""
Returns zipped lists, but final size is equal to b with (if shorter) a padded with nulls
Additionally also tries to find item reallocations by searching child dicts (if they are dicts) for attribute, listed in attrs)
When addMarker == False (patching), final size will be the longer of a, b
"""
for idx, item in enumerate(b):
try:
attr = next((x for x in attrs if x in item), None)  # See if the item has any of the ID attributes
match, matchIdx = next(((orgItm, idx) for idx, orgItm in enumerate(a) if attr in orgItm and orgItm[attr] == item[attr]), (None, None)) if attr else (None, None)
if match and matchIdx != idx and addMarker: item[izipDestinationMatching] = izipDestinationMatching(attr, item[attr], matchIdx)
except:
match = None
yield (match if match else a[idx] if len(a) > idx else None), item
if not addMarker and len(a) > len(b):
for item in a[len(b) - len(a):]:
yield item, item

def dictdiff(a, b, searchAttrs=[]):
"""
returns a dictionary which represents difference from a to b
the return dict is as short as possible:
equal items are removed
added / changed items are listed
removed items are listed with value=None
Also processes list values where the resulting list size will match that of b.
It can also search said list items (that are dicts) for identity values to detect changed positions.
In case such identity value is found, it is kept so that it can be re-found during the merge phase
@param a: original dict
@param b: new dict
@param searchAttrs: list of strings (keys to search for in sub-dicts)
@return: dict / list / whatever input is
"""
if not (isinstance(a, dict) and isinstance(b, dict)):
if isinstance(a, list) and isinstance(b, list):
return [dictdiff(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs)]
return b
res = OrderedDict()
if izipDestinationMatching in b:
keepKey = b[izipDestinationMatching].attr
del b[izipDestinationMatching]
else:
keepKey = izipDestinationMatching
for key in sorted(set(a.keys() + b.keys())):
v1 = a.get(key, None)
v2 = b.get(key, None)
if keepKey == key or v1 != v2: res[key] = dictdiff(v1, v2, searchAttrs)
if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
return res

def dictmerge(a, b, searchAttrs=[]):
"""
Returns a dictionary which merges differences recorded in b to base dictionary a
Also processes list values where the resulting list size will match that of a
It can also search said list items (that are dicts) for identity values to detect changed positions
@param a: original dict
@param b: diff dict to patch into a
@param searchAttrs: list of strings (keys to search for in sub-dicts)
@return: dict / list / whatever input is
"""
if not (isinstance(a, dict) and isinstance(b, dict)):
if isinstance(a, list) and isinstance(b, list):
return [dictmerge(v1, v2, searchAttrs) for v1, v2 in izip_destination(a, b, searchAttrs, False)]
return b
res = OrderedDict()
for key in sorted(set(a.keys() + b.keys())):
v1 = a.get(key, None)
v2 = b.get(key, None)
#print "processing", key, v1, v2, key not in b, dictmerge(v1, v2)
if v2 is not None: res[key] = dictmerge(v1, v2, searchAttrs)
elif key not in b: res[key] = v1
if len(res) <= 1: res = dict(res)  # This is only here for pretty print (OrderedDict doesn't pprint nicely)
return res
``````

Questions:

what about standart (compare FULL Object)

PyDev->new PyDev Module->Module: unittest

``````import unittest

class Test(unittest.TestCase):

def testName(self):
obj1 = {1:1, 2:2}
obj2 = {1:1, 2:2}
self.maxDiff = None # sometimes is usefull
self.assertDictEqual(d1, d2)

if __name__ == "__main__":
#import sys;sys.argv = ['', 'Test.testName']

unittest.main()
``````

Questions:

If on Python ≥ 2.7:

``````# update different values in dictB
# I would assume only dictA should be updated,
# but the question specifies otherwise

for k in dictA.viewkeys() & dictB.viewkeys():
if dictA[k] != dictB[k]:
dictB[k]= dictA[k]

# add missing keys to dictA

dictA.update( (k,dictB[k]) for k in dictB.viewkeys() - dictA.viewkeys() )
``````

Questions:

Here is a solution for deep comparing 2 dictionaries keys:

``````def compareDictKeys(dict1, dict2):
if type(dict1) != dict or type(dict2) != dict:
return False

keys1, keys2 = dict1.keys(), dict2.keys()
diff = set(keys1) - set(keys2) or set(keys2) - set(keys1)

if not diff:
for key in keys1:
if (type(dict1[key]) == dict or type(dict2[key]) == dict) and not compareDictKeys(dict1[key], dict2[key]):
diff = True
break

return not diff
``````

Questions:

here’s a solution that can compare more than two dicts:

``````def diff_dict(dicts, default=None):
diff_dict = {}
# add 'list()' around 'd.keys()' for python 3 compatibility
for k in set(sum([d.keys() for d in dicts], [])):
# we can just use "values = [d.get(k, default) ..." below if
# we don't care that d1[k]=default and d2[k]=missing will
# be treated as equal
if any(k not in d for d in dicts):
diff_dict[k] = [d.get(k, default) for d in dicts]
else:
values = [d[k] for d in dicts]
if any(v != values[0] for v in values):
diff_dict[k] = values
return diff_dict
``````

usage example:

``````import matplotlib.pyplot as plt
diff_dict([plt.rcParams, plt.rcParamsDefault, plt.matplotlib.rcParamsOrig])
``````

Questions:

Not sure if it is still relevant but I came across this problem, my situation i just needed to return a dictionary of the changes for all nested dictionaries etc etc. Could not find a good solution out there but I did end up writing a simple function to do this. Hope this helps,

Questions:

If you want a built-in solution for a full comparison with arbitrary dict structures, @Maxx’s answer is a good start.

``````import unittest

test = unittest.TestCase()
test.assertEqual(dictA, dictB)
``````

Questions:

``````dicta = {"a":1,"d":2}
dictb = {"a":5,"d":2}

for value in dicta.values():
if not value in dictb.values():
print value
``````

will print differ value of dicta

Questions:

Below i created two dictionaries. I need to return the key and value differences between them. I am stuck here. I am not sure which way is correct. I need to know how to get the key value difference. I want to first check if they are the same and if they are not to print key -value differences. I dont want to use deep diff. I dont know to compare if they are the same ?

``````num_list = [1,2]
val_list = [0,1]
dict1 = dict(zip(num_list,val_list))
print dict1

num_list2= [1,2]
val_list2 = [0,6]
dict2 = dict(zip(num_list2,val_list2))
print dict2
if dict1 == dict2
``````

output : currently
{1: 0, 2: 1}
{1: 0, 2: 6}

Questions:

My recipe of symmetric difference between two dictionaries:

``````def find_dict_diffs(dict1, dict2):
unequal_keys = []
unequal_keys.extend(set(dict1.keys()).symmetric_difference(set(dict2.keys())))
for k in dict1.keys():
if dict1.get(k, 'N\A') != dict2.get(k, 'N\A'):
unequal_keys.append(k)
if unequal_keys:
print 'param', 'dict1\t', 'dict2'
for k in set(unequal_keys):
print str(k)+'\t'+dict1.get(k, 'N\A')+'\t '+dict2.get(k, 'N\A')
else:
print 'Dicts are equal'

dict1 = {1:'a', 2:'b', 3:'c', 4:'d', 5:'e'}
dict2 = {1:'b', 2:'a', 3:'c', 4:'d', 6:'f'}

find_dict_diffs(dict1, dict2)
``````

And result is:

``````param   dict1   dict2
1       a       b
2       b       a
5       e       N\A
6       N\A     f
``````

Questions:

As mentioned in other answers, unittest produces some nice output for comparing dicts, but in this example we don’t want to have to build a whole test first.

Scraping the unittest source, it looks like you can get a fair solution with just this:

``````import difflib
import pprint

def diff_dicts(a, b):
if a == b:
return ''
return '\n'.join(
difflib.ndiff(pprint.pformat(a, width=30).splitlines(),
pprint.pformat(b, width=30).splitlines())
)
``````

so

``````dictA = dict(zip(range(7), map(ord, 'python')))
dictB = {0: 112, 1: 'spam', 2: [1,2,3], 3: 104, 4: 111}
print diff_dicts(dictA, dictB)
``````

Results in:

``````{0: 112,
-  1: 121,
-  2: 116,
+  1: 'spam',
+  2: [1, 2, 3],
3: 104,
-  4: 111,
?        ^

+  4: 111}
?        ^

-  5: 110}
``````

Where:

• ‘-‘ indicates key/values in the first but not second dict
• ‘+’ indicates key/values in the second but not the first dict

Like in unittest, the only caveat is that the final mapping can be thought to be a diff, due to the trailing comma/bracket.

Questions:

Try this to find de intersection, the keys that is in both dictionarie, if you want the keys not found on second dictionarie, just use the not in

``````intersect = filter(lambda x, dictB=dictB.keys(): x in dictB, dictA.keys())
``````

Questions:

@Maxx has an excellent answer, use the `unittest` tools provided by Python:

``````import unittest

class Test(unittest.TestCase):
def runTest(self):
pass

def testDict(self, d1, d2, maxDiff=None):
self.maxDiff = maxDiff
self.assertDictEqual(d1, d2)
``````

Then, anywhere in your code you can call:

``````try:
Test().testDict(dict1, dict2)
except Exception, e:
print e
``````

The resulting output looks like the output from `diff`, pretty-printing the dictionaries with `+` or `-` prepending each line that is different.