Removing duplicates from a list of dictionaries in Python using deepcopy
I have a list of dictonaries:
list_1 = [{'account': '1234', 'email' : 'abc@xyz.com'}, ... , ...]
I wanted to remove the entries with duplicate emails in the list.
import copy
list_2 = copy.deepcopy(list_1)
for i in mainList
for j in range(len(list_2)-1, -1, -1):
if ((list_2[j]["email"] == mainList[i])):
list_1.remove(list1[j])
mainList
here is the list of emails with which I am comparing values.
mainList
looks like this:
['abc@xyz.com', 'efg@cvb.com, ..., ...]
The main problem is list_1
is not coming out correctly. If I use list, slicing or even list comprehension to copy it, it will come out empty.
The final result should give list_1
containing only one element/list/dictionary for each email.
Using copy or deepcopy at least gives me something. It also seems like sometimes I am getting an indexing error.
Using for x in list_2:
instead returns list_1
with only one item.
The closest I got to the correct answer was iterating over list_1
itself while removing items, but it was not 100% correct.
3 answers
-
answered 2022-01-25 13:46
Vishal Singh
iterate over your list of dictionaries and keep saving every email in a new dictionary only if it is not already present.
temp = dict() list_1 = [{'account': '1234', 'email': 'abc@xyz.com'}] for d in list_1: if d['email'] in temp: continue else: temp[d['email']] = d final_list = list(temp.values())
-
answered 2022-01-25 13:51
RAJ KUMAR NAYAK
Seems like you want to remove duplicate dictionaries. Please mention the duplicate dictionaries also in the problem.
di = [{'account': '1234', 'email' : 'abc@xyz.com'}, {'account1': '12345', 'email1' : 'abcd@xyz.com'}, {'account': '1234', 'email' : 'abc@xyz.com'}] s=[i for n, i in enumerate(d) if i not in di[n + 1:]] Print(s)
This would give you required output
[{'account1': '12345', 'email1': 'abcd@xyz.com'}, {'account': '1234', 'email': 'abc@xyz.com'}]
-
answered 2022-01-25 13:56
JonSG
The easiest way I feel to accomplish this is to create an indexed version of
list_1
(a dictionary) based on your key.list_1 = [ {'account': '1234', 'email' : 'abc@xyz.com'}, {'account': '1234', 'email' : 'abc@xyz.com'}, {'account': '4321', 'email' : 'zzz@xyz.com'}, ] list_1_indexed = {} for row in list_1: list_1_indexed.setdefault(row['email'], row) list_2 = list(list_1_indexed.values()) print(list_2)
This will give you:
[ {'account': '1234', 'email': 'abc@xyz.com'}, {'account': '4321', 'email': 'zzz@xyz.com'} ]
I'm not sure I would recommend it, but if you wanted to use a comprehension you might do:
list_2 = list({row['email']: row for row in list_1}.values())
Note that the first strategy results in the first key row wins and the comprehension the last key row wins.
do you know?
how many words do you know