Removing duplicates from a list of dictionaries in Python using deepcopy

I have a list of dictonaries:

list_1 = [{'account': '1234', 'email' : 'abc@xyz.com'}, ... , ...]

I wanted to remove the entries with duplicate emails in the list.

import copy
list_2 = copy.deepcopy(list_1)
for i in mainList
 for j in range(len(list_2)-1, -1, -1):
   if ((list_2[j]["email"] == mainList[i])):
                    list_1.remove(list1[j])

mainList here is the list of emails with which I am comparing values. mainList looks like this:

['abc@xyz.com', 'efg@cvb.com, ..., ...]

The main problem is list_1 is not coming out correctly. If I use list, slicing or even list comprehension to copy it, it will come out empty. The final result should give list_1 containing only one element/list/dictionary for each email. Using copy or deepcopy at least gives me something. It also seems like sometimes I am getting an indexing error. Using for x in list_2: instead returns list_1 with only one item. The closest I got to the correct answer was iterating over list_1 itself while removing items, but it was not 100% correct.

3 answers

  • answered 2022-01-25 13:46 Vishal Singh

    iterate over your list of dictionaries and keep saving every email in a new dictionary only if it is not already present.

    temp = dict()
    list_1 = [{'account': '1234', 'email': 'abc@xyz.com'}]
    for d in list_1:
        if d['email'] in temp:
            continue
        else:
            temp[d['email']] = d
    final_list = list(temp.values())
    

  • answered 2022-01-25 13:51 RAJ KUMAR NAYAK

    Seems like you want to remove duplicate dictionaries. Please mention the duplicate dictionaries also in the problem.

    di = [{'account': '1234', 'email' : 'abc@xyz.com'}, {'account1': '12345', 
    'email1' : 'abcd@xyz.com'}, {'account': '1234', 'email' : 'abc@xyz.com'}]
    s=[i for n, i in enumerate(d) if i not in di[n + 1:]]
    Print(s)
    

    This would give you required output

    [{'account1': '12345', 'email1': 'abcd@xyz.com'}, {'account': '1234', 'email': 
    'abc@xyz.com'}]
    

  • answered 2022-01-25 13:56 JonSG

    The easiest way I feel to accomplish this is to create an indexed version of list_1 (a dictionary) based on your key.

    list_1 = [
        {'account': '1234', 'email' : 'abc@xyz.com'},
        {'account': '1234', 'email' : 'abc@xyz.com'},
        {'account': '4321', 'email' : 'zzz@xyz.com'},
    ]
    
    list_1_indexed = {}
    for row in list_1:
        list_1_indexed.setdefault(row['email'], row)
    list_2 = list(list_1_indexed.values())
    
    print(list_2)
    

    This will give you:

    [
        {'account': '1234', 'email': 'abc@xyz.com'},
        {'account': '4321', 'email': 'zzz@xyz.com'}
    ]
    

    I'm not sure I would recommend it, but if you wanted to use a comprehension you might do:

    list_2 = list({row['email']: row for row in list_1}.values())
    

    Note that the first strategy results in the first key row wins and the comprehension the last key row wins.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum