how to remove partial duplicates in a list of lists

I have a list of lists where it looks like

[[1,'a',2],[1,'b',2],[1,'a',3]]

I want to remove the item from the list if the second element in the list of lists are the same (e.g. they are both a)

I want to create output that looks like:

[[1,'a',2],[1,'b',2]]

where it grabs the first one in the list of the duplicates.

3 answers

  • answered 2018-10-11 19:33 user3483203

    You can use a dictionary where the second element is the key, on the reverse of the list, to drop duplicates:

    dct = {j: (i, k) for i, j, k in reversed(L)}
    

    {'a': (1, 2), 'b': (1, 2)}
    

    Getting the result back as a list:

    [[i, j, k] for j, (i, k) in dct.items()]
    

    [[1, 'a', 2], [1, 'b', 2]]
    

    While this solution will always keep the first occurence of a duplicate, the relative order of elements is not guaranteed in the final result.

  • answered 2018-10-11 19:33 Jean-François Fabre

    that's a variant of How do you remove duplicates from a list whilst preserving order?.

    You can use a marker set to track the already appended sublists since strings are immutable so hashable & storable in a set:

    lst = [[1,'a',2],[1,'b',2],[1,'a',3]]
    
    marker_set = set()
    
    result = []
    
    for sublist in lst:
        second_elt = sublist[1]
        if second_elt not in marker_set:
            result.append(sublist)
            marker_set.add(second_elt)
    
    print(result)
    

    prints:

    [[1, 'a', 2], [1, 'b', 2]]
    

    (using a marker set and not a list allows an average O(1) lookup instead of O(N))

  • answered 2018-10-11 19:37 vash_the_stampede

    lst = [[1,'a',2],[1,'b',2],[1,'a',3]]
    res = []
    for i in lst:
        if not any(i[1] in j for j in res):
            res.append(i)
    
    print(res)
    # [[1, 'a', 2], [1, 'b', 2]]