Mongodb query to find all documents containing duplicate fields in array

    "_id" : ObjectId("0fffa133x"),
    "properties" : [ 
        {
            "key" : "1",
            “value” : “a”    
        }, 
        {
            "key” : “1”,
            “value” : “b”
        },...
    ]},
    { "_id" : ObjectId("0fffa132x"),
    "properties" : [ 
        {
            "key" : "1",
            “value” : “a”    
        }, 
        {
            "key” : “2”,
            “value” : “b”
        },...
    ]},....
   ]

Im relatively new to mongodb, so I’m having trouble with this one query. Basically, I need a mongodb query that returns all the documents where properties array contains duplicate keys. For example above, the query should return document with id 0fffa133x since key:1 appear twice in the array. Any help is appreciated!

1 answer

  • answered 2021-04-08 04:32 Dheemanth Bhat

    SOLUTION #1: If you are using older version of MongoDB

    db.collection.aggregate([
        {
            $addFields: {
                pSize: { $size: "$properties" }
            }
        },
        {
            $addFields: {
                uniqueKeys: {
                    $reduce: {
                        input: "$properties",
                        initialValue: [{ $arrayElemAt: ["$properties.key", 0] }],
                        in: {
                            $setUnion: ["$$value", ["$$this.key"]]
                        }
                    }
                }
            }
        },
        {
            $match: {
                $expr: {
                    $ne: ["$pSize", { $size: "$uniqueKeys" }]
                }
            }
        },
        {
            $project: { "pSize": 0, "uniqueKeys": 0 }
        }
    ]);
    

    SOLUTION #2: If you are using MongoDb version >= 4.4

    db.collection.aggregate([
        {
            $addFields: {
                pSize: { $size: "$properties" }
            }
        },
        {
            $addFields: {
                uniqueKeys: {
                    $reduce: {
                        input: "$properties",
                        initialValue: { $first: [["$properties.key"]] },
                        in: {
                            $setUnion: ["$$value", ["$$this.key"]]
                        }
                    }
                }
            }
        },
        {
            $match: {
                $expr: {
                    $ne: ["$pSize", { $size: "$uniqueKeys" }]
                }
            }
        },
        {
            $unset: ["pSize", "uniqueKeys"]
        }
    ]);
    

    Output:

    {
        "_id" : ObjectId("606f4794bc7414255cc3d49c"),
        "properties" : [
            {
                "key" : "1",
                "value" : "a"
            },
            {
                "key" : "2",
                "value" : "b"
            },
            {
                "key" : "1",
                "value" : "c"
            }
        ]
    }
    

    Test data in collection:

    [
        {
            properties: [
                { key: "1", value: "a" },
                { key: "2", value: "b" },
                { key: "1", value: "c" }
            ]
        },
        {
            properties: [
                { key: "1", value: "a" },
                { key: "2", value: "b" }
            ]
        }
    ]