Select rows where a field is associated with more than one predefined value

I had a hard time choosing a title for this question. I am not sure it describes accurately what I want, so I will be grateful if instead of a downvote you will help to improve the title. :)

I have the following (sample) tables:

USERS:
+----------+
|    id    |
+----------+
|     1    |
------------
|     3    |
------------
|     4    |
+----------+
URLS:
+----------+----------+
|    id    |    url   |
+----------+----------+
|     1    |   a.com  |
-----------------------
|     1    |   b.com  |
-----------------------
|     1    |   d.com  |
-----------------------
|     2    |   a.com  |
-----------------------
|     2    |   e.com  |
-----------------------
|     3    |   a.com  |
-----------------------
|     3    |   e.com  |
-----------------------
|     3    |   f.com  |
-----------------------
|     3    |   g.com  |
-----------------------
|     4    |   a.com  |
-----------------------
|     4    |   e.com  |
+----------+----------+

I want to select from URLS all the URL's that corresponds to at least two ids that are in USERS. So for eample, the result of such query with these sample table:

+----------+
|    url   |
+----------+
|   a.com  | <- associated with ids: 1, 3, 4
------------
|   e.com  | <- associated with ids: 3, 4
+----------+

As you can see, b.com, for example, is associated with 1 which is in the USERS table. However, because it is not associated with any other id in USERS then it is not in the result. How can I achieve this?

3 answers

  • answered 2018-12-05 20:19 Matt Cremeens

    You will have more than one id associated with a url if the min does not equal the max for each, and we can exclude ids not in the USERS table by doing an inner join, so perhaps something like this

    select url
    from URLS as a
    inner join USERS as b
    on a.id=b.id
    group by url
    having max(b.id)<>min(b.id)
    

  • answered 2018-12-05 20:23 Zeki Gumus

    If you want to see all associated Ids in same column you can use STUFF() . Here is your sample:

    --DROP TABLE Users;
    CREATE TABLE Users
    (
        id  INT
    );
    
    
    --DROP TABLE Urls;
    CREATE TABLE Urls
    (
        id  INT
        ,Url    VARCHAR(100)
    );
    
    INSERT INTO Users VALUES(1),(2),(3),(4)
    INSERT INTO URLs  VALUES
     (1,'a.com')
    ,(1,'b.com')
    ,(1,'d.com')
    ,(2,'a.com')
    ,(2,'e.com')
    ,(3,'a.com')
    ,(3,'e.com')
    ,(3,'f.com')
    ,(3,'g.com')
    ,(4,'a.com')
    ,(4,'e.com')
    
    SELECT *,
            STUFF((SELECT distinct ',' + CAST(ID AS VARCHAR)
                        from URLs U2
                        WHERE U1.Url=U2.Url
                        AND EXISTS (SELECT 1 FROM Users US WHERE U2.id=US.id)
                FOR XML PATH(''), TYPE
                ).value('.', 'NVARCHAR(MAX)') 
            ,1,1,'')
    
    FROM 
        (SELECT DISTINCT Url FROM Urls) U1
    

  • answered 2018-12-05 20:35 Sami

    You can use EXISTS() as

    SELECT Url
    FROM Urls T
    WHERE EXISTS(
                  SELECT 1
                  FROM Urls
                  WHERE ID != T.ID
                        AND
                        Url = T.Url
                )
    GROUP BY Url;
    

    Or

    SELECT T1.Url
    FROM Urls T1 JOIN Urls T2
    ON T1.ID != T2.ID
    AND T1.Url = T2.Url
    GROUP BY T1.Url;
    

    Update:

    Since you need to join with Users table

    SELECT T1.Url
    FROM Urls T1 INNER JOIN Users T2
    ON T1.ID = T2.ID
    WHERE EXISTS(
                  SELECT 1
                  FROM Urls
                  WHERE ID != T1.ID
                        AND
                        Url = T1.Url
                )
    GROUP BY T1.Url;
    
    SELECT T1.Url
    FROM Urls T1 INNER JOIN Urls T2
    ON T1.ID != T2.ID
    AND T1.Url = T2.Url
    INNER JOIN Users U ON T1.ID = U.ID
    GROUP BY T1.Url;
    

    Demo