Presto query to exclude rows based on the value in a specific column
This seems like it should be simple but I guess I'm missing something.
I've got a query where I'm selecting a bunch of columns from my DB, and I want to exclude any rows where (for example) the province = 'ontario' or 'quebec'. I've tried it many ways, with different results - none of which are what I want.
select .... from fulldata WHERE substr(code, 1, 5) <> '10000' AND (province <> 'ontario' OR province <> 'quebec')
This does work to exclude rows where province is one of those values, but it also excludes all rows where province is null/blank - no idea why.
AND NOT ( stateprovince = 'ontario' OR stateprovince = 'quebec') with the same result.
AND stateprovince <> ANY(VALUES 'ontario','quebec') and it doesn't have any impact (all data shows up, nothing is filtered out)
Even breaking it down to something as simple as
AND stateprovince <> 'ontario' does get rid of ontario data, but also get's rid of blanks which isn't the intended result.
In writing this up I've kind of come up with a workaround which is to change the query to this:
... AND (stateprovince IS NULL OR stateprovince <> 'ontario') AND (stateprovince IS NULL OR stateprovince <> 'quebec')
Which works, but doesn't seem like the "right" way. Any idea what's up?
I'm using Presto on Athena.
NULL(think: "unknown") then this:
(province <> 'ontario' OR province <> 'quebec')
will evaluate to
NULL, filtering out given row. Since
NOT (NULL)is also
NULL, there isn't a simple way to avert this. Thus you need an explicit
(province IS NULL OR province NOT IN('ontario', 'quebec'))
Alternatively you can write this this way:
(coalesce(province, '') NOT IN('ontario', 'quebec'))