Amazon Athena partition with colon(:) is not working
When creating partition in Athena, I tried to use the date in the format (yyyy-MM-ddTHH:mm:ssZ) then I am not able to query the data
Step 1: Create table
CREATE EXTERNAL TABLE my_info ( id STRING, name STRING ) PARTITIONED BY ( part string ) STORED AS ORC LOCATION 's3://bucket1/data' tblproperties ("orc.compress"="SNAPPY");
Step 2: Create folder like below and added the files.
Step 3: Refresh partition MSCK REPAIR TABLE my_info
Step 4: Query the data SELECT * FROM my_info
With this I am not able to query any data
If I change the folder to format (yyyy-MM-ddTHH)
without ’:’ in Step 2
Then I am able to get the results.
Any idea about why this is not working.
This is because when you create the partitioned table the partitioning is implemented as part of the S3 path e.g. for
part=2019-11-12T14:15:16Zsection is an S3 path that Athena interprets as a partition when querying the data.
S3 path names have some restrictions on the characters that can be used:
The following characters in a key name might require additional code handling and likely need to be URL encoded or referenced as HEX. Some of these are non-printable characters and your browser might not handle them, which also requires special handling:
Ampersand ("&") Dollar ("$") ASCII character ranges 00–1F hex (0–31 decimal) and 7F (127 decimal) 'At' symbol ("@") Equals ("=") Semicolon (";") Colon (":") Plus ("+") Space – Significant sequences of spaces may be lost in some uses (especially multiple spaces) Comma (",") Question mark ("?")
In this case it's probably the colons in the path that are not being interpreted by Presto/Athena. To work around this you can use an alternative dividing character in the timestamp e.g.
part=2019-11-12--14-15-16or omit it altogether.