how to set session property when executing pd.to_sql? {error:"Required field 'numDVs' is unset! }

there is issue when insert dataframe data to presto db.

error message is

{'message': "Required field 'numDVs' is unset! Struct:LongColumnStatsData(lowValue:0, highValue:2, numNulls:0, numDVs:0),
'errorCode': 16777216, 
'errorName': 'HIVE_METASTORE_ERROR', 
'errorType': 'EXTERNAL' ..."

i think

SET SESSION COLLECT_COLUMN_STATISTICS_ON_WRITE = FALSE

should be executed before inserting data. but i can't find way to do. is there any way to set session property, before execute pd.to_sql?

from pyhive import presto
result.to_sql(table_name, engine, if_exists='append', index=False)

table format

CREATE TABLE tablename (
similarity DOUBLE,
cluster INTEGER,
member_cnt INTEGER,
member_list VARCHAR,
mean_similarity DOUBLE,
ym VARCHAR(6)
)
WITH (format = 'ORC', PARTITIONED_BY = ARRAY['ym'])

1 answer

  • answered 2019-11-05 12:05 ebyhr

    We can set session properties by session_props as below.

    from pyhive import presto
    cursor = presto.connect('localhost', session_props={'hive.collect_column_statistics_on_write': 'false'}).cursor()