Teradata: How to remove character \x00 & \x02 from data

I am loading data from a teradata database like:

df = spark.read.format("jdbc").option("url", "jdbc:teradata://url_of_teradata_db/MAYBENULL=ON,TYPE=FASTEXPORT,charset=ASCII").option("dbtable", "({}) as subq".format(req)).option("driver", "com.teradata.jdbc.TeraDriver").option("user", my_user).option("password", my_password).load()

In my data, I get unwanted character like: \x00 & \x02 ...

How can I correctly read those characters *into my spark.read.format?

I found the NULLBYTEPREFIX but I am not sure how to use it


my query query is really simple:


I ask teradata more information about this column and I get:

| Column Name     | Type | Nullable | Format | Max length | 
| My_TABLE.MY_ROW | CV   | N        | X(100) | 100        |

1 answer

  • answered 2022-05-04 13:19 soniabhishek36

    Could you check character set of database columns and add those character set into your above query.

    You can refer this link, if you unable to understand the character set in teradata.

