Google Translate API - detect language + translate document (xlsx, csv)

I'm trying to use Google Cloud Translation API for translating an excel (or csv) document that includes text in multiple languages and my target language is english.

I would like to use "Translate text in batches (Advanced edition only)" code sample (link here: but in the code sample is a line that defines the source language so there can only be one source language.

But I need to detect the langugage first in the document and then translate the text to english. There is code sample for detecting language in a simple string of a text "Detecting languages (Advanced)" (link: but I need to combine the first code sample that translates documents (but only has one source language defined) with the ability to detect language instead of having one source language defined.

Is there this type of code sample in the resources? How could this be solved?

Here is the sample code in question:

from import translate

def batch_translate_text(
    """Translates a batch of texts on GCS and stores the result in a GCS location."""

    client = translate.TranslationServiceClient()

    location = "us-central1"
    # Supported file types:
    gcs_source = {"input_uri": input_uri}

    input_configs_element = {
        "gcs_source": gcs_source,
        "mime_type": "text/plain",  # Can be "text/plain" or "text/html".
    gcs_destination = {"output_uri_prefix": output_uri}
    output_config = {"gcs_destination": gcs_destination}
    parent = f"projects/{project_id}/locations/{location}"

    # Supported language codes:
    operation = client.batch_translate_text(
            "parent": parent,
            "source_language_code": "en",
            "target_language_codes": ["ja"],  # Up to 10 language codes here.
            "input_configs": [input_configs_element],
            "output_config": output_config,

    print("Waiting for operation to complete...")
    response = operation.result(timeout)

    print("Total Characters: {}".format(response.total_characters))
    print("Translated Characters: {}".format(response.translated_characters))

1 answer

  • answered 2021-07-28 03:28 Ricco D

    Unfortunately it is not possible to pass array of values to field source_language_code using batchTranslateText. What I could suggest is to perform detectLanguage and translateText per file.

    What the code below does is:

    1. It extracts the content to be translated. For testing purposes the the csv files used only have 1 column and content for sample1.csv is in tl(Tagalog) and sample2.csv is in es(Spanish).
    2. Pass the extracted content to detect_language() to get detected language code.
    3. Pass all the required parameters to translate_text() to translate

    NOTE: The code below is only tested using csv files with one column. Edit the code at main() to pattern on what column you would like to extract data.

    from import translate
    import csv
    def listToString(s):
        """ Transform list to string"""
        str1 = " "
        return (str1.join(s))
    def detect_language(project_id,content):
        """Detecting the language of a text string."""
        client = translate.TranslationServiceClient()
        location = "global"
        parent = f"projects/{project_id}/locations/{location}"
        response = client.detect_language(
            mime_type="text/plain",  # mime types: text/plain, text/html
        for language in response.languages:
            return language.language_code
    def translate_text(text, project_id,source_lang):
        """Translating Text."""
        client = translate.TranslationServiceClient()
        location = "global"
        parent = f"projects/{project_id}/locations/{location}"
        # Detail on supported types can be found here:
        response = client.translate_text(
                "parent": parent,
                "contents": [text],
                "mime_type": "text/plain",  # mime types: text/plain, text/html
                "source_language_code": source_lang,
                "target_language_code": "en-US",
        # Display the translation for each input text provided
        for translation in response.translations:
            print("Translated text: {}".format(translation.translated_text))
    def main():
        csv_files = ["sample1.csv","sample2.csv"]
        # Perform your content extraction here if you have a different file format #
        for csv_file in csv_files:
            csv_file = open(csv_file)
            read_csv = csv.reader(csv_file)
            content_csv = []
            for row in read_csv:
            content = listToString(content_csv) # convert list to string
            detect = detect_language(project_id=project_id,content=content)
    if __name__ == "__main__":




    cómo estás

    Output using the code above:

    Translated text: how are you okay
    Translated text: how are you ok

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum