How to copy files from a kubernetes pod to a Google Cloud bucket programmatically

I am trying to copy files from a Kubernetes pod to a GCP Bucket. I can get the path of my file, but I was wondering if I want to do this programmatically using python, how can I do this.

I get my buckets using gcsfs. How can I copy a file in my program without using kubectl?

Is there anyway to do this through python.

1 answer

  • answered 2021-06-17 08:06 Dawid Kruk

    I need to agree with the comment made by @anemyte:

    There is this question about how to copy a file from a pod. You can download it and then use your code to upload it to the bucket.


    I see 2 possible solutions to this question:

    • Use GCS Fuse and Python code to copy the file from your Pod to GCS bucket
    • Use the Python library to connect to the GCS bucket without gcsfuse

    Use GCS Fuse and Python code to copy the file from your Pod to GCS bucket

    Assuming that you have a Pod that was configured with GCS Fuse and it's working correctly you can use a following code snippet to copy the files (where in dst you pass the mounted directory of a bucket):

    from shutil import copyfile
    copyfile(src, dst) 
    

    -- Stackoverflow.com: Questions: 123198: How can a file be copied


    Use the Python library to connect to the GCS bucket without GCS Fuse

    As pointed by community member @anemyte, you can use the Cloud Storage client libraries to programmatically address your question:

    There is a Python code snippet that addresses the upload operation:

    from google.cloud import storage
    
    
    def upload_blob(bucket_name, source_file_name, destination_blob_name):
        """Uploads a file to the bucket."""
        # The ID of your GCS bucket
        # bucket_name = "your-bucket-name"
        # The path to your file to upload
        # source_file_name = "local/path/to/file"
        # The ID of your GCS object
        # destination_blob_name = "storage-object-name"
    
        storage_client = storage.Client()
        bucket = storage_client.bucket(bucket_name)
        blob = bucket.blob(destination_blob_name)
    
        blob.upload_from_filename(source_file_name)
    
        print(
            "File {} uploaded to {}.".format(
                source_file_name, destination_blob_name
            )
        )
    

    Please have in mind that you will need to have appropriate permissions to use the GCS bucket. You can read more about it by following below link:

    A side note!

    You can also use Workload Identity as one of the ways to assign required permissions to your Pod.


    Additional resources:

    It passed my mind that you could want to use Python outside of the Pod (like from your laptop) to get the file copied from the Pod to GCS bucket. I'd reckon you could follow this example: