Python - List all the files and blob inside an Azure Storage Container

This is my first post here on StackOverflow, hope it respects the guideline of this community.

I'm trying to accomplish a simple task in Python because even though I'm really new to it, I found it very easy to use. I have a storage account on Azure, with a lot of containers inside. Each container contains some random files and/or blobs.

What I'm trying to do, is to get the name of all these files and/or blob and put it on a file.

For now, I got here:

import os, uuid
import sys
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient, __version__
connection_string = "my_connection_string"
blob_svc = BlobServiceClient.from_connection_string(conn_str=connection_string)


try:

    print("Azure Blob Storage v" + __version__ + " - Python quickstart sample")
    print("\nListing blobs...")
    containers = blob_svc.list_containers()
    list_of_blobs = []


    for c in containers:
      container_client = blob_svc.get_container_client(c)
      blob_list = container_client.list_blobs()
      for blob in blob_list:
        list_of_blobs.append(blob.name)
      file_path = 'C:/my/path/to/file/randomfile.txt'
      sys.stdout = open(file_path, "w")
      print(list_of_blobs)

except Exception as ex:
    print('Exception:')
    print(ex) 

But I'm having 3 problems:

  1. I'm getting the <name_of_ the_blob>/<name_of_the_file_inside>: I would like to have just the name of the file inside the blob

  2. If in a container there is a blob (or more than 1 blob) + a random file, this script prints only the name of the blob + the name of the file inside, skipping the other files outside the blobs.

  3. I would like to put all the names of the blobs/files in a .csv file.

But I'm not sure how to do point 3, and how to resolve points 1 and 2.

Cloud some maybe help on this?

Thanks!

Edit:

I'm adding an image here just to clarify a little what I mean when I talk about blob/files

Example of on the containers inside the azure storage account

1 answer

  • answered 2022-05-06 13:07 SwethaKandikonda-MT

    Just to clarify that there are no 2 things such as files or blobs in the Blob Storage the files inside Blob Storage are called blobs. Below is the hierarchy that you can observe in blob storage.

    Blob Storage > Containers > Directories/Virtual Folders > Blobs

    I'm getting the <name_of_ the_blob>/<name_of_the_file_inside>: I would like to have just the name of the file inside the blob

    for this, you can iterate through your container using list_blobs(<Container_Name>) taking only the names of the blobs i.e., blob.name. Here is how the code goes when you are trying to list all the blobs names inside a container.

    generator = blob_service.list_blobs(CONTAINER_NAME)
    for blob in generator:
        print("\t Blob name: "+c.name+'/'+  blob.name)
    

    If in a container there is a blob (or more than 1 blob) + a random file, this script prints only the name of the blob + the name of the file inside, skipping the other files outside the blobs.

    you can use iterate for containers using list_containers() and then use list_blobs(<Container_Name>) for iterating over the blob names and then finally write the blob names to a local file.

    I would like to put all the names of the blobs/files in a .csv file.

    A simple with open('<filename>.csv', 'w') as f write. Below is the sample code

    with open('BlobsNames.csv', 'w') as f:
         f.write(<statements>)
    

    Here is the complete sample code that worked for us where each blob from every folder will be listed.

    import os
    from azure.storage.blob import BlockBlobService
    
    ACCOUNT_NAME = "<ACCOUNT_NAME>"
    SAS_TOKEN='<YOUR_SAS_TOKEN>'
    
    blob_service = BlockBlobService(account_name=ACCOUNT_NAME,account_key=None,sas_token=SAS_TOKEN)
    
    print("\nList blobs in the container")
    with open('BlobsNames.txt', 'w') as f:
        containers = blob_service.list_containers()
        for c in containers:
            generator = blob_service.list_blobs(c.name)
            for blob in generator:
                print("\t Blob name: "+c.name+'/'+  blob.name)
                f.write(c.name+'/'+blob.name)
                f.write('\n')    
    

    This works even when there are folders in containers.

    RESULT:

    enter image description here

    NOTE: You can just remove c.name while printing the blob to file if your requirement is to just pull out the blob names.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum