Skip to main content

ListBlobsSegmented - perfect to iterate in large containers - Azure Storage Blobs

In this pot we will talk about 'ListBlobsSegmented' command, that allow us to get blobs from a container.
When this command is useful?
'ListBlobsSegmented' is used when we need to fetch the list of blobs that are under a container of Azure Storage. This command will not fetch the content of blobs, only the metadata of blob will be fetched. Based on this information, if needed we can trigger the download.

An important thing is related to the number of blobs that will be fetched when we make a call. The number of blobs that will be retrieved from a call is maximum 5.000 blobs metadata. If the container has more than 5.000 items, the response will contain also a BlobContinuationToken.
This token can be used to fetch the next 5.000 blobs from the container. The size of the result cannot be changed. We cannot modify this value.

Example:
BlobResultSegment blobResultSegment = blobContainer.ListBlobsSegmented(new BlobContinuationToken());
while (blobResultSegment.ContinuationToken != null)
{
    // process blobs - blobResultSegment.Results
    Console.WriteLine(blobResultSegment.Results.Count());
    blobResultSegment = blobContainer.ListBlobsSegmented(blobResultSegment.ContinuationToken);
}
The below code would print the following information to Console for a container with 24340 blobs:
5000
5000
5000
5000
4340

This command is very fast, usually takes less a second. If is important to know that 'ListBlobs' method of container is using behind 'ListBlobsSegmented' to fetch content. Once 5.000 blobs were assessed from 'ListBlob' result, the next 5.000 items will be fetch behind the scene.

There are three important tips that we should keep in mind related to ListBlobsSegmented:
ContinuationToken is not valid on another instance
This token can be used only under the same instance of CloudBlobClient. If you create another process, that will have another instance of CloudBlobClient, than.... you will not be able to use the token that you retrieved before.
This means that if you want to send the tokens to a queue for example and other systems would need to process them, than you will have a surprise. The token will return 0 items all the time.
Why? Because the token is valid only in the same context - on the same instance of CloudBlobClient
You could do some magic with OperationContext.

Type of blobs that are returned in the result
Keep in mind that a container can have different types of blob. Because of this you should check the type of a blob before assuming that each item from result is a 'CloudBlockBlob'

Iterate in containers where we have virtual directories
There are moments when we want to iterate in a container that has virtual directories, but we don't care about it. We want all blobs under the container, even the one from virtual directories. For this case, we need to set 'useFlatBlobListing' on TRUE.

In conclusion, we can say that 'ListBlobsSegmented' is a great method when we need to support pagination or iteration in large containers, that have hundreds of containers.

Comments

Popular posts from this blog

Why Database Modernization Matters for AI

  When companies transition to the cloud, they typically begin with applications and virtual machines, which is often the easier part of the process. The actual complexity arises later when databases are moved. To save time and effort, cloud adoption is more of a cloud migration in an IaaS manner, fulfilling current, but not future needs. Even organisations that are already in the cloud find that their databases, although “migrated,” are not genuinely modernised. This disparity becomes particularly evident when they begin to explore AI technologies. Understanding Modernisation Beyond Migration Database modernisation is distinct from merely relocating an outdated database to Azure. It's about making your data layer ready for future needs, like automation, real-time analytics, and AI capabilities. AI needs high throughput, which can be achieved using native DB cloud capabilities. When your database runs in a traditional setup (even hosted in the cloud), in that case, you will enc...

How to audit an Azure Cosmos DB

In this post, we will talk about how we can audit an Azure Cosmos DB database. Before jumping into the problem let us define the business requirement: As an Administrator I want to be able to audit all changes that were done to specific collection inside my Azure Cosmos DB. The requirement is simple, but can be a little tricky to implement fully. First of all when you are using Azure Cosmos DB or any other storage solution there are 99% odds that you’ll have more than one system that writes data to it. This means that you have or not have control on the systems that are doing any create/update/delete operations. Solution 1: Diagnostic Logs Cosmos DB allows us activate diagnostics logs and stream the output a storage account for achieving to other systems like Event Hub or Log Analytics. This would allow us to have information related to who, when, what, response code and how the access operation to our Cosmos DB was done. Beside this there is a field that specifies what was th...

[Post Event] Azure AI Connect, March 2025

On March 13th, I had the opportunity to speak at Azure AI Connect about modern AI architectures.  My session focused on the importance of modernizing cloud systems to efficiently handle the increasing payload generated by AI.