Skip to main content

ListBlobsSegmented - perfect to iterate in large containers - Azure Storage Blobs

In this pot we will talk about 'ListBlobsSegmented' command, that allow us to get blobs from a container.
When this command is useful?
'ListBlobsSegmented' is used when we need to fetch the list of blobs that are under a container of Azure Storage. This command will not fetch the content of blobs, only the metadata of blob will be fetched. Based on this information, if needed we can trigger the download.

An important thing is related to the number of blobs that will be fetched when we make a call. The number of blobs that will be retrieved from a call is maximum 5.000 blobs metadata. If the container has more than 5.000 items, the response will contain also a BlobContinuationToken.
This token can be used to fetch the next 5.000 blobs from the container. The size of the result cannot be changed. We cannot modify this value.

Example:
BlobResultSegment blobResultSegment = blobContainer.ListBlobsSegmented(new BlobContinuationToken());
while (blobResultSegment.ContinuationToken != null)
{
    // process blobs - blobResultSegment.Results
    Console.WriteLine(blobResultSegment.Results.Count());
    blobResultSegment = blobContainer.ListBlobsSegmented(blobResultSegment.ContinuationToken);
}
The below code would print the following information to Console for a container with 24340 blobs:
5000
5000
5000
5000
4340

This command is very fast, usually takes less a second. If is important to know that 'ListBlobs' method of container is using behind 'ListBlobsSegmented' to fetch content. Once 5.000 blobs were assessed from 'ListBlob' result, the next 5.000 items will be fetch behind the scene.

There are three important tips that we should keep in mind related to ListBlobsSegmented:
ContinuationToken is not valid on another instance
This token can be used only under the same instance of CloudBlobClient. If you create another process, that will have another instance of CloudBlobClient, than.... you will not be able to use the token that you retrieved before.
This means that if you want to send the tokens to a queue for example and other systems would need to process them, than you will have a surprise. The token will return 0 items all the time.
Why? Because the token is valid only in the same context - on the same instance of CloudBlobClient
You could do some magic with OperationContext.

Type of blobs that are returned in the result
Keep in mind that a container can have different types of blob. Because of this you should check the type of a blob before assuming that each item from result is a 'CloudBlockBlob'

Iterate in containers where we have virtual directories
There are moments when we want to iterate in a container that has virtual directories, but we don't care about it. We want all blobs under the container, even the one from virtual directories. For this case, we need to set 'useFlatBlobListing' on TRUE.

In conclusion, we can say that 'ListBlobsSegmented' is a great method when we need to support pagination or iteration in large containers, that have hundreds of containers.

Comments

Popular posts from this blog

Windows Docker Containers can make WIN32 API calls, use COM and ASP.NET WebForms

After the last post , I received two interesting questions related to Docker and Windows. People were interested if we do Win32 API calls from a Docker container and if there is support for COM. WIN32 Support To test calls to WIN32 API, let’s try to populate SYSTEM_INFO class. [StructLayout(LayoutKind.Sequential)] public struct SYSTEM_INFO { public uint dwOemId; public uint dwPageSize; public uint lpMinimumApplicationAddress; public uint lpMaximumApplicationAddress; public uint dwActiveProcessorMask; public uint dwNumberOfProcessors; public uint dwProcessorType; public uint dwAllocationGranularity; public uint dwProcessorLevel; public uint dwProcessorRevision; } ... [DllImport("kernel32")] static extern void GetSystemInfo(ref SYSTEM_INFO pSI); ... SYSTEM_INFO pSI = new SYSTEM_INFO(

Azure AD and AWS Cognito side-by-side

In the last few weeks, I was involved in multiple opportunities on Microsoft Azure and Amazon, where we had to analyse AWS Cognito, Azure AD and other solutions that are available on the market. I decided to consolidate in one post all features and differences that I identified for both of them that we should need to take into account. Take into account that Azure AD is an identity and access management services well integrated with Microsoft stack. In comparison, AWS Cognito is just a user sign-up, sign-in and access control and nothing more. The focus is not on the main features, is more on small things that can make a difference when you want to decide where we want to store and manage our users.  This information might be useful in the future when we need to decide where we want to keep and manage our users.  Feature Azure AD (B2C, B2C) AWS Cognito Access token lifetime Default 1h – the value is configurable 1h – cannot be modified

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine: threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See http://go.microsoft.com/fwlink/?LinkId=260882 for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration: TeamCity .NET 4.51 EF 6.0.2 VS2013 It see