Skip to main content

ListBlobsSegmented - perfect to iterate in large containers - Azure Storage Blobs

In this pot we will talk about 'ListBlobsSegmented' command, that allow us to get blobs from a container.
When this command is useful?
'ListBlobsSegmented' is used when we need to fetch the list of blobs that are under a container of Azure Storage. This command will not fetch the content of blobs, only the metadata of blob will be fetched. Based on this information, if needed we can trigger the download.

An important thing is related to the number of blobs that will be fetched when we make a call. The number of blobs that will be retrieved from a call is maximum 5.000 blobs metadata. If the container has more than 5.000 items, the response will contain also a BlobContinuationToken.
This token can be used to fetch the next 5.000 blobs from the container. The size of the result cannot be changed. We cannot modify this value.

Example:
BlobResultSegment blobResultSegment = blobContainer.ListBlobsSegmented(new BlobContinuationToken());
while (blobResultSegment.ContinuationToken != null)
{
    // process blobs - blobResultSegment.Results
    Console.WriteLine(blobResultSegment.Results.Count());
    blobResultSegment = blobContainer.ListBlobsSegmented(blobResultSegment.ContinuationToken);
}
The below code would print the following information to Console for a container with 24340 blobs:
5000
5000
5000
5000
4340

This command is very fast, usually takes less a second. If is important to know that 'ListBlobs' method of container is using behind 'ListBlobsSegmented' to fetch content. Once 5.000 blobs were assessed from 'ListBlob' result, the next 5.000 items will be fetch behind the scene.

There are three important tips that we should keep in mind related to ListBlobsSegmented:
ContinuationToken is not valid on another instance
This token can be used only under the same instance of CloudBlobClient. If you create another process, that will have another instance of CloudBlobClient, than.... you will not be able to use the token that you retrieved before.
This means that if you want to send the tokens to a queue for example and other systems would need to process them, than you will have a surprise. The token will return 0 items all the time.
Why? Because the token is valid only in the same context - on the same instance of CloudBlobClient
You could do some magic with OperationContext.

Type of blobs that are returned in the result
Keep in mind that a container can have different types of blob. Because of this you should check the type of a blob before assuming that each item from result is a 'CloudBlockBlob'

Iterate in containers where we have virtual directories
There are moments when we want to iterate in a container that has virtual directories, but we don't care about it. We want all blobs under the container, even the one from virtual directories. For this case, we need to set 'useFlatBlobListing' on TRUE.

In conclusion, we can say that 'ListBlobsSegmented' is a great method when we need to support pagination or iteration in large containers, that have hundreds of containers.

Comments

Popular posts from this blog

How to check in AngularJS if a service was register or not

There are cases when you need to check in a service or a controller was register in AngularJS.
For example a valid use case is when you have the same implementation running on multiple application. In this case, you may want to intercept the HTTP provider and add a custom step there. This step don’t needs to run on all the application, only in the one where the service exist and register.
A solution for this case would be to have a flag in the configuration that specify this. In the core you would have an IF that would check the value of this flag.
Another solution is to check if a specific service was register in AngularJS or not. If the service was register that you would execute your own logic.
To check if a service was register or not in AngularJS container you need to call the ‘has’ method of ‘inhector’. It will return TRUE if the service was register.
if ($injector.has('httpInterceptorService')) { $httpProvider.interceptors.push('httpInterceptorService&#…

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine:
threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See http://go.microsoft.com/fwlink/?LinkId=260882 for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration:

TeamCity.NET 4.51EF 6.0.2VS2013
It seems that there …

[Post-Event] Codecamp Conference Cluj-Napoca - Nov 19, 2016

Last day I was invited to another Codecamp Conference, that took place in Cluj-Napoca. Like other Codecamp Conferences, the event was very big, with more than 1.000 participants and 70 sessions. There were 10 tracks in parallel, so it was pretty hard to decide at  what session you want to join.
It was great to join this conference and I hope that you discovered something new during the conference.
At this event I talked about Azure IoT Hub and how we can use it to connect devices from the field. I had a lot of demos using Raspberry PI 3 and Simplelink SensorTag. Most of the samples were written in C++ and Node.JS and people were impressed that even if we are using Microsoft technologies, we are not limited to C# and .NET. World and Microsoft are changing so fast. Just looking and Azure IoT Hub and new features that were launched and I'm pressed (Jobs, Methods, Device Twin).
On backend my demos covered Stream Analytics, Event Hub, Azure Object Storage and DocumentDB.

Title:
What abo…