Skip to main content

ListBlobsSegmented - perfect to iterate in large containers - Azure Storage Blobs

In this pot we will talk about 'ListBlobsSegmented' command, that allow us to get blobs from a container.
When this command is useful?
'ListBlobsSegmented' is used when we need to fetch the list of blobs that are under a container of Azure Storage. This command will not fetch the content of blobs, only the metadata of blob will be fetched. Based on this information, if needed we can trigger the download.

An important thing is related to the number of blobs that will be fetched when we make a call. The number of blobs that will be retrieved from a call is maximum 5.000 blobs metadata. If the container has more than 5.000 items, the response will contain also a BlobContinuationToken.
This token can be used to fetch the next 5.000 blobs from the container. The size of the result cannot be changed. We cannot modify this value.

BlobResultSegment blobResultSegment = blobContainer.ListBlobsSegmented(new BlobContinuationToken());
while (blobResultSegment.ContinuationToken != null)
    // process blobs - blobResultSegment.Results
    blobResultSegment = blobContainer.ListBlobsSegmented(blobResultSegment.ContinuationToken);
The below code would print the following information to Console for a container with 24340 blobs:

This command is very fast, usually takes less a second. If is important to know that 'ListBlobs' method of container is using behind 'ListBlobsSegmented' to fetch content. Once 5.000 blobs were assessed from 'ListBlob' result, the next 5.000 items will be fetch behind the scene.

There are three important tips that we should keep in mind related to ListBlobsSegmented:
ContinuationToken is not valid on another instance
This token can be used only under the same instance of CloudBlobClient. If you create another process, that will have another instance of CloudBlobClient, than.... you will not be able to use the token that you retrieved before.
This means that if you want to send the tokens to a queue for example and other systems would need to process them, than you will have a surprise. The token will return 0 items all the time.
Why? Because the token is valid only in the same context - on the same instance of CloudBlobClient
You could do some magic with OperationContext.

Type of blobs that are returned in the result
Keep in mind that a container can have different types of blob. Because of this you should check the type of a blob before assuming that each item from result is a 'CloudBlockBlob'

Iterate in containers where we have virtual directories
There are moments when we want to iterate in a container that has virtual directories, but we don't care about it. We want all blobs under the container, even the one from virtual directories. For this case, we need to set 'useFlatBlobListing' on TRUE.

In conclusion, we can say that 'ListBlobsSegmented' is a great method when we need to support pagination or iteration in large containers, that have hundreds of containers.


Popular posts from this blog

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine:
threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration:

TeamCity.NET 4.51EF 6.0.2VS2013
It seems that there …

Entity Framework (EF) TransactionScope vs Database.BeginTransaction

In today blog post we will talk a little about a new feature that is available on EF6+ related to Transactions.
Until now, when we had to use transaction we used ‘TransactionScope’. It works great and I would say that is something that is now in our blood.
using (var scope = new TransactionScope(TransactionScopeOption.Required)) { using (SqlConnection conn = new SqlConnection("...")) { conn.Open(); SqlCommand sqlCommand = new SqlCommand(); sqlCommand.Connection = conn; sqlCommand.CommandText = ... sqlCommand.ExecuteNonQuery(); ... } scope.Complete(); } Starting with EF6.0 we have a new way to work with transactions. The new approach is based on Database.BeginTransaction(), Database.Rollback(), Database.Commit(). Yes, no more TransactionScope.
In the followi…

GET call of REST API that contains '/'-slash character in the value of a parameter

Let’s assume that we have the following scenario: I have a public HTTP endpoint and I need to post some content using GET command. One of the parameters contains special characters like “\” and “/”. If the endpoint is an ApiController than you may have problems if you encode the parameter using the http encoder.
using (var httpClient = new HttpClient()) { httpClient.BaseAddress = baseUrl; Task<HttpResponseMessage> response = httpClient.GetAsync(string.Format("api/foo/{0}", "qwert/qwerqwer"))); response.Wait(); response.Result.EnsureSuccessStatusCode(); } One possible solution would be to encode the query parameter using UrlTokenEncode method of HttpServerUtility class and GetBytes method ofUTF8. In this way you would get the array of bytes of the parameter and encode them as a url token.
The following code show to you how you could write the encode and decode methods.