Skip to main content

Azure Storage: How to reduce distribution time from 16 days to 8h with only 11 cents

Cloud is sold out like an environment without limits, where you can have as many resource you want. The true is that each service offered by a cloud provider has some limitations – like a normal service. For each service we can mitigate in different way this boundaries.
But what is happening when we reach the limits of a service? For a part of this services, there are best practices related on how we can scale over a service limitation. But for others services we need to put our mind at work and find different ways how you can scale above cloud limits.
In today post we will talk about Azure Storage and what we should do if we want to scale over Azure Storage limits.
Before jumping to the problem, let’s see some limitations of Azure Storage.

Context
The biggest limitation is not the storage capacity that can reach 500 TB of data per Azure account – the biggest problem is the bandwidth. Based on the next two characteristics, the available outbound bandwidth can vary from 10 gigabits to 30 gigabyte per second:
  • Region (location where the account was created)
  • Type of storage redundancy

10 gigabytes of outbound traffic already sounds very good. We should take into account that this value is per Azure Storage Account and not per file. On top of the total bandwidth that can be reached using a storage account there are some limitations at partition level.
Before going deeper on this topic, let’s see what does a partition means in this case. A partition is represented by an object of Azure Storage that affect and determinate how load balancing and traffic manager should work.
For each component of Azure Storage a partition is represented by a different scalability unit:
  1. Azure Blobs - A partition is represented by Container + Blob
  2. Azure Tables – A partition is represented by Table + Partition
  3. Azure Queues – A partition is represented by a Queue

What does this mean? It means that we have a maximum outbound traffic limit that can be reached by a partition and we need to be aware about it.
In this moment, the maximum throughput limits that can be reached on each partition are:
  1. Azure Blobs – 60MB/s or 500 requests/s
  2. Azure Tables – 2.000 entities/s
  3. Azure Queues – 2.000 messages/s

In this moment we already identify some limitations at partition level that can affect our system. Let’s see what we can do to improve it.

The Problem
Imagine a system that store large amounts of data on Azure Storage. This information contains software updates that will reach millions of devices would wide.  Because we need a secure mechanism to access this content (Shared Access Signature – token base access) we will need to store it on blobs.

Azure CDNs are not used because there is no support for Shared Access Signature or other security mechanism. A possible solution would be to encrypt the content and distributed it over Azure CDN network. But the cost of decryption at device level is too high (from CPU perspective).
Because of this in this moment we are limited to 60 MB/s or up to 500 requests/s. If on average, the download bandwidth of each device is 512 KB/s we can have maximum 120 devices that can download the content with the maximum speed. Or we could have 500 devices that download the content at a rate of 122 KB/s.
  • 120 devices at 512 KB/s
  • 500 devices at 122 KB/s

Going further, a package that has 100 MB, will reach maximum 500 devices in 14 minutes. If we put on the table the network connectivity issues that can appear at device level we end up with maximum 300 devices that will download with success that package every 14 minutes.
But in the world of IoT, where we can have 50.000 devices, this would mean that we will be able to distribute this package to all our devices in 39 hours.
Upsss … as a client I expect more than that, I need to be able to push a package to this devices in a shorter period of time.

Before going further let’s make a summary.
The download limit for a blob is 60 MB/s or up to 500 request per second. A package that has 100 MB, will be deliver to 50.000 devices in 39 hours.

Solution
To be able to reduce the package distribution time we need to be clever. Having only one location where the content is store can be simple to maintain, but don’t allow us to scale over the account limitations.
Nothing stop us to copy a package to multiple regions and datacenters. In this way we can have a package distributed to 3 different datacenters using 3 different storage accounts. From performance perspective, this would mean that we can distribute a package to 900 devices in 14 minutes. This mean that we would be able to distribute our package to all 50.000 devices in only 13 hours.
  • 1 replication…300 devices reached every 14 minutes
  • 3 replication…900 devices reached every 14 minutes


But what if we need to distribute the same package to 500.000 devices? We can apply the same technique but more clever. Let’s have the following deployment configuration:

  • Replicate content in 5 datacenters
  • In each datacenter replicate the content on 3 different Azure Storage accounts

Remarks: We could replicate in the in the same Azure Storage account the content in 3 different blobs, but personally I prefer to use different Azure Storage accounts (in this way the risk to reach the storage accounts limits is lower).
Having the content replicated in 5 datacenters on 3 Azure Storage accounts would mean that we have the content replicated in 15 different locations.
The above configuration would allow us to reach:
  • 4.500 devices every 14 minutes    
  • 50.000 devices in 2.5 hours
  • 100.000 devices in 5.2 hours
  • 500.000 devices in 26 hours

In this moment we have 16 different Azure datacenters public available would wide. Imagine making a package deployment in all the datacenters on 3 different accounts per datacenter. This would allow us to reach 14.400 devices every 14 minutes and 500.000 devices in 8.2 hours.
From technical perspective, it is pretty clear that we would add some complexity to our system – no pain no game (smile).
There will be two components that will become more complex:
First one, is the component that manage the package upload. This component needs to be able to replicate the package payload on all the datacenters and Storage Accounts where we want to store the package.
The second components that will become more complex is the component that resolve the blob download URL for each device. This component needs to be able to uniform distribute the load across all available nodes. If you can partition your devices based on region, location or anything else, than you will be able to do this pretty simple.
In the near future I will write some posts that will talk more about the two components.    

Cost
Let’s see how the cost will be affected by a solution that replicate the content in multiple datacenters. Let’s see how much would cost us to download 100 MB of data from Azure Blobs by 500.000 devices.
Remarks: Because the download costs is different based on Zone (datacenter location), we will assume that all datacenters are in the Zone 1, with traffic per month of 60 TB – Outbound traffic cost per GB would be €0.0522. Also we assume the storage account is 30TB with only Local Redundancy Storage – Storage cost per month for 1 GB of data will be €0.022.
Download Cost:
     500.000 devices X 100 MB / 1024 = 48828.125 GB = 47.68 TB
     48829 GB X €0.0522 = €2548.8738

Storage Cost
     For 100 MB:
     With only one copy: 100 MB = 0.098 GB X €0.022 = €0.002156
     With 3 replications: €0.006468
     With 15 replications: €0.03234
     With 48 replications: €0.103488 (16 datacenters X 3 Storage Account per datacenter)

As you can see, the cost of storage it is only a small small small part of the total price. The download cost is the one that dominate the total cost.
If we think that we only add an extra cost of less than €0.11 to reduce the distribution for 500.000 devices from 16 days to 8.2 hours it is pretty amazing what we can do with a clever solution.
We can have the package content replicated in 48 storage accounts only in the moment when we release the package. Once we released it and deployed we can keep the content replicated only in 3 nodes for example.

Conclusion
In conclusion I would like to say that even if Azure Storage accounts have some limitations, because in this moment the number of datacenters would wide reached 16, we can do pretty amazing things if we scale horizontally.

With only 11 cents per month, we can reduce the distribution time of a package from 16 days to 8.2 hours. This is the power of a system that is distributed would wide.

Comments

Popular posts from this blog

How to check in AngularJS if a service was register or not

There are cases when you need to check in a service or a controller was register in AngularJS.
For example a valid use case is when you have the same implementation running on multiple application. In this case, you may want to intercept the HTTP provider and add a custom step there. This step don’t needs to run on all the application, only in the one where the service exist and register.
A solution for this case would be to have a flag in the configuration that specify this. In the core you would have an IF that would check the value of this flag.
Another solution is to check if a specific service was register in AngularJS or not. If the service was register that you would execute your own logic.
To check if a service was register or not in AngularJS container you need to call the ‘has’ method of ‘inhector’. It will return TRUE if the service was register.
if ($injector.has('httpInterceptorService')) { $httpProvider.interceptors.push('httpInterceptorService&#…

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine:
threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See http://go.microsoft.com/fwlink/?LinkId=260882 for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration:

TeamCity.NET 4.51EF 6.0.2VS2013
It seems that there …

Run native .NET application in Docker (.NET Framework 4.6.2)

Scope
The main scope of this post is to see how we can run a legacy application written in .NET Framework in Docker.

Context
First of all, let’s define what is a legacy application in our context. By a legacy application we understand an application that runs .NET Framework 3.5 or higher in a production environment where we don’t have any more the people or documentation that would help us to understand what is happening behind the scene.
In this scenarios, you might want to migrate the current solution from a standard environment to Docker. There are many advantages for such a migration, like:

Continuous DeploymentTestingIsolationSecurity at container levelVersioning ControlEnvironment Standardization
Until now, we didn’t had the possibility to run a .NET application in Docker. With .NET Core, there was support for .NET Core in Docker, but migration from a full .NET framework to .NET Core can be costly and even impossible. Not only because of lack of features, but also because once you…