Skip to main content

(Part 3) Testing the limits of Windows Azure Service Bus

In the latest post series about the limits of Windows Azure Service Bus, we saw that is the maximum number of messages that we can process through a single topic (1.000.000 messages every 30 minutes) and what kind of worker role we should use to process this messages in the optimum way (Medium size).
In this post I will try to respond to another question that is normal to appear when we are using a cloud solution.
Can I scale the number of instances that consume messages from the topic?
Environment:
Each message that was added to Service Bus was pretty small. We had around 100 characters in UTF7 and 3 properties added to each BrokeredMessage.
We run the tests with one; two and three subscribers with different filter rules and the result were similar.
Each message that is received from Service Bus required a custom action to be executed. This action is pretty complicated and consumes CPU power. Also the logic requires to access remote services (that are stored in the same data-center).
For all this tests we used a medium size worker role.
Action:
In the first phase we pushed on the topic 100.000 messages and measure how long it takes to process all the available messages. We tried to identify what is the best number of instances that we can have in parallel from the perspective of costs and time.
The next phase was to see detect what is the optimum number of messages that can be processed using the given number of instances that we found in the first phase.
Results:
Phase 1: We tried to process 100.000 of messages with different number of worker roles instances. The size of all instances was Medium. The results of these tests were:
  • 1 instance – 18 minutes and 10 seconds
  • 2 instance – 11 minutes and 40 seconds
  • 3 instance – 7 minutes and 15 seconds
  • 4 instance – 4 minutes and 15 seconds
  • 5 instance – 3 minutes and 20 seconds
  • 6 instance – 2 minutes and 59 seconds
  • 7 instance – 2 minutes and 27 seconds
  • 8 instance – 2 minutes and 5 seconds
We can observe that at while scaling up, the time decrease with around 30% percent until we reach 4-5 instances, when the time start to decrease more slowly. This happens because when a topic is intensive used, the response time increase. Don’t expect to have the same latency when the topic is hit 10 times per seconds or 1000 per seconds.
When calculating the price, we should take into account to different prices. The first price represents the cost of running N instances for the period of time when messages from topic are processed. The second price is the obsolete price, which represents the cost of running N instances for the minimum period of time – in the case of Windows Azure the smallest time unit is hour.
Taking all this things into account, we observed that 4 instance of medium size can process our messages in the shortest period of time with the best costs.
Having this “magic” number of instances and the size of them we made the next step. We measure how long it takes for our application that is deployed on 4 worker roles with medium size to process different number of messages. The results of this test were:
  • 100.000 messages – 4 minutes
  • 200.000 messages – 8 minutes
  • 300.000 messages – 11 minutes
  • 500.000 messages – 19 minutes
  • 1.000.000 messages – 34 minutes
  • 3.000.000 messages – 2 hours and 32 minutes
The results were pretty interesting. The performance is also the same until we reach a critical point. Do you remember the first post when I mentioned that we observe that 1.000.000 messages per 30 minutes seems to be the maximum number of messages that we can process in an optimum way?
This is the point when the performance starts to go down. We can observe that the difference between processing 1.000.000 messages and 3.000.000 messages is pretty big. From the time perspective the last test requested almost 4.5x more time.
At the end of the tests we decided to use only one worker role of medium size to see how long it takes to process different number of messages. The time results were:
  • 100.000 messages – 18 minutes
  • 200.000 messages – 32 minutes
  • 300.000 messages – 51 minutes
  • 500.000 messages – 1 hour and 31 minutes
  • 1.000.000 messages – 2 hour and 58 minutes
What I liked at this result it was how the duration increased. The duration time increases almost like the number of messages.
Conclusion
For our business problems we observed that having 4 worker roles of medium size of the same topic is the best configuration that we can have from time and costs perspective. This result is extremely important because we know where is the point when we need to scale in a different way.
In the next post related to this topic we will talked about costs.
Remarks: Usually all the duration values are rounded to minutes (without seconds and milliseconds). This does not mean that we don’t have these values.

Part 4

Comments

Popular posts from this blog

Windows Docker Containers can make WIN32 API calls, use COM and ASP.NET WebForms

After the last post , I received two interesting questions related to Docker and Windows. People were interested if we do Win32 API calls from a Docker container and if there is support for COM. WIN32 Support To test calls to WIN32 API, let’s try to populate SYSTEM_INFO class. [StructLayout(LayoutKind.Sequential)] public struct SYSTEM_INFO { public uint dwOemId; public uint dwPageSize; public uint lpMinimumApplicationAddress; public uint lpMaximumApplicationAddress; public uint dwActiveProcessorMask; public uint dwNumberOfProcessors; public uint dwProcessorType; public uint dwAllocationGranularity; public uint dwProcessorLevel; public uint dwProcessorRevision; } ... [DllImport("kernel32")] static extern void GetSystemInfo(ref SYSTEM_INFO pSI); ... SYSTEM_INFO pSI = new SYSTEM_INFO(

Azure AD and AWS Cognito side-by-side

In the last few weeks, I was involved in multiple opportunities on Microsoft Azure and Amazon, where we had to analyse AWS Cognito, Azure AD and other solutions that are available on the market. I decided to consolidate in one post all features and differences that I identified for both of them that we should need to take into account. Take into account that Azure AD is an identity and access management services well integrated with Microsoft stack. In comparison, AWS Cognito is just a user sign-up, sign-in and access control and nothing more. The focus is not on the main features, is more on small things that can make a difference when you want to decide where we want to store and manage our users.  This information might be useful in the future when we need to decide where we want to keep and manage our users.  Feature Azure AD (B2C, B2C) AWS Cognito Access token lifetime Default 1h – the value is configurable 1h – cannot be modified

What to do when you hit the throughput limits of Azure Storage (Blobs)

In this post we will talk about how we can detect when we hit a throughput limit of Azure Storage and what we can do in that moment. Context If we take a look on Scalability Targets of Azure Storage ( https://azure.microsoft.com/en-us/documentation/articles/storage-scalability-targets/ ) we will observe that the limits are prety high. But, based on our business logic we can end up at this limits. If you create a system that is hitted by a high number of device, you can hit easily the total number of requests rate that can be done on a Storage Account. This limits on Azure is 20.000 IOPS (entities or messages per second) where (and this is very important) the size of the request is 1KB. Normally, if you make a load tests where 20.000 clients will hit different blobs storages from the same Azure Storage Account, this limits can be reached. How we can detect this problem? From client, we can detect that this limits was reached based on the HTTP error code that is returned by HTTP