Skip to main content

(Part 3) Testing the limits of Windows Azure Service Bus

In the latest post series about the limits of Windows Azure Service Bus, we saw that is the maximum number of messages that we can process through a single topic (1.000.000 messages every 30 minutes) and what kind of worker role we should use to process this messages in the optimum way (Medium size).
In this post I will try to respond to another question that is normal to appear when we are using a cloud solution.
Can I scale the number of instances that consume messages from the topic?
Environment:
Each message that was added to Service Bus was pretty small. We had around 100 characters in UTF7 and 3 properties added to each BrokeredMessage.
We run the tests with one; two and three subscribers with different filter rules and the result were similar.
Each message that is received from Service Bus required a custom action to be executed. This action is pretty complicated and consumes CPU power. Also the logic requires to access remote services (that are stored in the same data-center).
For all this tests we used a medium size worker role.
Action:
In the first phase we pushed on the topic 100.000 messages and measure how long it takes to process all the available messages. We tried to identify what is the best number of instances that we can have in parallel from the perspective of costs and time.
The next phase was to see detect what is the optimum number of messages that can be processed using the given number of instances that we found in the first phase.
Results:
Phase 1: We tried to process 100.000 of messages with different number of worker roles instances. The size of all instances was Medium. The results of these tests were:
  • 1 instance – 18 minutes and 10 seconds
  • 2 instance – 11 minutes and 40 seconds
  • 3 instance – 7 minutes and 15 seconds
  • 4 instance – 4 minutes and 15 seconds
  • 5 instance – 3 minutes and 20 seconds
  • 6 instance – 2 minutes and 59 seconds
  • 7 instance – 2 minutes and 27 seconds
  • 8 instance – 2 minutes and 5 seconds
We can observe that at while scaling up, the time decrease with around 30% percent until we reach 4-5 instances, when the time start to decrease more slowly. This happens because when a topic is intensive used, the response time increase. Don’t expect to have the same latency when the topic is hit 10 times per seconds or 1000 per seconds.
When calculating the price, we should take into account to different prices. The first price represents the cost of running N instances for the period of time when messages from topic are processed. The second price is the obsolete price, which represents the cost of running N instances for the minimum period of time – in the case of Windows Azure the smallest time unit is hour.
Taking all this things into account, we observed that 4 instance of medium size can process our messages in the shortest period of time with the best costs.
Having this “magic” number of instances and the size of them we made the next step. We measure how long it takes for our application that is deployed on 4 worker roles with medium size to process different number of messages. The results of this test were:
  • 100.000 messages – 4 minutes
  • 200.000 messages – 8 minutes
  • 300.000 messages – 11 minutes
  • 500.000 messages – 19 minutes
  • 1.000.000 messages – 34 minutes
  • 3.000.000 messages – 2 hours and 32 minutes
The results were pretty interesting. The performance is also the same until we reach a critical point. Do you remember the first post when I mentioned that we observe that 1.000.000 messages per 30 minutes seems to be the maximum number of messages that we can process in an optimum way?
This is the point when the performance starts to go down. We can observe that the difference between processing 1.000.000 messages and 3.000.000 messages is pretty big. From the time perspective the last test requested almost 4.5x more time.
At the end of the tests we decided to use only one worker role of medium size to see how long it takes to process different number of messages. The time results were:
  • 100.000 messages – 18 minutes
  • 200.000 messages – 32 minutes
  • 300.000 messages – 51 minutes
  • 500.000 messages – 1 hour and 31 minutes
  • 1.000.000 messages – 2 hour and 58 minutes
What I liked at this result it was how the duration increased. The duration time increases almost like the number of messages.
Conclusion
For our business problems we observed that having 4 worker roles of medium size of the same topic is the best configuration that we can have from time and costs perspective. This result is extremely important because we know where is the point when we need to scale in a different way.
In the next post related to this topic we will talked about costs.
Remarks: Usually all the duration values are rounded to minutes (without seconds and milliseconds). This does not mean that we don’t have these values.

Part 4

Comments

Popular posts from this blog

Windows Docker Containers can make WIN32 API calls, use COM and ASP.NET WebForms

After the last post , I received two interesting questions related to Docker and Windows. People were interested if we do Win32 API calls from a Docker container and if there is support for COM. WIN32 Support To test calls to WIN32 API, let’s try to populate SYSTEM_INFO class. [StructLayout(LayoutKind.Sequential)] public struct SYSTEM_INFO { public uint dwOemId; public uint dwPageSize; public uint lpMinimumApplicationAddress; public uint lpMaximumApplicationAddress; public uint dwActiveProcessorMask; public uint dwNumberOfProcessors; public uint dwProcessorType; public uint dwAllocationGranularity; public uint dwProcessorLevel; public uint dwProcessorRevision; } ... [DllImport("kernel32")] static extern void GetSystemInfo(ref SYSTEM_INFO pSI); ... SYSTEM_INFO pSI = new SYSTEM_INFO(...

How to audit an Azure Cosmos DB

In this post, we will talk about how we can audit an Azure Cosmos DB database. Before jumping into the problem let us define the business requirement: As an Administrator I want to be able to audit all changes that were done to specific collection inside my Azure Cosmos DB. The requirement is simple, but can be a little tricky to implement fully. First of all when you are using Azure Cosmos DB or any other storage solution there are 99% odds that you’ll have more than one system that writes data to it. This means that you have or not have control on the systems that are doing any create/update/delete operations. Solution 1: Diagnostic Logs Cosmos DB allows us activate diagnostics logs and stream the output a storage account for achieving to other systems like Event Hub or Log Analytics. This would allow us to have information related to who, when, what, response code and how the access operation to our Cosmos DB was done. Beside this there is a field that specifies what was th...

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills)

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills) The idea that moving to the cloud reduces the costs is a common misconception. The cloud infrastructure provides flexibility, scalability, and better CAPEX, but it does not guarantee lower costs without proper optimisation and management of the cloud services and infrastructure. Idle and unused resources, overprovisioning, oversize databases, and unnecessary data transfer can increase running costs. The regional pricing mode, multi-cloud complexity, and cost variety add extra complexity to the cost function. Cloud adoption without a cost governance strategy can result in unexpected expenses. Improper usage, combined with a pay-as-you-go model, can result in a nightmare for business stakeholders who cannot track and manage the monthly costs. Cloud-native services such as AI services, managed databases, and analytics platforms are powerful, provide out-of-the-shelve capabilities, and increase business agility and innovation. H...