Skip to main content

My thoughts about Azure Cache for Redis (Enterprise)

In today's post, we talk about the flavours of Redis Cache for Microsoft Azure and how to decrypt undocumented errors that we can receive from Redis during the provisioning phase.

When using Microsoft Azure, we have two main options for using Redis Cache:

- Azure Cache for Redis: a SaaS service provided by Microsoft that uses OSS Redis (Redis open-source)

- Azure Cache for Redis Enterprise: fully managed by Microsoft that uses Redis Enterprise

In 90% of the cases, Azure Cache for Redis it's the best-managed cache solution available in Microsoft Azure, offering a 99.9% availability SLA, supporting 120 GB of memory and 40k client connection. I had a great experience with it, as long as you understand the connection concept of Redis.

Azure Cache for Redis Enterprise provides more power, up to 13TB of memory, 2M client connection, a 99.999 availability SLA, 1M operations per second and all the features of Redis Enterprise like active-geo, modules, time series support and Redis on Flash. 

Going with the Redis Enterprise tiers comes with a price, but it is a good offer, especially when you need active-geo replication. You need to consider that active-geo replication requires 2 instances of Redis Enterprise. The pricing model includes both replicas, because active-geo is the most common motivation to go with the Enterprise tier.

From the performance point of view, you should expect up to 70-75% more operations per second and 40% better latency when you compare the Premium tier of Azure Cache for Redis with Azure Cache for Redis Enterprise. 

From the cost point of view, it is hard to compare, but if we compare the P5 tier of Premium offer of Azure Cache for Redis with E100 that is similar from the cache size point of view, your running cost is almost double, BUT you get two data nodes. The real cost hit is when you use the C5 or C5 tier standard tier, and you need to go with the enterprise one for active-geo, for example, when the running cost is 7-10 times more.

An important difference between the two services is who provides support for it. Azure Cache for Redis is fully managed by Microsoft and well documented. Azure Cache for Redis Enterprise is managed by Microsoft, and you get good support from the Redis team, but you need to consider that it is not directly from Microsoft.


When should I use the Enterprise tier?

The no. 1 feature of the Enterprise tier is the active-geo (active geo-replication) that makes the customer move to this tier, together with JSON and time series features.  The performance provided by Azure Cache for the Redis Premium tier is very good. Until now, I was not involved in a project where migration to the Enterprise tier was caused by performance. Yes, we were using 2-4-6 instances of Premium tier deployed across regions without issues. But when geo-replication was required, the Enterprise tier was the best option, even in comparison with other solutions provided by the market. 

When you consider active geo-replication of Redis, there are 3 main cases when you can use it:

(1) Geo-distributed applications: where you want to replicate content across multiple locations in near-real time

(2) Handle region failures: where you ensure a failover node, that is fully replicated

(3) Roaming user sessions: across 2 different locations, having the ability to serve the user from two different locations

Issues with Enterprise tier

A few weeks ago, we had an interesting experience with Redis Enterprise. For a few days, the team received the below error when they were trying to spin up an instance of Redis Enterprise.

    "status": "Failed",

    "error": {

        "code": "ResourceDeploymentFailure",

        "message": "The resource operation completed with terminal provisioning state 'Failed'."

    }

}

The error message is encrypted, and it is not very clear. You can make a lot of assumptions and you don't know if the problem is from your side or from Redis Enterprise.

No additional information was provided, and the ARM scripts were correct. The same error message was provided when a new Redis Enterprise instance was created from the Azure Portal. We were trying to do a PoC, targeting the active-geo feature, and it was not the best experience for the technical team that was stuck. We opened an incident ticket to Redis related to it.

The cause of the incident was a lack of resources available to Redis in the given region. After a few days, we were able to create a new instance without a problem, but I still have a concern related to - What if this would happen in the production environment during an incident? Would be the customer solution down for a few days?

Comments

Popular posts from this blog

Windows Docker Containers can make WIN32 API calls, use COM and ASP.NET WebForms

After the last post , I received two interesting questions related to Docker and Windows. People were interested if we do Win32 API calls from a Docker container and if there is support for COM. WIN32 Support To test calls to WIN32 API, let’s try to populate SYSTEM_INFO class. [StructLayout(LayoutKind.Sequential)] public struct SYSTEM_INFO { public uint dwOemId; public uint dwPageSize; public uint lpMinimumApplicationAddress; public uint lpMaximumApplicationAddress; public uint dwActiveProcessorMask; public uint dwNumberOfProcessors; public uint dwProcessorType; public uint dwAllocationGranularity; public uint dwProcessorLevel; public uint dwProcessorRevision; } ... [DllImport("kernel32")] static extern void GetSystemInfo(ref SYSTEM_INFO pSI); ... SYSTEM_INFO pSI = new SYSTEM_INFO(...

How to audit an Azure Cosmos DB

In this post, we will talk about how we can audit an Azure Cosmos DB database. Before jumping into the problem let us define the business requirement: As an Administrator I want to be able to audit all changes that were done to specific collection inside my Azure Cosmos DB. The requirement is simple, but can be a little tricky to implement fully. First of all when you are using Azure Cosmos DB or any other storage solution there are 99% odds that you’ll have more than one system that writes data to it. This means that you have or not have control on the systems that are doing any create/update/delete operations. Solution 1: Diagnostic Logs Cosmos DB allows us activate diagnostics logs and stream the output a storage account for achieving to other systems like Event Hub or Log Analytics. This would allow us to have information related to who, when, what, response code and how the access operation to our Cosmos DB was done. Beside this there is a field that specifies what was th...

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills)

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills) The idea that moving to the cloud reduces the costs is a common misconception. The cloud infrastructure provides flexibility, scalability, and better CAPEX, but it does not guarantee lower costs without proper optimisation and management of the cloud services and infrastructure. Idle and unused resources, overprovisioning, oversize databases, and unnecessary data transfer can increase running costs. The regional pricing mode, multi-cloud complexity, and cost variety add extra complexity to the cost function. Cloud adoption without a cost governance strategy can result in unexpected expenses. Improper usage, combined with a pay-as-you-go model, can result in a nightmare for business stakeholders who cannot track and manage the monthly costs. Cloud-native services such as AI services, managed databases, and analytics platforms are powerful, provide out-of-the-shelve capabilities, and increase business agility and innovation. H...