Skip to main content

My thoughts about Azure Cache for Redis (Enterprise)

In today's post, we talk about the flavours of Redis Cache for Microsoft Azure and how to decrypt undocumented errors that we can receive from Redis during the provisioning phase.

When using Microsoft Azure, we have two main options for using Redis Cache:

- Azure Cache for Redis: a SaaS service provided by Microsoft that uses OSS Redis (Redis open-source)

- Azure Cache for Redis Enterprise: fully managed by Microsoft that uses Redis Enterprise

In 90% of the cases, Azure Cache for Redis it's the best-managed cache solution available in Microsoft Azure, offering a 99.9% availability SLA, supporting 120 GB of memory and 40k client connection. I had a great experience with it, as long as you understand the connection concept of Redis.

Azure Cache for Redis Enterprise provides more power, up to 13TB of memory, 2M client connection, a 99.999 availability SLA, 1M operations per second and all the features of Redis Enterprise like active-geo, modules, time series support and Redis on Flash. 

Going with the Redis Enterprise tiers comes with a price, but it is a good offer, especially when you need active-geo replication. You need to consider that active-geo replication requires 2 instances of Redis Enterprise. The pricing model includes both replicas, because active-geo is the most common motivation to go with the Enterprise tier.

From the performance point of view, you should expect up to 70-75% more operations per second and 40% better latency when you compare the Premium tier of Azure Cache for Redis with Azure Cache for Redis Enterprise. 

From the cost point of view, it is hard to compare, but if we compare the P5 tier of Premium offer of Azure Cache for Redis with E100 that is similar from the cache size point of view, your running cost is almost double, BUT you get two data nodes. The real cost hit is when you use the C5 or C5 tier standard tier, and you need to go with the enterprise one for active-geo, for example, when the running cost is 7-10 times more.

An important difference between the two services is who provides support for it. Azure Cache for Redis is fully managed by Microsoft and well documented. Azure Cache for Redis Enterprise is managed by Microsoft, and you get good support from the Redis team, but you need to consider that it is not directly from Microsoft.


When should I use the Enterprise tier?

The no. 1 feature of the Enterprise tier is the active-geo (active geo-replication) that makes the customer move to this tier, together with JSON and time series features.  The performance provided by Azure Cache for the Redis Premium tier is very good. Until now, I was not involved in a project where migration to the Enterprise tier was caused by performance. Yes, we were using 2-4-6 instances of Premium tier deployed across regions without issues. But when geo-replication was required, the Enterprise tier was the best option, even in comparison with other solutions provided by the market. 

When you consider active geo-replication of Redis, there are 3 main cases when you can use it:

(1) Geo-distributed applications: where you want to replicate content across multiple locations in near-real time

(2) Handle region failures: where you ensure a failover node, that is fully replicated

(3) Roaming user sessions: across 2 different locations, having the ability to serve the user from two different locations

Issues with Enterprise tier

A few weeks ago, we had an interesting experience with Redis Enterprise. For a few days, the team received the below error when they were trying to spin up an instance of Redis Enterprise.

    "status": "Failed",

    "error": {

        "code": "ResourceDeploymentFailure",

        "message": "The resource operation completed with terminal provisioning state 'Failed'."

    }

}

The error message is encrypted, and it is not very clear. You can make a lot of assumptions and you don't know if the problem is from your side or from Redis Enterprise.

No additional information was provided, and the ARM scripts were correct. The same error message was provided when a new Redis Enterprise instance was created from the Azure Portal. We were trying to do a PoC, targeting the active-geo feature, and it was not the best experience for the technical team that was stuck. We opened an incident ticket to Redis related to it.

The cause of the incident was a lack of resources available to Redis in the given region. After a few days, we were able to create a new instance without a problem, but I still have a concern related to - What if this would happen in the production environment during an incident? Would be the customer solution down for a few days?

Comments

Popular posts from this blog

Windows Docker Containers can make WIN32 API calls, use COM and ASP.NET WebForms

After the last post , I received two interesting questions related to Docker and Windows. People were interested if we do Win32 API calls from a Docker container and if there is support for COM. WIN32 Support To test calls to WIN32 API, let’s try to populate SYSTEM_INFO class. [StructLayout(LayoutKind.Sequential)] public struct SYSTEM_INFO { public uint dwOemId; public uint dwPageSize; public uint lpMinimumApplicationAddress; public uint lpMaximumApplicationAddress; public uint dwActiveProcessorMask; public uint dwNumberOfProcessors; public uint dwProcessorType; public uint dwAllocationGranularity; public uint dwProcessorLevel; public uint dwProcessorRevision; } ... [DllImport("kernel32")] static extern void GetSystemInfo(ref SYSTEM_INFO pSI); ... SYSTEM_INFO pSI = new SYSTEM_INFO(...

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine: threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See http://go.microsoft.com/fwlink/?LinkId=260882 for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration: TeamCity .NET 4.51 EF 6.0.2 VS2013 It see...

Navigating Cloud Strategy after Azure Central US Region Outage

 Looking back, July 19, 2024, was challenging for customers using Microsoft Azure or Windows machines. Two major outages affected customers using CrowdStrike Falcon or Microsoft Azure computation resources in the Central US. These two outages affected many people and put many businesses on pause for a few hours or even days. The overlap of these two issues was a nightmare for travellers. In addition to blue screens in the airport terminals, they could not get additional information from the airport website, airline personnel, or the support line because they were affected by the outage in the Central US region or the CrowdStrike outage.   But what happened in reality? A faulty CrowdStrike update affected Windows computers globally, from airports and healthcare to small businesses, affecting over 8.5m computers. Even if the Falson Sensor software defect was identified and a fix deployed shortly after, the recovery took longer. In parallel with CrowdStrike, Microsoft provi...