Scaling dimensions inside Azure CosmosDB

Azure CosmosDB is the non-relational database available inside Azure, multi-model and global distribution. Documents are stored in collections, that can be queried to fetch content.
In general, people are doing the following comparison between DocumentDB and SQL

A Document is similar to a Row
A Collection is similar to a Table
A Database is similar to a SQL DB

Even so, when you talk about scaling and throughput, things are a little more complicated inside Azure CosmosDB - there are 2 different dimensions that can be used for throughput.

Container Level

The first one is at the container level. What is a container? Well, for DocumentDB it is represented by the collection itself. You have the ability to specify the no. of resources (RU - Requests Units) that you want to reserve for a specific container.

When you specify the level of throughput you also required to specify a partition that will be used to generate the logical partition. That is generated behind the scene and contain documents with the same partition key. Logical partitions are used to distribute the load across our collection. It is a group of documents with the same partition key.

For one or multiple logical partitions, Azure CosmosDB generates physical partitions mapped to one or more logical partitions. There is no control on the no. of partitions, they are fully managed by Azure CosmosDB. Each replica will have the same no of physical partitions with the same no. of resources reserved.

When we allocate resources at container level we are reserving resources for a collection. The resources at the collection level are shared between all the physical partitions. Because of this if we have a partition that has a high load, the other partitions will suffer from a lack of resources. We don't have the ability to reserve a resource and blocked it at the partition level.

You should be aware that when you specify a partition key at the collection level for throughput configuration, it will be used by the container to do the data partitioning between the containers, you will not reserve dedicated resources per partition. The resources are per collection.

Database Level

The second is at the database level. Resources are shared across all the collections under the database. You might have lower costs. but no predictable performance at the collection level. The performance can vary, depending on the load at the database level, being affected by:

No. of containers
No. of partitions keys per collections
Load distribution across logical partitions

Mixing Database and Container level scalability

There is the ability to reserve dedicated resources per-database and container level. Let's assume that you have a database (D1) with 4 collections (C1, C2, C3, C4). We can reserve 10.000RU/s at the database level and 2.000 additional RU/s for C2.

Doing such provisioning means that:

You pay for 12.000 RU/s
10.000 RU/s are shared between C1, C3 and C4 collections
2.000 RU/s are fully dedicated only for C2 with clear SLA and response time per collection

When the load of C2 exceed the 2.000 RU/S reserved, a throughput exception is generated, even if at database level you might have resources available.

Resource allocation

At the initial phases of the project, you can have an approach where you allocate all resource at the database level. This is a big advantage for DEV and TEST env. where you can limit the CosmosDB costs. Once you identify collection where the load is high and the query complexity request higher resources, you can allocate dedicated resources for each collection.

A common mistake is to start with the resources allocated at the container level. This force you to have high initial costs and no ability to share resources for collections that have a low load.

How to audit an Azure Cosmos DB

In this post, we will talk about how we can audit an Azure Cosmos DB database. Before jumping into the problem let us define the business requirement: As an Administrator I want to be able to audit all changes that were done to specific collection inside my Azure Cosmos DB. The requirement is simple, but can be a little tricky to implement fully. First of all when you are using Azure Cosmos DB or any other storage solution there are 99% odds that you’ll have more than one system that writes data to it. This means that you have or not have control on the systems that are doing any create/update/delete operations. Solution 1: Diagnostic Logs Cosmos DB allows us activate diagnostics logs and stream the output a storage account for achieving to other systems like Event Hub or Log Analytics. This would allow us to have information related to who, when, what, response code and how the access operation to our Cosmos DB was done. Beside this there is a field that specifies what was th...

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills)

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills) The idea that moving to the cloud reduces the costs is a common misconception. The cloud infrastructure provides flexibility, scalability, and better CAPEX, but it does not guarantee lower costs without proper optimisation and management of the cloud services and infrastructure. Idle and unused resources, overprovisioning, oversize databases, and unnecessary data transfer can increase running costs. The regional pricing mode, multi-cloud complexity, and cost variety add extra complexity to the cost function. Cloud adoption without a cost governance strategy can result in unexpected expenses. Improper usage, combined with a pay-as-you-go model, can result in a nightmare for business stakeholders who cannot track and manage the monthly costs. Cloud-native services such as AI services, managed databases, and analytics platforms are powerful, provide out-of-the-shelve capabilities, and increase business agility and innovation. H...

Cloud Myths: Migrating to the cloud is quick and easy (Pill 2 of 5 / Cloud Pills)

The idea that migration to the cloud is simple, straightforward and rapid is a wrong assumption. It’s a common misconception of business stakeholders that generates delays, budget overruns and technical dept. A migration requires laborious planning, technical expertise and a rigorous process. Migrations, especially cloud migrations, are not one-size-fits-all journeys. One of the most critical steps is under evaluation, under budget and under consideration. The evaluation phase, where existing infrastructure, applications, database, network and the end-to-end estate are evaluated and mapped to a cloud strategy, is crucial to ensure the success of cloud migration. Additional factors such as security, compliance, and system dependencies increase the complexity of cloud migration. A misconception regarding lift-and-shits is that they are fast and cheap. Moving applications to the cloud without changes does not provide the capability to optimise costs and performance, leading to ...

Cloud as a Story - Vunvulea Radu

Search This Blog