Skip to main content

From Azure Event Grid to AWS Kinesis with Azure Functions

 When we talk about cloud integration, many people think first about connecting the APIs. But the real challenge comes when we talk about scaling. How do we ensure that a system continues to work when traffic grows, when events come faster, or when one side of the system scales differently from the other?

In this post, I want to show how we can connect Azure Event Grid, Azure Functions, and AWS Kinesis in a way that scales correctly. The idea is simple — we read events from Event Grid, transform them in an Azure Function, and push them into Kinesis — but making it scale well across clouds needs some attention.

Azure Event Grid — it’s all about scalability

Many people believe that Azure Event Grid scales automatically without limits. That is not really true. Event Grid can handle many events , but has its own capacity model. Creating a custom topic in Event Grid is internally partitioned to allow more parallel processing. Still, there are hard limits per topic. For example:

> Each custom topic can handle around 5,000 events per second per region.

> Each event can have a size up to 1 MB.

> To go beyond that, you can scale using throughput units, or by multiple topics or event domains.

> It can deliver events in batches — up to 1,000 events per batch — which helps a lot for throughput.

So, it does scale, but not “magically.” If your subscribers (like an Azure Function) cannot process fast enough, Event Grid will retry and queue events for some time, but eventually, you will need to scale the rest of the system to keep up.

Event Grid uses throughput units (TUs) to define how much event traffic a single topic or domain can handle. Each throughput unit gives you roughly 1 MB per second of publishing capacity and supports around 1,000 events per second. Event Grid automatically adds or removes throughput units as your event rate changes, so you don’t need to configure them manually. However, scaling happens per topic or domain, not globally — if one topic is busy, it won’t borrow capacity from another. For high event volumes, you can spread your load across multiple topics or use an Event Grid Domain for better horizontal scaling.

AWS Kinesis — shards, shards and shards

Kinesis is very powerful, but it scales differently. It uses shards to define how much data it can handle in parallel. Each shard supports:

> 1 MB per second (or about 1,000 records per second) for writes.

> 2 MB per second for reads.

If your data flow is 10 MB per second, you will need 10 shards. You can split or merge shards as your data grows or shrinks. This is not automatic by default, but AWS provides examples where a small Lambda function monitors the stream through CloudWatch metrics and adjusts the shard count automatically.

It is a bit more manual than Event Grid, but it gives you very precise control. You can decide when and how much you want to scale.

Azure Function — the bridge or the bottleneck

When the traffic grows, the Azure Function must keep up. Running it on a Consumption plan will automatically create more instances as more events arrive. You don’t need to manage the scaling yourself — Azure does it for you. However, this plan has a limit: usually around 200 concurrent instances per region.

The Premium plan is a better choice if you need more stable or predictable performance. It lets you define the number of pre-warmed instances and gives more consistent behaviour.

It’s also essential to use asynchronous code when sending to Kinesis; otherwise, you can block threads and slow down processing. And remember that Event Grid retries automatically if the Function gets too busy, so you don’t need to add your own retry logic.

When we put everything in place, the scaling chain looks like this:

(1) Event Grid pushes events at the rate you configured (up to 5,000 per second per topic).

(2) The Function scales out automatically to process all events in parallel.

(3) Kinesis increases shards automatically (if you configured auto-scaling) or manually when needed.

If one component slows down, the others can compensate — Event Grid retries, Functions scale out, and Kinesis can expand shards. The result is a pipeline that keeps running even when the load changes.



Scaling rhythm

The most important part is to make the system proportional. You don’t need infinite scale — you just need each part to scale in the same rhythm.

For example:

> If Event Grid sends 5,000 events/sec, ensure your Function can handle at least that many invocations.

> If each event is 200 KB, that’s about 1 GB per second of data — you will need at least 10 Kinesis shards (1 MB/sec each) to handle it comfortably.

This way, you have a balance between input, processing, and output. The system breathes naturally with the traffic.

Final Thoughts

What I like about this setup is how it combines the best of both clouds. Azure Event Grid provides fast, reliable event delivery. Azure Functions provides flexibility and transformation logic without managing servers. AWS Kinesis is a strong backbone for streaming and analytics.

When you understand how each part scales — not just in theory, but in numbers — you can build something that runs smoothly even when the data flow doubles or triples. This is not only about connecting Azure and AWS; it’s about making them work together as if they were one system.

Comments

Popular posts from this blog

How to audit an Azure Cosmos DB

In this post, we will talk about how we can audit an Azure Cosmos DB database. Before jumping into the problem let us define the business requirement: As an Administrator I want to be able to audit all changes that were done to specific collection inside my Azure Cosmos DB. The requirement is simple, but can be a little tricky to implement fully. First of all when you are using Azure Cosmos DB or any other storage solution there are 99% odds that you’ll have more than one system that writes data to it. This means that you have or not have control on the systems that are doing any create/update/delete operations. Solution 1: Diagnostic Logs Cosmos DB allows us activate diagnostics logs and stream the output a storage account for achieving to other systems like Event Hub or Log Analytics. This would allow us to have information related to who, when, what, response code and how the access operation to our Cosmos DB was done. Beside this there is a field that specifies what was th...

Cloud Myths: Migrating to the cloud is quick and easy (Pill 2 of 5 / Cloud Pills)

The idea that migration to the cloud is simple, straightforward and rapid is a wrong assumption. It’s a common misconception of business stakeholders that generates delays, budget overruns and technical dept. A migration requires laborious planning, technical expertise and a rigorous process.  Migrations, especially cloud migrations, are not one-size-fits-all journeys. One of the most critical steps is under evaluation, under budget and under consideration. The evaluation phase, where existing infrastructure, applications, database, network and the end-to-end estate are evaluated and mapped to a cloud strategy, is crucial to ensure the success of cloud migration. Additional factors such as security, compliance, and system dependencies increase the complexity of cloud migration.  A misconception regarding lift-and-shits is that they are fast and cheap. Moving applications to the cloud without changes does not provide the capability to optimise costs and performance, leading to ...

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills)

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills) The idea that moving to the cloud reduces the costs is a common misconception. The cloud infrastructure provides flexibility, scalability, and better CAPEX, but it does not guarantee lower costs without proper optimisation and management of the cloud services and infrastructure. Idle and unused resources, overprovisioning, oversize databases, and unnecessary data transfer can increase running costs. The regional pricing mode, multi-cloud complexity, and cost variety add extra complexity to the cost function. Cloud adoption without a cost governance strategy can result in unexpected expenses. Improper usage, combined with a pay-as-you-go model, can result in a nightmare for business stakeholders who cannot track and manage the monthly costs. Cloud-native services such as AI services, managed databases, and analytics platforms are powerful, provide out-of-the-shelve capabilities, and increase business agility and innovation. H...