Skip to main content

Azure Event Hub (Day 3 of 31)

List of all posts from this series:

Short Description 
Event Hub is an event aggregator special created for ingress use case. Can be used with success in use cases where we have millions of events from different sources and we need to be able to collect and process them. It can be the best solution for the event collecting part.
All this benefits comes with a price. Features like death letter queue, transaction support or delivery guaranty don’t exist anymore. For this scenario we should use Service Bus Queue or Topic (but they are not as fast as Event Hub).

Main Features 
Streaming Capability
Event Hub has the stream capability, giving us the possibility to consume event as an event stream, connecting it to different big data systems.

AMQP support
Even Hub can be accessed using the REST API over HTTP protocol. Also, we have support for AMQP (it is one of the most use message queue protocol).

Events are grouped in partitioned. On each partition  events are added at the end of the sequence in the order they arrive. Each partition is independent, having different grow rate and retention policy. The number of partition is directly influenced by the consumers and how many consumer we have.
The number if partition in this moment can be between 8 and 32 and cannot be change once created.

Event Data
Represent the content of an event (message body from Service Bus). This event data arrive and are stored in partitions. We can specify for event data the lifetime of them.

Event Consumer
Is represented by an application that consume events from partition. Each partition should have only one event consumer at a time.

Consumer Group
Is the mechanism used to deliver the same content to multiple consumers. Each consumer group has his own offset on the event hub. For example if we have 4 consumer groups, it means that the same event data will be received by all 4 consumers. It is similar with Subscriptions concept from Service Bus Topics.
All event data that are send to event hub will be received by each consumer group.

Partition Offsets
Each consumer can set the stream offsets on partition. This value can be based on timestamp or offset value. It is importan to remember that this value is managed by consumer.
It is recomanded to use timestamps, because the management is made more easilty.

Can be used with success to ‘commit’ reading from a partitions by a consumer group. Using checkpoints we can specify how much of events we were able to consume with success. In case of something bad happens, we can use this checkpoints to continue the event data processing from the last checkpoint.

Event Publisher
Are represented by device (senders) that send events to Event Hub.

The access to event hub (from event publishers) is based on Shared Access Signature (SAS). Don’t forget that when you create a token for Event Publisher you should give only send access rights. The token can already contains the partition key, otherwise, when we send data to the Event Hub we need to specify the partition key.

Events Data Size
The maximum size of an event data of a batch that contains multiple event data is 256KB.

First connection over AMQP is more expensive, because is bidirectional socket where a secure channel is establish. But once created it is less expensive to send event data (like a session, we don’t need to set the secure channel for each new event data.
HTTP/S is less expensive to send event data, but for each event data we need to establish the secure channel.
AMQP can be used with success when we send data to Event Hub in a constant manner. HTTP should be used when we send data only from time to time.

Partitions (Partitions Keys) 
Are used to group events for event consumer. We can group events based on our own business model (device type, location and so on).
The partitions themselves are not relevant for Event Publisher, they only need to send the partition key that will be used to group event data in partitions.
For consumer groups, the partitions are very important, because each consumer group can consume messages from a specific partition.

Throughput Capacity
Each Event Hub has Throughput Unit that controls how many event data can be processed by an Event Hub. In this moment a Throughput Unit is defined by:

  • 1MB or 1K events per second for Ingress
  • 2MB for Egress

From the management portal we have full control on how many Throughput Units we want to allocate.

Black List
We have the ability to put a device (the access token of the device) in a black list. When a device is in the black list he cannot access the Event Hub anymore (send event data).

Partition Key
It is used to distribute event data for partitions. All event data with the same partition key will be send to one partition. If we don’t specify the partition key, the events data will be send to partitions in a round robin manner.

This interface can be used with success when we want to define Event Consumers entities.

Message Retention
Messages will be stored on Event Hub for a specific time interval. Once this time expired messages are removed automatically. One of the benefits of this mechanism is not only the cleaning mechanism but also you can process messages more than one based using Partition Offset.

Maxim size of event data (256KB) can be seen as a limitation, but is not. We are talking about a transport platform that needs to manage millions of messages per second. It make sense to work with small units.
The number of partitions (32) and Service Bus Brokered Connections (1000) is limited. We have the ability to request more.
Only one consumer per consumer group and partitions. Yes, is it okay. Why? We have the stream capability, and a stream cannot be spitted. The concept is different in comparison with Service Bus Topic or Queue.
We don’t have support for sequencing, dead-lettering, transaction support and strong delivery assurances.
The maximum number of consumers on a partition from a consumer group is 5. The recommended value is one.

Applicable Use Cases 
Below you can find 4 use cases where Event Hub can be used with success.

Telemetry data
If you need to collect telemetry data from devices, that you could collect all this data over event hub. Event Hub can be the perfect channel for collection data. We can plug it to different big data ingest systems.

Audit Information
Event Hub can be used with success to collect audit data from devices that are on field. Can be useful when we scale from 10k devices to 1M devices or when we execute commands on devices and the audit level increase drastically.

GPS Location
It can be the perfect channel to collect the GPS location of devices. This is a use cases when we can afford to lose from time to time 1 or 2 GPS positions.

Device Status
When we need to collect the device status at a specific time interval, event hub is a cheap and simple way to collect it.

Code Sample 
// Create event hub
EventHubDescription hubDescr = new EventHubDescription("foohub");
hubDescr.PartitionCount = 16;

// Create publisher
EventHubClient hubClient = EventHubClient.Create("foohub");

// Send event data
FooEventData fooData = new FooEventData()
    DeviceId = 9997,
    Location = "12345.242423"
EventData data = new EventData(fooData, fooSerialized) 
       PartitionKey = info.DeviceId.ToString()

// Create consumer for messages from last day
EventHubReceiver hubConsumer = await defaultConsumerGroup
        shardId: fooPartitionId, 
        startingDateTimeUtc : DateTime.UtcNow.AddDays(-1)); 
// Consume a message
var message = await hubConsumer.ReceiveAsync();

// Connect to an event processor using a consumer group
EventProcessorHost host = new EventProcessorHost(
    WorkerName, EventHubName, defaultConsumerGroup.GroupName, eventHubConnectionString, blobConnectionString);

Pros and Cons 


  • Log millions of events per second 
  • Simple authorization mechanism 
  • Time-based event buffering 
  • Elastic scale 
  • Pluggable adapters for other cloud services


  • No all features from Service Bus Topic exist (but is acceptable)
  • The size of the event data is limited to 256KB (the size is pretty okay)
  • Number of partitions is limited to maximum 32

When we calculate the price of Event Hub we should take into account the fallowing components:

  • Outbound traffic
  • Throughput units
  • Ingress Events count
  • Event Data Storage (1day is free, additional days cost)
  • Number of Service Bus brokered connections needed

In conclusion I would say that Event Hub it is the perfect solution in IoT world and can be very useful when we need to manage millions of events (messages) per second.


Popular posts from this blog

How to check in AngularJS if a service was register or not

There are cases when you need to check in a service or a controller was register in AngularJS.
For example a valid use case is when you have the same implementation running on multiple application. In this case, you may want to intercept the HTTP provider and add a custom step there. This step don’t needs to run on all the application, only in the one where the service exist and register.
A solution for this case would be to have a flag in the configuration that specify this. In the core you would have an IF that would check the value of this flag.
Another solution is to check if a specific service was register in AngularJS or not. If the service was register that you would execute your own logic.
To check if a service was register or not in AngularJS container you need to call the ‘has’ method of ‘inhector’. It will return TRUE if the service was register.
if ($injector.has('httpInterceptorService')) { $httpProvider.interceptors.push('httpInterceptorService&#…

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine:
threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration:

TeamCity.NET 4.51EF 6.0.2VS2013
It seems that there …

Run native .NET application in Docker (.NET Framework 4.6.2)

The main scope of this post is to see how we can run a legacy application written in .NET Framework in Docker.

First of all, let’s define what is a legacy application in our context. By a legacy application we understand an application that runs .NET Framework 3.5 or higher in a production environment where we don’t have any more the people or documentation that would help us to understand what is happening behind the scene.
In this scenarios, you might want to migrate the current solution from a standard environment to Docker. There are many advantages for such a migration, like:

Continuous DeploymentTestingIsolationSecurity at container levelVersioning ControlEnvironment Standardization
Until now, we didn’t had the possibility to run a .NET application in Docker. With .NET Core, there was support for .NET Core in Docker, but migration from a full .NET framework to .NET Core can be costly and even impossible. Not only because of lack of features, but also because once you…