Azure Time Series Insights it is a great tool that can be used to monitor in near-real time your systems and get useful insights about what is happening behind a scene.
In this post, we will talk about things that we need to consider before jumping to prepare a demo with it. There are behaviors that are normal when we go with Time Series Insights in production, but when we are preparing a demo or we want to push mocks data inside it can be annoying (especially when we do not understand the behavior).
Context
I had to deliver 2 presentations and 3 demos about Time Series Insights and I learned in the hard way this kind of things, too long nights before the demos. Based on them I made a checklist that I hope to be useful to others also.
1. Ingress of mock events
One of the most important thing that we need to keep an eye on is the moment when we want to push mock data inside Time Series Insights. In general, we want in a short period to push data for a few days or weeks.
There are two tiers available on Time Series Insights (S1 & S2) that allows us to push 720 events per minute or 7200 events per minute on each Ingress Unit. This means that if we want to push 1M events in short period, this would be possible, but we will need to wait a while. More exactly you’ll need ~24 minutes until all the events will be ingested by Ingress Unit.
You can improve the ingress time if you increase the number of Ingress Units from 1 to 5 or event 10. Do not forget that you pay Ingress Units per day and not per hour or minute.
2. Number of mock events
When you prepare a demo, you can reach easily more than 1M events. In the moment when you push all this events inside the Time Series Insights be aware that there is a limit of 1M events per day for S1 per Ingress Unit (10M events for S2).
During a demo, you will be disappointed if you find out that only the first millions of events are available.
To avoid something like this, you shall start to push data a few days before or increase the number of Ingress Units to more than one. I recommend the second option, by increasing the number of Ingress Units.
3. Size of the events
In general, you do not care about the size of the events that you want to push inside a system. The size is relevant in the moment when there are throughput limits that takes this into account.
Inside Azure Time Series Insights, the number of events that you can push per day is limited per Ingress Unit to 1M for S1 or 10M for S2. The event size is calculated at 1KB size. This means that if you have an event that has 0.5KB, it will be counted as one, but if you have an event that has 1.1KB it will be calculated as 2 events.
Because of this, if you know that the events size is around 1KB, you should be aware of this and try to increase a little the number of Ingress Units that you need based on this.
In general I randomly select 1000 events and calculated their average size, max, min and the number of events that are higher than a round value like 1KB, 2KB and so on. In this way I’m able to have a better estimation from capacity perspective.
4. Custom timestamp
The default behavior of Time Series Insights is to consider the creation time of event source (Event Hub of IoT Hub) as moment in time when the event happened.
In production this might be (or not) acceptable, but when you are preparing a demo you want to be able to play a little with time and use a field from the event as event time. When you specify a source of events inside Time Series Insights there is already this option, but be aware of time format.
There is a specific format of timestamp for the event field. If you don’t have the exact format you’ll have NO data inside Time Series Insights and no error of course. The format is “yyyy-MM-ddTHH:mm:ss.FFFFFFFK” and you should keep it in this way, Any other format will trigger losing your events.
Sample code
You can find below some scripts that I used to fetch sample data inside the service. You might find it useful.
The below code populate the Event Hub with live data. Can be used with success during a demo.
The below function push content to Event Hub. Serialization is made by hand not because we don't have Nuget packages available for this, but because we wanted to show some issues that exist with the current implementation where StringBuilder is the father of all serialization steps.
The last code that I want to share with you it's how you can populate with historical data the Event Hub for Time Series Insight.
Conclusion
Time Series Insights it is a great and powerful service. You should just keep in mind small tips and tricks when you prepare a demo for it. This kind of things can block you a few hours and can be extremely frustrating.
In this post, we will talk about things that we need to consider before jumping to prepare a demo with it. There are behaviors that are normal when we go with Time Series Insights in production, but when we are preparing a demo or we want to push mocks data inside it can be annoying (especially when we do not understand the behavior).
Context
I had to deliver 2 presentations and 3 demos about Time Series Insights and I learned in the hard way this kind of things, too long nights before the demos. Based on them I made a checklist that I hope to be useful to others also.
1. Ingress of mock events
One of the most important thing that we need to keep an eye on is the moment when we want to push mock data inside Time Series Insights. In general, we want in a short period to push data for a few days or weeks.
There are two tiers available on Time Series Insights (S1 & S2) that allows us to push 720 events per minute or 7200 events per minute on each Ingress Unit. This means that if we want to push 1M events in short period, this would be possible, but we will need to wait a while. More exactly you’ll need ~24 minutes until all the events will be ingested by Ingress Unit.
You can improve the ingress time if you increase the number of Ingress Units from 1 to 5 or event 10. Do not forget that you pay Ingress Units per day and not per hour or minute.
2. Number of mock events
When you prepare a demo, you can reach easily more than 1M events. In the moment when you push all this events inside the Time Series Insights be aware that there is a limit of 1M events per day for S1 per Ingress Unit (10M events for S2).
During a demo, you will be disappointed if you find out that only the first millions of events are available.
To avoid something like this, you shall start to push data a few days before or increase the number of Ingress Units to more than one. I recommend the second option, by increasing the number of Ingress Units.
3. Size of the events
In general, you do not care about the size of the events that you want to push inside a system. The size is relevant in the moment when there are throughput limits that takes this into account.
Inside Azure Time Series Insights, the number of events that you can push per day is limited per Ingress Unit to 1M for S1 or 10M for S2. The event size is calculated at 1KB size. This means that if you have an event that has 0.5KB, it will be counted as one, but if you have an event that has 1.1KB it will be calculated as 2 events.
Because of this, if you know that the events size is around 1KB, you should be aware of this and try to increase a little the number of Ingress Units that you need based on this.
In general I randomly select 1000 events and calculated their average size, max, min and the number of events that are higher than a round value like 1KB, 2KB and so on. In this way I’m able to have a better estimation from capacity perspective.
4. Custom timestamp
The default behavior of Time Series Insights is to consider the creation time of event source (Event Hub of IoT Hub) as moment in time when the event happened.
In production this might be (or not) acceptable, but when you are preparing a demo you want to be able to play a little with time and use a field from the event as event time. When you specify a source of events inside Time Series Insights there is already this option, but be aware of time format.
There is a specific format of timestamp for the event field. If you don’t have the exact format you’ll have NO data inside Time Series Insights and no error of course. The format is “yyyy-MM-ddTHH:mm:ss.FFFFFFFK” and you should keep it in this way, Any other format will trigger losing your events.
Sample code
You can find below some scripts that I used to fetch sample data inside the service. You might find it useful.
The below code populate the Event Hub with live data. Can be used with success during a demo.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 | private void LiveDeviceData() { Random random = new Random(); List<string> statusList = new List<string>() { "Stopped", "Block", "Run", "Run", "Run", "Run", "Pause", }; EventHubClient eventHubClient = EventHubClient.CreateFromConnectionString(@"Endpoint=sb://..."); while (true) { int deviceId = random.Next(1, 20); DeviceStatus deviceStatus = new DeviceStatus() { Date = DateTime.Now, DeviceId = deviceId, SiteId = deviceId % 4 + 1, HealthLevel = random.Next(3, 10), NumberOfUnits = random.Next(20, 1000), Status = statusList[random.Next(0, statusList.Count - 1)] }; using (MemoryStream ms = new MemoryStream()) { using (var sw = new StreamWriter(ms)) { sw.Write(deviceStatus.ToEventData()); sw.Flush(); ms.Position = 0; EventData eventData = new EventData(ms); eventHubClient.Send(eventData); } } System.Console.Write('.'); Thread.Sleep(random.Next(10, 200)); } } |
The below function push content to Event Hub. Serialization is made by hand not because we don't have Nuget packages available for this, but because we wanted to show some issues that exist with the current implementation where StringBuilder is the father of all serialization steps.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | private void PushToEventHub(List<DeviceStatus> deviceStatusList) { EventHubClient eventHubClient = EventHubClient.CreateFromConnectionString(@"Endpoint=sb://..."); int toProcessCount = oldDateDeviceBatchSize-1; int batchSize = 100; while (toProcessCount>0) { StringBuilder sb = new StringBuilder(); using (MemoryStream ms = new MemoryStream()) { using (var sw = new StreamWriter(ms)) { sb.Append("["); sw.Write("["); for (int i = toProcessCount; i > toProcessCount - batchSize; i--) { if(i>toProcessCount) { sb.Append(","); sw.Write(","); } sb.Append(deviceStatusList[i].ToEventData()); sw.Write(deviceStatusList[i].ToEventData()); } sb.Append("]"); sw.Write("]"); sw.Flush(); ms.Position = 0; EventData eventData = new EventData(ms); eventHubClient.Send(eventData); toProcessCount -= batchSize; string s = sb.ToString(); System.Console.WriteLine($"Batch push: ${toProcessCount}"); } } } } ... public class DeviceStatus { public DateTime Date { get; set; } public int DeviceId { get; set; } public int SiteId { get; set; } public string Status { get; set; } public int HealthLevel { get; set; } public int NumberOfUnits { get; set; } public string ToEventData() { string eventInJson = $"{{ \"Date\": \"{Date.ToString("yyyy-MM-ddTHH:mm:ss.FFFFFFFK")}\", \"DeviceId\": \"{DeviceId}\",\"SiteId\": \"{SiteId}\",\"Status\": \"{Status}\",\"HealthLevel\": \"{HealthLevel}\",\"NumberOfUnits\": \"{NumberOfUnits}\" }}"; return eventInJson; } } |
The last code that I want to share with you it's how you can populate with historical data the Event Hub for Time Series Insight.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | private static void PopulateWithOldDeviceData() { DateTime currentDate = originalDeviceEndTime; Random random = new Random(); List<DeviceStatus> bufferList = new List<DeviceStatus>(); List<string> statusList = new List<string>() { "Stopped", "Block", "Run", "Run", "Run", "Run", "Pause", }; for (int i = 0; i < oldDateDeviceBatchSize; i++) { currentDate = currentDate .AddSeconds(random.Next(-1, 0)) .AddMilliseconds(random.Next(-1010, 0)); int deviceId = random.Next(1, 20); DeviceStatus deviceStatus = new DeviceStatus() { Date = currentDate, DeviceId = deviceId, SiteId = deviceId % 4 + 1, HealthLevel = random.Next(3, 10), NumberOfUnits = random.Next(20, 1000), Status = statusList[random.Next(0, statusList.Count - 1)] }; bufferList.Add(deviceStatus); } PushToEventHub(bufferList); } |
Conclusion
Time Series Insights it is a great and powerful service. You should just keep in mind small tips and tricks when you prepare a demo for it. This kind of things can block you a few hours and can be extremely frustrating.
Comments
Post a Comment