Skip to main content

Azure Tables Storage (Day 19 of 31)

List of all posts from this series: http://vunvulearadu.blogspot.ro/2014/11/azure-blog-post-marathon-is-ready-to.html

Short Description 
Azure Table Services are part of Azure Storage and give us the possibility to store large amount of data in a structured way. It allow us to store collections of entities in so called table. If we look over this service from NoSQL side, we could say that so called table in Azure Table are the collections of NoSQL world.


Main Features 
A part of the features of Azure Table are similar with Azure Blob Storage. This is happening because both services are constructed over the same infrastructure and have a common base.
Different entities schema in the same table
We are allowed to store under the same table entities with different schema. For example we can store entity ‘Car’ and ‘House’ in the same table as long as in the moment when we retrieve the entity we can detect the type of entity.
Size
The size of tables is unlimited. We can store large collection of entities in Azure Table without any kind of problem (the size can be reach more than 1 TB). The limit of the maximum size is the same for Azure Blob Storage – 500TB (for now).
RESTFull and OData
All the access is made over a REST API that support OData standard. This mean that we can query the storage very easily from any kind of device.
Tables Count
We don’t have a real limit of the number of tables that we can create under the same storage. Theoretically, we can create as many tables we want as long we have enough space.
Simple hierarchy
The hierarchy of Azure Table database is very simple. Under Storage Account we have 0..N tables. Each table can contains 0..M entities. Each entity can have from 3..256 properties.
Partition Key
It is used to group entities that are similar from a table. Partition key is used by Azure Table to partition the tables in different nodes when the table size is too big.
Row Key
The unique ID of an entity in the same partition. In a table, we can have multiple Row Key with the same value as long as the partition key for each entity is different.
Entity Key (ID)
The unique ID of each entity is constructed from partition key + row key. This key is unique per table and can be used to retrieve entities.
Timestamp
This is the property key that give us information when was the last time when the entity was changed. Can be used with success to detect if the entity value changes during a specific time period (ETag).
This value cannot be set. It is set automatically by the system. Any kind of changes of this property will be ignored by the system.
Batch Support
We have the ability to execute a batch over Azure Table. You can maximum have 100 actions in a batch. The changes from the batch can affect only on partition from a table.
Property Supported Types
The range of types supported by properties is very large, starting from INT and STRING to bool or array of bytes. Below you can find the list of all types that are supported:

  • byte[]
  • bool
  • DateTime
  • Double
  • Guid
  • String
  • Int32
  • Int64

Security
There are 3 different methods to manage the access to your blob content:

  • Shared Access Signature – you have full control to manage read, write, manage access at container and blob level for each item. In this way different users can have different access level
  • ACL – Similar with Shared Access Signature, but allow us better management mechanism. In this way for a specific key (SAS key) we can manage from the backend the access writes without having to revoke it.

Pagination Support
When we execute queries that retrieves large about of data we can use the continuous token to be able to iterate over all the entities retrieved by query. This token contains 3 parts:

  • NextTableName
  • NextPartitionKey
  • NextRowKey

Query
Azure Tables have query support. The queries support the fallowing components:

  • $filter – used to filter entities based on client rules
  • $top – returns the first N entities 
  • $selected – returns only the property that client request

$filter
Has support for base filter rules like equal, greater, less, not equal, less or equal, grater and equal. On top of this we have support for Boolean operations (and, or, not).
Redundant
Azure Blob storage give us multiple options when we talk about redundancy. By default we have redundancy at data center level – this mean that all the time there are 3 copies of the same content (and you pay only one). On top of this there are other 3 options of redundancy that I will describe below (plus the one that we already talk about):

  • LRS (Local Redundant Storage) – Content is replicate 3 times in the same data center (facility unit from region)
  • ZRS (Zone Redundant Storage) – Content is replicated 3 times in the same regions cross 2 datacenters (facilities) where 2 datacenters are available. Otherwise the content is replicated 3 times into 2 different regions
  • GRS (Geo Redundant Storage) – Content is replicated 6 times across 2 regions (3 times in the same region and 3 times across different regions)
  • RA- GRS (Read Access Geo Redundant Storage) – Content is replicated in the same way as for GRS, but you have read only access in the second region. For the GRS even if the data exist in the second region you cannot access it directly.

Limitations 
  • Like other NoSQL storages, Azure Table cannot manage complex joins, FK or store procedures. In this moment you have support for querying, but you should keep in mind that only on the index properties the query will work (fast). 
  • Each entity can have maximum 256 property. The maxim size of an entity can be 1MB.
  • A batch can contains maximum 100 actions and can be executed only over the same partitions of a table, with the payload maximum size of 4MB.
  • The length of a table name can be between 3 and 64 characters. 
  • Maximum size of Partition Key or Row Key is 1KB (each of them).

Applicable Use Cases
Below you can find 3 uses cases when I would use Azure Tables.
Logs
I would use Azure Table to store large amount of logs, splitter in different tables based on date and source. In this way the traceability step and cleaning steps are pretty simple.
Software updates assignments
If we have a system that needs to store the software updates assignments for each device than we could think to use Azure Tables. We can store in each table the assignments for each system. We can group the assignments of each system based on the applications type using partition key.
Content tags and description
If we are using Azure Blobs for example to store our binary content, than we may need a place where we would like to store the list of tags for each binary content, plus descriptions and other information. For this case we could use Azure Table with success.

Code Sample 


// Retrieve storage account from connection string
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
    CloudConfigurationManager.GetSetting("StorageConnectionString"));

// Create the table client
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

//Create the CloudTable that represents the "people" table.
CloudTable table = tableClient.GetTableReference("people");

// Define the query, and only select the Email property
TableQuery<DynamicTableEntity> projectionQuery = new TableQuery<DynamicTableEntity>().Select(new string[] { "Email" });

// Define an entity resolver to work with the entity after retrieval.
EntityResolver<string> resolver = (pk, rk, ts, props, etag) => props.ContainsKey("Email") ? props["Email"].StringValue : null;

foreach (string projectedEmail in table.ExecuteQuery(projectionQuery, resolver, null, null))
{
    Console.WriteLine(projectedEmail);
}

Source: http://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-how-to-use-tables/
Pros and Cons 
Pros

  • Unlimited number of tables and entities
  • Maximum size of database very high (same as for Azure Storage)
  • Fast and easy to query
  • Querying over partition key and row key very fast (plus timestamp)
  • Cheap

Cons

  • Not all properties are indexed (only partition key and row key)
  • Limited number of properties per entity
  • Querying on properties that are not indexed required getting the entities from Azure to client machine


Pricing
When you start to calculate the cost of Azure Blob Storage you should take into account the following things:

  • Capacity (size)
  • Number of Transactions
  • Outbound traffic
  • Traffic between facilities (data centers)


Conclusion
Yes, Azure Table are a good option if we need a NoSQL solution to store content in (key,value) format. If you need more than that I recommend looking at DocumentDB. There are use cases when Azure Table is the perfect solution for you.
Personal note: Take into account the number of transactions that you are executing,  because it can affects the costs at the end of the month.

Comments

Popular posts from this blog

How to check in AngularJS if a service was register or not

There are cases when you need to check in a service or a controller was register in AngularJS.
For example a valid use case is when you have the same implementation running on multiple application. In this case, you may want to intercept the HTTP provider and add a custom step there. This step don’t needs to run on all the application, only in the one where the service exist and register.
A solution for this case would be to have a flag in the configuration that specify this. In the core you would have an IF that would check the value of this flag.
Another solution is to check if a specific service was register in AngularJS or not. If the service was register that you would execute your own logic.
To check if a service was register or not in AngularJS container you need to call the ‘has’ method of ‘inhector’. It will return TRUE if the service was register.
if ($injector.has('httpInterceptorService')) { $httpProvider.interceptors.push('httpInterceptorService&#…

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine:
threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See http://go.microsoft.com/fwlink/?LinkId=260882 for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration:

TeamCity.NET 4.51EF 6.0.2VS2013
It seems that there …

[Post-Event] Codecamp Conference Cluj-Napoca - Nov 19, 2016

Last day I was invited to another Codecamp Conference, that took place in Cluj-Napoca. Like other Codecamp Conferences, the event was very big, with more than 1.000 participants and 70 sessions. There were 10 tracks in parallel, so it was pretty hard to decide at  what session you want to join.
It was great to join this conference and I hope that you discovered something new during the conference.
At this event I talked about Azure IoT Hub and how we can use it to connect devices from the field. I had a lot of demos using Raspberry PI 3 and Simplelink SensorTag. Most of the samples were written in C++ and Node.JS and people were impressed that even if we are using Microsoft technologies, we are not limited to C# and .NET. World and Microsoft are changing so fast. Just looking and Azure IoT Hub and new features that were launched and I'm pressed (Jobs, Methods, Device Twin).
On backend my demos covered Stream Analytics, Event Hub, Azure Object Storage and DocumentDB.

Title:
What abo…