Azure Tables Storage (Day 19 of 31)

List of all posts from this series: http://vunvulearadu.blogspot.ro/2014/11/azure-blog-post-marathon-is-ready-to.html

Short Description
Azure Table Services are part of Azure Storage and give us the possibility to store large amount of data in a structured way. It allow us to store collections of entities in so called table. If we look over this service from NoSQL side, we could say that so called table in Azure Table are the collections of NoSQL world.

Main Features
A part of the features of Azure Table are similar with Azure Blob Storage. This is happening because both services are constructed over the same infrastructure and have a common base.
Different entities schema in the same table
We are allowed to store under the same table entities with different schema. For example we can store entity ‘Car’ and ‘House’ in the same table as long as in the moment when we retrieve the entity we can detect the type of entity.
Size
The size of tables is unlimited. We can store large collection of entities in Azure Table without any kind of problem (the size can be reach more than 1 TB). The limit of the maximum size is the same for Azure Blob Storage – 500TB (for now).
RESTFull and OData
All the access is made over a REST API that support OData standard. This mean that we can query the storage very easily from any kind of device.
Tables Count
We don’t have a real limit of the number of tables that we can create under the same storage. Theoretically, we can create as many tables we want as long we have enough space.
Simple hierarchy
The hierarchy of Azure Table database is very simple. Under Storage Account we have 0..N tables. Each table can contains 0..M entities. Each entity can have from 3..256 properties.
Partition Key
It is used to group entities that are similar from a table. Partition key is used by Azure Table to partition the tables in different nodes when the table size is too big.
Row Key
The unique ID of an entity in the same partition. In a table, we can have multiple Row Key with the same value as long as the partition key for each entity is different.
Entity Key (ID)
The unique ID of each entity is constructed from partition key + row key. This key is unique per table and can be used to retrieve entities.
Timestamp
This is the property key that give us information when was the last time when the entity was changed. Can be used with success to detect if the entity value changes during a specific time period (ETag).
This value cannot be set. It is set automatically by the system. Any kind of changes of this property will be ignored by the system.
Batch Support
We have the ability to execute a batch over Azure Table. You can maximum have 100 actions in a batch. The changes from the batch can affect only on partition from a table.
Property Supported Types
The range of types supported by properties is very large, starting from INT and STRING to bool or array of bytes. Below you can find the list of all types that are supported:

byte[]
bool
DateTime
Double
Guid
String
Int32
Int64

Security
There are 3 different methods to manage the access to your blob content:

Shared Access Signature – you have full control to manage read, write, manage access at container and blob level for each item. In this way different users can have different access level
ACL – Similar with Shared Access Signature, but allow us better management mechanism. In this way for a specific key (SAS key) we can manage from the backend the access writes without having to revoke it.

Pagination Support
When we execute queries that retrieves large about of data we can use the continuous token to be able to iterate over all the entities retrieved by query. This token contains 3 parts:

NextTableName
NextPartitionKey
NextRowKey

Query
Azure Tables have query support. The queries support the fallowing components:

$filter – used to filter entities based on client rules
$top – returns the first N entities
$selected – returns only the property that client request

$filter
Has support for base filter rules like equal, greater, less, not equal, less or equal, grater and equal. On top of this we have support for Boolean operations (and, or, not).
Redundant
Azure Blob storage give us multiple options when we talk about redundancy. By default we have redundancy at data center level – this mean that all the time there are 3 copies of the same content (and you pay only one). On top of this there are other 3 options of redundancy that I will describe below (plus the one that we already talk about):

LRS (Local Redundant Storage) – Content is replicate 3 times in the same data center (facility unit from region)
ZRS (Zone Redundant Storage) – Content is replicated 3 times in the same regions cross 2 datacenters (facilities) where 2 datacenters are available. Otherwise the content is replicated 3 times into 2 different regions
GRS (Geo Redundant Storage) – Content is replicated 6 times across 2 regions (3 times in the same region and 3 times across different regions)
RA- GRS (Read Access Geo Redundant Storage) – Content is replicated in the same way as for GRS, but you have read only access in the second region. For the GRS even if the data exist in the second region you cannot access it directly.

Limitations

Like other NoSQL storages, Azure Table cannot manage complex joins, FK or store procedures. In this moment you have support for querying, but you should keep in mind that only on the index properties the query will work (fast).
Each entity can have maximum 256 property. The maxim size of an entity can be 1MB.
A batch can contains maximum 100 actions and can be executed only over the same partitions of a table, with the payload maximum size of 4MB.
The length of a table name can be between 3 and 64 characters.
Maximum size of Partition Key or Row Key is 1KB (each of them).

Applicable Use Cases
Below you can find 3 uses cases when I would use Azure Tables.
Logs
I would use Azure Table to store large amount of logs, splitter in different tables based on date and source. In this way the traceability step and cleaning steps are pretty simple.
Software updates assignments
If we have a system that needs to store the software updates assignments for each device than we could think to use Azure Tables. We can store in each table the assignments for each system. We can group the assignments of each system based on the applications type using partition key.
Content tags and description
If we are using Azure Blobs for example to store our binary content, than we may need a place where we would like to store the list of tags for each binary content, plus descriptions and other information. For this case we could use Azure Table with success.

Code Sample

// Retrieve storage account from connection string
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
    CloudConfigurationManager.GetSetting("StorageConnectionString"));

// Create the table client
CloudTableClient tableClient = storageAccount.CreateCloudTableClient();

//Create the CloudTable that represents the "people" table.
CloudTable table = tableClient.GetTableReference("people");

// Define the query, and only select the Email property
TableQuery<DynamicTableEntity> projectionQuery = new TableQuery<DynamicTableEntity>().Select(new string[] { "Email" });

// Define an entity resolver to work with the entity after retrieval.
EntityResolver<string> resolver = (pk, rk, ts, props, etag) => props.ContainsKey("Email") ? props["Email"].StringValue : null;

foreach (string projectedEmail in table.ExecuteQuery(projectionQuery, resolver, null, null))
{
    Console.WriteLine(projectedEmail);
}

Source: http://azure.microsoft.com/en-us/documentation/articles/storage-dotnet-how-to-use-tables/
Pros and Cons
Pros

Unlimited number of tables and entities
Maximum size of database very high (same as for Azure Storage)
Fast and easy to query
Querying over partition key and row key very fast (plus timestamp)
Cheap

Cons

Not all properties are indexed (only partition key and row key)
Limited number of properties per entity
Querying on properties that are not indexed required getting the entities from Azure to client machine

Pricing
When you start to calculate the cost of Azure Blob Storage you should take into account the following things:

Capacity (size)
Number of Transactions
Outbound traffic
Traffic between facilities (data centers)

Conclusion
Yes, Azure Table are a good option if we need a NoSQL solution to store content in (key,value) format. If you need more than that I recommend looking at DocumentDB. There are use cases when Azure Table is the perfect solution for you.
Personal note: Take into account the number of transactions that you are executing, because it can affects the costs at the end of the month.

Cloud Myths: Migrating to the cloud is quick and easy (Pill 2 of 5 / Cloud Pills)

The idea that migration to the cloud is simple, straightforward and rapid is a wrong assumption. It’s a common misconception of business stakeholders that generates delays, budget overruns and technical dept. A migration requires laborious planning, technical expertise and a rigorous process. Migrations, especially cloud migrations, are not one-size-fits-all journeys. One of the most critical steps is under evaluation, under budget and under consideration. The evaluation phase, where existing infrastructure, applications, database, network and the end-to-end estate are evaluated and mapped to a cloud strategy, is crucial to ensure the success of cloud migration. Additional factors such as security, compliance, and system dependencies increase the complexity of cloud migration. A misconception regarding lift-and-shits is that they are fast and cheap. Moving applications to the cloud without changes does not provide the capability to optimise costs and performance, leading to ...

Cloud as a Story - Vunvulea Radu

Search This Blog

Azure Tables Storage (Day 19 of 31)

Labels

Comments

Post a Comment

Popular posts from this blog

How to audit an Azure Cosmos DB

Cloud Myths: Cloud is Cheaper (Pill 1 of 5 / Cloud Pills)

Cloud Myths: Migrating to the cloud is quick and easy (Pill 2 of 5 / Cloud Pills)