Skip to main content

How we can remove millions of entities from a Windows Azure Table (part 2)

In one of my latest post I tacked about how we can remove a lot of entities from Windows Azure – deleting the table.
This solution works but we will have to face up with an odd problem. When we are deleting a table, we only mark it for deletion, our client will not be able to access the table and in background Windows Azure will start deleting the table. If we have a table with millions of entities will face up with a big problem. Until Windows Azure will finish deletion of the table will not be able to create a new table with the same time.
This can be a blocker. We don’t want to wait 4 hours, until our table is deleted to be able to recreate another one.
And here is a big BUT. We can be smart and use ListTables. Basically we can create another table that have the name something like [MyTableName]+[Guid] and using ListTable method to retrieve the name of our table only using [MyTableName].
Using this solution, we can add mark our old table for delegation and create a new table with a similar name where we can store our entities. In this way we will be able to remove items from a table in only a few seconds.  Even if this “virtual” clean it works great and can speed up our applications.
The following code can be used to delete a table:
CloudTableClient tableStorage = new CloudTableClient([tableUri],[credential]);
string tableName = [MyTableName];
tableName = tableStorage.ListTables(tableName).FirstOrDefault();
if (tableName != null)
{
  tableStorage.DeleteTableIfExist(tableName);
}
The following code can be used to retrieve the name of our table and recreate a new table of don’t exist:
string tableName = tableStorage.ListTables([MyTableName]).FirstOrDefault();
if (tableName == null)
{
  tableName = [MyTableName] + Guid.NewGuid();
  tableStorage.CreateTableIfNotExist(tableName);
}

The only downside of this solution is that there are some seconds when we will not have a table where clients can access it. If we can leave with this solution and we can manage this problem on the client side than we can use it without any kind of problem.

Comments

  1. The question is: why/when would you need to do something like this is a real application? :)
    (sure, for testing/debugging scenarios it might be useful, but other than that..)

    ReplyDelete
    Replies
    1. There are times when you need to clean a table very fast. For example a table that store the assignment between updates and system where this update is available.

      Delete

Post a Comment

Popular posts from this blog

ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded

Today blog post will be started with the following error when running DB tests on the CI machine:
threw exception: System.InvalidOperationException: The Entity Framework provider type 'System.Data.Entity.SqlServer.SqlProviderServices, EntityFramework.SqlServer' registered in the application config file for the ADO.NET provider with invariant name 'System.Data.SqlClient' could not be loaded. Make sure that the assembly-qualified name is used and that the assembly is available to the running application. See http://go.microsoft.com/fwlink/?LinkId=260882 for more information. at System.Data.Entity.Infrastructure.DependencyResolution.ProviderServicesFactory.GetInstance(String providerTypeName, String providerInvariantName) This error happened only on the Continuous Integration machine. On the devs machines, everything has fine. The classic problem – on my machine it’s working. The CI has the following configuration:

TeamCity.NET 4.51EF 6.0.2VS2013
It seems that there …

Fundamental Books of a Software Engineer (version 2018)

More then six years ago I wrote a blog post about fundamental books that any software engineer (developer) should read. Now it is an excellent time to update this list with new entries.

There are 5 different categories of books, that represent the recommended path. For example, you start with Coding books, after that, you read books about Programming, Design and so on.
There are some books about C++ that I recommend not because you shall know C++, only because the concepts that you can learn from it.

Coding

Writing solid codeCode completeProgramming Pearls, more programming pearls(recommended)[NEW] Introduction to Algorithms

Programming

Refactoring (M. Fowler)Pragmatic ProgrammerClean code[NEW] Software Engineering: A Practitioner's Approach[NEW] The Mythical Man-Month[NEW] The Art of Computer Programming

Design

Applying UML and Patterns (GRASP patterns)C++ coding standards (Sutter, Alexandrescu)The C++ programming language (Stroustrup, Part IV)Object-oriented programming (Peter Coad)P…

Entity Framework (EF) TransactionScope vs Database.BeginTransaction

In today blog post we will talk a little about a new feature that is available on EF6+ related to Transactions.
Until now, when we had to use transaction we used ‘TransactionScope’. It works great and I would say that is something that is now in our blood.
using (var scope = new TransactionScope(TransactionScopeOption.Required)) { using (SqlConnection conn = new SqlConnection("...")) { conn.Open(); SqlCommand sqlCommand = new SqlCommand(); sqlCommand.Connection = conn; sqlCommand.CommandText = ... sqlCommand.ExecuteNonQuery(); ... } scope.Complete(); } Starting with EF6.0 we have a new way to work with transactions. The new approach is based on Database.BeginTransaction(), Database.Rollback(), Database.Commit(). Yes, no more TransactionScope.
In the followi…