Performance

Jun 7, 2013 at 11:55 AM
Hi there,

I am curious to how you achieve so excellent performance on Azure - what is the secret in short terms? I know I can download the source code and try to find out myself, but I am very anxious to research more how this great performance can be achieved.

When working on SQL Azure "normally", the performance for large OLTP datasets is terrible - and when doing large inserts/updates, I am often being meet be the throttling logic hence locking me out for a random amount of time.

Thanks in advance for sharing the secrets within - and do keep up the good work :-)
Coordinator
Jun 7, 2013 at 3:34 PM
Hi,

This might be a great question to ask on the Azure forums: http://social.msdn.microsoft.com/Forums/en-US/ssdsgetstarted/threads

That said, in regards do SQLAzureMW the first part of the program just creates databases, tables, stored procedures ... etc. That in and of itself is pretty fast. The problem comes in when doing inserts, updates, reads, and deletes. In the case of SQLAzureMW, I am using BCP and it is about the fastest way to upload data. What else I do is kick off parallel BCP threads to upload data. Inside of SQLAzureMW.exe.config file, you can set the man number of parallel threads for the upload process. This is how I get most of my speed. The default is 8 threads. Note that I have uploaded some large databases and it has taken many days to upload tables with several hundred million records.

Part of the problem is getting large amounts of data from on-premise database to a Azure data center. One thing that you can do here is compress your data, upload it to a IaaS VM (that has SQL Server 2008 R2 SP1 or greater) and then run SQLAzureMW on that VM to upload data to SQL Database. Note, just because you do the import process from IaaS VM in the same datacenter as your SQL Database does not mean that you won't be throttled. You will still be throttled so it is very important that you have retry logic built into your code.

Throttling by SQL Database (Azure) has always been kind of a pain, and what I did was implement something similar to MSDN Transient Fault Handling Application Block http://msdn.microsoft.com/en-us/library/hh680934(v=PandP.50).aspx. It really simplifies the retry process.

Now, I have my own application running in Windows Azure which talks to a SQL Database. I have around 20 tables but I have close to 600 million rows in a table. In my case, I can have around 57,000,000 records uploaded per day. These records comes from multiple sources and I uses a Windows Azure worker role for my upload point to avoid having 1000s of clients trying to directly stuff my database with records. In my worker role, I batch up around 10,000 to 15,000 records into a single merge process (see: http://msdn.microsoft.com/en-us/library/bb510625.aspx) using Table Valued Parameters (see: http://msdn.microsoft.com/en-us/library/bb510489.aspx). If you have not played with these, check it out. I love them and use it all the time.

The key thing is to have a very well tuned database and make sure that you are hitting indexes, avoid deadlocks, avoid heavy use of temp tables and avoid long transactions. I do 99% of my tuning on a local SQL Server database (look at reads, writes, CPU ...). Once I have a rock solid db, then I migrate it.

Anyway, this is just my $0.02 worth. I would love to hear from others on their words of wisdom :-).

I hope this helps,
George
Jun 10, 2013 at 8:58 AM
Hi George,

Thank you so much for your feedback - it is highly appreciated. I will try to investigate some more on BCP and your hint regarding Table Valued parameters as the current implementation with heavy OLTP usage is simply just plain awful in performance. Again, mind you this is not the case on-premise.

To give you some insight, I use the ThreadPool to execute SQL Scripts towards Azure. Very soon this is being detected by the Azure throttling service and in worst case the SQL Server instance is shutted down for aprox. 60 seconds before started again. This is not the case with your BCP approach why I find it very interesting .. and also, it is at least 10-20 times faster than the other approach (probably because of many transactions, throttling and the likes hereof).

Lastly - in regards to the Transient Fault handling, this is already implemented as this was one of my first .. hmm .. WTF experience, as this was never and issue with on-premise servers. Cloud services (Azure, Amazon, Google, etc.) is here to stay and I happy for all the new knowledge gathered by using them.

Once again - thank you for taking time to give valued feedback - we can all learn from you, George.
Coordinator
Jun 10, 2013 at 2:52 PM
I hear you around performance. One big difference between on-premise SQL Server and Windows Azure SQL Database is that SQL Database is a shared database and thus everybody is sharing the same resources on a commodity class processor. IMO, SQL Database is great for mom and pop style databases but if an enterprise wants to take a large database from a enterprise class server to the cloud (on a commodity class server), they will be quite disappointed in performance. The general rule of thumb was to always throw hardware at databases (bigger and bigger box) and scale out the web farm with commodity processors. With the SQL Database, it is really going to be all about scaling out as well. When I wrote my first application using Azure, I ran into a couple of your WTF moments (too funny). I wrote a very high I/O application and within the first minute I ran into the throttling issue. I did a lot of tuning and my final application had 4 medium size Azure worker roles (running 10 threads each) going against one Azure SQL Database. I had specific tasks that needed to be completed and one task would take 7 days to run. I was so worried about paying for space, that as soon as my task was done, I would save my data off to my local machine and then do my BI on the data. To make a long story short, when I purged my Azure SQL Database, I also purged my local machine and lost 7 days of data. Freaked me out and I needed the data BAD. I could not afford to run another 7 days to get my data, so I spent all night and redesigned my architecture to scale out the database. My final configuration was 125 Azure worker roles with 64 Azure SQL Databases. I reran my task and it completed in 4 hours. Anyway, long story short, if you need high I/O, look at scaling out. If you can't scale out and SQL Database can't handle your database workload, you might want to look at using Azure IaaS and host your own SQL Server.

Also, something else that I do. I know in some cases that I am always doing a set of queries back to back. What I do is put the queries into a single batch and just return multiple result sets. This cuts back on the number of connections I need to make to the database server. I also rely on caching. The key note here is to reduce as many trips to the database as possible. Note that even if you cache for one or two minutes, on a very high I/O application, you are going to save yourself a lot of trouble by avoiding a lot of performance issues.

I hope this helps.

Regards,
George
Jun 10, 2013 at 3:54 PM
Edited Jun 10, 2013 at 3:58 PM
ghuey wrote:
Note that even if you cache for one or two minutes, on a very high I/O application, you are going to save yourself a lot of trouble by avoiding a lot of performance issues.
We have built an API following the REST principles, so hopefully the consuming clients will acknowledge the caching headers sent, otherwise several of the service definitions is cached in memory. Like yourself, I soon realized that SQL Azure is not for everybody - and for enterprise like companies it is a no go. Long story short though, we where transferred to a new hosting provider that was just plain unprofessional and the performance was close to Azure (and here we talk on-premise in the new hosting center). I then advocated for a pure cloud solution which have the benefits of being the future - now.

All the challenges we must just learn and adopt from .. but if anyone ask me for SQL Azure I will hold my claim that it is not for everyone - your example was brilliant with the mom-pop style :-)

I have considered the reserved SQL Azure, but it is to expensive and a little to obvious a move for Microsoft to do. Federal services also got a look, but the idea of chunking your databases to several instances is a bit of a pain in the b***.

Again, thanks for your input - i cannot state how appreciated it is.

Cheers,

Michael

PS
Right now I am investigating if SqlBulkCopy is the way to go - as I can understand on the web, it is the engine behind BCP? If this is the right way to go, do you have a hint for how to transfer data when having parent-child relations? To elaborate, its the classic model. PK [id] int, FK [parentId] int, where I do not know the [parentId] when initializing a reader or datatable. Don't use time on it - just if you have a hint or two right out of your mind - or a link.

PPS
Respect for your work - I have used your program quite a lot over the last year.