missing millions of rows.

Feb 20, 2013 at 8:36 AM
I have a table with over 5 million rows. When I run through the scripting portion, it sometimes errors halfway through, like this:

1000 rows successfully bulk-copied to host-file. Total received: 2721000
SQLState = S1000, NativeError = 0
Error = [Microsoft][SQL Server Native Client 11.0]Protocol error in TDS stream
SQLState = 08S01, NativeError = -2147467259
Error = [Microsoft][SQL Server Native Client 11.0]The connection is no longer usable because the server response for a previously executed statement was incorrectly formatted.
2721614 rows copied.

sometimes no errors, it just ends at some spot. It's often never even close to 5 million.

Help!
Coordinator
Feb 20, 2013 at 8:06 PM
Hi,

That looks like you are getting throttled. I will do a couple of test on my end and see if I can't figure out why SQLAzureMW is giving up. One of the things that I would have you look at is in SQLAzureMW.exe.config file. Edit the file and look for this section:
<add key="TimeBetweenChunks" value="15000"/>                            <!-- Number of miliseconds to sleep between BCP chuncks -->
<add key="ChunkSize" value="500000"/>                                           <!-- Number of records to process per BCP itteration (Uploading to Azure) -->
<add key="BCPArgsIn" value="{0} in {1} -E -n -b 10000 -a 16384"/>  <!-- BCP in command parameters -->
<add key="BCPArgsOut" value="&quot;{0}&quot; out {1} -E -n"/>     <!-- BCP out command parameters -->
You will note three key areas:

TimeBetweenChunks -- 15000 translates to 15 seconds.
ChunkSize -- This is the total number of records you want to BCP to upload in one try. In this case 500,000 records.
BCPArgsIn -- This is the batch size before BCP send a commit. In this case, BCP commits every 10,000 records.

You can try changing the chunk size to something smaller (say 250000) and change the batch (-b) size to 1000 and see what happens. Also, check in BCPArgsIn and BCPArgsOut and see if you have the -n parameter set? I might have left it at -w. -n is more efficient.

Also, something that will save you time is after the export process, click on the script tab and then hit Save button. Save your scripts. This way, if the import process dies a most horrible death you don't have to go through the export process again, just select the option to run TSQL without Analyzing and select the script that you saved.

So far, in my tests, I have uploaded a little over a million records (from Nashville to the Chicago datacenter). Out of curiosity, which Azure datacenter are you running out of?

Regards,
George
Feb 20, 2013 at 9:15 PM
Good call on the config values. I noticed those, but thought I might not need to change anything. Will look at that.

You might be right about throttling. I moved the tool to a vm on aws because I wondered if being closer to the source might make a difference. Turns out I was right, and it worked.

we are in the W1B datacenter
Feb 20, 2013 at 9:27 PM
Edited Feb 20, 2013 at 9:29 PM
one strange thing. we have one table, with 4032 rows. all but one make it over when we export. any suggestions on what to look for on figuring out why this one didn't make it?

the results show it should have copied!

2/20/2013 9:32:18 AM --> Copied 4032 of 4032 (100%)
Coordinator
Feb 21, 2013 at 5:40 PM
Well, good question and I don't have a good answer. I take it that you did a SELECT COUNT() from your source table and verified that you actually had 4032 records? And then did a SELECT COUNT() from your SQL Database (Azure) table and verified that you only had 4031 records. One of the things that you can do is create a dummy database on your local machine create the table in the dummy database. Since you have the BCP files on your machine, you should be able to just run BCP from a command prompt and upload the 4032 records into the dummy database and then do a SELECT COUNT(*) and see what you get. IF you find that you have 4032 records (like you would expect), you could do the same simple test in SQL Database and see what you get. If the data is not top secret, you could share the data with me and the create table script and I can do some testing on my machine and SQL Database.

Regards,
George