Friday, August 18, 2006

bcp BULK COPY

I don't usually like the all caps thing but "bulk copy" seems to me to deserve a more grandiose introduction. BULK COPY just seems to be more fitting to me. Imagine being able to copy everything out of your database with a simple little script. Think of the years you have spent amassing the gigabytes of information in your database. And here we come with a little command line tool that allows us to suck all of that data right out. I think that deserves all caps.

BULK COPY or bcp as it is known (and will be known for the rest of this article because that whole typing in caps thing is a pain) doesn't ever seem to get the love and respect it deserves. I think there are a couple of factors. First it is a command line utility and that just seems wrong to most people. As Microsoft has always made the database interaction available via a GUI, the idea of interaction with the db from the command line is foreign to most and down right scary to others. I like to count this as one of those rare times in technology where being older has it advantages. When I started working with DBMS' there were no GUI's. It was all command line, so bcp came very naturally to me and any DBA working in that era. We either worked on DBMS' that didn't provide a GUI or provided one that was so sore and painful to use that it was easier to chisel the commands into a stone tablet.

So my first order of business is to encourage you to not be afraid of the command line. I encourage you to use the osql utility provided by Microsoft to connect to you database and move around a little. It isn't as pretty, but it is fast and for short answers or quick lookups, it is much faster then opening the whole MS SQL interface. If you need help or direction using the osql utility, drop me a line or post a question and I will be happy to help.

So back to the topic at hand, bcp. The thing to keep in mind about the bcp utility is that it works only on a table by table basis. You can only bcp out or in one table at a time. This is extremely useful as most of the time you will use bcp to copy large amounts of data from your larger tables to a file to be used any number of ways. One of the most popular uses of bcp is copying large amounts of data from a production system to a development system. Bcp is a good choice for this as it is fast and very unobtrusive. Also because the bcp command only copies you don't run the risk of accidentally moving or deleting the production data. If you mess up a bcp command the worse thing that generally happens is you don't get the data that you thought you were going to get because either the bcp command failed or you got data from the wrong table. Both of these are easily addressed and cause no harm to the production data. ***Notice: I am not condoning or encouraging you to go and get production data with out the ok of the people that are responsible for the production data. *** (Personally I could care less, but professionally I have to pretend that I do).

The bpc command itself is very simple and looks something like this:
bcp .. [in/out] "" login and options.

So in the real world it might look like this:
bcp mikedb..table1 out "c:\bcpFiles\table1.bcp" -SmikeDBMSserver -Umike -Ppassword

The result of that command will be a file called table1.bcp in my bcpFiles folder on my c: drive. Pretty simple eh. So now if I want to copy the data back in, I simply change the out to an in and away we go. Now of course the table has to exist and it has to be the same table with the same attributes and all of that good stuff but again, that is why it is called BULK COPY.

OK so what if you want to copy all of the contents of all of the tables in a db. Why would you want to do that? Well again, there is the production to development reason but also there are times when a database is corrupt or in a state that won't allow a backup or a restore and is for the most part useless. How are you going to save your db. How about using bcp to get out all of the data you can, rebuilding the database (hopefully on better hardware) and then bcp'ing back in the data. (Of course you have to be careful of the evil identity column, but that is another post).

Here is a handy dandy script I use that will allow you to build the command to run a bcp out for all of your tables. It will allow you to save them into save them into a file that you can then run as a batch command.

--This will give a result set of the bcp commands which you can copy to a .bat file and run.
select 'bcp ' + db_name() + '..' + name + ' out "c:\bcpFiles\' + name + '.bcp" -S' + @@servername + ' -N -T' from sysobjects where xtype = 'U'

So the first thing that should jump out at you is that this is a query. Indeed it is and needs to be run as such. The output is what will be captured in the file and run as the batch file. The output would look like:

bcp mikedb..table1 out "c:\bcpFiles\table1.bcp" -SmikeDBMSserver -N -T
bcp mikedb..table2 out "c:\bcpFiles\table2.bcp" -SmikeDBMSserver -N -T
......
bcp mikedb..table(n) out "c:\bcpFiles\table(n).bcp" -SmikeDBMSserver -N -T

where -N= keep non-text native and -T= trusted connection. There are many options available all of which are listed by simply typing bcp at the command line.

So there you have it. A simple introduction to bcp and how it might be of use to you. There are many ways to use bcp and I touched on just one. Again, as I don't like to type and regurgitating information for the sake of watching myself type seems like a waste of time. Here is a great link that can help further your understanding and use of the bcp command line utility. I think their summary hits the nail on the head.

A DBA today may use DTS, Database Restore, and/or attaching and detaching databases to copy data from one server to another. All these tools will get the job done. However, if you want to copy large amounts of SQL Server data quickly, BCP is still one of the best tools available. Once you get past the initial learning curve, you will find that BCP offers a highly efficient way to copy data from one database to another.

http://www.databasejournal.com/features/mssql/article.php/3391761



-mike

2 comments:

Anonymous said...

so how about some handy dandy info on osql? seems interesting, I've never used it before.

wait, let me googel...

mhh, can set environmental variables to remember servers and login names

what default rows and columns should I use in my command line prompt for best use, 80 columns doesn't seem to be enough

Anonymous said...

OK, OK... If I gave the impression that the command line was better, I didn't mean to. I am just saying you don't have to be afraid of it and it does have some value.