20100529

How to Plan SQL Server Database Files

One of the most important tasks you can do in SQL Server is to setup your data and log files.  Not getting these files setup correctly can be one of the biggest causes for production problems whether it’s disk contention, space usage, or something else.  And honestly this level of planning is often overlooked and by the time the problem is discovered the application and its users have already suffered.  So let’s get into some good discussion about how to setup your database files.

Log files

We’re going to start with logs first because they’re the easiest and probably the ones you’ll touch most frequently.  And we need to talk about placing your log files on disk first.  In general, you’ll want to place your log files on a different physical partition than your data files.  This is for 2 reasons.  The first is disk contention.  Every transaction has to write something to the log file before it can write it to the data file and if both files are on the same disk, then the disk arm has to work twice has hard because it has to jump over here to write the log and then jump over there to write to the table.  Putting a log file on its own disk is also better for the performance of the log because logs write sequentially so if the log file is on its own disk the disk arm has very little moving to do to get to the next place it needs to write so it’s much faster.  The second reason you want data and log files on separate disks is for recovery.  If the data partition fails you’ll want the log on a separate set of disks so you can still recover the logs and roll the last transactions forward so you don’t lose any data.

Several log files

It’s a common misconception that you will get a performance gain by using several log files.  This is something we see quite often and it simply isn’t true.  Log files are written sequentially which means that each log file is filled up before the next one is written to.  So if you have 4 log files (Log1, Log2, Log3, Log4), SQL Server 2008 will fill up Log1, then fill up Log2, etc.  This is different from the way data files behave and we’ll discuss that in a minute.  The only reason to have multiple log files on multiple partitions is for space.  You may need more disk than a single partition can provide.  Oh, and putting multiple log files on a single partition is just dumb.  It gains you nothing.

Data files

Now let’s talk about where you’ll place your data files.  As we’ve already said it’s a good idea to place them on a different physical partition from your log files. And we know we already said that, but experience has shown us that we could fill an entire page with that exact advice and a good portion of you still won’t grasp the importance of it.  So we’re really going to hammer that point home in this article.
So while your data files are separated from your log files you may feel free to have multiple data files on multiple physical partitions to take advantage of performance gains.  Now, we have to say again that these have to be physical partitions because logical partitions are still on the same physical disk and it’s the disk arm contention you’re trying to avoid.  Data files, unlike log files, use what’s known as an equal fill algorithm.  This means that all of the files are filled equally as much as possible.  So let’s say you have 4 data files (Data1, Data2, Data3, Data4).  When you write to the database, SQL Server 2008 will write to these files in a round robin fashion and they should grow at more or less the same rate.  You can also place tables inside specific files so you have a good level of control over how you split up the I/O workload in your database.  And while the number of files you need to optimize your workload is a subject that’s up for debate in the community, for most systems it won’t make that much difference.  However, it is a good idea that no matter how many data files you decide to have, to make them all the same size.  This goes back to that equal fill algorithm we were talking about before.

File growth

Another good topic is file growth.  This is a mistake that many beginners make.  They accept the defaults for file growth and that’s just asking for trouble.  We’re going to give you some advice on how to set your file growth and you can adjust it to suit your needs, but at least you’ll know the arguments.
  1. The best thing to do is to set both your data and your log files as large as you ever want them to be.  Depending on your version of SQL Server, it can be very expensive to grow files so setting their size ahead of time can alleviate performance problems before they even start.  So if you’ve got a single partition dedicated to your log file, then go ahead and make your log the same size as your partition.  If it’s dedicated then you’ve got nothing to lose.  And the same goes for your data files.  If they’re on dedicated partitions (and they should be), then you’ve got nothing to lose.
  2. The next best thing is to set your files to autogrow by fairly large predictable increments.  The default autogrowth is 1MB for data files and 10% for log files.  I always grow data files by at least 1GB and often times even more.  If you’re going to suffer the expense of growing a file you don’t want to do it 5 times a day, so make it worth your while.  And 10% is too unpredictable.  As the log grows the 10% marker is going to get bigger too so you’ll actually be growing your log more and more each time.
  3. Set all your data files to grow at the same rate, and all of your log files to grow at the same rate.  Your data files don’t have to grow at the same rate as your log files, but they should grow at the same rate as each other.

No comments:

Post a Comment