4.1. System Sizing Considerations

We're sure that all readers of this book are hoping for a simple chart that will specify what kind of server machine they should buy. Sorry, it's not going to happen: The problem is too difficult to reduce to a simple matrix or tool. The best we can do is describe the different options and parameters, and the kinds of operations that will push you to require bigger and more expensive hardware.

There are four main factors that will push your project to more expensive hardware: data volumes, usage complexity, number of simultaneous users, and system availability requirements. These factors are illustrated in Figure 4.2, and discussed in the following sections.

4.1.1. Calculating Data Volumes

The first and most obvious characteristic of your DW/BI system that will affect your hardware purchases is data volumes. By the time you've finished your logical model and initial data profiling, you should have enough information to estimate how big your DW/BI system is going to be. Later in the project your DBAs will calculate database sizes in great detail, but for now you can just think about fact table row counts. Unless you have a monster dimension of 50–100 million rows, dimension sizes are insignificant.

For starters, just figure out what order of magnitude number of rows you'll have for your initial historical load of fact data. Multiply that number by 100 bytes, which is our generous rule-of-thumb for fact row sizes. You can get more precise if you ...

Get The Microsoft® Data Warehouse Toolkit: With SQL Server™ 2005 and the Microsoft® Business Intelligence Toolset now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.