The SQL Team blog just released a new blog on the new feature Columnstore Indexes in the new release of SQL Server 2011 (Denali). This is a new feature that organize indexed data column-wise instead of row-wise.
- Compression. Most general-purpose relational database management systems, including SQL Server, store data in row-wise fashion. This organization is sometimes called a row store. Both heaps and B-trees are row stores because they store the values from each column in a given row contiguously. When you want to find all the values associated with a row, having the data stored together on one page is very efficient. Storing data by rows is less ideal for compressing the data. Most compression algorithms exploit the similarities of a group of values. The values from different columns usually are not very similar. When data is stored row-wise, the number of rows per page is relatively few, so the opportunities to exploit similarity among values are limited. A column store organizes data in column-wise fashion. Data from a single column are stored contiguously. Usually there is repetition and similarity among values within a column. The column store organization allows compression algorithms to exploit that similarity.
- Fetching only needed columns. When data is stored column-wise, each column can be accessed independently of the other columns. If a query touches only a subset of the columns in a table, IO is reduced. Data warehouse fact tables are often wide as well long. Typical queries touch only 10 – 15% of the columns. That means a column store can reduce IO by 85 – 90%, a huge speedup in systems that are often IO bound, meaning the query speed is limited by the speed at which needed data can be transferred from disk into memory.
This will be interesting to try out when I have installed my Denali copy on a WMware instans.