Columnstore Indexing at SQL Saturday #104

SQL Server 2012 will give us more index options by adding the Columnstore index. This post is an overview of this new feature, and it’s also a prelude to the presentation I’m delivering on Columnstore indexing January 7, 2011 at SQL Saturday #104 in Colorado Springs, CO.

My presentation, Using Columnstore Indexes in SQL Server 2012, will include demos that cover the important things to know about Columnstore indexing. If you haven’t heard much about this type of index structure, here are a few things to know:

–          Intended for Data Warehousing, Reporting and Ad-hoc Queries
–          Does not use B-Tree structure like row indexing
–          Can only have 1 per table
–          Does not allow table to be updated
–          Best built with columns that have non-unique, repetitive data
–          More advantage is gained from bigger data
–          Column order doesn’t matter
–          Maximum benefit gained with queries that aggregate data
–          Don’t use for single-row type queries
–          Memory-intensive to build and execute

It’s Fast

Why build a Columnstore index? It can be very fast to query, much faster than the typical row-based indexing we are used to. They’ll make reporting faster, and they can be part of the solution for the elusive ad-hoc query challenge that power business users want.

Creating a Columnstore

A Columnstore can and probably should contain every column in the table with a few exceptions.  Unique data columns in some cases should be avoided.  Also, creating a Columnstore index requires a lot of memory, so you can reduce the stress on your system by not including every column.

Using the Index

The SQL Server optimizer will decide if it can benefit from using the index just like it does with other indexes. You’ll be able to check the execution plan to see if SQL Server decided to use it.

Here’s a preview of the type of comparisons, analysis and best practices I’ll be presenting at SQL Saturday #104. I’ll focus on comparing this new type of index with row-based indexes and I’ll help you determine when the right time to use it is.

This query uses the AdventureWorksDW and is a common type of query when using a dimensional model. It aggregates data from a fact table while joining to dimension tables:

SELECT dp.EnglishProductName, dp.ProductLine, dd.CalendarQuarter,
sum(SalesAmount) as TotalSales
FROM FactInternetSales f
inner join DimProduct dp on dp.ProductKey = f.ProductKey
inner join DimDate dd on dd.DateKey = f.OrderDateKey
where dd.CalendarYear = 2004
group by dp.EnglishProductName, dp.ProductLine, dd.CalendarQuarter
order by dp.EnglishProductName, dp.ProductLine, dd.CalendarQuarter

This is the I/O and CPU usage:

CPU time = 200 ms,  elapsed time = 493 ms.
Table ‘DimProduct’. Scan count 0, logical reads 516, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘FactInternetSales’. Scan count 1, logical reads 2062, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘DimDate’. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

And the execution plan which is showing us that SQL Server is doing a clustered index scan on the fact table:

Columnstore Query

I’ll run the same query after adding a Columnstore index that includes every column in the FactInternetSales table.  Below is an abbreviated version of the DDL:

CREATE NONCLUSTERED COLUMNSTORE INDEX CX_FACT_INTERNET_SALES ON dbo.FactInternetSales
(
ProductKey,
OrderDateKey,
DueDateKey,

)

Executing the same query produces the following results:

CPU time = 80 ms,  elapsed time = 343 ms.
Table ‘DimProduct’. Scan count 0, logical reads 516, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘FactInternetSales’. Scan count 1, logical reads 64, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘DimDate’. Scan count 1, logical reads 66, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

As you can see from the execution plan, the query optimizer used the Columnstore index this time.  The CPU utilization was reduced from 200ms to 80ms and the logical reads went from 2062 to 64 on the FactInternetSales table. The disclaimer is that this is not a scientific test, but it’s a quick preview of what Columnstore indexes can do.

SQL Saturday #104

This is a snippet of what you’ll see if you attend my presentation in Colorado Springs. I’ll use larger tables to demonstrate how to get the most out of Columnstore indexes, how to use them with a partial set of columns and with partitioning, and I’ll compare them to their equivalent row-based queries.

Advertisements

One response

  1. […] Click here for a blog I wrote earlier last week with more details about my session. I’ll see you in the Springs! Advertisement LD_AddCustomAttr("AdOpt", "1"); LD_AddCustomAttr("Origin", "other"); LD_AddCustomAttr("theme_bg", "ffffff"); LD_AddCustomAttr("theme_text", "656565"); LD_AddCustomAttr("theme_link", "707070"); LD_AddCustomAttr("theme_border", "c9d9e6"); LD_AddCustomAttr("theme_url", "516f86"); LD_AddCustomAttr("LangId", "1"); LD_AddCustomAttr("Autotag", "technology"); LD_AddCustomAttr("Tag", "columnstore-indexing"); LD_AddCustomAttr("Tag", "sql-saturday-104"); LD_AddCustomAttr("Tag", "sql-server-2012"); LD_AddSlot("wpcom_below_post"); LD_GetBids(); Like this:LikeBe the first to like this post. « SQL Saturday #104 has BI, Prizes and a Ski Trip […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: