Sunday, December 27, 2009

Bisnis Pulsa Elektrik - Harga Murah Bersahabat

"Selamat Datang di Website NataTronik | Natatronik Reloads"

Type your summary here

Type rest of the post here

Saturday, December 13, 2008

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns By Jeff Moden, 2008/08/19

Do you get stuck to find the way converting rows to collumns ?? so.. this is the sollutions :

Introduction:


(This article is dedicated to a good friend and fellow T-SQL warrior, Katrina Wright. We've fought and won many battles together.)


I looked for a definition of what a "Cross Tab" actually is and, after a slight modification, couldn't find a better one than what's in SQL Server 2000 Books Online...


"Sometimes it is necessary to rotate results so that [the data in] columns are presented horizontally and [the data in] rows are presented vertically. This is known as creating a PivotTable®, creating a cross-tab report, or rotating data."




In other words, you can use a Cross Tab or Pivot to convert or transpose information from rows to columns either for reporting or to convert some special long skinny tables known as EAV's or NVP's into a more typical form data.


The purpose of this article is to provide an introduction to Cross Tabs and Pivots and how they can be used to "rotate" data...


Before you say anything...


The reason I'm writing a series of articles on the simple concept of Cross Tabs and Pivots is because of the recent number of requests for this type of information on the SQL Server Central forums... there was a while when not a day went by when two or three such requests were posted each day.


Also, yes, I aware that a lot of this type of "formatting" should be done in the GUI, reporting tool, or maybe even a Spreadsheet. I'm also aware that using EAV/NVP tables isn't considered to be a "best practice". But, like I said about the number of recent number of posts, folks get forced into a corner by their bosses and, if they have to do such a thing, I thought they could use a little help.


Notes of Interest:


I wrote all of the example code and data using Temp Tables just to be safe. Sure, I could have used Table Variables, but they don't really allow for people to do partial runs and they don't all people to look and see what's in the Table Variable after each section. Also, some of the data we'll end up using is a wee bit bigger than what I would normally use a table variable for.


Last but not least, I currently only have SQL Server 2000 and 2005 installed. I indicate which rev each section of code will run on in parenthesis. I'm pretty sure that most of this will work on 2008 and that a good portion of the code for Cross Tabs will also work on 7... but I don't have access to either which means I haven't tested it.


Also, for your convenience, all of the code has been attached in the "Resources" section near the end of the article.



Ok... let's get started...


A simple introduction to Cross Tabs:


The Cross Tab Report example from Books Online is very simple and easy to understand. I've shamelessly borrowed from it to explain this first shot at a Cross Tab.


The Test Data


Basically, the table and data looks like this...



--===== Sample data #1 (#SomeTable1)
--===== Create a test table and some data
CREATE TABLE #SomeTable1
(
Year SMALLINT,
Quarter TINYINT,
Amount DECIMAL(2,1)
)
GO
INSERT INTO #SomeTable1
(Year, Quarter, Amount)
SELECT 2006, 1, 1.1 UNION ALL
SELECT 2006, 2, 1.2 UNION ALL
SELECT 2006, 3, 1.3 UNION ALL
SELECT 2006, 4, 1.4 UNION ALL
SELECT 2007, 1, 2.1 UNION ALL
SELECT 2007, 2, 2.2 UNION ALL
SELECT 2007, 3, 2.3 UNION ALL
SELECT 2007, 4, 2.4 UNION ALL
SELECT 2008, 1, 1.5 UNION ALL
SELECT 2008, 3, 2.3 UNION ALL
SELECT 2008, 4, 1.9
GO




Every row in the code above is unique in that each row contains ALL the information for a given quarter of a given year. Unique data is NOT a requirement for doing Cross Tabs... it just happens to be the condition that the data is in. Also, notice that the 2nd quarter for 2008 is missing.


The goal is to make the data look more like what you would find in a spreadsheet... 1 row for each year with the amounts laid out in columns for each quarter with a grand total for the year. Kind of like this...


... and, notice, we've plugged in a "0" for the missing 2nd quarter of 2008.


Year   1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total 
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 0.0 2.3 1.9 5.7


The KEY to Cross Tabs!


Let's start out with the most obvious... we want a Total for each year. This isn't required for Cross Tabs, but it will help demonstrate what the key to making a Cross Tab is.



To make the Total, we need to use the SUM aggregate and a GROUP BY... like this...


--===== Simple sum/total for each year
SELECT Year,
SUM(Amount) AS Total
FROM #SomeTable1
GROUP BY Year
ORDER BY Year


And, that returns the following...


Year   Total                                    
------ ----------------------------------------
2006 5.0
2007 9.0
2008 5.7


Not so difficult and really nothing new there. So, how do we "pivot" the data for the Quarter?


Let's do this by the numbers...



  1. How many quarters are there per year? Correct, 4.

  2. How many columns do we need to show the 4 quarters per year? Correct, 4.

  3. How many times do we need the Quarter column to appear in the SELECT list to make it show up 4 times per year? Correct, 4.

  4. Now, look at the total column... it gives the GRAND total for each year. What would we have to do to get it to give us, say, the total just for the first quarter for each year? Correct... we need a CASE statement inside the SUM.



Number 4 above is the KEY to doing this Cross Tab... It should be a SUM and it MUST have a CASE to identify the quarter even though each quarter only has 1 value. Yes, if each quarter had more than 1 value, this would still work! If any given quarter is missing, a zero will be substituted.


To emphasize, each column for each quarter is just like the Total column, but it has a CASE statement to trap info only for the correct data for each quarter's column. Here's the code...


--===== Each quarter is just like the total except it has a CASE
-- statement to isolate the amount for each quarter.
SELECT Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS [1st Qtr],
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS [2nd Qtr],
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS [3rd Qtr],
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS [4th Qtr],
SUM(Amount) AS Total
FROM #SomeTable1
GROUP BY Year




... and that gives us the following result in the text mode (modified so it will fit here)...


Year   1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 .0 2.3 1.9 5.7

Also notice... because there is only one value for each quarter, we could have gotten away with using MAX instead of SUM. Go ahead... try it. We'll use a similar method for normalizing an EAV table in the future.


For most applications, that's good enough. If it's supposed to represent the final output, we might want to make it a little prettier. The STR function inherently right justifies, so we can use that to make the output a little prettier. Please, no hate mail here! I'll be one of the first that formatting of this nature is supposed to be done in the GUI!


--===== We can use the STR function to right justify data and make it prettier.
-- Note that this should really be done by the GUI or Reporting Tool and
-- not in T-SQL
SELECT Year,
STR(SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END),5,1) AS [1st Qtr],
STR(SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END),5,1) AS [2nd Qtr],
STR(SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END),5,1) AS [3rd Qtr],
STR(SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END),5,1) AS [4th Qtr],
STR(SUM(Amount),5,1) AS Total
FROM #SomeTable1
GROUP BY Year


The code above gives us the final result we were looking for...


Year   1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total 
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 0.0 2.3 1.9 5.7

Just to emphasize what the very simple KEY to making a Cross Tab is... it's just like making a Total using SUM and Group By, but we've added a CASE statement to isolate the data for each Quarter.


A simple introduction to Pivots:


Microsoft introduced the PIVOT function in SQL Server 2005. It works about the same (has some limitations) as a Cross Tab. Using the same test table we used in the Cross Tab examples above, let's see how to use PIVOT to do the same thing...


--===== Use a Pivot to do the same thing we did with the Cross Tab
SELECT Year, --(4)
[1] AS [1st Qtr], --(3)
[2] AS [2nd Qtr],
[3] AS [3rd Qtr],
[4] AS [4th Qtr],
[1]+[2]+[3]+[4] AS Total --(5)
FROM (SELECT Year, Quarter,Amount FROM #SomeTable1) AS src --(1)
PIVOT (SUM(Amount) FOR Quarter IN ([1],[2],[3],[4])) AS pvt --(2)
ORDER BY Year


Ok... let's break that code down and figure out what each part does... the items below have numbers in the code above so you can more easily see what's going on...



  1. The FROM clause is actually a derived table. It very simply contains the columns that we want to use in the cross tab from the source table we want to use the pivot on. It will sometimes work as a normal FROM clause with just the table listed instead of a derived table, but most of the time it will not and is unpredictable when it does work.

  2. This is the "Pivot" line. It identifies the aggregate to be used, the column to pivot in the FOR clause, and the list of values that we want to pivot in the IN clause... in this case, the quarter number. Also notice that you must treat those as if they were column names. They must either be put in brackets or double quotes (if the quoted identifier setting is ON).

  3. This is the pivoted SELECT list. Notice that you have to bring everything in the IN clause from (2) up to the SELECT list. Aliasing the column names is optional but usually a good thing to do just to make the output obvious.

  4. You must also bring Year up as the row identifier in the pivot. Think of this as your "anchor" for the rows.

  5. Last but not least, if you want a total for each row in the pivot, you can no longer use just an aggregate. Instead, you must add all the columns together.



When you run the code, you get this...


Year   1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 NULL 2.3 1.9 NULL

Notice the NULL's where there are no values or where a NULL has been added into a total. Remember that anything plus a NULL is still a NULL. All of this occurs because the Pivot doesn't do any substitutions like the Case statements we used in the Cross Tab. To fix this little problem, we have to use COALESCE (or ISNULL) on the columns... every bloody column! So, you end up with code that looks like this...



--===== Converting NULLs to zero's in the Pivot using COALESCE
SELECT Year,
COALESCE([1],0) AS [1st Qtr],
COALESCE([2],0) AS [2nd Qtr],
COALESCE([3],0) AS [3rd Qtr],
COALESCE([4],0) AS [4th Qtr],
COALESCE([1],0) + COALESCE([2] ,0) + COALESCE([3],0) + COALESCE([4],0) AS Total
FROM (SELECT Year, Quarter,Amount FROM #SomeTable1) AS src
PIVOT (SUM(Amount) FOR Quarter IN ([1],[2],[3],[4])) AS pvt
ORDER BY Year





That finally gives us the same result as a Cross Tab sans any right hand justification... again, you'd need to add the STR function to the code to do that.


Year   1st Qtr 2nd Qtr 3rd Qtr 4th Qtr Total
------ ------- ------- ------- ------- -----
2006 1.1 1.2 1.3 1.4 5.0
2007 2.1 2.2 2.3 2.4 9.0
2008 1.5 .0 2.3 1.9 5.7

Readability Comparison


Just for grins, here are both the Cross Tab and the Pivot code real close together so that you can do a comparison...


--===== The Cross Tab example 
SELECT Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS [1st Qtr],
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS [2nd Qtr],
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS [3rd Qtr],
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS [4th Qtr],
SUM(Amount) AS Total
FROM #SomeTable1
GROUP BY Year

--===== The Pivot Example
SELECT Year,
COALESCE([1],0) AS [1st Qtr],
COALESCE([2],0) AS [2nd Qtr],
COALESCE([3],0) AS [3rd Qtr],
COALESCE([4],0) AS [4th Qtr],
COALESCE([1],0) + COALESCE([2] ,0) + COALESCE([3],0) + COALESCE([4],0) AS Total
FROM (SELECT Year, Quarter,Amount FROM #SomeTable1) AS src
PIVOT (SUM(Amount) FOR Quarter IN ([1],[2],[3],[4])) AS pvt
ORDER BY Year




I'm sure that you'll have a preference, but I like the Cross Tab code better for two reasons... the Cross Tab code is simpler, in my eyes... all I have to remember how to do are those very simple Case statements, I only have to list the values of the pivot columns once, and I don't have to use COALESCE anywhere. The second reason is how simple it is to do a row total.


There's actually several other reasons and one of them is performance. We'll get to performance later, but first let's talk about...


Multiple Aggregations In a Cross Tab (or, "The Problem with Pivots")


We're going to do this section backwards from what we've been doing... we're going to cover how to Pivot multiple aggregations before we cover the equivalent Cross Tab.


A "multiple aggregation Pivot" is just that... we want to show two different aggregates in the Pivot something like this (notice both the Qty and Amt columns have been aggregated)...


Company Year   Q1Amt Q1Qty Q2Amt Q2Qty Q3Amt Q3Qty Q4Amt Q4Qty TotalAmt TotalQty
------- ------ ----- ----- ----- ----- ----- ----- ----- ----- -------- --------
ABC 2006 1.1 2.2 1.2 2.4 1.3 1.3 1.4 4.2 5.0 10.1
ABC 2007 2.1 2.3 2.2 3.1 2.3 2.1 2.4 1.5 9.0 9.0
ABC 2008 1.5 5.1 0.0 0.0 2.3 3.3 1.9 4.2 5.7 12.6
XYZ 2006 2.1 3.6 2.2 1.8 3.3 2.6 2.4 3.7 10.0 11.7
XYZ 2007 3.1 1.9 1.2 1.2 3.3 4.2 1.4 4.0 9.0 11.3
XYZ 2008 2.5 3.9 3.5 2.1 1.3 3.9 3.9 3.4 11.2 13.3


This type of Pivot is a common request so that both aggregates can be viewed for the same time period at the same time. Otherwise, you'd have two completely separate Pivots and you'd have to visually scan back and forth to make simple comparisons. As you'll see the "Problem with Pivots" is that each Pivot can only aggregate one column. To do something like this using Pivots, you have two use two Pivots.


The Test Data


Before we begin, we need some data to test with...


--===== Sample data #2 (#SomeTable2)
--===== Create a test table and some data
CREATE TABLE #SomeTable2
(
Company VARCHAR(3),
Year SMALLINT,
Quarter TINYINT,
Amount DECIMAL(2,1),
Quantity DECIMAL(2,1)
)
GO
INSERT INTO #SomeTable2
(Company,Year, Quarter, Amount, Quantity)
SELECT 'ABC', 2006, 1, 1.1, 2.2 UNION ALL
SELECT 'ABC', 2006, 2, 1.2, 2.4 UNION ALL
SELECT 'ABC', 2006, 3, 1.3, 1.3 UNION ALL
SELECT 'ABC', 2006, 4, 1.4, 4.2 UNION ALL
SELECT 'ABC', 2007, 1, 2.1, 2.3 UNION ALL
SELECT 'ABC', 2007, 2, 2.2, 3.1 UNION ALL
SELECT 'ABC', 2007, 3, 2.3, 2.1 UNION ALL
SELECT 'ABC', 2007, 4, 2.4, 1.5 UNION ALL
SELECT 'ABC', 2008, 1, 1.5, 5.1 UNION ALL
SELECT 'ABC', 2008, 3, 2.3, 3.3 UNION ALL
SELECT 'ABC', 2008, 4, 1.9, 4.2 UNION ALL
SELECT 'XYZ', 2006, 1, 2.1, 3.6 UNION ALL
SELECT 'XYZ', 2006, 2, 2.2, 1.8 UNION ALL
SELECT 'XYZ', 2006, 3, 3.3, 2.6 UNION ALL
SELECT 'XYZ', 2006, 4, 2.4, 3.7 UNION ALL
SELECT 'XYZ', 2007, 1, 3.1, 1.9 UNION ALL
SELECT 'XYZ', 2007, 2, 1.2, 1.2 UNION ALL
SELECT 'XYZ', 2007, 3, 3.3, 4.2 UNION ALL
SELECT 'XYZ', 2007, 4, 1.4, 4.0 UNION ALL
SELECT 'XYZ', 2008, 1, 2.5, 3.9 UNION ALL
SELECT 'XYZ', 2008, 2, 3.5, 2.1 UNION ALL
SELECT 'XYZ', 2008, 3, 1.3, 3.9 UNION ALL
SELECT 'XYZ', 2008, 4, 3.9, 3.4
GO




The Multi-Aggregate Pivot


Like I said... we'll do the Pivot first this time... then we'll show you how easy it is to do using a Cross Tab.


In order to do a single Pivot, you have to have a derived table and a Pivot clause. The "Problem with Pivots" is that you can only Pivot one aggregate per Pivot clause. If you want to Pivot two aggregates as shown at the beginning of this section, you have to make two Pivots and join them as well as adding the necessary columns to the Select list. You already know how to use a single Pivot... Here's how we would do a double Pivot using the data above...


--===== The "Problem with Pivots" is you need to do one Pivot for each aggregate.
-- This code Pivots the Amt and Qty values by quarter.
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q4Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
FROM (SELECT Company, Year, Quarter, Amount FROM #SomeTable2) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM #SomeTable2) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year


I don't know about you, but my personal feeling is that's starting to look a bit complicated and it's starting to be more difficult to read. Certainly, if we tried to convert this to dynamic SQL, you'd have your work cut out for you.


Notice that the FROM clause has two nearly identical derived tables and the only difference in the Pivot clauses are the columns being SUMmed. And, take a look at the row totals in the Select list... thank goodness this example only has 4 columns each for Quantity and Amount.


The Multi-Aggregate Cross Tab


We saw how complicated Multi-Aggregate Pivots can get. And the example above was just for two 4 column aggregates... just image what it might look like for three 12 column aggregates!


Let's see how complicated it might be in a Cross Tab...ready?


--===== Doing multiple aggregations in Cross Tabs is as simple as CPR
-- (CPR = Cut, Paste, Replace). AND, the table is "dipped" only
-- once instead of twice so there are NO JOINS to worry about!
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
FROM #SomeTable2
GROUP BY Company, Year
ORDER BY Company, Year


How easy is that!? There're no derived tables... no fancy Pivot clauses... no huge lines of code to make simple row totals... and no joins!. It's a breeze to make using a little CPR (Copy, Paste, Replace).


Go back and compare the incredible simplicity of this Cross Tab with the relatively complex Pivot code to do the same thing. I don't know about you, but I won't be using Pivot to do such a simple thing.


"Pre-Aggregation"


I found something very handy in the past... I call it "Pre-Aggregation" and it can be used on either a Cross Tab or a Pivot.


The general purpose of pre-aggregation is to make it very easy to summarize the data and then format the data for display. Sometimes you'll have some complex aggregations that are a bit difficult or impossible to do when mixed with the rotation in the Select list, so the best thing to do is to do the aggregations as a derived table and then rotate the results. For example, if you want to aggregate dates by month, you'll find it's much easier to pre-aggregate the data using a formula to convert all dates to the first of the month. We'll cover more on that subject in the next article on Cross Tabs.


Pre-aggregation is nothing more than doing the aggregation as part of a derived table and then doing a Cross Tab or Pivot on that result. That's all it is.



You'll find pre-aggregation code for both Cross Tabs and Pivots in the next section of code where you'll also find another really good reason for doing pre-aggregation even if you don't need it to solve complexity...


Performance


Ah yes... what about performance? Just because the code looks simple or complex doesn't necessarily mean faster or slower nor fewer or more resources. Here's the full test code I used... I intentionally did NOT calculate Quarters from the date in the Cross Tabs or the Pivots because I wanted to show you just how much of a performance difference a simple tweak here and there can make... the biggest tweaks I made was the use of pre-aggregation and the use of CTE's...


--===== Create and populate a 1,000,000 row test table.
-- Column "RowNum" has a range of 1 to 1,000,000 unique numbers
-- Column "Company" has a range of "AAA" to "BBB" non-unique 3 character strings
-- Column "Amount has a range of 0.0000 to 9999.9900 non-unique numbers
-- Column "Quantity" has a range of 1 to 50,000 non-unique numbers
-- Column "Date" has a range of >=01/01/2000 and <01/01/2010 non-unique date/times
-- Columns Year and Quarter are the similarly named components of Date
-- Jeff Moden


SELECT TOP 1000000 --<<Look! Change this number for testing different size tables
RowNum = IDENTITY(INT,1,1),
Company = CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65)
+ CHAR(ABS(CHECKSUM(NEWID()))%2+65),
Amount = CAST(ABS(CHECKSUM(NEWID()))%1000000/100.0 AS MONEY),
Quantity = ABS(CHECKSUM(NEWID()))%50000+1,
Date = CAST(RAND(CHECKSUM(NEWID()))*3653.0+36524.0 AS DATETIME),
Year = CAST(NULL AS SMALLINT),
Quarter = CAST(NULL AS TINYINT)
INTO #SomeTable3
FROM Master.sys.SysColumns t1
CROSS JOIN
Master.sys.SysColumns t2

--===== Fill in the Year and Quarter columns from the Date column
UPDATE #SomeTable3
SET Year = DATEPART(yy,Date),
Quarter = DATEPART(qq,Date)

--===== A table is not properly formed unless a Primary Key has been assigned
-- Takes about 1 second to execute.
ALTER TABLE #SomeTable3
ADD PRIMARY KEY CLUSTERED (RowNum)
CREATE NONCLUSTERED INDEX IX_#SomeTable3_Cover1
ON dbo.#SomeTable3 (Company, Year)
INCLUDE (Amount, Quantity, Quarter)
GO
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
GO
---------------------------------------------------------------------------------------------------
--===== "Normal" Cross Tab
PRINT REPLICATE('=',100)
PRINT '=============== "Normal" Cross Tab ==============='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
FROM #SomeTable3
GROUP BY Company, Year
ORDER BY Company, Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
---------------------------------------------------------------------------------------------------
--===== "Normal" Pivot
PRINT REPLICATE('=',100)
PRINT '=============== "Normal" Pivot ==============='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
FROM (SELECT Company, Year, Quarter, Amount FROM #SomeTable3) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM #SomeTable3) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
---------------------------------------------------------------------------------------------------
--===== "Pre-aggregated" Cross Tab
PRINT REPLICATE('=',100)
PRINT '=============== "Pre-aggregated" Cross Tab ==============='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
FROM (SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3 GROUP BY Company,Year,Quarter) d
GROUP BY Company, Year
ORDER BY Company, Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
---------------------------------------------------------------------------------------------------
--===== "Pre-aggregated" Pivot
PRINT REPLICATE('=',100)
PRINT '=============== "Pre-aggregated" Pivot ==============='
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
FROM (SELECT Company, Year, Quarter, SUM(Amount) AS Amount FROM #SomeTable3 GROUP BY Company, Year, Quarter) t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, SUM(Quantity) AS Quantity FROM #SomeTable3 GROUP BY Company, Year, Quarter) t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
---------------------------------------------------------------------------------------------------
--===== "Pre-aggregated" Cross Tab with CTE
PRINT REPLICATE('=',100)
PRINT '=============== "Pre-aggregated" Cross Tab with CTE ==============='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH
ctePreAgg AS
(SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3
GROUP BY Company,Year,Quarter
)
SELECT Company,
Year,
SUM(CASE WHEN Quarter = 1 THEN Amount ELSE 0 END) AS Q1Amt,
SUM(CASE WHEN Quarter = 1 THEN Quantity ELSE 0 END) AS Q1Qty,
SUM(CASE WHEN Quarter = 2 THEN Amount ELSE 0 END) AS Q2Amt,
SUM(CASE WHEN Quarter = 2 THEN Quantity ELSE 0 END) AS Q2Qty,
SUM(CASE WHEN Quarter = 3 THEN Amount ELSE 0 END) AS Q3Amt,
SUM(CASE WHEN Quarter = 3 THEN Quantity ELSE 0 END) AS Q3Qty,
SUM(CASE WHEN Quarter = 4 THEN Amount ELSE 0 END) AS Q4Amt,
SUM(CASE WHEN Quarter = 4 THEN Quantity ELSE 0 END) AS Q4Qty,
SUM(Amount) AS TotalAmt,
SUM(Quantity) AS TotalQty
FROM ctePreAgg
GROUP BY Company, Year
ORDER BY Company, Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
---------------------------------------------------------------------------------------------------
--===== "Pre-aggregated" Pivot with CTE
PRINT REPLICATE('=',100)
PRINT '=============== "Pre-aggregated" Pivot with CTE ==============='
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH
ctePreAgg AS
(SELECT Company,Year,Quarter,SUM(Amount) AS Amount,SUM(Quantity) AS Quantity
FROM #SomeTable3
GROUP BY Company,Year,Quarter
)
SELECT amt.Company,
amt.Year,
COALESCE(amt.[1],0) AS Q1Amt,
COALESCE(qty.[1],0) AS Q1Qty,
COALESCE(amt.[2],0) AS Q2Amt,
COALESCE(qty.[2],0) AS Q2Qty,
COALESCE(amt.[3],0) AS Q3Amt,
COALESCE(qty.[3],0) AS Q3Qty,
COALESCE(amt.[4],0) AS Q4Amt,
COALESCE(qty.[4],0) AS Q5Qty,
COALESCE(amt.[1],0)+COALESCE(amt.[2],0)+COALESCE(amt.[3],0)+COALESCE(amt.[4],0) AS TotalAmt,
COALESCE(qty.[1],0)+COALESCE(qty.[2],0)+COALESCE(qty.[3],0)+COALESCE(qty.[4],0) AS TotalQty
FROM (SELECT Company, Year, Quarter, Amount FROM ctePreAgg) AS t1
PIVOT (SUM(Amount) FOR Quarter IN ([1], [2], [3], [4])) AS amt
INNER JOIN
(SELECT Company, Year, Quarter, Quantity FROM ctePreAgg) AS t2
PIVOT (SUM(Quantity) FOR Quarter IN ([1], [2], [3], [4])) AS qty
ON qty.Company = amt.Company
AND qty.Year = amt.Year
ORDER BY amt.Company, amt.Year
SET STATISTICS TIME OFF
SET STATISTICS IO OFF


The test code was executed 10 times each for 10k, 100k, and 1 million rows both with and without the index created at the beginning of the code. The averaged results, calculated from a profiler table (not included in the code), are fascinating. The light green cells indicate the fastest run times or the least number of reads. The light blue indicate the second place for the same thing...



Notice that even for "normal" Cross Tabs and Pivots that the only place a Pivot wins in any category is in the paltry 10k row test. The Cross Tab wins everywhere else. That's good news for SQL Server 2000 users because you won't want to change your code if and when you upgrade to SQL Server 2005. Using CTE's helps a bit but not as much as pre-aggregation on the larger row counts does. Again, that's good news for SQL Server 2000 users.


Review


In this article, we learned the basis of how to change rows to columns using both Cross Tabs and Pivot. We've discovered that Cross Tabs are nothing more than simple aggregations that have a built in selection condition in the form of a simple Case statement. We've seen how to use a Pivot to do the same thing as a Cross Tab and, in the process, discovered that they're a bit more complicated to create, read, and understand especially when compared to the simplicity of the Cross Tab code. We've been introduced to the concept of "pre-aggregation and the fact that pre-aggregation can make more complex aggregations both easier to read and to contrive. Through testing, we've found that the Cross Tab beats Pivot code in all but the smallest of tables. Through that same testing, we've also found that pre-aggregation adds a substantial performance gain in all but the smallest of tables.


Last but certainly not least, we've discovered that there's no reason to rewrite properly written Cross Tabs to become Pivots when shifting from SQL Server 2000 to SQL Server 2005. To do so would actually cause a negative impact to performance most of the time.


In the Works...



Coming soon to a forum near you... Dynamic Cross Tabs, EAV/NVP conversions, and more on pre-aggregation.


Thanks for listening folks.


--Jeff Moden


Tuning SQL Server performance via disk arrays and disk partitioning

One of my collections, about MSSQL Server Tunning. Enjoy it...

Author : Denny Cherry, 08.19.2008

As a DBA, much of your focus is on performance tuning SQL Server. But have you spent time to tune the hardware supporting your SQL Server system? Are you using the optimal disk array configuration? Are the disk partitions aligned? This tip discusses how to get your SQL Server hardware performance in top shape – whether the system is already in operation or it's a new setup.


With the massive amount of raw horse power available in today's server class hardware, it's easy to skip over the hardware when it comes to performance tuning the SQL Server database. After all, with so much power available, who cares if something takes a few extra milliseconds to complete? Is anyone really going to notice that extra millisecond?




But what happens when you perform an operation that takes 10 extra milliseconds to complete; and it needs to perform 100 times an hour, for a year? All of a sudden, that 10 milliseconds turns into 2.4 hours. If you perform that operation 1,000 times an hour -- which isn't all that unheard of in a smaller OLTP database -- you are now looking at more than 24 hours of wasted time.


In my particular environment, we run the same stored procedure at least 2,000 times per minute. If that stored procedure takes an extra 10 milliseconds to complete, we are looking at eight hours of lost time daily, or 121 days of lost time per year.


Tune SQL Server by tuning disk arrays


There are a few places to check hardware components when tuning your SQL Server system. The easiest components to check are disk arrays. Typically, disk arrays are where the most time is expended
























More on tuning SQL Server performance and hardware:
  • Determining SQL Server database storage requirements

  • Optimize disk configuration in SQL Server

  • Storage area network (SAN) basics every DBA must know



  • waiting for something to happen. There are a couple of ways to tune the disks to improve SQL Server performance. The first is to make sure your disk array has enough spindles to handle the workload that will be placed on it. Second, make sure the disk arrays are in the correct RAID level to offer the best support level for the database.



    While it is true that RAID 10 offers better write performance, in most cases, RAID 10 isn't required for the data files. That said, you should use RAID 10 for your transaction logs and tempdb database, as they are mostly write files. The reason I say not to use RAID 10 for all database files is that RAID 10 is very costly to implement in large disk sizes. This is because for each spindle used for data, a second spindle is used for redundancy.


    Finding out if you need more spindles on an existing system is easy. Open Performance Monitor on the server and add the "Physical Disk" object and the "Current Disk Queue Length" counter. Some queuing is OK; however, there is a tipping point. To find out where the tipping point of "OK queuing" and "too much queuing" is, take the number of disks in the array and multiply it by two. If the result is greater than the maximum value in Performance Monitor, then you have too much queuing. When we talk about the number of disks, we're referring to the number of disks that are actively working with data. If you have a RAID 10 array, this is half the number of disks in the array.


    "Number of Disks" x 2 = Maximum Allowable Queuing


    How to configure the disk array on your new SQL Server system


    When working on a new system without any load on it, making sure you configure your disk array correctly is a little more challenging. If you have another system with the same amount of load on it, you can use that system as a guide. However, if this is the first large system in your environment, then getting it correct can be a bit harder.


    You'll need to look at your database system and the expected transactions per second, and make an educated guess on how many spindles you'll need. When dealing with high-end drives, expect each drive to give you about 100 IOPs to 120 IOPs per second per disk in an OLTP environment. When dealing with SATA drives, expect each drive to give you about 60 IOPs to 80 IOPs per second per disk in an OLTP environment. Those numbers will go up when working in an OLAP environment because OLAP databases put a different kind of load on the disk system. It's a more sequential load, whereas OLTP databases put a random load on the disks.


    Disk partition alignment improves SQL Server performance



    Once you set up your disk array, you'll want to make sure the partition you create is correctly aligned. By default, when Windows (or any other operating system for that matter) creates the partition on a disk or array, the partition is not correctly aligned for peak performance. Disk drives are made up of 1K blocks. The physical disks like to do all their operations in 64-block chunks called clusters. Conveniently, SQL Server likes to do all its operations in 64 K operations -- there are eight 8K blocks in each extent, and SQL does its reads one extent at a time. When the partition is created, the boot sector is created at the beginning of the partition. This boot sector is 32 K in size, causing the 64K operations to be spread between two 64K clusters. This then causes each logical operation to take twice as many physical operations as needed.


    You can see your current alignment offset by using the diskpart application. Open a command prompt and run diskpart.exe. When you are prompted with a DISKPART> prompt, type SELECT DISK n where n is the disk number you want to look at -- the command LIST DISK will give you a list of disks in the machine. After selecting the disk, type in "LIST PARTITION" to get the partition information, including the offset.



    Using disk partition in SQL Server

    Figure 1: Run the DISKPART application to view disk alignment.

    In order to create the partition, you'll need to use the CREATE PARTITION command. The full command is CREATE PARTITION PRIMARY ALIGN=64. This creates the new partition with the 64K offset, aligning the partition into the optimum position for maximum performance.



    Check out part two in this tip series, configuring memory and CPU processing for improved SQL Server performance.

    Tuning SQL Server performance via memory and CPU processing

    After a long journey browsing on internet, finally I found this interesting and usefull article for tunning our MSSQL Database Server. So in this place i re posting to share with you guys.... enjoy it ;)

    Author : Denny Cherry, 09.09.2008
    In my previous tip, Tuning SQL Server performance via disk arrays and disk partitioning, we talked about how important it is to ensure that your storage was set up correctly to optimize SQL Server performance. However, storage isn't the only part of SQL Server hardware that needs special consideration when designing your infrastructure.



    SQL Server memory can also impact performance. While having too much memory in a SQL Server system is a waste of money, having too little memory is extremely detrimental to performance. Unfortunately, determining when you need more memory in the system can be a bit tricky. When memory problems begin, you'll start to see an increase in disk I/O, as well as an increase in disk queuing. You'll also see a decrease in the buffer cache hit ratio and page life expectancy. As memory requirements increase, you may begin to see these error messages in the log file:




  • A significant part of SQL Server process memory has been paged out. This may result in a performance degradation. Duration: %n seconds. Working set (KB): %w, committed (KB): %c, memory utilization: %u.


  • SQL Server has encountered %o occurrence(s) of IO requests taking longer than 15 seconds to complete on file [%f] in database [%d] (%i). The OS file handle is %h. The offset of the latest long IO is: %l.


  • Unfortunately, this is not the only time these errors are reported, so you have to use them along with the performance monitor metrics to determine that memory is actually low.



    When dealing with SQL Server memory issues, there are a few options to resolve the problems. The easiest solution is to increase server memory, which increases the amount of buffer























    More on improving SQL Server performance:
  • Memory configurations for procedure cache and buffer cache
  • Configuring SQL Server memory settings

  • Clustered and non-clustered indexes in SQL Server




  • cache available. This adds to the amount of data in memory and reduces your disk I/O. Other potential solutions include removing clustered indexes for extremely large tables and using only nonclustered indexes for the table, including the Primary Key.



    This will only make a difference when the clustered index is being used for lookups, and clustered index seeks are used. If another index is in use, it will not relieve any memory pressure, as the clustered index won't be in memory. If you're using clustered index scans, then this turns into table scans that load the table into memory instead of the index. If clustered index scans are being performed, then a new nonclustered index may help the situation without removing the index.



    How to monitor CPU queuing



    The CPU is another piece of hardware that can cause potential performance problems. Most people only look at the speed of or number of CPUs. However, just like disks, CPUs can become bottlenecked. If there is a CPU bottleneck, you may not even see the CPU performance at 100%. CPUs have command queues in much the same way that disks have I/O queues. Commands are loaded into a CPU queue and the operation waits for the CPU to become available before performing the operation. As CPUs became faster, we could do things much faster within the CPU, but we could still only do the same number of things at one time. Now, as dual-core, tri-core and quad-core CPUs become available, we can process more commands at one time.



    You can monitor your CPU queue using SQL Server Performance Monitor. You'll find PerfMon under the System object, with the counter name "Processor Queue Length." Pretty much any queue length other than zero indicates a need to increase the number of operations that SQL Server can perform at any one time. It doesn't indicate a need for faster CPUs, but a need for more CPU cores. Today's newest servers support 32 cores per server, and some of the most advanced servers support up to 64 cores -- when chases are scaled together support for 64 cores can be built (available only from certain vendors).



    In parts one and two, I've pointed out a variety of places within the hardware that impact whether your SQL Server system will run at peak performance. These tips are not the be-all, end-all solutions to performance problems. Table design and index tuning always have been and will continue to be extremely important. Today's SQL Server is expected to do more work for more hours of the day, which makes hardware tuning more important to the success of the database platform. With these tools in your arsenal to combat performance problems, you'll be able to get every ounce of performance from the existing hardware with no or minimal hardware upgrades to the platform. But when you do need to make those purchasing decisions, use these tips to make the correct purchasing decisions to get the most upgrade for your dollars spent.


    Tuesday, September 23, 2008

    Export data from SQL Server to Excel

    Problem
    Exporting data from SQL Server to Excel seems like a reasonably simple request. I just need to write out a few reports for users on a regular basis, nothing too fancy, the same basic report with a few different parameters. What native SQL Server options are available to do so? Do I need to learn another tool or can I use some T-SQL commands? Does SQL Server 2005 offer any new options to enhance this process?

    ..Read More

    Solution
    Exporting data from SQL Server to Excel can be achieved in a variety of ways. Some of these options include Data Transformation Services (DTS), SQL Server Integration Services (SSIS) and Bulk Copy (BCP). Data Transformation Services (SQL Server 2000) and SQL Server Integration Services (SQL Server 2005) offers a GUI where widgets can be dragged and dropped Each option has advantages and disadvantages, but all can do the job. It is just a matter of your comfort level with the tools and the best solution to meet the need.

    Another option that is available directly via the T-SQL language is the OPENROWSET command (SQL Server 2000 and SQL Server 2005). This command can be called directly in any stored procedure, script or SQL Server Job from T-SQL. Below outlines the full syntax available:


    Source - SQL Server 2005 Books Online

    Below is a simple example of writing out the Job name and date to Sheet1 of an Excel spreadsheet in either SQL Server 2005 or 2000:





    INSERT INTO OPENROWSET('Microsoft.Jet.OLEDB.4.0',
    'Excel 8.0;Database=C:\testing.xls;',
    'SELECT Name, Date FROM [Sheet1$]')
    SELECT [Name], GETDATE() FROM msdb.dbo.
    sysjobs
    GO

    Using the OPENROWSET command creates two caveats. The first caveat is the need to have an Excel spreadsheet serve as a template in the needed directory with the correct worksheets and columns. Without this the you would receive an error message. The second caveat is that it is necessary to enable the OPENROWSET command with the SQL Server 2005 Surface Area Configuration utility. Note - This step is not needed for SQL Server 2000. With the loop example you could copy and paste the Excel spreadsheet and load the data as needed.

    Although the example above is very simple, you could integrate this simple example into your code. For example, if you had the need to write out a number of reports in the same format, you could loop (WHILE loop or Cursor) over the records and write them out to different Excel spreadsheets based on the name or report type. In addition, you could integrate this code with either SQL Server 2000 mail or Database mail (SQL Server 2005) and mail the results to the users with very little effort and build the process entirely with T-SQL.

    Next Steps