Tuesday, October 9, 2007

Calculate Percentiles with SQL Server 2005

"When you need to analyze numerical data, percentiles are handy for understanding the distribution. Oracle has been providing analytical functions such as NTILE, PERCENTILE_CONT, and RANK since version 8, while SQL Server has been playing catch up. With SQL Server 2005, Microsoft finally added NTILE, RANK, DENSE_RANK, and ROW_NUMBER to SQL Server's T-SQL functions. Unfortunately, the SQL Server 2005 still does not provide PERCENTILE_CONT, and neither will SQL Server 2008. They do, however, feature the new SQL Server Common Table Expression (CTE), which you can use to calculate percentiles."

"This article presents a solution for calculating percentiles in SQL Server 2005 using SQL Server CTE. Based on a solution described in Joe Celko's SQL For Smarties, the idea is to calculate the percentile (percent rank) of each row; if one of the rows has the percentile value that you are looking for, you return it. Otherwise, you perform an interpolation between a value that is slightly more than the desired percentile and one that is slightly less."
 
 
I recently used this approach for some new functionality in one of my projects and it really saved alot of intensive and tedious queries to come up with the right value.

0 comments: