OK, so won’t actually generate a stack overflow error in SQL Server, but it makes for a good title. Some time ago I was asked to write a custom deferred revenue integration into GP. There were some quirks to their recognition schedule that prevented them from using the standard deferred revenue module.
For those of you not familiar with deferred revenue, here’s a quick primer. If you pay $24 for a 12 month magazine subscription, the publisher is not allowed to recognize and report to their stockholders the entire $24 when they receive it. They can only recognize revenue as they’ve delivered the service, in this case each magazine, to you. As such, they will recognize and report $2 in revenue each month.
OK, so how do you write a SQL statement that takes a set of rows and creates multiple rows from each individual row. Let’s say that we have a sales table, tblSales, that has an amount to be recognized, a starting month and a number of months to be recognized over as follows:
OrderNumber | Amount | StartingMonth | RecognitionMonths |
1 | $120.00 | 3/2010 | 3 |
2 | $257.00 | 2/2010 | 7 |
To make this easy, we’ll recognize the revenue equally over the number of months. Often revenue needs to be recognized according to the number of days in the month and take into account starting and ending dates that are mid-month. In this case we’ll use a simple model and simply need to return a dataset of 10 rows as follows:
OrderNumber | AmountRecognized | MonthRecognized |
1 | $40.00 | 3/2010 |
1 | $40.00 | 4/2010 |
1 | $40.00 | 5/2010 |
2 | $36.71 | 2/2010 |
2 | $36.71 | 3/2010 |
2 | $36.71 | 4/2010 |
2 | $36.71 | 5/2010 |
2 | $36.71 | 6/2010 |
2 | $36.71 | 7/2010 |
2 | $36.74 | 8/2010 |
So how, in a SQL statement, can we get from 2 rows to 10? As with any problem, I’m sure we could come up with any number of solutions using WHILE loops and table variables, but I’m going to use the opportunity to introduce recursion in SQL.
Recursion is a feature of the T-SQL Common Table Expression. It is essentially a common table expression (CTE) that references itself. Let’s look at the following statement:
WITH DeferredRevenueEntries(
OrderNumber
,AmountRecognized
,MonthRecognized
,RecognitionMonths
,CurrentMonthIndex) AS
(
-- Anchor member
-- The anchor member creates the entry for the first month.
SELECT
OrderNumber
,AmountRecognized = Amount / RecognitionMonths
,MonthRecognized = StartingMonth
,RecognitionMonths
,1 AS CurrentMonthIndex
FROM tblSales
UNION ALL
-- Recursive member
-- Recurse one time for each additional month to be recognized
SELECT
OrderNumber
,AmountRecognized
,MonthRecognized = DATEADD(month, 1, MonthRecognized)
,RecognitionMonths
,CurrentMonthIndex = CurrentMonthIndex + 1
FROM DeferredRevenueEntries
WHERE CurrentMonthIndex < RecognitionMonths
)SELECT
OrderNumber
,AmountRecognized
,MonthRecognized
FROM DeferredRevenueEntries
ORDER BY
OrderNumber
,MonthRecognized
Here we’re defining a CTE, DeferredRevenueEntries. This is a UNION query. the first part of the union is called the anchor member. This is a SELECT statement against our source table, tblSales. This part of will return the first month rows for each of the sales records, with the amount to be recognized being the total amount divided by the number of months.
The second part is the recursive member. Note that the FROM clause is referencing the CTE itself. This will take the initial two records from the first part of the union, and keep returning them, adding one month each time. To make sure that we only return the number of records matching the number of months to be recognized, we add the field CurrentMonthIndex, increment it each time we recurse and make sure we only return records where the CurrentMonthIndex is less than the number of months to be recognized.
OK, so what happens if we forget to put in the WHERE clause in the recursive member? Will we in fact cause the aforementioned stack overflow? The answer is no. SQL is too smart for that. Instead you will recieve the following error:
Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 100 has been exhausted before statement completion.
By default, SQL Server limits you to a maximum recursion level of 100. It is possible to override that limit by using the MAXRECURSION option in the final SELECT statement such that it looks as follows:
SELECT
OrderNumber
,AmountRecognized
,MonthRecognized
FROM DeferredRevenueEntries
ORDER BY
OrderNumber
,MonthRecognized
OPTION (MAXRECURSION 1000)
Now you might say, well what if I put in a MAXRECURSION of 10000000000? Is there some way I can create a stack overflow? Again, SQL will outwit you. The maximum MAXRECURSION limit is 32767.
OK, so there’s one final issue. You’ll note in my results example above that the recognition amount for order 2 was $36.71 except for the last entry where it’s $36.74. This is because never met an accountant who could deal with as much as a single penny off no matter how high the total. To fix that I create the following modified version of the statement:
WITH DeferredRevenueEntries(
OrderNumber
,AmountRecognized
,MonthRecognized
,RecognitionMonths
,CurrentMonthIndex) AS
(
-- Anchor member
-- The anchor member creates the entry for the first month.
SELECT
OrderNumber
,AmountRecognized = ROUND((Amount / RecognitionMonths), 2)
,MonthRecognized = StartingMonth
,RecognitionMonths
,1 AS CurrentMonthIndex
FROM tblSales
UNION ALL
-- Recursive member
-- Recurse one time for each additional month to be recognized
SELECT
OrderNumber
,AmountRecognized
,MonthRecognized = DATEADD(month, 1, MonthRecognized)
,RecognitionMonths
,CurrentMonthIndex = CurrentMonthIndex + 1
FROM DeferredRevenueEntries
WHERE CurrentMonthIndex < (RecognitionMonths - 1)
)SELECT
OrderNumber
,AmountRecognized
,MonthRecognized
FROM DeferredRevenueEntriesUNION ALL
-- Create last entry manually to adjust for rounding errors.
SELECT
tblSales.OrderNumber
,AmountRecognized = ROUND(tblSales.Amount - DeferredRevenueEntryTotals.TotalAmountRecognized, 2)
,MonthRecognized = DATEADD(month, 1, DeferredRevenueEntryTotals.LastMonthRecognized)
FROM tblSales
INNER JOIN (
SELECT
OrderNumber
,TotalAmountRecognized = SUM(AmountRecognized)
,LastMonthRecognized = MAX(MonthRecognized)
FROM DeferredRevenueEntries
GROUP BY OrderNumber) AS DeferredRevenueEntryTotals
ON tblSales.OrderNumber = DeferredRevenueEntryTotals.OrderNumberORDER BY
OrderNumber
,MonthRecognizedYou’ll notice a few changes here. The first is that I round the AmountRecognized value to 2 digits. The second is that the CTE is now only going to recurse through the number of months – 1. This is because we’re create the final entry in the last select in the UNION query by taking the total sales amount from tblSales and then subtracting the total recognized prior to the final month, giving us an amount that is adjusted for any rounding issues.
I hope that this has given you a brief look at the power of the recursive CTE and that you are reassured that you will not break SQL Server by using it. There are certainly many other ways to use the, such as traversing hierarchical lists. Go ahead, recurse away!
Great article. Hadn't thought of the "tail" record thing for rounding. Nice!
ReplyDelete