Monday, January 18, 2010

Common Table Expressions – Dispensing with the Sub-Query

The Common Table Expression (CTE) is a construct that was introduced in SQL Server 2005. It allows you to define a SELECT statement outside of your main query and then reference it within the query. It’s a great replacement for sub-queries, which can be difficult to debug and maintain, especially once their nested. You can think of CTEs as SQL views that exist only within the scope of a particular SQL statement.

Let’s start with an query example similar to one we were asked to do recently. Let’s imagine you’re asked to write an inventory analysis report that requires the following columns:

  • Item ID
  • Item Description
  • Item Class
  • Standard Cost
  • Quantity On Hand
  • Count Sold – The total number of items sold.
  • Average Unit Price – The average unit price that the
  • Unique Customers Count – The count if individual customers who have purchased the item
  • Percentage of Customers – The percentage of all customers who have purchased the item

The report will have start and end date range parameters (Quantity On Hand will be the current regardless of the date range.

Item ID, Item Description, Item Class and Standard Cost come from the item master table. Quantity On Hand comes from the item quantity master table. Count Sold, Average Unit Price, Count of Unique Customers and Percentage of Customers come from a combination of the Sales Order and Customer Master tables.

Now I suspect there may be a way to do this without sub-queries using aggregates and CASE statements but that’s a challenge for another day. When solving a problem I like to break it into manageable chunks. In this case I’d looking at breaking this into two pieces:

  • Returning the core item information, Item ID, Item Description, Standard Cost and Quantity On Hand.
  • Returning the sales information, Count Sold and Average Unit Price, Unique Customer Count and Percentage of Customers.

Let’s start with the core item info. That query would look as follows:

SELECT
        ItemNumber = ISNULL(ItemMaster.ITEMNMBR, QuantityMaster.ITEMNMBR)
        ,ItemDescription = ItemMaster.ITEMDESC
        ,ItemClass = ItemMaster.ITMCLSCD
        ,StandardCost = ItemMaster.STNDCOST
        ,QuantityOnHand = ISNULL(QuantityMaster.QTYONHND, 0)
    FROM IV00101 AS ItemMaster
        FULL JOIN IV00102 AS QuantityMaster
            ON ItemMaster.ITEMNMBR = QuantityMaster.ITEMNMBR
    WHERE ISNULL(QuantityMaster.RCRDTYPE, 1) = 1

Here we’re joining the item master (IV00101) with the item quantity master (IV00102). A couple of notes:

  • We’re using a FULL JOIN to ensure we pick up all items. I have found examples where there are records in the item master or quantity master but not a corresponding record in the other.
  • We’re restricting the quantity master records to a record type (RCRDTYPE) of 1, which only returns quantities that apply to all sites.

Next we’ll look at the statistics. The query would look as follows:

SELECT
        ItemNumber = SOPLineItems.ITEMNMBR
        ,CountSold = SUM(SOPLineItems.QUANTITY)
        ,AverageUnitPrice =
            CASE
                WHEN SUM(SOPLineItems.QUANTITY) = 0 THEN 0.00
                ELSE SUM(SOPLineItems.QUANTITY * SOPLineItems.UNITPRCE) / SUM(SOPLineItems.QUANTITY)
            END
        ,CustomerCount = COUNT(DISTINCT CUSTNMBR)
        ,CustomerPercentage = CAST(COUNT(DISTINCT CUSTNMBR) AS float) /
            CAST ((SELECT COUNT(*) FROM RM00101) AS float)
    FROM SOP30300 AS SOPLineItems
        INNER JOIN SOP30200 AS SOPHeaders
            ON SOPLineItems.SOPTYPE = SOPHeaders.SOPTYPE
                AND SOPLineItems.SOPNUMBE = SOPHeaders.SOPNUMBE
    WHERE SOPHeaders.SOPTYPE = 3
        AND SOPHeaders.GLPOSTDT BETWEEN @StartDate AND @EndDate
    GROUP BY SOPLineItems.ITEMNMBR

Here we’re pulling the data from SOP line items and SOP headers with a total count from the customer master. A couple of notes:

  • We’re using a CASE statement in the Average Unit Price to ensure we don’t get a divide by zero error.
  • We’re filtering the rows on the SOP header GL posting date (GLPOSTDT).
  • We’re getting the item customer count using the COUNT(DISTINCT ) construct.
  • While getting the customer percentage we’re explicitly casting the counts as floats so that it doesn’t perform integer division and only return 0 and 1.

So traditionally, we would combine these together as two sub-queries as below:

SELECT
        ItemNumber = Items.ItemNumber
        ,ItemClass = Items.ItemClass
        ,ItemDescription = Items.ItemDescription
        ,StandardCost = Items.StandardCost
        ,QuantityOnHand = Items.QuantityOnHand
        ,CountSold = ISNULL(ItemStatistics.CountSold, 0)
        ,AverageUnitPrice = ISNULL(ItemStatistics.AverageUnitPrice, 0)
        ,CustomerCount = ISNULL(ItemStatistics.CustomerCount, 0)
        ,CustomerPercentage = ISNULL(ItemStatistics.CustomerPercentage, 0)
    FROM (
            SELECT
                    ItemNumber = ISNULL(ItemMaster.ITEMNMBR, QuantityMaster.ITEMNMBR)
                    ,ItemDescription = ItemMaster.ITEMDESC
                    ,ItemClass = ItemMaster.ITMCLSCD
                    ,StandardCost = ItemMaster.STNDCOST
                    ,QuantityOnHand = ISNULL(QuantityMaster.QTYONHND, 0)
                FROM IV00101 AS ItemMaster
                    FULL JOIN IV00102 AS QuantityMaster
                        ON ItemMaster.ITEMNMBR = QuantityMaster.ITEMNMBR
                WHERE ISNULL(QuantityMaster.RCRDTYPE, 1) = 1) AS Items
        LEFT JOIN (
                SELECT
                        ItemNumber = SOPLineItems.ITEMNMBR
                        ,CountSold = SUM(SOPLineItems.QUANTITY)
                        ,AverageUnitPrice =
                            CASE
                                WHEN SUM(SOPLineItems.QUANTITY) = 0 THEN 0.00
                                ELSE SUM(SOPLineItems.QUANTITY * SOPLineItems.UNITPRCE) / SUM(SOPLineItems.QUANTITY)
                            END
                        ,CustomerCount = COUNT(DISTINCT CUSTNMBR)
                        ,CustomerPercentage = CAST(COUNT(DISTINCT CUSTNMBR) AS float) /
                            CAST ((SELECT COUNT(*) FROM RM00101) AS float)
                    FROM SOP30300 AS SOPLineItems
                        INNER JOIN SOP30200 AS SOPHeaders
                            ON SOPLineItems.SOPTYPE = SOPHeaders.SOPTYPE
                                AND SOPLineItems.SOPNUMBE = SOPHeaders.SOPNUMBE
                    WHERE SOPHeaders.SOPTYPE = 3
                        AND SOPHeaders.GLPOSTDT BETWEEN @StartDate AND @EndDate
                    GROUP BY SOPLineItems.ITEMNMBR) AS ItemStatistics
                        ON Items.ItemNumber = ItemStatistics.ItemNumber

Although this works, it’s starting to get a little difficult to read. If there were additional sub-queries, particularly nested ones, it would be much harder to follow and debug. If we’re using CTEs we would create the query below:

-- Items common table expression returns item list with quantities
-- for all sites.
WITH Items
AS (
    SELECT
            ItemNumber = ISNULL(ItemMaster.ITEMNMBR, QuantityMaster.ITEMNMBR)
            ,ItemDescription = ItemMaster.ITEMDESC
            ,ItemClass = ItemMaster.ITMCLSCD
            ,StandardCost = ItemMaster.STNDCOST
            ,QuantityOnHand = ISNULL(QuantityMaster.QTYONHND, 0)
        FROM IV00101 AS ItemMaster
            FULL JOIN IV00102 AS QuantityMaster
                ON ItemMaster.ITEMNMBR = QuantityMaster.ITEMNMBR
        WHERE ISNULL(QuantityMaster.RCRDTYPE, 1) = 1
),

-- Item statistics common table expression returns the relevant
-- statistics for each item.
ItemStatistics
AS (
    SELECT
            ItemNumber = SOPLineItems.ITEMNMBR
            ,CountSold = SUM(SOPLineItems.QUANTITY)
            ,AverageUnitPrice =
                CASE
                    WHEN SUM(SOPLineItems.QUANTITY) = 0 THEN 0.00
                    ELSE SUM(SOPLineItems.QUANTITY * SOPLineItems.UNITPRCE) / SUM(SOPLineItems.QUANTITY)
                END
            ,CustomerCount = COUNT(DISTINCT CUSTNMBR)
            ,CustomerPercentage = CAST(COUNT(DISTINCT CUSTNMBR) AS float) /
                CAST ((SELECT COUNT(*) FROM RM00101) AS float)
        FROM SOP30300 AS SOPLineItems
            INNER JOIN SOP30200 AS SOPHeaders
                ON SOPLineItems.SOPTYPE = SOPHeaders.SOPTYPE
                    AND SOPLineItems.SOPNUMBE = SOPHeaders.SOPNUMBE
        WHERE SOPHeaders.SOPTYPE = 3
            AND SOPHeaders.GLPOSTDT BETWEEN @StartDate AND @EndDate
        GROUP BY SOPLineItems.ITEMNMBR
)

-- Final query joins the two CTEs together.
SELECT
        ItemNumber = Items.ItemNumber
        ,ItemClass = Items.ItemClass
        ,ItemDescription = Items.ItemDescription
        ,StandardCost = Items.StandardCost
        ,QuantityOnHand = Items.QuantityOnHand
        ,CountSold = ISNULL(ItemStatistics.CountSold, 0)
        ,AverageUnitPrice = ISNULL(ItemStatistics.AverageUnitPrice, 0)
        ,CustomerCount = ISNULL(ItemStatistics.CustomerCount, 0)
        ,CustomerPercentage = ISNULL(ItemStatistics.CustomerPercentage, 0)
    FROM Items
        LEFT JOIN ItemStatistics
            ON Items.ItemNumber = ItemStatistics.ItemNumber

In the query above we have two common table expressions, Items and ItemStatistics. We declare the first CTE using the WITH clause. Subsequent CTEs are separated with a comma. Note that all CTEs must be declared at the beginning of the query. The final query simply joins the two CTEs together

The biggest advantage to CTEs is their readability, manageability and the ease of troubleshooting. It’s much easier to break the query down and troubleshoot its individual components than to do the same with nested sub-queries. Other things to note about CTEs:

  • A CTE can reference other CTEs defined above it.
  • A query can join to a CTE multiple times. i.e., wherever you might use the same sub-query multiple times you can now define it once in a CTE and simply reference the CTE multiple times.
  • You can reference parameters and variables in a CTE and thus optimize your queries by filtering out records prior to joining them in your main query.
  • If you have any statements above a CTE, such as DECLARE or SET statements in a stored procedure, you must terminate those prior statements using a semicolon.

So let us bid farewell to the much vaunted and occasionally overused sub-query and hello to our new friend, the Common Table Expression.

Next up, the Recursive Common Table Expression.

Thursday, January 14, 2010

Excellent Consulting Tips Article

I’m veering off the technical path here, but I came across this great article on the things NOT to do as a consultant.

10 things you should never do on a consulting job (http://blogs.techrepublic.com.com/10things/?p=1290&tag=nl.e550)

Wednesday, January 13, 2010

My Adventures using eConnect Stored Procedures – Part 2, Error Handling

I’m a big believer in doing things in a consistent fashion, be it formatting SQL code or loading the dishwasher.  To that end, I like to handle errors consistently in my applications and that means using C#’s standard method using Try/Catch. 

The eConnect stored procedures don’t raise an error if one occurs but rather, as mentioned in Part 1, pass back error information in two output parameters, @O_iErrorState, an integer and @oErrString a 255 character varchar. 

Now unfortunately, neither of these variables on their own provide much interesting information.  @O_iErrorState returns a single integer representing an error.  @oErrString sends a list of integers, separated by spaces, if more than one error is occurs.  So where do I get the text that corresponds to the numbers?

The answer is in the taErrorCode table in the DYNAMICS database.  The error codes returned by the eConnect stored procedures correspond to the values in the ErrorCode column.  Using that you can return the stored procedure the error code relates to from the SourceProc column and the error description from the ErrorDesc column.

So that solves the first part of my problem, which is how to retrieve readable text from the error codes, but how do I then go and throw an exception with all of the error information instead of relying on the output parameters?  The first step is to consolidate the information for one or more errors into a single string.  For this I created a user defined scalar function that takes an error code list, as returned by the @oErrString output parameter and returns a single string as follows (best read by cutting and pasting into Management Studio):

CREATE FUNCTION [dbo].[fnvGeteConnectErrorMessage]
(
    @eConnectErrorString nvarchar(255)
)
RETURNS nvarchar(max)
AS
BEGIN
    DECLARE @ErrorString nvarchar(255)
    DECLARE @ErrorMessage nvarchar(max)
    DECLARE @DelimiterIndex int
    DECLARE @CurrentErrorCode nvarchar(10)
    DECLARE @CRLF nvarchar(2)

--Initialize variables
    SET @CRLF = CHAR(13) + CHAR(10)
    SET @ErrorString = @eConnectErrorString
    SET @ErrorMessage = 'The following error(s) were returned:' + @CRLF + @CRLF
    SET @ErrorString = LTRIM(RTRIM(@ErrorString))

--If no errors then return empty string
    IF LEN(@ErrorString) = 0
    BEGIN
        RETURN N''
    END

--Find first delimiter
    SET @DelimiterIndex = CHARINDEX(' ', @ErrorString, 0)
    IF @DelimiterIndex = 0
    BEGIN
--If no delimiter, return entire string
        SET @CurrentErrorCode = @ErrorString
    END
    ELSE
    BEGIN
--If delimiter found, set current error code to first error code in string
        SET @CurrentErrorCode = SUBSTRING(@ErrorString, 1, (@DelimiterIndex - 1))
    END

--Loop until there are no error codes remaining
    WHILE LEN(@CurrentErrorCode) > 0
    BEGIN
--Return error stored procedure and message and append to error message to be returned
        SELECT
                @ErrorMessage = @ErrorMessage + 'Source procedure - ' + ISNULL(SourceProc, '')
                    + @CRLF + 'Error - ' + ISNULL(ErrorDesc, '') + @CRLF + @CRLF
            FROM DYNAMICS.dbo.taErrorCode
            WHERE ErrorCode = @CurrentErrorCode

--Remove current error code from string and get next error code
        SET @ErrorString = RTRIM(LTRIM(SUBSTRING(@ErrorString, (LEN(@CurrentErrorCode) + 1), (LEN(@ErrorString) - LEN(@CurrentErrorCode)))))
        SET @DelimiterIndex = CHARINDEX(' ', @ErrorString, 0)
        IF @DelimiterIndex = 0
        BEGIN
            SET @CurrentErrorCode = @ErrorString
        END
        ELSE
        BEGIN
            SET @CurrentErrorCode = SUBSTRING(@ErrorString, 1, (@DelimiterIndex - 1))
        END
    END

    -- Return the result of the function
    RETURN @ErrorMessage
END

What this function is ultimately doing is taking the space delimited list of error codes, looping through them, retrieving the source procedure and error description, appending it to the overall error message and then returning the whole thing.

Now that I can return a consolidated error string, how do I throw an error?  I handled this by creating a wrapper stored procedure.  This not only allows me to handle errors the way I want, but also have a stored procedure with only the parameters I need in my code, with readable names.  Below is the wrapper stored procedure I created to insert a GL transaction header:

CREATE PROCEDURE [dbo].[spvGLInsertGPGLHeader]
    @BatchNumber char(15)
    ,@JournalEntry int
    ,@JournalReference char(30)
    ,@TransactionDate datetime
AS

    DECLARE @RC int
    DECLARE @ErrorState int
    DECLARE @ErrorString varchar(255)
    DECLARE @ErrorMessage nvarchar(max)

--Execute the taGLTransactionHeaderInsert eConnect stored
--procedure with passed parameters and required defaults.
    EXECUTE @RC = [PHI2].[dbo].[taGLTransactionHeaderInsert]
        @I_vBACHNUMB = @BatchNumber
        ,@I_vJRNENTRY = @JournalEntry
        ,@I_vREFRENCE = @JournalReference
        ,@I_vTRXDATE = @TransactionDate
        ,@I_vTRXTYPE =0
        ,@I_vSERIES = 2
        ,@I_vRequesterTrx = 1
        ,@O_iErrorState = @ErrorState OUTPUT
        ,@oErrString = @ErrorString OUTPUT

--If any errors are returned, get the consolidated error message
--and raise a SQL error.
    IF LEN(@ErrorString) <> 0
    BEGIN
        SET @ErrorMessage = dbo.fnvGeteConnectErrorMessage(@ErrorString)
        RAISERROR (@ErrorMessage, 18, 1)

        RETURN
    END

As you can see, we’re using TSQL’s RAISERROR (doesn’t the missing “E” in the middle bug you a little) to throw an error back to your calling code.  Also note that we have a much cleaner stored procedure to reference from our calling application.

Next up, calling this from an ADO.NET table adaptor in a Dynamics GP .NET add-in.

Tuesday, January 12, 2010

My Adventures using eConnect Stored Procedures – Part 1

OK. So over a month ago I talked about doing an integration using the eConnect stored procedures directly instead of the COM+ or Message Queue interfaces. Well, I’m happy to report that the first one is completed. It’s a GL integration with AA (Analytical Accounting) transactions from an Excel file. Each row represents a single journal entry distribution and has two columns with AA dimensions. Each AA column represents a different dimension with a code for each that is allocated 100% of the distribution amount.

As I noted in the previous post, each eConnect XML node represents a single stored procedure of the same name. In this integration we’re using the following nodes:

  • taGLTransactionHeaderInsert
  • taGLTransactionLineInsert
  • taAnalyticsDistribution

Each of these stored procedures has a set of parameters that matches the parameters of the XML node and follows the same rules in terms of defaults and whether it’s required or not. In addition there are two output parameters at the end of the parameters list, @O_iErrorState, an integer and @oErrString a 255 character varchar. These parameters return information an any errors that occurred in processing.

In addition to the stored procedures that represent the eConnect XML nodes, there was an additional stored procedure, taGetNextJournalEntry, I used to return the next journal entry number. This stored procedure has the following three parameters:

  • @I_vInc_Dec – An integer that is either 1 or 0 to indicate whether to increment or decrement the number (in case you’re rolling back a transaction)
  • @O_vJournalEntryNumber – A 13 character output parameter that returns the next journal entry number.
  • @O_iErrorState – An integer output parameter that returns an error number or 0 if no errors occurred.

A lesson I learned was the importance of running the stored procedures in the correct order. My gut told me to create the header first, then the transaction lines and AA distributions. My gut was wrong.

Actually after creating the header and then the transaction lines I was mostly OK. The transactions did get created with the proper distributions. The only clue that something was wrong was that the batch transaction count and total was incorrect. However, after adding the AA logic I got no AA distributions.

Integrating AA has never failed to frustrate me. A couple of hours of foul language and troubleshooting later I ended up writing a little test app that wrote added the transactions through the COM+ interface. After running a SQL trace in SQL Server Profiler, I discovered that the transaction header insert stored procedure needs to be run last, not first. Once the calls were rearranged all was happy!

What did I learn? When you avoid the COM+/XML parsing/DTC overhead it’s incredibly fast. That you should have a test application ready to create the transactions using the COM+ interface to determine the correct stored procedure order. That you’ll need to handle transactions in your own code instead of relying on the COM+ object and DTC. And that it’s really not that hard to do and a good option if you have direct access to the database.

In future posts I’ll talk about some of the other things I did in this integration including handling the errors returned from the eConnect stored procedures and connecting to the database using the end-user’s GP credentials.

SSRS Expressions Tips & Tricks

Wow!  Time flies when you’re having fun.  Anyway, between taking some time off for the holidays, preparing to take time off for the holidays and digging out of the email from taking time off for the holidays, I guess I’ve neglected the blog a bit.

This will just be a short post.  A colleague of mine, Liz Pfahler, passed this link along to me.  It’s a good comprehensive list of tips and tricks for SSRS expressions.  Everything from how to set alternating row colors, to returning the first or last day of the month. 

There’s a few things I’ve used in the past but there’s definitely some new ones and it’s great to have them all in one place.

Check it out.

http://www.sqlservercentral.com/articles/Reporting+Services+(SSRS)/67660/