Finding Solutions In Problems

I’m always looking for novel ways to accomplish something, even it’s totally inefficient… This sounds familiar.

Interestingly, the solution I was experimenting with in the last post is closely related to a problem I took up for resolution today.  In the inefficiency of the code, I found a very clever and unique way of splitting a string on a delimiter like a comma.  Now in this particular case, there was a much more efficient way to accomplish the end result, but I can appreciate the creativity in this query.

This is the code that was focused on:

declare @SortExpression VARCHAR(100) = 'state,zip',@SortOrder VARCHAR(10) = 'ASC'

SELECT STUFF(
(SELECT ', ' + Parsed 
FROM (
    SELECT SUBSTRING(@SortExpression + ',', Number,
        CHARINDEX(',', @SortExpression + ',',Number) - Number) + ' ' + @SortOrder AS Parsed
    FROM Common.dbo.Numbers
    WHERE Number <= LEN(@SortExpression)
    AND SUBSTRING(',' + @SortExpression, Number, 1) = ','
    ) t2
FOR XML PATH(''),TYPE).value('.[1]', 'varchar(MAX)'),1,2,'')

Whenever you see three SELECT keywords and XML together to result in a single value, something seems out of place.  This code takes two strings, “state,zip” and “ASC” and turns it into a single string “state ASC, zip ASC”.  That’s a lot of code for just that.

The interesting part of that query is in the middle, where the query accesses the a database with a table called Numbers which just has a list of numbers from 1 to 10000.  In another post, I created a list of numbers, the exact same thing, in memory using UNPIVOT and a CTE.  I could’ve done the exact same thing to fix this, but the eventual solution I came up with does the same with no recordsets at all to deal with.

Drilling into the inner query,

SELECT SUBSTRING(@SortExpression + ',', Number,
    CHARINDEX(',', @SortExpression + ',',Number) - Number) + ' ' + @SortOrder AS Parsed
FROM Common.dbo.Numbers
WHERE Number <= LEN(@SortExpression)
AND SUBSTRING(',' + @SortExpression, Number, 1) = ','

the thing I find so clever about this query, despite its inefficiency, is the approach it takes.  It uses a sliding window of characters, breaking on the comma, and looking for a trailing comma (that was manually added as a terminator).  When you see the results without the SUBSTRING criteria in the WHERE clause, it looks like:

Parsed

state ASC
state ASC
tate ASC
ate ASC
te ASC
e ASC
ASC
zip ASC
ip ASC
p ASC

And with the SUBSTRING check, the only rows returned are the ones that have a trailing comma.  Code like this makes your head hurt.  In fact, it’s so complex, you almost have to accept it at face value, thinking it’s just so complex, you’d better not touch it.  That is when you isolate and experiment, get the same results, prove it out with different values, and improve it.

So what’s the best replacement I could come up with?

select replace(@SortExpression,',',' ' +@Sortorder + ',') +
case when @SortExpression<>'' then ' ' + @Sortorder else '' end

One complete statement.  Whee!