When you read sorted UNIQUEIDENTIFIER values in the query results from Microsoft SQL Server, it's not immediately clear how they may be ordered. Alberto Ferrari wrote a blog post about this subject a while ago. However, I didn't find that Alberto's article highlighted the answer to my fundamental questions about UNIQUEIDENTIFIER sorting. Numbering the octets in the UNIQUEIDENTIFIER from left to right as 0..9 then A..F, what is the Most Significant Byte (MSB) to Least Significant Byte (LSB) ordering of the type from SQL Server's perspective?
You can really think of a UNIQUEIDENTIFIER like a really big integer: 128 bits or 16 bytes wide. However, when you look at the 32 hexadecimal digits of a UNIQUEIDENTIFIER expressed as a hyphenated string, it's not clear that SQL is treating this type like a number that we would read from left to right. For example, look at the following two UNIQUEIDENTIFIERs generated by SQL Server's NEWID() function:
Do these look like large integers? Maybe if you took the hyphens out. If they were numbers, which of them would be the larger one? Reading left to right, anyone with a cursory understanding of hexadecimal notation might assume that the first value beginning with EB is larger than the one beginning with 4F. But that's not correct from SQL Server's perspective. The second one is the larger UNIQUEIDENTIFIER to SQL Server. To prove this, run the following query in SQL Server Management Studio:
WITH [UIDs] AS ( -- 0 1 2 3 4 5 6 7 8 9 A B C D E F
SELECT [ID] = '0', [UID] = CAST('01000000-0000-0000-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '1', [UID] = CAST('00010000-0000-0000-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '2', [UID] = CAST('00000100-0000-0000-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '3', [UID] = CAST('00000001-0000-0000-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '4', [UID] = CAST('00000000-0100-0000-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '5', [UID] = CAST('00000000-0001-0000-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '6', [UID] = CAST('00000000-0000-0100-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '7', [UID] = CAST('00000000-0000-0001-0000-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '8', [UID] = CAST('00000000-0000-0000-0100-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = '9', [UID] = CAST('00000000-0000-0000-0001-000000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = 'A', [UID] = CAST('00000000-0000-0000-0000-010000000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = 'B', [UID] = CAST('00000000-0000-0000-0000-000100000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = 'C', [UID] = CAST('00000000-0000-0000-0000-000001000000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = 'D', [UID] = CAST('00000000-0000-0000-0000-000000010000' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = 'E', [UID] = CAST('00000000-0000-0000-0000-000000000100' AS UNIQUEIDENTIFIER)
UNION SELECT [ID] = 'F', [UID] = CAST('00000000-0000-0000-0000-000000000001' AS UNIQUEIDENTIFIER)
)
SELECT [ID], [UID] FROM [UIDs] ORDER BY [UID] DESC
This query is based on the one that Alberto Ferrari showed in his blog. But this one goes after the answer that I'm interested in a bit more directly. The query creates a rowset that associates each distinct octet in UNIQUEIDENTIFIERs with an order identifier, 0..9 then A..F. Then, by ordering the results of the query by the the UNIQUEIDENTIFIER values, we should be able to tell which octets are more significant numerically than the others according to SQL Server. Here's what the results look like:
A 00000000-0000-0000-0000-010000000000
B 00000000-0000-0000-0000-000100000000
C 00000000-0000-0000-0000-000001000000
D 00000000-0000-0000-0000-000000010000
E 00000000-0000-0000-0000-000000000100
F 00000000-0000-0000-0000-000000000001
8 00000000-0000-0000-0100-000000000000
9 00000000-0000-0000-0001-000000000000
7 00000000-0000-0001-0000-000000000000
6 00000000-0000-0100-0000-000000000000
5 00000000-0001-0000-0000-000000000000
4 00000000-0100-0000-0000-000000000000
3 00000001-0000-0000-0000-000000000000
2 00000100-0000-0000-0000-000000000000
1 00010000-0000-0000-0000-000000000000
0 01000000-0000-0000-0000-000000000000
Do you see the pattern? Reading the IDs in the left column from top to bottom, we can see that the most significant octet is in the A position and the least significant one is at position 0. The pattern that represents the most significant to the least significant bytes reads as ABCDEF8976543210 which is not exactly what you might expect. Looking back at the new UNIQUEIDENTIFIER values shown above, now we know why EBC23DE8-DC16-4A8B-8E11-6B1509B0DAED is considered to be less than 4F899E16-9D3E-4EA6-8A32-749A3FCAD865 numerically by SQL Server. The 6B octet at position A in the first UNIQUEIDENTIFER is less than the 74 octet at the same position in the second value. In fact, if we were to reorder the all of the bytes in those two UNIQUEIDENTIFIERs left to right according to the way that SQL Server really "sees" them numerically, they would need to undergo these transformations:
So, now we can read SQL Server UNIQUEIDENTIFIERs numerically. That will come in handy one day so tuck it into the back of your brain. Enjoy.