Question Recursive CTE Solution

elroyskimms

Member
Joined
Jul 10, 2015
Messages
19
Programming Experience
10+
I'm having trouble wrapping my head around Recursive CTE's in SQL Server. My dataset (simplified version below for copy/paste) is a large table containing hundreds of thousands of chains. Each chain is made up of a number of components, linked together in a specific order for each chain. I've added both a ParentID column AND a DisplayOrder column to the table. I can keep one or both of those columns as needed.

The goal is to take a 2[SUP]nd[/SUP] list of chains (input) and search/compare them to the existing chains. The input chains can be 1 component chains or they can be multiple components linked together in a specific order. The end results need to show which chains in the main table contain matches for the input chains, and they must match the components in the same order.

If Chain #1 in the table is A, B, C, D, E and I am searching for B, it will match. If I am searching for C, D, E, it will also match. But if I am searching for A, C, D, it will not match as the input chain is not the same order as the chain in the table. I also want to be able to use multiple chains in the input at once. So I can search for B and B, C, D and also A, C, D at the same time and generate the results.

I have a solution already for this using a recursive function in VB.Net which grabs each match where the ParentID = 0 (start of a chain) and then navigates down each chain until it reaches the end OR it finds something that doesn't match. It's more or less a brute force procedural solution. It works, but it handles one chain at a time. That's not very efficient when matching a list of 30,000 chains on the input and 500,000 chains in the database table. I can also do this with an SQL Cursor, but again, that's a very procedural solution. I'm hoping to find a Recursive CTE that will be more efficient and allow me to search multiple input chains at the same time, rather than iterating through each input chain individually. Inline table functions might as well be sorcery to me. This is not my area of expertise. Any help would be greatly appreciated.

-E

Copy/Paste

VB.NET:
DECLARE @Chains TABLE (ID int, Component varchar(20), ParentID int, DisplayOrder int, ChainID int)
DECLARE @SearchChains TABLE (Component varchar(20), DisplayOrder int, GroupID int)
 
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (1, 'Head bone', 0, 0, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (2, 'Neck bone', 1, 1, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (3, 'Shoulder bone', 2, 2, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (4, 'Back bone', 3, 3, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (5, 'Hip bone', 4, 4, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (6, 'Thigh bone', 5, 5, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (7, 'Knee bone', 6, 6, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (8, 'Shin bone', 7, 7, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (9, 'Ankle bone', 8, 8, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (10, 'Heel bone', 9, 9, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (11, 'Foot bone', 10, 10, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (12, 'Toe bone', 11, 11, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (13, 'Knee bone', 0, 0, 2)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (14, 'Shin bone', 13, 1, 2)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (15, 'Head bone', 0, 0, 3)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (16, 'Toe bone', 15, 1, 3)
 
 
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Head bone', 0, 1)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Neck bone', 1, 1)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Shoulder bone', 2, 1)
 
--Perform recursive search with 1 3-part chain
--Should only match ChainID 1
 
DELETE FROM @SearchChains
 
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Head bone', 0, 2)
 
--Perform recursive search with 1 1-part chain
--Should match both ChainID 1 and ChainID 3
 
DELETE FROM @SearchChains
 
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Shin bone', 0, 3)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Head bone', 0, 4)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Neck bone', 1, 4)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Shoulder bone', 2, 4)
 
--Perform recursive search with both 1-part and 3-part chains
--SearchChain GroupID 3 should match ChainID 1 and ChainID 2
--SearchChain GroupID 4 should match ChainID 1
 
Working Backward From Results

Sometimes I find it helpful to generate the desired output and work backwards. In this case, it's not helping me very much, but I thought I'd add this here in case it helps anyone else.

VB.NET:
|              INPUT               |                 DATABASE CHAINS                |
|Component    |DisplayOrder|GroupID|ID  |Component    |DisplayOrder|ChainID|IsMatch?|
|Shin bone    |0           |3      |8   |Shin bone    |7           |1      |1       |
|Shin bone    |0           |3      |14  |Shin bone    |1           |2      |1       |
|Head bone    |0           |4      |1   |Head bone    |0           |1      |1       |
|Neck bone    |1           |4      |2   |Neck bone    |1           |1      |1       |
|Shoulder bone|2           |4      |3   |Shoulder bone|2           |1      |1       |
|Head bone    |0           |4      |1   |Head bone    |0           |3      |1       |
|Neck bone    |1           |4      |2   |Toe bone     |1           |3      |0       |
|Shoulder bone|2           |4      |NULL|NULL         |NULL        |3      |0       |

This would be the results that I am trying to achieve. From here, I can do a bitwise AND on the IsMatch column (or SUM and compare it to a COUNT) to make sure that all of the items in the Input chain are matched and thus eliminate anything that is not a complete match. Of course, there may be a better alternative to all of this, and I'm happy to consider it.

I can get the desired output for the Anchor query of the CTE. I can use:

VB.NET:
SELECT SC.Component, SC.DisplayOrder, SC.GroupID, C.ID, C.Component, C.ParentID, C.DisplayOrder, C.ChainID
FROM @SearchChains SC
LEFT OUTER JOIN @Chains C ON SC.Component = C.Component
    AND SC.DisplayOrder = 0
ORDER BY SC.GroupID ASC, C.ChainID ASC, SC.DisplayOrder ASC

and get

VB.NET:
Component            DisplayOrder GroupID     ID          Component            ParentID    DisplayOrder ChainID
-------------------- ------------ ----------- ----------- -------------------- ----------- ------------ -----------
Neck bone            1            1           NULL        NULL                 NULL        NULL         NULL
Shoulder bone        2            1           NULL        NULL                 NULL        NULL         NULL
Head bone            0            1           1           Head bone            0           0            1

But using an outer join in the recursive query of the CTE is not allowed. So I'm not sure how to traverse the Chain and fill in the NULL fields from the child components. Maybe a PARTITION is needed here? I have no idea. Set theory is a class I wish they would have offered back when I was in college.

-E
 
Close But No Cigar

I've made a little bit of progress on this one. I feel like there is something simple I am missing, and I'll probably look like an idiot when the solution presents itself. I've reduced the sample dataset and query and the results are ALMOST what I am looking for.

VB.NET:
DECLARE @Chains TABLE (ID int, Component varchar(20), ParentID int, DisplayOrder int, ChainID int)
DECLARE @SearchChains TABLE (Component varchar(20), DisplayOrder int, GroupID int)

INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (1, 'Head bone', 0, 0, 1)
INSERT INTO @Chains (ID, Component, ParentID, DisplayOrder, ChainID) VALUES (2, 'Neck bone', 1, 1, 1)

INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Head bone', 0, 1)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Neck bone', 1, 1)
INSERT INTO @SearchChains(Component, DisplayOrder, GroupID) VALUES ('Shoulder bone', 2, 1);

WITH cteMatching (ID, Component, ParentID, DisplayOrder, ChainID, RecLevel)
AS
(
SELECT  C.ID, C.Component, C.ParentID, C.DisplayOrder, C.ChainID, 1 as RecLevel
FROM @Chains C
WHERE DisplayOrder = 0
UNION ALL
SELECT C.ID, C.Component, C.ParentID, C.DisplayOrder, C.ChainID, cte.RecLevel + 1
FROM @Chains C
INNER JOIN cteMatching cte ON C.ParentID = cte.ID
)
SELECT SC.Component, SC.DisplayOrder, SC.GroupID, cte.ID, cte.Component, cte.ParentID,
 ISNULL(cte.DisplayOrder, 2147483647) as DisplayOrder, cte.ChainID, cte.RecLevel,
 ISNULL((SELECT 1 WHERE SC.Component = cte.Component), 0) as IsMatch
FROM @SearchChains SC
LEFT OUTER JOIN cteMatching cte ON SC.Component = cte.Component
ORDER BY SC.GroupID ASC, cte.ChainID ASC, SC.DisplayOrder ASC, cte.DisplayOrder ASC

The results:
VB.NET:
[SIZE=1][SIZE=1]Component            DisplayOrder GroupID     ID          Component            ParentID    DisplayOrder ChainID     RecLevel    IsMatch
-------------------- ------------ ----------- ----------- -------------------- ----------- ------------ ----------- ----------- -----------
Shoulder bone        2            1           NULL        NULL                 NULL        2147483647   NULL        NULL        0
Head bone            0            1           1           Head bone            0           0            1           1           1
Neck bone            1            1           2           Neck bone            1           1            1           2           1
[/SIZE][/SIZE]

The missing piece is that I need the ChainID that is null to match the ChainID of the rest of the Chain. I need to know that this particular chain wasn't long enough and therefore it is not a match. Any ideas?

-E
 
Final Answer

I was able to get some help on StackOverflow. You can read all the details here: sql server - SQL Recursive CTE Linked Chains - Stack Overflow The end results is that while a Recursive CTE would work, their suggestion was a different approach entirely. Here is the final result:

VB.NET:
DECLARE @Chains TABLE (ID int IDENTITY(1,1), ComponentID int, DisplayOrder int, ChainID int)
DECLARE @SearchChains TABLE (ComponentID int, DisplayOrder int, GroupID int)
DECLARE @Components TABLE (ID int IDENTITY(10,1), Component varchar(15))

INSERT INTO @Components (Component) VALUES ('Head bone')
INSERT INTO @Components (Component) VALUES ('Neck bone')
INSERT INTO @Components (Component) VALUES ('Shoulder bone')
INSERT INTO @Components (Component) VALUES ('Back bone')
INSERT INTO @Components (Component) VALUES ('Hip bone')
INSERT INTO @Components (Component) VALUES ('Thigh bone')
INSERT INTO @Components (Component) VALUES ('Knee bone')
INSERT INTO @Components (Component) VALUES ('Shin bone')
INSERT INTO @Components (Component) VALUES ('Ankle bone')
INSERT INTO @Components (Component) VALUES ('Heel bone')
INSERT INTO @Components (Component) VALUES ('Foot bone')
INSERT INTO @Components (Component) VALUES ('Toe bone')
INSERT INTO @Components (Component) VALUES ('Leg bone')

INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (10, 0, 1)
INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (11, 1, 1)
INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (12, 2, 1)
INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (13, 3, 1)
INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (14, 4, 1)
INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (10, 0, 2)
INSERT INTO @Chains (ComponentID, DisplayOrder, ChainID) VALUES (12, 1, 2)

INSERT INTO @SearchChains(ComponentID, DisplayOrder, GroupID) VALUES (13, 0, 1)
INSERT INTO @SearchChains(ComponentID, DisplayOrder, GroupID) VALUES (14, 1, 1)
INSERT INTO @SearchChains(ComponentID, DisplayOrder, GroupID) VALUES (22, 2, 1)
INSERT INTO @SearchChains(ComponentID, DisplayOrder, GroupID) VALUES (10, 0, 2)
INSERT INTO @SearchChains(ComponentID, DisplayOrder, GroupID) VALUES (11, 1, 2)
INSERT INTO @SearchChains(ComponentID, DisplayOrder, GroupID) VALUES (10, 0, 3)
;

WITH ChainPath AS
(
    SELECT ChainID
        ,(
            SELECT CAST('|' + co.Component AS NVARCHAR(MAX))
            FROM @Chains AS ch
                INNER JOIN @Components co ON ch.ComponentID = co.ID
            WHERE ch.ChainID = cc.ChainID
            FOR XML PATH('')
        ) as ChainPath
    FROM @Chains cc
    GROUP BY ChainID
)
,SearchPath AS
(
    SELECT GroupID
          ,(
            SELECT CAST('|' + c.Component AS NVARCHAR(MAX)) 
            FROM @SearchChains AS p
                INNER JOIN @Components c ON p.ComponentID = c.ID
            WHERE p.GroupID=sc.GroupID
            FOR XML PATH('')
           ) AS SearchPath
    FROM @SearchChains AS sc
    GROUP BY GroupID
)


SELECT SP.*, CP.*
FROM ChainPath CP
CROSS JOIN SearchPath SP
WHERE ChainPath + '|' LIKE '%' + SearchPath + '|' + '%'
 
Back
Top