Home » excel » vba – how to build parent-child data table in excel?

vba – how to build parent-child data table in excel?

Posted by: admin March 9, 2020 Leave a comment


I have data in this fashion:

Parent  |  Data
Root    | AAA  
AAA     | BBB  
AAA     | CCC  
AAA     | DDD  
BBB     | EEE  
BBB     | FFF  
CCC     | GGG  
DDD     | HHH  

Which needs to be converted into a below like fashion. This basically needs to end up in an excel spreadsheet. How can I convert the above data into the following:


1   |  2  | 3

AAA | BBB |  
AAA | CCC |  
AAA | DDD |  
How to&Answers:

I started and finished the answer below late last night. In the cold light of day it needs at least some expansion.

Sheet2, source data, before the macro is run:

Sheet2, source data, before the macro is run

Sheet3, result, after the macro is run:

Sheet3, result, after the macro is run

The basis of the method is to create arrays that link each child to its parent. The macro then follows the chain from each child up its ancesters growing a string: child, parent|child, grandparent|parent|child, … After sorting, this is the result ready for saving.

With the example data, Steps 1 and 3 could be combined because all the names and rows are in alphabetic order. Building the list of names in one step and linking them in another makes for a simple macro regardless of the sequence. On reflection, I am not sure if step 2, sorting the names, is necessary. Sorting the ancester name lists, step 5, is necessary. Sorting Sheet3 after output is not possible because there might be more than three levels.

I am not sure if this counts as an elegant solution but its pretty simple.

I have placed the source data in worksheet Sheet2 and I output to Sheet3.

There are 7 stages:

  1. Build array Child containing every name.
  2. Sort array Child. I have provided a simple sort which is adequate for a demonstration. Better sorts are available on the internet if you have enough names to require it.
  3. Build array Parent such that Parent(N) is the index within Child of the parent of Child(N).
  4. Build array ParentName by following the pointers in array Parent from child to parent to grandparent to … While doing this, determine the maximum number of levels.
  5. Sort array ParentName.
  6. Build a header row in the output sheet.
  7. Copy ParentName to the output sheet.

I believe I have included enough comments for the code to be understandable.

Option Explicit
Sub CreateParentChildSheet()

  Dim Child() As String
  Dim ChildCrnt As String
  Dim InxChildCrnt As Long
  Dim InxChildMax As Long
  Dim InxParentCrnt As Long
  Dim LevelCrnt As Long
  Dim LevelMax As Long
  Dim Parent() As Long
  Dim ParentName() As String
  Dim ParentNameCrnt As String
  Dim ParentSplit() As String
  Dim RowCrnt As Long
  Dim RowLast As Long

  With Worksheets("Sheet2")
    RowLast = .Cells(Rows.Count, 1).End(xlUp).Row
    ' If row 1 contains column headings, if every child has one parent
    ' and the ultimate ancester is recorded as having a parent of "Root",
    ' there will be one child per row
    ReDim Child(1 To RowLast - 1)

    InxChildMax = 0
    For RowCrnt = 2 To RowLast
      ChildCrnt = .Cells(RowCrnt, 1).Value
      If LCase(ChildCrnt) <> "root" Then
        Call AddKeyToArray(Child, ChildCrnt, InxChildMax)
      End If
      ChildCrnt = .Cells(RowCrnt, 2).Value
      If LCase(ChildCrnt) <> "root" Then
        Call AddKeyToArray(Child, ChildCrnt, InxChildMax)
      End If

    ' If this is not true, one of the assumptions about the
    ' child-parent table is false
    Debug.Assert InxChildMax = UBound(Child)

    Call SimpleSort(Child)

    ' Child() now contains every child plus the root in
    ' ascending sequence.

    ' Record parent of each child
      ReDim Parent(1 To UBound(Child))
      For RowCrnt = 2 To RowLast
        If LCase(.Cells(RowCrnt, 1).Value) = "root" Then
          ' This child has no parent
          Parent(InxForKey(Child, .Cells(RowCrnt, 2).Value)) = 0
          ' Record parent for child
          Parent(InxForKey(Child, .Cells(RowCrnt, 2).Value)) = _
                           InxForKey(Child, .Cells(RowCrnt, 1).Value)
        End If

  End With

  ' Build parent chain for each child and store in ParentName
  ReDim ParentName(1 To UBound(Child))

  LevelMax = 1

  For InxChildCrnt = 1 To UBound(Child)
    ParentNameCrnt = Child(InxChildCrnt)
    InxParentCrnt = Parent(InxChildCrnt)
    LevelCrnt = 1
    Do While InxParentCrnt <> 0
      ParentNameCrnt = Child(InxParentCrnt) & "|" & ParentNameCrnt
      InxParentCrnt = Parent(InxParentCrnt)
      LevelCrnt = LevelCrnt + 1
    ParentName(InxChildCrnt) = ParentNameCrnt
    If LevelCrnt > LevelMax Then
      LevelMax = LevelCrnt
    End If

  Call SimpleSort(ParentName)

  With Worksheets("Sheet3")
    For LevelCrnt = 1 To LevelMax
      .Cells(1, LevelCrnt) = "Level " & LevelCrnt
    ' Ignore entry 1 in ParentName() which is for the root
    For InxChildCrnt = 2 To UBound(Child)
      ParentSplit = Split(ParentName(InxChildCrnt), "|")
      For InxParentCrnt = 0 To UBound(ParentSplit)
        .Cells(InxChildCrnt, InxParentCrnt + 1).Value = _

  End With

End Sub

Sub AddKeyToArray(ByRef Tgt() As String, ByVal Key As String, _
                                                  ByRef InxTgtMax As Long)

  ' Add Key to Tgt if it is not already there.

  Dim InxTgtCrnt As Long

  For InxTgtCrnt = LBound(Tgt) To InxTgtMax
    If Tgt(InxTgtCrnt) = Key Then
      ' Key already in array
      Exit Sub
    End If
  ' If get here, Key has not been found
  InxTgtMax = InxTgtMax + 1
  If InxTgtMax <= UBound(Tgt) Then
    ' There is room for Key
    Tgt(InxTgtMax) = Key
  End If

End Sub

Function InxForKey(ByRef Tgt() As String, ByVal Key As String) As Long

  ' Return index entry for Key within Tgt

  Dim InxTgtCrnt As Long

  For InxTgtCrnt = LBound(Tgt) To UBound(Tgt)
    If Tgt(InxTgtCrnt) = Key Then
      InxForKey = InxTgtCrnt
      Exit Function
    End If

  Debug.Assert False        ' Error

End Function
Sub SimpleSort(ByRef Tgt() As String)

  ' On return, the entries in Tgt are in ascending order.

  ' This sort is adequate to demonstrate the creation of a parent-child table
  ' but much better sorts are available if you google for "vba sort array".

  Dim InxTgtCrnt As Long
  Dim TempStg As String

  InxTgtCrnt = LBound(Tgt) + 1
  Do While InxTgtCrnt <= UBound(Tgt)
    If Tgt(InxTgtCrnt - 1) > Tgt(InxTgtCrnt) Then
      ' The current entry belongs before the previous entry
      TempStg = Tgt(InxTgtCrnt - 1)
      Tgt(InxTgtCrnt - 1) = Tgt(InxTgtCrnt)
      Tgt(InxTgtCrnt) = TempStg
      ' Check the new previous enty against its previous entry if there is one.
      InxTgtCrnt = InxTgtCrnt - 1
      If InxTgtCrnt = LBound(Tgt) Then
        ' Prevous entry is start of array
        InxTgtCrnt = LBound(Tgt) + 1
      End If
      ' These entries in correct sequence
      InxTgtCrnt = InxTgtCrnt + 1
    End If

End Sub


I have a simpler solution using TreeView object. If you don’t mind the order of the nodes to be difference and using MSCOMCTL.OCX, please use below code.

Requires MSOCOMCTL.OCX to be registered.
enter image description here

Consider this data:

Using a TreeView (adding to a UserForm for visualization, code not shown):

Code to dump the tree data (normal module, use TreeToText):

Option Explicit

Private oTree As TreeView

Private Sub CreateTree()
    On Error Resume Next ' <-- To keep running even error occurred
    Dim oRng As Range, sParent As String, sChild As String

    Set oRng = ThisWorkbook.Worksheets("Sheet1").Range("A2") ' <-- Change here to match your Root cell
    Do Until IsEmpty(oRng)
        sParent = oRng.Value
        sChild = oRng.Offset(0, 1).Value
        If InStr(1, sParent, "root", vbTextCompare) = 1 Then
            oTree.Nodes.Add Key:=sChild, Text:=sChild
            oTree.Nodes.Add Relative:=oTree.Nodes(sParent).Index, Relationship:=tvwChild, Key:=sChild, Text:=sChild
        End If
        '--[ ERROR HANDLING HERE ]--
        ' Invalid (Repeating) Child will have the Row number appended
        If Err.Number = 0 Then
            Set oRng = oRng.Offset(1, 0) ' Move to Next Row
            oRng.Offset(0,1).Value = sChild & " (" & oRng.Row & ")"
        End If
    Set oRng = Nothing
End Sub

Sub TreeToText()
    Dim oRng As Range, oNode As Node, sPath As String, oTmp As Variant

    ' Create Tree from Data
    Set oTree = New TreeView
    ' Range to dump Tree Data
    Set oRng = ThisWorkbook.Worksheets("Sheet1").Range("D2") ' <-- Change here
    For Each oNode In oTree.Nodes
        sPath = oNode.FullPath
        If InStr(1, sPath, oTree.PathSeparator, vbTextCompare) > 0 Then
            oTmp = Split(sPath, oTree.PathSeparator)
            oRng.Resize(, UBound(oTmp) + 1).Value = oTmp
            Set oRng = oRng.Offset(1, 0)
        End If
    Set oRng = Nothing
    Set oTree = Nothing
End Sub

Output of code (hard code to D2):
Macro Output

If you have a very large data, you better off load the Range to memory first.