Home » excel » excel – recursive tree-like parsing with VBA

excel – recursive tree-like parsing with VBA

Posted by: admin May 14, 2020 Leave a comment

Questions:

I have the following input & output data (Sheet1 from 1-19, Sheet2 from 21+, followed by output)
https://ethercalc.org/bzrwyz8bsail (Note that the children are aligned to the right instead of having 2 spaces which is the formatting the script does)

I have the following VBA script which parses the parents and items and writes to Sheet 2:

Sub newlist()
    Dim w1 As Worksheet
    Dim w2 As Worksheet
    Dim Ide As String
    Dim k As Long
    Dim kk As Long
    Dim n As Long
    Dim entity As String

    Set w1 = Sheets("Sheet1")
    Set w2 = Sheets("Sheet2")
    w2.Cells(1, 1).Value = w1.Cells(1, 8).Value
    w2.Cells(1, 2).Value = w1.Cells(1, 10).Value
    c = 0
    Ide = Cells(1, 1).Value
    w1.Activate
    n = Cells(Rows.Count, 1).End(xlUp).row
    k = 3
    kk = 1
    For i = 2 To n
        If w1.Cells(i, 8).Value = Ide Then
            entity= w1.Cells(i, 10).Value
            entityString = "  " & entity
            w2.Cells(kk + 1, 1).Value = entityString
            kk = kk + 1
            k = k + 1
        Else
            kk = kk + 1
            k = 3
            Ide = w1.Cells(i, 8).Value
            entity= w1.Cells(i, 10).Value
            w2.Cells(kk, 1).Value = Ide
            kk = kk + 1
            entityString = "  " & entity
            w2.Cells(kk, 1).Value = entityString
        End If
        Next
    End Sub

As one can see in the output, parents that are a child themselves are not written under their parent. For instance, the first Papa’s children should be written under Root’s child Papa, and no name should be repeated twice. Another instance is how Echo’s children should be under Echo, rather than being repeated.

How would I go about handling this using recursion? It seems like iteration is not effective.

How to&Answers:

I’m really weak about recursion, but here’s a go at it. Output below is from the Debug.Print statement:

Root
  Lima
    Delta
    Echo
      Foxtrot
      Golf
      Hotel
      India
      Juliett
      Kilo
  Mike
  November
  Oscar
    Papa
      Alpha
      Bravo
      Charlie
    Quebec

And the output on Sheet 2 with the indent levels:

enter image description here

Recursion happens when the method ProcessItem calls itself within the For Each v In dict(name) loop:

Option Explicit
Sub newlist()
    Dim w1 As Worksheet, w2 As Worksheet
    Dim num_rows
    Dim parent As Range, parentName As String
    Dim parentRange As Range, childrenRange As Range
    Dim childCount As Long
    Dim p As Variant

    Dim dict As Object
    Set dict = CreateObject("Scripting.Dictionary")

    Set w1 = Sheets("Sheet6")
    num_rows = w1.Cells(Rows.Count, 1).End(xlUp).row
    'If there's no parentName column, we can't continue.
    If w1.Rows(1).Find("parentName") Is Nothing Then Exit Sub
    Set parentRange = w1.Rows(1).Find("parentName").Offset(1).Resize(num_rows - 1, 1)
    'If there's no Root level, how do we know where to start?
    If parentRange.Find("Root") Is Nothing Then Exit Sub

    For Each parent In parentRange
        If Not dict.Exists(parent.Value) Then
            childCount = Application.WorksheetFunction.CountIf(parentRange, parent.Value)
            Set childrenRange = parent.Offset(, 2).Resize(childCount, 1)
            dict.Add parent.Value, Application.Transpose(Application.Transpose(childrenRange.Value))
        End If
    Next
    Set w2 = Sheets.Add
    ' Recursive method to traverse our dictionary, beginning at Root element.
    Call ProcessItem("Root", dict, w2, 2)

    w2.Cells(1, 1).Value = w1.Cells(1, 8).Value
    w2.Cells(1, 2).Value = w1.Cells(1, 10).Value

End Sub
Private Sub ProcessItem(name As String, dict As Object, ws As Worksheet, row_num As Long, Optional indent As Long = 0)
Dim output As String, v
' add spaces to indent the output string:
output = WorksheetFunction.Rept(" ", indent) & name
Debug.Print output
' write output to the new worksheet:
ws.Cells(row_num, 1).Value = output
row_num = row_num + 1
If Not dict.Exists(name) Then
    'we're at a terminal element, a child with no children.
    Exit Sub
Else
    For Each v In dict(name)
        ' ## RECURSION ##
        Call ProcessItem(CStr(v), dict, ws, row_num, indent + 2)
    Next
End If

End Sub

Follow-up:

If you wanted to track the parent name also (ex. “parent.child”) then I think you could do it like this (untested):

Make your initial call like so — you don’t actually need to name the argument in the function call, but I notated it like this just to illustrate:

Call ProcessItem(parentName:="", "Root", dict, w2, 2)

Then the function needs to be modified slightly:

Private Sub ProcessItem(parentName as String, name As String, dict As Object, ws As Worksheet, row_num As Long, Optional indent As Long = 0)
Dim output As String, v
output = IIF(parentName = "", name, parentName & "." & name)
output = WorksheetFunction.Rept(" ", indent) & output
Debug.Print output
' write output to the new worksheet:
ws.Cells(row_num, 1).Value = output
row_num = row_num + 1
If Not dict.Exists(name) Then
    'we're at a terminal element, a child with no children.
    Exit Sub
Else
    For Each v In dict(name)
        ' ## RECURSION ##
        Call ProcessItem(name, CStr(v), dict, ws, row_num, indent + 2)
    Next
End If

End Sub