Home » c# » c# efficient way to convert a list of paths to xml-Exceptionshub

c# efficient way to convert a list of paths to xml-Exceptionshub

Posted by: admin February 24, 2020 Leave a comment

Questions:

I have a list of folder paths coming from a database that I need to export to xml. The raw data looks as follows:

enter image description here

What I need to do is create a tree like structure, similar to this within the xml:

 - networkAdd
    - users
        - test1
           - delete unicode character test
                 - character test 1
                        - linked to folder
                 - character test 2
                 - character test 3
 - sp2013 
    - newTestsite
        - newTestLib
           - sampleFolder
           - Renamed at source again
           - SecurityTest2013Folder
        - Shared Documents
           - sample.folder

I have current got an efficient xml write method available to me BUT it requires a treeview. I took the list above (coming from the database) and converted it to a treeview that could be used with this method (which works fine) but it requires me to convert to a treeview first which is inefficient. I use this code:

public static TreeView PopulateTreeView(IEnumerable<FolderInfo> paths)
{
    var treeView = new TreeView();
    treeView.PathSeparator = "\";

    TreeNode lastNode = null;
    string subPathAgg;
    string lastRootFolder = null;

    foreach (var item in paths)
    {
        var path = item.FolderName; // folder path.

        if (lastRootFolder != item.FolderRoot)
        {
            lastRootFolder = item.FolderRoot;
            lastNode = null;
        }

        subPathAgg = string.Empty;
        foreach (string subPath in path.Split('\'))
        {
            if (subPath.Length > 0)
            {
                subPathAgg += subPath + "\";
                TreeNode[] nodes = treeView.Nodes.Find(subPathAgg, true);

                var newNode = new TreeNode
                {
                    Name = subPathAgg,
                    Text = subPath,
                    ImageIndex = 2,
                    ToolTipText = item.FullFolderPath
                };

                if (nodes.Length == 0)
                {
                    if (lastNode == null)
                        treeView.Nodes.Add(newNode);
                    else
                        lastNode.Nodes.Add(newNode);

                    lastNode = newNode;
                }
                else
                    lastNode = nodes[0];
            }
        }
    }

    return treeView;
}

This line of code becomes very slow to execute when I have over 10 million records to process:
TreeNode[] nodes = treeView.Nodes.Find(subPathAgg, true);

It would be much more efficient for me to convert straight from DB to XML (without the treeview middle man).

Has anyone any advice on an alternative way of parsing folder paths into xml, taking nesting into consideration? Thanks for any pointers in advance!

How to&Answers:

Turns out, if you can ensure that your strings are properly sorted (which should be easy if they’re coming from a DB), this is pretty easy if you work directly with an XmlWriter. Something like:

var strings = new[]
{
    @"\networkAdd",
    @"\networkAdd\users",
    @"\networkAdd\users\test1\",
    @"\networkAdd\users\test1\delete unicode character test",
    @"\networkAdd\users\test1\delete unicode character test\character test 1",
    @"\networkAdd\users\test1\delete unicode character test\character test 1\linked to folder",
    @"\networkAdd\users\test1\delete unicode character test\character test 2",
    @"\networkAdd\users\test1\delete unicode character test\character test 3",
    @"http:\sp2013",
    @"http:\sp2013\newTestsite",
    @"http:\sp2013\newTestlib",
    @"http:\sp2013\newTestlib\sampleFolder",
};
// Obviously, stream it out to a file rather than an in-memory string
using (var stringWriter = new StringWriter())
using (var writer = new XmlTextWriter(stringWriter))
{
    writer.WriteStartDocument();
    writer.WriteStartElement("Items");

    var previous = Array.Empty<string>();
    foreach (var str in strings)
    {
        var current = str.Split('\', StringSplitOptions.RemoveEmptyEntries);
        int i;
        // Find where the first difference from the previous element is
        for (i = 0; i < Math.Min(current.Length, previous.Length); i++)
        {
            if (current[i] != previous[i])
            {
                break;
            }
        }
        // i now contains the index of the first difference
        // First, close off anything in previous which isn't in the current
        for (int j = i; j < previous.Length; j++)
        {
            writer.WriteEndElement();
        }
        // Then, any new elements
        for (int j = i; j < current.Length; j++)
        {
            writer.WriteStartElement("Item");
            writer.WriteAttributeString("value", current[j]);
        }

        previous = current;
    }

    writer.WriteEndDocument();
}

Gives:

<?xml version="1.0" encoding="utf-16"?>
<Items>
    <Item value="networkAdd">
        <Item value="users">
            <Item value="test1">
                <Item value="delete unicode character test">
                    <Item value="character test 1">
                        <Item value="linked to folder" />
                    </Item>
                    <Item value="character test 2" />
                    <Item value="character test 3" />
                </Item>
            </Item>
        </Item>
    </Item>
    <Item value="http:">
        <Item value="sp2013">
            <Item value="newTestsite" />
            <Item value="newTestlib">
                <Item value="sampleFolder" />
            </Item>
        </Item>
    </Item>
</Items>

It needs a bit of work around handling :// etc, but the basic principle should be sound.