Question How do I create new nodes based on comma delimited values?

zi_zu

Member
Joined
Oct 12, 2011
Messages
5
Programming Experience
10+
I have XML input like example 1.
I wish to create a new XMLDocument for each PROPERTY
delimited by comma, new nodes containing all STRUCTURE
and for all STRUCTURE all FLOOR as well.
The result shoul look like example 2.


example 1:
HTML:
<?xml version="1.0" encoding="UTF-8"?>
<object-data>
    <object-class name="PROPERTY">
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1, FAST 2"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1, HUS 2, HUS 3"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 1, PLAN 2, PLAN 3, PLAN 4"/>
                </object-attrs>
            </object-class>
        </object-class>
    </object-class>
</object-data>


example 2:
HTML:
<?xml version="1.0" encoding="UTF-8"?>
<object-data>
    <object-class name="PROPERTY">
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 1"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 2"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 3"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 4"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 2"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 1"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 2"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 2"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 2"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 3"/>
                </object-attrs>
            </object-class>
        </object-class>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 2"/>
        </object-attrs>
        <object-class name="STRUCTURE">
            <object-attrs>
                <object-attr name="STRUCT_NAME" value="HUS 1"/>
            </object-attrs>
            <object-class name="FLOOR">
                <object-attrs>
                    <object-attr name="FLOOR_NAME" value="PLAN 4"/>
                </object-attrs>
            </object-class>
        </object-class>

...and so on...

    </object-class>
</object-data>
 
Last edited by a moderator:
HTML:
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1, FAST 2"/>
        </object-attrs>
why should 'FAST 1' become this:
HTML:
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
        </object-attrs>
It should structurally expand to following:
HTML:
        <object-attrs>
            <object-attr name="PROP_NAME" value="FAST 1"/>
            <object-attr name="PROP_NAME" value="FAST 2"/>
        </object-attrs>
In your example you are duplicating unrelated nodes based on 'expansion' of other nodes, why?
 
There you have it: I am to unsure how to make this, I used my own thinking
about a structure.

Example 1 contains 2 PROPERTY (FAST1, FAST2),
3 STRUCTURE (HUS 1, HUS 2, HUS3) and 4 FLOOR values
(PLAN 1, PLAN 2, PLAN 3, PLAN4).
This would generate 24 combinations in total (if my math serves me...)
Those 24 combinations should be the result, since I need to do some lookups
in data tables for each and every combination.

All input on my lack of skill is welcome, I'm ready to learn!
 
As I see it a node should only be duplicated if it's value attribute contains comma separated values, which would give you results like what I suggested. There's no reason to expand and duplicate more than that as it would serve no purpose.
In code such a transformation may be done for example using XDocument and Linq like this:
Dim doc = XDocument.Load("source.xml")
Dim selection = From node In doc.Descendants Where node.@value IsNot Nothing AndAlso node.@value.Contains(", ")
For Each node In selection.ToArray
    Dim values = node.@value.Split(New String() {", "}, StringSplitOptions.None)
    For Each value In values
        Dim clone As New XElement(node)
        clone.@value = value
        node.Parent.Add(clone)
    Next
    node.Remove()
Next
doc.Save("transformed.xml")

The complete result would look like this:
HTML:
<?xml version="1.0" encoding="utf-8"?>
<object-data>
  <object-class name="PROPERTY">
    <object-attrs>
      <object-attr name="PROP_NAME" value="FAST 1" />
      <object-attr name="PROP_NAME" value="FAST 2" />
    </object-attrs>
    <object-class name="STRUCTURE">
      <object-attrs>
        <object-attr name="STRUCT_NAME" value="HUS 1" />
        <object-attr name="STRUCT_NAME" value="HUS 2" />
        <object-attr name="STRUCT_NAME" value="HUS 3" />
      </object-attrs>
      <object-class name="FLOOR">
        <object-attrs>
          <object-attr name="FLOOR_NAME" value="PLAN 1" />
          <object-attr name="FLOOR_NAME" value="PLAN 2" />
          <object-attr name="FLOOR_NAME" value="PLAN 3" />
          <object-attr name="FLOOR_NAME" value="PLAN 4" />
        </object-attrs>
      </object-class>
    </object-class>
  </object-class>
</object-data>
No information in source document has been lost with this transformation.
 
Thanks!
I see your point and it looks very neat.
As it could occur more than three "levels" in input, this is a great help.
Now I'll find out how to extract all combinations, the 24 ones I mentioned :tennis:
Still learning...
 
Basically you need tree nested loops, and loop over each objct-attrs collection.
        Dim props = doc.<object-data>.<object-class>.<object-attrs>.<object-attr>
        Dim structs = props.<object-class>.<object-attrs>.<object-attr>
        Dim floors = structs.<object-class>.<object-attrs>.<object-attr>

        For Each x In props
            For Each y In structs
                For Each z In floors
                    Debug.WriteLine(String.Format("{0} : {1} : {2}", x.@value, y.@value, z.@value))
                Next
            Next
        Next
 
Thanks again!
I've been looking for a way to handle dynamic portion of sub-nodes.
As it is now, it's props-structs-floors, but it could be props-structs-floors-spaces or even
more levels of data like city-props-structs-floors-spaces.
Finding all combinations when not knowing how many levels there are made me look
at recursive reading of xml files, but I haven't found anything helpful so far.
 
Combining groups like that is called Cartesian product, and there are much to be found on the web for that.
For example I found a Linq extension method in C# that I converted to VB.Net from here: Computing a Cartesian Product with LINQ - Fabulous Adventures In Coding - Site Home - MSDN Blogs
Imports System.Runtime.CompilerServices

Module Extensions

    <Extension()> _
    Public Function CartesianProduct(Of T)(ByVal sequences As IEnumerable(Of IEnumerable(Of T))) _
    As IEnumerable(Of IEnumerable(Of T))
        Dim emptyProduct As IEnumerable(Of IEnumerable(Of T)) = New T()() {New T() {}}
        Return sequences.Aggregate(emptyProduct, Function(accumulator, sequence) _
                                                     From accseq In accumulator _
                                                     From item In sequence _
                                                     Select accseq.Concat(New T() {item}))
    End Function
End Module

Then to get the dynamic groups for example make a query like so, this gets value attribute from all object-attrs/object-attr grouped by object-class, which makes it fit to the extension method where T is String:
        Dim groups = doc...<object-class> _
                    .Select(Function(x) x.<object-attrs>.<object-attr> _
                    .Select(Function(y) y.@value))

        For Each prod In groups.CartesianProduct
            Debug.WriteLine(String.Join(", ", prod.ToArray))
        Next
 
Back
Top