Community Pick: Many members of our community have endorsed this article.
Editor's Choice: This article has been selected by our editors as an exceptional contribution.

Expanding a Hierarchical Data Structure

mbizup
CERTIFIED EXPERT
Published:
Updated:
Introduction

One of the earliest concepts learned by database developers is parent-child (one-to-many) relationships.  These relationships are integral in countless database applications.  A purchase request containing one or more items ordered is a very common example.  

Child records are identifiable through key fields, such as Purchase Order ID, which link them back to the Parent record.    This relationship can easily be displayed using a main form for top-level purchase order information with a subform to list the items ordered.

This scenario gets a little more complex if any of the child records contain children of their own (i.e.,  grandkids).   This type of data structure is common in engineering and manufacturing, where individual parts can make up sub-assemblies which are pieced together to form some final product.  

For example, a bicycle is a top-level assembly that contains parts such as posts, wheels brakes and more.  Each of these parts can also contain multiple parts.   A wheel is a sub-assembly of the bicycle which  is composed of a rim, tire, spokes etcetera.  These sub-assemblies can further contain component parts and sub-assemblies of their own (e.g.,  hubs, chain rings and bearings).    

The connection between each assembly and its parts can be defined with a one-to-many relationship as described previously with the purchase order example.   A Parts List table is used to store child records for the parts, relating them to parent records with a key field such as Parent Assembly ID.   Going a step further than the Purchase Order example, the ID of any component can in turn appear in the Parent Assembly ID field, which defines that part as an assembly having parts of it's own.    

Identifying all parts in a hierarchical structure like this is a common requirement.  Tracking quantities is also important in many applications, for example to determine how much of each part to order when building a given number of a certain assembly.  A listing of materials, sub-assemblies and their associated quantities is called a Bill of Materials (BOM).  Such a listing can be used when ordering parts needed to build the end product.   Additionally, a numeric Indenture Level  which defines each part's place relative to the top of the structure is useful information.  

For example (Indenture Level in parentheses):

                    -Bicycle (0)
                          - Wheel assembly (1)
                                - Rims (2)
                                - Tires (2)
                                - Hub assembly (2)
                                      - Sprockets (3)
                                      - Derailier (3)
                                      - Etc. (3)
                          - Brake assembly (1)
                                - Pads (2)
                                - Cables (2)
                                - Etc. (2)
                         - Etc (1)

A form/subform combination can easily be used to display a top-level parts list for an assembly, with assembly information shown in the main form and its highest level parts shown in the subform.  However reporting the entire structure from the top down goes beyond the capabilities of subforms or subreports.

To report this type of structure, the data for all parts in the hierarchy needs to be extracted from the database.   This can be accomplished in different ways, via code or query.  One advantage of using a code-based method is that other tasks can be accomplished through code at each iteration as the list is being built.

This article focuses on one Visual Basic-based technique that can be used to extract a top-down listing of assemblies and parts while tracking an indenture level defining each part or assembly's place in the hierarchy.  Extracting this data is what is meant by Expanding a Hierarchical Data Structure, in that each sub-assembly is successively expanded to reveal all of the parts that make up the end product.

The Sample Database
ExpandPartsList.mdb

The attached database is a two-table example with Visual Basic code to extract a top-down listing of the parts in an assembly.  The data included with the sample defines a wheel, with sub-assemblies and parts.  The data is not meant to be realistic; it simply is there for visual example.  The sample form and code allow the user to expand a parts listing for the wheel or for any of the sub-assemblies.  The rest of this article dissects the sample to explain the concepts used.
 

1. The Tables

Design and relationships

One table is used for a general parts repository, containing information pertinent to each part.  For this purpose, even assemblies are considered parts.  Continuing with the bicycle example, the bicycle, wheels and brakes are "parts" that should be included in this table along with the nuts and bolts that are used to build them.  The structure of the parts repository table follows:
Design of the parts repository table (tblPartMain)Although not included in this simple sample database, additional fields such as manufacturer information and units would be useful.  The sample assumes that the units for all parts is "Each".  A units field, however would help ensure that the users understood how to enter quantities (e.g.,  feet vs spools of wire).  Keeping units consistent is of particular importance for summing quantities to get totals per assembly.
 
A second table is used to store assembly parts, relating each part to its parent assembly.  While the main Parts List table is simply a collection of all assemblies and piece parts, this component table distinguishes assemblies from simple parts.  A ParentID field is used to relate parts to their associated assemblies.  Any part that appears in this ParentID field is by definition an assembly.  The structure of this table follows:
Design of table listing components of the final product, including subassemblies and piece parts (tblSubAssemblyInfo)
The two tables are linked by a simple one-to many relationship.  The parts which appear uniquely in tblPartMain can occur multiple times in tblSubassemblyInfo, because the same part can appear throughout the overall structure on many different assemblies.  There is also a many-to-many relationship present between the piece parts and parent assemblies in tblSubAssemblyInfo.  Any part can appear on many different assemblies, and assemblies can contain multiple parts.   For this reason, neither the PartID field nor the ParentAssemblyID field can be unique.  The following query output illustrates this point, in that 1/4" Bolts appear on multiple assemblies, and the parent assemblies also appear as having multiple child records.
SELECT tblSubAssemblyInfo.ComponentID, tblPartMain_1.PartDescription AS Part, tblPartMain.PartDescription AS [Next Higher Assembly], tblSubAssemblyInfo.Qty, tblSubAssemblyInfo.MyNotes
                      FROM (tblSubAssemblyInfo INNER JOIN tblPartMain ON tblSubAssemblyInfo.ParentAssemblyID = tblPartMain.ID) INNER JOIN tblPartMain AS tblPartMain_1 ON tblSubAssemblyInfo.PartID = tblPartMain_1.ID;

Open in new window

Query of typical data in tblSubAssemblyInfo
The relationships between the two tables are shown below.  Note that there are not two subassembly information tables, the relationship diagram just shows a self-join which defines a link between two fields in the same table.
Relationship diagram

Although not needed for expanding a parts list (and not included in this sample), a third table for storing assembly specific details might be useful in some situations.  Such details might include projects that funded the assemblies, approval information for assembly drawings, revision information and more.  Details like these do not really belong in a general parts table.


2. The Code

Data expansion algorithm

The code included in the sample loops through an assembly, extracting data associated with each part and exporting it to an output table that can then be used for reports or other purposes.  The key to extracting data for related sub-assemblies is recursive function calls.  This simply means that the function calls itself to repeat the same steps for parts that are determined to be sub-assemblies (i.e.,  looping through all of the parts in any sub-assembly, sub-sub-assembly, etc).  
 
The data extracted is similar to that stored in the Components Table, but the code also generates an indenture level to define a given part's place in the structure (for example, the bicycle would have an indenture level of 0, and a screw on the brake assembly may have an indenture level of 5).  Also, unlike the quantity shown in the component table, which is per subassembly, the quantity calculated for the exported data applies to the entire structure.  For example, a bicycle may have two identical wheel assemblies.  The part listing for a wheel may show 30 spokes.  The quantity calculated for the bicycle as a whole should be 30 spokes x 2 wheels per bike, or 60 spokes total.

The major steps in the code can be summarized as follows:

1. Select all parts from the component table having this assembly (for example, a wheel) as a parent.

2. Determine whether each selected part is a sub-assembly or simply a piece part.  If the PartID of a part appears in the ParentAssemblyID field for any component record, then that part is a sub-assembly.

3. Expand any sub-assemblies by recursively calling the function to repeat the same steps for each of its component parts.

4. Count the number of indenture levels by keeping track of recursive function calls.  This can be done using a Static variable, which holds its value between repeated function calls.  The indenture level needs to be incremented as the code steps down through subassemblies and sub-subassemblies within the same branch of the structure.  When all the parts in a given subassembly have been extracted, the indenture level needs to be returned to the indenture level of the next higher assembly as the code continues looping through the parts of the parent assembly.  Using this process, the top-level assembly will be assigned an indenture level of 1, and the subassemblies and parts underneath it will have indenture levels of 2 or greater, depending on their depth in the structure.

5. Track quantities of parts in assemblies, so that identical subassemblies that are used multiple times have their parts multiplied by the correct factor to determine the total quantity.  The process for this is similar to that in step 4.   A 'multiplier' needs to be maintained to keep the quantities accurate.  For example, as mentioned earlier, the quantity of any part on a wheel gets doubled because there are two identical wheels on the bicycle.  When iterating through parts on the wheel, a multiplier of 2 is used to calculate quantities.  When the code has finished working with the parts on the wheel, that multiplier needs to be decreased to 1 again (to ensure that parts on the handlebar stem, for example, don't get doubled).

6. Insert a record containing partID, ParentID, Quantity, Indenture Level and any other desired information into an output table.
Function UpdateAssemblyPartsList(strPN As String, intIndenture As Integer, PartID As Long, lngParentAssembly As Long, dblOriginalQty As Double, dblQty As Double, strNotes As String)
                          Dim strSQL As String
                          strSQL = "INSERT INTO tblAssemblyPartsList (PartID,IndentureLevel,ParentAssemblyID,MfrPN,OriginalQty, AdjustedQty, Notes) "
                          strSQL = strSQL & "VALUES (" & PartID & "," & intIndenture & ",'" & lngParentAssembly & "','" & strPN & "'," & dblOriginalQty & "," & dblQty & ",'" & strNotes & "')"
                          CurrentDb.Execute strSQL, dbFailOnError
                      End Function
                      

Open in new window

7. Prevent infinite loops by setting a reasonable maximum indenture level.   Circular references can occur if a user has mistakenly entered an assembly as its own parent.  As the code steps through parts in the assembly, it will see that same assembly in it's own list of parts.  When this happens, the function will call itself to expand the same assembly again (and again and again and again...).  This infinite recursion will eventually result in a stack overflow error.

The maximum indenture level is a sanity check that is used to gracefully stop the program execution when one can safely assume that a circular reference has occurred (for example, if the assembly in question is a pen, an indenture level of 50 would definitely indicate that there is a problem with a circular reference).  This is only one method of handling the potential for infinite recursion.  Another method is mentioned a little later, in the description of the treeview output.
Public Function GetAssemblyPartsListing(lngTopLevel As Long, intQTY As Integer, blRecursiveCall As Boolean)
                      
                          Dim rsParts As DAO.Recordset
                          Dim db As DAO.Database
                          Dim dblQty As Double
                          Dim I As Integer
                          
                          Static intIndentureLevel As Integer
                          Static intMult As Integer
                          
                          On Error GoTo EH
                          
                          Set db = CurrentDb
                          Set rsParts = db.OpenRecordset("Select *, MyNotes as strNotes FROM tblSubAssemblyInfo Inner Join tblPartMain On tblSubAssemblyInfo.PartID = tblPartMain.ID WHERE ParentAssemblyID = " & lngTopLevel)
                          'Initialize the multiplier if this is the first function call
                          If Not blRecursiveCall = True Then
                              intMult = intQTY
                          End If
                          Do Until rsParts.EOF
                              'If this part is an assembly (determined by looking for this part in the ParentAssemblyID field)...
                              If DCount("*", "tblSubAssemblyInfo", "ParentAssemblyID = " & lngTopLevel) > 0 Then
                                  'Calculate the multiplier for parts at the next level
                                  If IsNumeric(rsParts!Qty) Then
                                      ' Multiplies the quantity of this part as stored in the table by the quantity of the parent assembly for this part,
                                      ' and stores it in a static variable so that it's value persists the next time this function is called.
                                      intMult = intMult * Nz(IIf(IsNumeric(rsParts!Qty), rsParts!Qty, 1), 1)
                                      dblQty = intMult
                                  Else
                                      dblQty = -9999
                                  End If
                                  UpdateAssemblyPartsList rsParts!MfrPartNumber, intIndentureLevel + 1, rsParts!PartID, rsParts!ParentAssemblyID, rsParts!Qty, dblQty, rsParts!strNotes
                                  ' Increment IndentureLevel for subassembly parts
                                  intIndentureLevel = intIndentureLevel + 1
                                  If intIndentureLevel > 50 Then GoTo ExitFN   'Indentures > 50 indicate a recursion problem
                                  GetAssemblyPartsListing rsParts!PartID, Nz(intMult, 1), True
                                  ' Decrement the indenture level and reduce the multiplier, since we're not
                                  ' dealing with that subassembly any more
                                  intIndentureLevel = intIndentureLevel - 1
                                  ' Divide to return to the quantity needed for the parent assembly.
                                  If IsNumeric(rsParts!Qty) Then intMult = intMult / Nz(rsParts!Qty, 1)
                              Else
                                  ' Discrete part- not a subassembly... get the total quantity, leaving the multiplier as is.
                                  If IsNumeric(rsParts!Quantity) Then dblQty = intMult * Nz(rsParts!Qty, 1)
                                  UpdateAssemblyPartsList rsParts!MfrPartNumber, intIndentureLevel + 1, rsParts!PartID, rsParts!ParentAssemblyID, Nz(rsParts!Qty, 1), dblQty, rsParts!strNotes
                              End If
                              rsParts.MoveNext
                          Loop
                          
                      ExitFN:
                          rsParts.Close
                          Set rsParts = Nothing
                          Exit Function
                      EH:
                          MsgBox "Error " & Err.Number & ": " & Err.Description
                          GoTo ExitFN
                      End Function
                      

Open in new window


3. The Output

Displaying the results

The attached sample database shows several different ways of displaying the output data from expanding a hierarchical data structure, including a treeview and several datasheets showing lists that can readily be shown on Access reports or exported to other formats such as Excel or text files.

Since the data structure defines a “shape”, a treeview control is an ideal way of displaying the structure from the top down.  Each part in the structure is shown as a node in the treeview, and parts that are assemblies themselves can be expanded or collapsed with +/- signs to reveal or hide the parts that define them.  
TreeView control --  displays the part description, part number, indenture level and quantity of parts used to build an assembly
The code used to load the treeview is another example of a recursive function, which calls itself to fill in child nodes under each parent.   For comparison, this function uses a different sanity check than that previously explained to avoid circular references.  The code checks the values of the child and parent IDs prior with each call to the recursive function.  If a child ID is the same as its parent ID, a circular reference has occurred.  The code at that point is aborted.

The actual code has been significantly edited to show the relevant parts below:
Sub AddTreeData(list of parameters)
                          
                          On Error GoTo EH
                          
                             ' Test for a circular reference
                             If rs(strChildField) = rs(strParentField) Then GoTo EH_CircularReference
                          
                              ' Do lots of things ...
                      
                              ' call this function recursively for "children"
                              AddTreeData (List of parameters)
                              
                              ' Do more things...    
                      
                          Exit Sub
                          
                      EH_CircularReference:
                          MsgBox "Exiting because of a circular reference in which a child record was determined to be it's own parent."
                          Exit Sub
                        
                      EH:
                          MsgBox "Error " & Err.Number & ": " & Err.Description
                      End Sub
                          

Open in new window


The second report is a listing of parts to be ordered.  This datasheet is based on a query that excludes parts which appear as “parent” records in the output.  This query limits the output to parts which need to be ordered versus assemblies that need to be built, and provides a total quantity needed for each part -- summed up for the whole structure.  
Example listing parts to order
The indentured Parts List is a listing of all parts and subassemblies needed to build the final product.  The indenture level and order of the parts illustrates how pieces fit together to build the overall structure.  It contains all of the information that the treeview shows, but it is a “flat” display which can easily be exported to Excel or a text file.
Indentured Parts List

The Assembly Listing shows all assemblies that need to be built and pieced together to form the final product (this listing excludes any parts that are not assemblies themselves).  Like the indentured parts list, the assembly listing uses indenture level and the order of the records to show the level and location of assemblies relative to the top of the structure.
Indentured listing of assemblies

In Conclusion

The sample database is a very simple example of how to apply these concepts, but the possibilities are endless.  

The sample simply demonstrates how to expand a Bill of Materials and report the resulting data.  However, these concepts could be integrated into systems that include work orders for building such assemblies and purchasing processes for acquiring the parts needed.  

In a more sophisticated database which includes a purchasing system, data such as that in the “parts to be ordered” report can be combined in a query with previous known costs of these parts.  This query would be a powerful cost estimation tool for future projects.

Other Applications

While the main focus of this article has been on a Bill of Materials example, the same concepts apply to hierarchical data structures in any discipline.  

Some other examples include:
          Family trees:  Parents, children, grandchildren, great-grandchildren...
          Recipe collections:  Ingredients may be recipes containing ingredients of their own
          Organizational hierarchies: Individuals may supervise employees who are supervisors themselves

Acknowledgements

Thanks to the editors who have worked with this article, and to the many people who have taken the time to review this and offer suggestions prior to publication (you know who you are).
 
Special thanks also to JDettman, rockiroads and puppydogbuddy for initially helping me get my brain wrapped around this subject, and puppydogbuddy for later referring me to other EE Members who needed help with this.  Explaining the concepts to other people has helped reinforce my own understanding of the material, and improve on the sample code in the process.   This article and my various posts on the subject are in that sense passing along the benefits that I have received from the community – like modern day storytelling.
18
33,081 Views
mbizup
CERTIFIED EXPERT

Comments (12)

Jim,

Thanks for the quick reply! Hopefully I'm not being a bother.

Could you provide some clarification to your last comment based on the below?

The sample code handles two iterations of the same part (1/4" bolts) without a problem. From the article:

"Any part can appear on many different assemblies, and assemblies can contain multiple parts.   For this reason, neither the PartID field nor the ParentAssemblyID field can be unique.  The following query output illustrates this point, in that 1/4" Bolts appear on multiple assemblies, and the parent assemblies also appear as having multiple child records."

However it does not handle two iterations of the same sub-assembly (bearing kit). If you had two wheel assemblies (front and rear) you would need to indicate that each of these wheels included a bearing kit, while only the rear wheel included a chain sprocket. So your indentured parts list would look something like:

Bike (main assembly)
-Frame (part)
-Front Wheel (assembly)
--Tire (part)
--Tube (part)
--Bearing Kit (sub-assembly)
---Inner Bearing (part)
---Outer Bearing (part)
-Rear Wheel (assembly)
--Tire (part)
--Tube (part)
--Chain Sprocket (part - differentiates from front wheel)
--Bearing Kit (sub-assembly)
---Inner Bearing (part)
---Outer Bearing (part)

The bearing kit would only exist once in the main parts list table, but would exist twice (in two different assemblies) in the sub-assembly info table and indentured parts list.

This seems like a fairly typical BOM/indentured parts list use case. Is there another way of accounting for both bearing kit sub-assemblies in their appropriate hierarchical "location" within the sample code? The only way i can think of it would be to assign a unique key to each iteration of a sub-assembly, but this seems to unnecessarily complicate and limit the utility of the program.  

Once again, thanks so much for your time!

-Josiah
Kevin CrossChief Technology Officer
CERTIFIED EXPERT
Most Valuable Expert 2011

Commented:
Josiah,

As Jim assisted you with, the example code here does not illustrate routed BOM but the system I use regularly is one that supports that and the code can handle if your join also looks at the Route Sequence ID as well as the Parent and Component Item ID.

The front and rear wheel question is a matter of different part numbers.  If you have the same part number appearing more than once then the GROUP BY at the end will simplify the BOM.  To stop this you could track some unique characteristic of the line instead of Item ID or Part #.  If these are different part numbers, there should be no issue.

Kevin
Kevin,

Thanks for the reply.

A Routing Sequence ID may be just what I need. How does the Routing Sequence ID get married to the parts and assemblies to show the location of each in the indentured parts list?

On a separate note: I'm interested in your aforementioned system as it relates to the capabilities of Access for the larger database endeavor I'm embarking on. Do you think that Access can handle tracking inventory and work in process over multiple manufacturing and assembly steps for some 1500 different parts? Perhaps it would be best if I started a separate thread regarding the applicability of Access for my needs.


Best,

Josiah
Kevin CrossChief Technology Officer
CERTIFIED EXPERT
Most Valuable Expert 2011

Commented:
Josiah,

Here is a message that queues up while I was in the air:
“Please forgive my confusion as I replied from my phone.  I have many iterations of this solution with most being in SQL Server, so please ignore the GROUP BY comment but issue is the same.  It is a matter of handling each line separately vs assuming unique part numbers which most of the people I deal with what to see the in the end.  In other production uses, I either aggregate multiple rows of same part before recursion so the quantity per assembly reflects the total requirement (e.g 4 wheels) or I use a unique key.  For example, in my production BOM, each component of the bill has a unique ID as the key then still lists the parent item ID, component item ID and route sequence If applicable along with the QPA.

I’m sure if you give this a shot yourself, it will make sense.  If not, just post your exact scenario with sample database and we’ll help you ... on the Q&A side of the house.”

Regarding your recent post, the system I use is Intuitive ERP running on SQL Server.  You may be able to replicate the same structure in MS Access, though.
CERTIFIED EXPERT
Fellow
Most Valuable Expert 2017

Commented:
<<Could you provide some clarification to your last comment based on the below?>>

It boils down to this; if any part number will appear within a single level of a BOM, then you need a line number/routing sequence #.  Example:

Assembly "Z"

Line  / Part  / Qty Per Assy
1  Part A   1
2  Part B   1
3  Part A   1
4  Part B   1

Instead of:

Part   /   Qty Per Assy
Part A   2
Part B   2


The article is based on the second.

<< Do you think that Access can handle tracking inventory and work in process over multiple manufacturing and assembly steps for some 1500 different parts?>>

 If you entire parts list was 1,500 item (assy's and raw materials), then yes. If would not be a problem even with the default database engine.   But with that said, I would start off using SQL Server for the data store  as projects like that usually grow quickly.  

Jim.

View More

Have a question about something in this article? You can receive help directly from the article author. Sign up for a free trial to get started.