Link to home
Start Free TrialLog in
Avatar of gideonjones
gideonjones

asked on

Merging Excel Sheets: List of names with different variables in different sheets, list of names not perfect overlap

Hi,

I've looked through the other topics on merging excel sheets and none of the seem to quite be what I'm looking for. I have two spreadsheets, with a data set describing two slightly different attributes for the same set of names (although names do not overlap perfectly).

Each spreadsheet contains different data about each name.
Important points:
1) The list of names isn't an exact overlap, around 75% of the names are matched
2) Each sheet contains slightly different information about a name
3) I'd like it all in one sheet, with a comprehensive name list, and a non comprehensive variable list, where names don't exist on both sheets therefore the total data set isn't available.

I'd like the columns to all merge into onesheet... I've created a simple example of my situation and desired goal to clarify, my sheets contain around 7000 rows of data each though so automation is a must

Sheet 1:

 VqQK6.jpg

Sheet 2:

QGqz1.jpg

Desired Result:

 wreG9.jpg

Hope this makes it clear, all help much appreciated!
Avatar of SANTABABY
SANTABABY
Flag of United States of America image

just in case the value of a common attribute for a certain name is different in two sheets, which value should be considered in the result sheet?
Example:
Andrew's age is 15 in sheet1 but 16 in sheet2.

OR you are certain that they will be exactly identical in the sheets.
Hi,

here is a short sample with all that you desire, I didn't go all the way but have made all the samples you need to see the functions needed for the completion of your goal...

But have a remark, with 7000 people, names will probvablly duplicate, and therefore this is not the best solution... Need unique values...

Have a nice day and good luck...
Sample.xlsx
Avatar of gideonjones
gideonjones

ASKER

@SANTABABY - well, I hadn't considered this as a possibility but one of the sheets has more recent data so should be considered preferable, e.g. sheet 2 has newer data, I think it is unlikely they would disagree though  - it would have to be an error.

@GASPERK - thanks! this looks interesting, not sure I understand yet but I'll try and apply it to my situation and see how I get on... do I have to manually put in each of the headers I need? I.e. I create a skeleton third sheet with the name column and each of th variables that both sheets cover then construct these IFERROR matching formulas? I'm a bit out of my depth but I'll give it a go!
Yes, you create the base of the sheet but then you insert the formula in the first cell and just copy down...

Good luck
ASKER CERTIFIED SOLUTION
Avatar of Patrick Matthews
Patrick Matthews
Flag of United States of America image

Link to home
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
Start Free Trial
@matthewspatrick wow... ok! :) Thanks this looks ... extensive. I'm not experienced with VBA... other than the most basic level of it, could you give me an idea about how I would actually implement this? Does it need editing at all based on the fact that the real life version of my example Col isn't exactly "Name" but "Market Name" ?

Thanks so much! I will try and implement this ASAP and let you know how I get on...
The only line you should need to edit is:

    [a1].Resize(1, 1 + DicAttributes.Count).Value = Split("Name," & Join(Attributes, ","), ",")

to be:

    [a1].Resize(1, 1 + DicAttributes.Count).Value = Split("Market Name," & Join(Attributes, ","), ",")

That said, you might find it helpful to rename some of the variables, but it is not necessary :)
@matthewspatrick. Well holy cow. I just created this macro and ran it... tried to read through your code to try to understand how it knew to do the various specific things... couldn't figure it out... ran it... everything is right where it needs to be, the exact desired out put I required. I am blown away :D this is a level of wizardry far beyond my ken, you have my thanks! You have all the points!

@others - thanks also so much for your input, @GasperK your solution tended to lead to duplicates as you thought it might, a useful solution for similar situations though so I hope others in the knowledgebase find it and use it! Thanks again
Fantastic - can't believe how well this worked for me. 10/10!
Glad to help!  The magic is all created by the Dictionary; for more about that, please do read the article.  In the article, I refer to this approach as the "dictionary of dictionaries", because each item in the parent dictionary is itself a dictionary.

Here is a commented version of the code:

Sub CollateData()
    
    Dim ws As Worksheet
    Dim DicPeople As Object
    Dim DicPerson As Object
    Dim DicAttributes As Object
    Dim LastR As Long
    Dim r As Long
    Dim LastC As Long
    Dim c As Long
    Dim arr As Variant
    Dim PersonName As String
    Dim AttributeKey As String
    Dim AttributeValue As String
    Dim Attributes As Variant
    Dim People As Variant
    
    ' This dictionary holds all of the people
    Set DicPeople = CreateObject("Scripting.Dictionary")
    DicPeople.CompareMode = vbTextCompare

    ' This dictionary holds the various attributes stored on the different worksheets
    Set DicAttributes = CreateObject("Scripting.Dictionary")
    DicAttributes.CompareMode = vbTextCompare
    
    ' Loop through each worksheet
    For Each ws In ThisWorkbook.Worksheets
        With ws
            ' Determine last row and last column populated on worksheet
            LastR = .Cells(.Rows.Count, 1).End(xlUp).Row
            LastC = .Cells(1, .Columns.Count).End(xlToLeft).Column

            ' Array holds the data from the worksheet
            arr = .Range(.Cells(1, 1), .Cells(LastR, LastC)).Value

            ' Loop through the array and process each person in the worksheet
            For r = 2 To LastR
                PersonName = arr(r, 1)

                ' If we haven't seen this person yet, create a dictionary for him/her
                If Not DicPeople.Exists(PersonName) Then
                    Set DicPerson = CreateObject("Scripting.Dictionary")
                    DicPerson.CompareMode = vbTextCompare
                    DicPeople.Add PersonName, DicPerson
                End If
                Set DicPerson = DicPeople(PersonName)

                ' Loop through the columns to harvest the attributes
                For c = 2 To LastC
                    AttributeKey = arr(1, c)
                    AttributeValue = arr(r, c)
                    DicPerson.Item(AttributeKey) = AttributeValue
                    If r = 2 Then DicAttributes.Item(AttributeKey) = AttributeKey
                Next
            Next
        End With
    Next
    
    ' Create workbook for results
    Workbooks.Add
    
    ' Dump attributes and people into arrays
    Attributes = DicAttributes.Keys
    People = DicPeople.Keys

    ' Headers
    [a1].Resize(1, 1 + DicAttributes.Count).Value = Split("Name," & Join(Attributes, ","), ",")

    ' Loop through the people, and write one row per person
    For r = 0 To UBound(People)
        PersonName = People(r)
        Cells(r + 2, 1) = PersonName
        Set DicPerson = DicPeople.Item(PersonName)

        ' Loop through attributes, and if applicable write the values
        For c = 0 To UBound(Attributes)
            AttributeKey = Attributes(c)
            If DicPerson.Exists(AttributeKey) Then
                Cells(r + 2, c + 2) = DicPerson.Item(AttributeKey)
            End If
        Next
    Next
    Columns.AutoFit
    [a1].Sort Key1:=[a1], Order1:=xlAscending, Header:=xlYes
    
    Set DicPerson = Nothing
    Set DicAttributes = Nothing
    Set DicPeople = Nothing
    
    MsgBox "Done"
    
End Sub

Open in new window


If you like the article, please remember to click 'Yes' on the 'Was this helpful' voting :)