gideonjones
asked on
Merging Excel Sheets: List of names with different variables in different sheets, list of names not perfect overlap
Hi,
I've looked through the other topics on merging excel sheets and none of the seem to quite be what I'm looking for. I have two spreadsheets, with a data set describing two slightly different attributes for the same set of names (although names do not overlap perfectly).
Each spreadsheet contains different data about each name.
Important points:
1) The list of names isn't an exact overlap, around 75% of the names are matched
2) Each sheet contains slightly different information about a name
3) I'd like it all in one sheet, with a comprehensive name list, and a non comprehensive variable list, where names don't exist on both sheets therefore the total data set isn't available.
I'd like the columns to all merge into onesheet... I've created a simple example of my situation and desired goal to clarify, my sheets contain around 7000 rows of data each though so automation is a must
Sheet 1:
VqQK6.jpg
Sheet 2:
QGqz1.jpg
Desired Result:
wreG9.jpg
Hope this makes it clear, all help much appreciated!
I've looked through the other topics on merging excel sheets and none of the seem to quite be what I'm looking for. I have two spreadsheets, with a data set describing two slightly different attributes for the same set of names (although names do not overlap perfectly).
Each spreadsheet contains different data about each name.
Important points:
1) The list of names isn't an exact overlap, around 75% of the names are matched
2) Each sheet contains slightly different information about a name
3) I'd like it all in one sheet, with a comprehensive name list, and a non comprehensive variable list, where names don't exist on both sheets therefore the total data set isn't available.
I'd like the columns to all merge into onesheet... I've created a simple example of my situation and desired goal to clarify, my sheets contain around 7000 rows of data each though so automation is a must
Sheet 1:
VqQK6.jpg
Sheet 2:
QGqz1.jpg
Desired Result:
wreG9.jpg
Hope this makes it clear, all help much appreciated!
Hi,
here is a short sample with all that you desire, I didn't go all the way but have made all the samples you need to see the functions needed for the completion of your goal...
But have a remark, with 7000 people, names will probvablly duplicate, and therefore this is not the best solution... Need unique values...
Have a nice day and good luck...
Sample.xlsx
here is a short sample with all that you desire, I didn't go all the way but have made all the samples you need to see the functions needed for the completion of your goal...
But have a remark, with 7000 people, names will probvablly duplicate, and therefore this is not the best solution... Need unique values...
Have a nice day and good luck...
Sample.xlsx
ASKER
@SANTABABY - well, I hadn't considered this as a possibility but one of the sheets has more recent data so should be considered preferable, e.g. sheet 2 has newer data, I think it is unlikely they would disagree though - it would have to be an error.
@GASPERK - thanks! this looks interesting, not sure I understand yet but I'll try and apply it to my situation and see how I get on... do I have to manually put in each of the headers I need? I.e. I create a skeleton third sheet with the name column and each of th variables that both sheets cover then construct these IFERROR matching formulas? I'm a bit out of my depth but I'll give it a go!
@GASPERK - thanks! this looks interesting, not sure I understand yet but I'll try and apply it to my situation and see how I get on... do I have to manually put in each of the headers I need? I.e. I create a skeleton third sheet with the name column and each of th variables that both sheets cover then construct these IFERROR matching formulas? I'm a bit out of my depth but I'll give it a go!
Yes, you create the base of the sheet but then you insert the formula in the first cell and just copy down...
Good luck
Good luck
ASKER CERTIFIED SOLUTION
membership
This solution is only available to members.
To access this solution, you must be a member of Experts Exchange.
ASKER
@matthewspatrick wow... ok! :) Thanks this looks ... extensive. I'm not experienced with VBA... other than the most basic level of it, could you give me an idea about how I would actually implement this? Does it need editing at all based on the fact that the real life version of my example Col isn't exactly "Name" but "Market Name" ?
Thanks so much! I will try and implement this ASAP and let you know how I get on...
Thanks so much! I will try and implement this ASAP and let you know how I get on...
The only line you should need to edit is:
[a1].Resize(1, 1 + DicAttributes.Count).Value = Split("Name," & Join(Attributes, ","), ",")
to be:
[a1].Resize(1, 1 + DicAttributes.Count).Value = Split("Market Name," & Join(Attributes, ","), ",")
That said, you might find it helpful to rename some of the variables, but it is not necessary :)
[a1].Resize(1, 1 + DicAttributes.Count).Value
to be:
[a1].Resize(1, 1 + DicAttributes.Count).Value
That said, you might find it helpful to rename some of the variables, but it is not necessary :)
ASKER
@matthewspatrick. Well holy cow. I just created this macro and ran it... tried to read through your code to try to understand how it knew to do the various specific things... couldn't figure it out... ran it... everything is right where it needs to be, the exact desired out put I required. I am blown away :D this is a level of wizardry far beyond my ken, you have my thanks! You have all the points!
@others - thanks also so much for your input, @GasperK your solution tended to lead to duplicates as you thought it might, a useful solution for similar situations though so I hope others in the knowledgebase find it and use it! Thanks again
@others - thanks also so much for your input, @GasperK your solution tended to lead to duplicates as you thought it might, a useful solution for similar situations though so I hope others in the knowledgebase find it and use it! Thanks again
ASKER
Fantastic - can't believe how well this worked for me. 10/10!
Glad to help! The magic is all created by the Dictionary; for more about that, please do read the article. In the article, I refer to this approach as the "dictionary of dictionaries", because each item in the parent dictionary is itself a dictionary.
Here is a commented version of the code:
If you like the article, please remember to click 'Yes' on the 'Was this helpful' voting :)
Here is a commented version of the code:
Sub CollateData()
Dim ws As Worksheet
Dim DicPeople As Object
Dim DicPerson As Object
Dim DicAttributes As Object
Dim LastR As Long
Dim r As Long
Dim LastC As Long
Dim c As Long
Dim arr As Variant
Dim PersonName As String
Dim AttributeKey As String
Dim AttributeValue As String
Dim Attributes As Variant
Dim People As Variant
' This dictionary holds all of the people
Set DicPeople = CreateObject("Scripting.Dictionary")
DicPeople.CompareMode = vbTextCompare
' This dictionary holds the various attributes stored on the different worksheets
Set DicAttributes = CreateObject("Scripting.Dictionary")
DicAttributes.CompareMode = vbTextCompare
' Loop through each worksheet
For Each ws In ThisWorkbook.Worksheets
With ws
' Determine last row and last column populated on worksheet
LastR = .Cells(.Rows.Count, 1).End(xlUp).Row
LastC = .Cells(1, .Columns.Count).End(xlToLeft).Column
' Array holds the data from the worksheet
arr = .Range(.Cells(1, 1), .Cells(LastR, LastC)).Value
' Loop through the array and process each person in the worksheet
For r = 2 To LastR
PersonName = arr(r, 1)
' If we haven't seen this person yet, create a dictionary for him/her
If Not DicPeople.Exists(PersonName) Then
Set DicPerson = CreateObject("Scripting.Dictionary")
DicPerson.CompareMode = vbTextCompare
DicPeople.Add PersonName, DicPerson
End If
Set DicPerson = DicPeople(PersonName)
' Loop through the columns to harvest the attributes
For c = 2 To LastC
AttributeKey = arr(1, c)
AttributeValue = arr(r, c)
DicPerson.Item(AttributeKey) = AttributeValue
If r = 2 Then DicAttributes.Item(AttributeKey) = AttributeKey
Next
Next
End With
Next
' Create workbook for results
Workbooks.Add
' Dump attributes and people into arrays
Attributes = DicAttributes.Keys
People = DicPeople.Keys
' Headers
[a1].Resize(1, 1 + DicAttributes.Count).Value = Split("Name," & Join(Attributes, ","), ",")
' Loop through the people, and write one row per person
For r = 0 To UBound(People)
PersonName = People(r)
Cells(r + 2, 1) = PersonName
Set DicPerson = DicPeople.Item(PersonName)
' Loop through attributes, and if applicable write the values
For c = 0 To UBound(Attributes)
AttributeKey = Attributes(c)
If DicPerson.Exists(AttributeKey) Then
Cells(r + 2, c + 2) = DicPerson.Item(AttributeKey)
End If
Next
Next
Columns.AutoFit
[a1].Sort Key1:=[a1], Order1:=xlAscending, Header:=xlYes
Set DicPerson = Nothing
Set DicAttributes = Nothing
Set DicPeople = Nothing
MsgBox "Done"
End Sub
If you like the article, please remember to click 'Yes' on the 'Was this helpful' voting :)
Example:
Andrew's age is 15 in sheet1 but 16 in sheet2.
OR you are certain that they will be exactly identical in the sheets.