How do I combine two data-frames based on two columns?

问题内容:

I know I can use the plyr and its friends to combine dataframes, and merge as well, but so far I don’t know how to merge two dataframes with multiple columns based on 2 columns?

问题评论:

1  
The provided answer (stackoverflow.com/q/1299871) is only joining based on one column (“CustomerId”), so I don’t think this is a duplicate. Can someone ‘Unduplicate’ this question?

答案:

答案1:

See the documentation on ?merge, which states:

By default the data frames are merged on the columns with names they both have, 
 but separate specifications of the columns can be given by by.x and by.y.

This clearly implies that merge will merge data frames based on more than one column. From the final example given in the documentation:

x <- data.frame(k1=c(NA,NA,3,4,5), k2=c(1,NA,NA,4,5), data=1:5)
y <- data.frame(k1=c(NA,2,NA,4,5), k2=c(NA,NA,3,4,5), data=1:5)
merge(x, y, by=c("k1","k2")) # NA's match

This example was meant to demonstrate the use of incomparables, but it illustrates merging using multiple columns as well. You can also specify separate columns in each of x and y using by.x and by.y.

答案评论:

2  
@darkage This question deals with merging data frames. Looks like you have data.tables. Totally different. I would read the documentation for data.table.

答案2:

Hope this helps;

df1 = data.frame(CustomerId=c(1:10),
             Hobby = c(rep("sing", 4), rep("pingpong", 3), rep("hiking", 3)),
             Product=c(rep("Toaster",3),rep("Phone", 2), rep("Radio",3), rep("Stereo", 2)))

df2 = data.frame(CustomerId=c(2,4,6, 8, 10),State=c(rep("Alabama",2),rep("Ohio",1),   rep("Cal", 2)),
             like=c("sing", 'hiking', "pingpong", 'hiking', "sing"))

df3 = merge(df1, df2, by.x=c("CustomerId", "Hobby"), by.y=c("CustomerId", "like"))

Assuming df1$Hobby and df2$like mean the same thing.

答案评论:

原文地址:

https://stackoverflow.com/questions/47754186/vlookup-match-by-two-columns-in-r

添加评论

友情链接:蝴蝶教程