I have df1 df1 :

  X1 X2 1 12345678 2 400454 3 12345214 4 77753523 5 77753827 

And there is df2 :

  X1 X2 X3 1 123 1 2 125 5 3 400 2 4 643 3 5 423 4 6 765 5 7 213 6 8 124 2 9 777 9 10 432 1 

I want to make it so that 3 column df1 in df1 - df1$X3 with values ​​corresponding to df2$X3 values, provided that a part of the value from df1$X2 coincides with the value of df2$X2 . This is how df1 should look df1 :

  X1 X2 X3 1 12345678 1 2 400454 2 3 12345214 1 4 77753523 9 5 77753827 9 

Please tell me how it can be implemented.

  • The coincidence of the parts of any values? Or precisely the leading three digits / symbol? - Yury Arrow
  • @YuryArrow Thanks for the reply. The “key” in a real task may not be at the beginning and have different lengths. That is, yes, I am interested in the coincidence of any part, and not the first 3 digits. - Vladridge pm

2 answers 2

In fact, in your formulation, the task is not specifically defined. And what will you do if the string matches at once with several keys?

Vector to solve the problem quickly can not think of how. Here, it is possible to create the table of connectivity by enumeration. Accordingly, the decision will very much depend on the context in which the task will be used. If a one-time report to do this one. And if you put this on production - I would not use my solution ... (Creating sources should be taken from the previous answer)

 df_links <- data.frame(valkey=NA, val=NA, spr_key=NA, spr_val=NA, spr_val_2=NA, links=NA) df_links <- df_links[-1,] for (i in 1:nrow(df1)) { for (j in 1:nrow(df2)) { df_links <- rbind(df_links, data.frame ( valkey=df1[i,1], val=df1[i,2], spr_key=df2[j,1], spr_val=df2[j,2], spr_val_2=df2[j,3], links=grepl(pattern=df2[j,2], x=df1[i,2], fixed=TRUE) )) } } print(df_links[df_links$links==TRUE,c(1,2,5)]) 
     df1 <- data.frame( x1=rep(1:5), x2=c(12345678,400454,12345214,77753523,77753827) ) df2 <- data.frame( x1=rep(1:10), x2=c(123,125,400,643,423,765,213,124,777,432), x3=c(1,5,2,3,4,5,6,2,9,1) ) ########################################################### # Создадим ключ df1$key <- substr(df1[,2],1,3) df2$key <- df2$x2 # Мерджим таблицы df1 <- merge(df1,df2[,c('key','x3')], by='key') # Удалим ключ df1 <- df1[ , !(names(df1) %in% "key")] df2 <- df2[ , !(names(df2) %in% "key")] print(df1)