Creating a new column when a part of a value from one column coincides with a value from another

Question

I have df1 df1 :

  X1 X2 1 12345678 2 400454 3 12345214 4 77753523 5 77753827

And there is df2 :

  X1 X2 X3 1 123 1 2 125 5 3 400 2 4 643 3 5 423 4 6 765 5 7 213 6 8 124 2 9 777 9 10 432 1

I want to make it so that 3 column df1 in df1 - df1$X3 with values corresponding to df2$X3 values, provided that a part of the value from df1$X2 coincides with the value of df2$X2 . This is how df1 should look df1 :

  X1 X2 X3 1 12345678 1 2 400454 2 3 12345214 1 4 77753523 9 5 77753827 9

Please tell me how it can be implemented.

The “key” in a real task may not be at the beginning and have different lengths.
That is, yes, I am interested in the coincidence of any part, and not the first 3 digits.

Yury arrow yury arrow 470 3 eleven · Accepted Answer · 2016-06-23T08:28:40

In fact, in your formulation, the task is not specifically defined. And what will you do if the string matches at once with several keys?

Vector to solve the problem quickly can not think of how. Here, it is possible to create the table of connectivity by enumeration. Accordingly, the decision will very much depend on the context in which the task will be used. If a one-time report to do this one. And if you put this on production - I would not use my solution ... (Creating sources should be taken from the previous answer)

 df_links <- data.frame(valkey=NA, val=NA, spr_key=NA, spr_val=NA, spr_val_2=NA, links=NA) df_links <- df_links[-1,] for (i in 1:nrow(df1)) { for (j in 1:nrow(df2)) { df_links <- rbind(df_links, data.frame ( valkey=df1[i,1], val=df1[i,2], spr_key=df2[j,1], spr_val=df2[j,2], spr_val_2=df2[j,3], links=grepl(pattern=df2[j,2], x=df1[i,2], fixed=TRUE) )) } } print(df_links[df_links$links==TRUE,c(1,2,5)])

Yury arrow yury arrow 470 3 eleven · Answer 2 · 2016-06-22T15:32:38

 df1 <- data.frame( x1=rep(1:5), x2=c(12345678,400454,12345214,77753523,77753827) ) df2 <- data.frame( x1=rep(1:10), x2=c(123,125,400,643,423,765,213,124,777,432), x3=c(1,5,2,3,4,5,6,2,9,1) ) ########################################################### # Создадим ключ df1$key <- substr(df1[,2],1,3) df2$key <- df2$x2 # Мерджим таблицы df1 <- merge(df1,df2[,c('key','x3')], by='key') # Удалим ключ df1 <- df1[ , !(names(df1) %in% "key")] df2 <- df2[ , !(names(df2) %in% "key")] print(df1)

Creating a new column when a part of a value from one column coincides with a value from another

2 answers 2

More articles: