※ 引述《angela79979 (mini)》之铭言:
: [问题叙述]:
: 有两个list : a.list , b.list
: >head(a.list)
: $'1'
: [1] 3 4 5 8 15
: $'3'
: [1] 2 3 6 9 12 14
: ...
: >head(b.list)
: $'2'
: [1] 2 3 5 13 24
: $'1'
: [1] 2 3 5 6 7 8 9 12
: ...
: 想比较a.list和b.list中 相同编号的list的重复element个数
: 例如:
: similarity<-sum(table(a.list$'1'[a.list$'1' %in% b.list$'1']))
: 欲对每一个编号的list都作去最比对
: 但使用loop却没办法对a.list$'i' 或 b.list$'i'作循环
: 想请问有没有其他的方法
## data generating - assume no duplicated names and element
library(magrittr)
list_1 = replicate(50, rbinom(1, 15, .5) %>% sample(1:20, .) %>%
sort, simplify = FALSE) %>%
set_names(sample(1:50, 50) %>% as.character)
list_2 = replicate(50, rbinom(1, 15, .5) %>% sample(1:20, .) %>%
sort, simplify = FALSE) %>%
set_names(sample(1:50, 50) %>% as.character)
## method 1 - using for
intersectNames = intersect(names(list_1), names(list_2)) %>% sort
similarity = vector('numeric', length(intersectNames)) %>%
set_names(as.character(intersectNames))
for (i in seq_along(intersectNames))
similarity[i] = sum(list_1[[intersectNames[i]]] %in%
list_2[[intersectNames[i]]])
print(head(similarity))
# 1 10 11 12 13 14
# 1 2 4 1 3 1
## method 2 - using map2 (or mapply)
library(purrr)
intersectNames = intersect(names(list_1), names(list_2)) %>% sort
similarity = map2(list_1[intersectNames], list_2[intersectNames],
~ sum(.x %in% .y)) %>% do.call(c, .) %>% set_names(intersectNames)
print(head(similarity))
# 1 10 11 12 13 14
# 1 2 4 1 3 1
PS: 其实map2就是mapply... 只是
function(x,y) sum(x %in% y)
改成
~ sum(.x %in% .y)