※ 引述《henry48124 (= =)》之铭言:
: [问题类型]:
:
: 程式咨询(我想用R 做某件事情,但是我不知道要怎么用R 写出来)
:
: [软件熟悉度]:
: 入门(写过其他程式,只是对语法不熟悉)
: [问题叙述]:
: 各位大大好,我有一笔资料长得像是:
: head(df)
: id place count
: 1 A 1
: 1 B 1
: 2 B 1
: 2 C 3
: 3 D 2
: 4 A 1
: 4 C 2
: 4 D 5
: 5 B 1
: 我希望能让他变成
: id count top_place1 top_place2
: 1 2 A B
: 2 4 C B
: 3 2 D
: 4 8 D C
: 5 1 B
: [程式范例]:
: 这是我目前的做法,总觉得写得怪怪的,如果未来要做到 top100 就不能这样写
: 谢谢各位 Orz
: library(dplyr)
: answer <- NULL
: for(x in as.list(unique(df$id))) {
: df_id <- df %>%
: filter(id == x) %>%
: arrange(-count)
: count <- sum(df$count)
: top_place1 <- NA
: top_place2 <- NA
: col <- c(x, count, top_place1, top_place2)
: for(y in 1:nrow(df_id)) {
: if(y <= 2) {
: col[y+2] <- df_id[y,]$place
: }
: answer <- rbind(answer, col)
: }
: [环境叙述]:
: [关键字]:
method 1是硬干,可以直接先看method 2
library(data.table)
library(stringr)
library(pipeR)
DT <- data.table(id = rep(1:5, c(2,2,1,3,1)),
place = c("A","B","B","C","D","A","C","D","B"),
count = c(1,1,1,3:1,2,5,1))
## method 1:
setorder(DT, id, -count, -place)
numRank <- 3
DT[ , .(lapply(1:numRank, function(i){
ifelse(length(place) >= i, place[i], "")
}) %>>% transpose %>>% sapply(str_c, collapse = ",")), by = .(id)] %>>%
`[`(j = str_c("top_place", 1:numRank) := transpose(str_split(V1, ",")),
by = .(id)) %>>%
`[`(j = V1 := NULL) %>>%
merge(DT[ , .(count = sum(count)), by = .(id)], by = "id")
# id top_place1 top_place2 top_place3 count
# 1: 1 A B 2
# 2: 2 C B 4
# 3: 3 D 2
# 4: 4 D C A 8
# 5: 5 B 1
## method 2:
setorder(DT, id, -count, -place)
numRank <- 3
DT[ , rr := length(count) - frank(count, ties.method = "first")+1, by = .(id)]
DT[rr %in% 1:numRank] %>>%
dcast(id ~ rr, value.var = "place") %>>%
setnames(as.character(1:numRank), str_c("top_place", 1:numRank)) %>>%
merge(DT[ , .(count = sum(count)), by = .(id)], by = "id")
# id top_place1 top_place2 top_place3 count
# 1: 1 A B NA 2
# 2: 2 C B NA 4
# 3: 3 D NA NA 2
# 4: 4 D C A 8
# 5: 5 B NA NA 1