Re: [问题] 处理资料问题(pkg:dplyr)

楼主: celestialgod (天)   2015-08-06 14:11:06
※ 引述《psinqoo (零度空间)》之铭言:
: 延伸问题
: 正被搞晕中
: RAW DATA长这样
好读版:http://pastebin.com/tPH8i43p
library(data.table)
library(reshape2)
library(dplyr)
library(tidyr)
library(magrittr)
dat = data.table(ID = rep(LETTERS[1:2], 4:3),
location = c('TAI', 'JP', 'CH', 'KOE', 'JP', 'GOK', 'TA'),
year = c(2012, 2013, 2014, 2011, 2011, 2015, 2012),
sex = rep(c("F", "M"), 4:3))
: ID 地点 日期 性别
: A TAI 2012 F
: A JP 2013 F
: A CH 2014 F
: A KOE 2011 F
: B JP 2011 M
: B GOK 2015 M
: B TA 2012 M
: 变成下面这样
: 第一种 形式
: ID 地点一 地点二 地点三 地点四 性别
: A KOE TAI JP CH F
: B JP TA GOK M
dat %>% group_by(ID, sex) %>%
mutate(location_name = paste0("location_", 1:n())) %>%
select(-year) %>% spread(location_name, location, fill = "")
# ID sex location_1 location_2 location_3 location_4
# 1 A F TAI JP CH KOE
# 2 B M JP GOK TA
: 第二种
: ID 2011 2012 2013 2014 2015 性别
: A KOE TAI JP CH F
: B JP TA GOK M
dat %>% spread(year, location, fill="")
## dcast.data.table(dat, ID + sex ~ year, value.var = "location")
# ID sex 2011 2012 2013 2014 2015
# 1: A F KOE TAI JP CH
# 2: B M JP TA GOK
: 第三种
: ID 地点 性别
: A KO,TAI,JP,CH F
: B JP,TA,GOK M
## method 1
dat %>% spread(year, location, fill="") %>%
setnames(as.character(2011:2015), paste0("year_", 2011:2015)) %>%
mutate(location = paste(year_2011, year_2012,year_2013,year_2014,year_2015,
sep = ",")) %>%
mutate(location = gsub(',+', ',',gsub(',$', '', location))) %>%
select(ID, sex, location)
# ID sex location
# 1: A F KOE,TAI,JP,CH
# 2: B M JP,TA,GOK
## method 2 (little tricky)
dat3 = dat %>% mutate(ones = "c1") %>% select(-year) %>%
dcast.data.table(ID + sex ~ ones,
fun.aggregate = function(x){paste(x, collapse = ',')},
value.var = "location") %>% setnames("c1", "location")
dat3
# ID sex location
# 1: A F TAI,JP,CH,KOE
# 2: B M JP,GOK,TA
## 3 (method 2 with order of year)
# PS: output will be the same as method 1
dat3_2 = dat %>% mutate(ones = "c1") %>%
arrange(ID, year) %>% select(-year) %>%
dcast.data.table(ID + sex ~ ones,
fun.aggregate = function(x){paste(x, collapse = ',')},
value.var = "location") %>% setnames("c1", "location")
dat3_2
# ID sex location
# 1: A F KOE,TAI,JP,CH
# 2: B M JP,TA,GOK
作者: psinqoo (零度空間)   2015-08-06 15:59:00
好强好强~~~~五体投地麦安捏共~~造福版友

Links booklink

Contact Us: admin [ a t ] ucptt.com