Re: [问题] 如何整理数量位置资料如:1胃,2肠

楼主: celestialgod (天)   2015-07-10 15:21:28
※ 引述《helixc (@_2;)》之铭言:
: [软件熟悉度]:新手+入门
: [问题叙述]:
: 手上有一笔某蛙类的解剖资料,想要分析食性。
: 纪录的时候会长这样:
: ID,Food A,Food B,Food C,Food E
: C146,,,,3肠
: B287,,,,10肠
: C140,,,,4肠
: C133,,,1肠,
: C132,1肠,,,
: B305,,,1肠,
: C112,,2肠,,1肠
: C120,,,,1肠
: C128,,,,1肠
: 想要整理成这样的资料:
: ID, Food type, Amount, Location
: C146, E, 3, 肠
: B287, E, 10, 肠
: C140, E, 4, 肠
: C133, C, 1, 肠
library(data.table)
library(dplyr)
library(tidyr)
library(magrittr)
library(stringr)
tmp_dt = fread("ID,Food A,Food B,Food C,Food E
C146,,,,3肠
B287,,,,10肠
C140,,,,4肠
C133,,,1肠,
C132,1肠,,,
B305,,,1肠,
C112,,2肠,,1肠
C120,,,,1肠
C128,,,,1肠", colClasses = rep("Character",5))
## method 1
output_dt = tmp_dt %>% gather(foodType, tmpCol,-ID) %>%
filter(tmpCol != "") %>%
mutate(Amount = str_extract(tmpCol, "\\d*"),
Location = str_sub(tmpCol, nchar(tmpCol), nchar(tmpCol))) %>%
select(-tmpCol) %>%
transform(foodType = as.character(foodType)) %>%
transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType)))
## method 2
output_dt2 = tmp_dt %>% gather(foodType, tmpCol,-ID) %>%
filter(tmpCol != "") %>%
transform(foodType = as.character(foodType),
tmpCol = sub("(\\d*)(.)", "\\1,\\2", tmpCol)) %>%
separate(tmpCol, c("Amount", "Location")) %>%
transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType)))
## method 3 (不用sub,separate的sep参数可以改成用位置切割)
output_dt2 = tmp_dt %>% gather(foodType, tmpCol,-ID) %>%
filter(tmpCol != "") %>%
transform(foodType = as.character(foodType)) %>%
separate(tmpCol, c("Amount", "Location"), -2) %>%
transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType)))
output: (3个都一样)
# ID foodType Amount Location
# 1: C132 A 1 肠
# 2: C112 B 2 肠
# 3: C133 C 1 肠
# 4: B305 C 1 肠
# 5: C146 E 3 肠
# 6: B287 E 10 肠
# 7: C140 E 4 肠
# 8: C112 E 1 肠
# 9: C120 E 1 肠
# 10: C128 E 1 肠
作者: helixc (@_2;)   2015-07-10 20:16:00
好多新函式要学,感谢

Links booklink

Contact Us: admin [ a t ] ucptt.com