Re: [问题] 关于重复测量资料

楼主: aaron77217 (慎)   2015-03-03 18:33:36
※ 引述《yummy7922 (crucify)》之铭言:
: [问题类型]:
: 程式咨询(我想用R 做某件事情,但是我不知道要怎么用R 写出来)
: [软件熟悉度]:
: 入门(写过其他程式,只是对语法不熟悉)
: [问题叙述]:
: 我的资料是重复测量的资料,资料中有13820位病人的多次测量值,
: 但不是每位病人的观察笔数都相同,
: 我想要针对每一位病人,将每三笔资料计算一个平均值,
: 最后不到三笔的资料也算一个平均值,
: 不过我不知道该如何做,想请教各位高手们,谢谢。
# dataset
N=30000;n=20;obs=15 # N=病患数 n=随机重复次数 obs=观测项数
a=Sys.time()
each_times=sample(n,N,replace=T)
cum=cumsum(each_times)
patient_name=rep(paste('patient',1:N,sep="-"),each_times)
obs_times=rep(1,length(patient_name));obs_times[cum]=obs_times[cum]-each_times
obs_times=cumsum(obs_times);obs_times[cum]=each_times
total=sum(each_times)
V=matrix(rnorm(total*obs),,obs)
dataset=cbind(obs_times,V)
row.names(dataset)=patient_name;colnames(dataset)=c('obs_times',paste('V',1:obs,sep=""))
Sys.time()-a
# N=30000;n=20;obs=15 => Time difference of 0.555032 secs
# N=13820;n=15;obs=1 => Time difference of 0.03400207 secs
# result
k=3
b=Sys.time()
times_a=dataset[,'obs_times']
data_m=matrix(dataset[,-1],,obs)
c_data_m=apply(data_m,2,cumsum)
filter_times=times_a%in%seq(k,max(times_a)%/%k*k,by=k)
filter_times[c((times_a==1)[-1],FALSE)]=TRUE
f_data_m=c_data_m[filter_times,]
times_end=times_a[filter_times]
times_length=times_end%%k;times_length[times_length==0]=k
times_start=times_end-times_length+1
times_info=paste(times_start,times_end,sep='-')
names_info=row.names(dataset)[filter_times]
if(obs!=1){
result_mean=apply(f_data_m,2,function(x){c(x[1],diff(x))/times_length})
}else{
result_mean=c(f_data_m[1],diff(f_data_m))/times_length
}
result=data.frame(names_info,times_info,matrix(result_mean,,obs))
Sys.time()-b
# N=30000;n=20;obs=15 => Time difference of 1.088062 secs
# N=13820;n=15;obs=1 => Time difference of 0.2040122 secs
# validation
patient=paste('patient',sample(N,1),sep='-')
dataset[row.names(dataset)==patient,]
result[names_info==patient,]
作者: yummy7922 (crucify)   2015-03-07 13:46:00
真的 谢谢

Links booklink

Contact Us: admin [ a t ] ucptt.com