Re: about hw2

楼主: ChihJen ( )   2005-03-17 19:54:35
I think you can check how much time it costs on a subset of coveypt..
For example, try the first 10000, 50000, 100000 and see how the time
scales.
This will tell us if 580k is possible.
If the scaling is linear, 80x40 <= 1hr is good enough.
※ 引述《LCL2 (新年快热~)》之铭言:
: It turns out the code for loading covtype data seems take about
: 6 minutes on 1.2G AMD CPU XD , while a perl program takes only
: about 2 minutes.
: ※ 引述《LCL2 (新年快热~)》之铭言:
: : I did some modify to the above code for index support. It takes
: : 3x sec on 1.2G AMD CPU for usps loading. But I don't think it's fast
: : enough for hw3 still XD. Is there any other way? (any code problem please
: : tell me, thanks)
: : readData <- function (filename, ncolumn){
: : lines <- readLines(filename)
: : linenum <- length(lines)
: : valueIndex <- 2*(1:ncolumn)-1
: : indexIndex <- 2*(1:ncolumn-1)
: : dataMatrix <- matrix(0,linenum, ncolumn)
: : for(i in 1:linenum){
: : tmp <- as.numeric(matrix(strsplit(lines[i], " +|:", perl=TRUE)[[1]]))
: : dataTmp <- tmp[valueIndex]
: : indexTmp <- tmp[indexIndex]
: : iUsedIndex <- !is.na(indexTmp)
: : dUsedIndex <- !is.na(dataTmp)
: : dataMatrix[i, c(1, 1+indexTmp[iUsedIndex])] <- dataTmp[dUsedIndex]
: : }
: : return(dataMatrix)
: : }
: : tmp <- readData("usps", 257)

Links booklink

Contact Us: admin [ a t ] ucptt.com