[问题] ptt text mining 操作问题

楼主: h920032 (王者迪西)   2016-12-04 02:07:46
[问题类型]:
因为要做报告 想要试着做ptt textmining 照着陈嘉葳大大的步骤做结果碰到了一对问题
[软件熟悉度]:
新手(第一次写R,因为要写报告才接触的XD)
[问题叙述]:
1.无法安装tmcn套件
安装后会出现
* installing *source* package 'tmcn' ...
** libs
*** arch - i386
Warning: 执行中命令 'make -f "C:/PROGRA~1/R/R-32~1.5/etc/i386/Makeconf" -f
"C:/PROGRA~1/R/R-32~1.5/share/make/winshlib.mk"
SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)'
SHLIB="tmcn.dll" OBJECTS="tmcn_encoding_isbig5.o tmcn_encoding_isgb18030.o
tmcn_encoding_isgb2312.o tmcn_encoding_isgbk.o tmcn_encoding_isutf8.o"' 已有状
态 127
ERROR: compilation failed for package 'tmcn'
* removing 'C:/Program Files/R/R-3.2.5/library/tmcn'
The downloaded source packages are in
‘C:\Users\家\AppData\Local\Temp\RtmpS8wKpT\downloaded_packages’
Warning messages:
1: 执行中命令 '"C:/PROGRA~1/R/R-32~1.5/bin/i386/R" CMD INSTALL -l "C:\Program
Files\R\R-3.2.5\library" C:\Users\家
\AppData\Local\Temp\RtmpS8wKpT/downloaded_packages/tmcn_0.1-4.tar.gz' 已有状态
1
2: In install.packages("tmcn", repos = "http://R-Forge.R-project.org", :
installation of package ‘tmcn’ had non-zero exit status
2.无法安装Rwordseg
安装会出现
* installing *source* package 'Rwordseg' ...
** R
** demo
** inst
** preparing package for lazy loading
Error : .onLoad failed in loadNamespace() for 'rJava', details:
call: fun(libname, pkgname)
error: JAVA_HOME cannot be determined from the Registry
Error : package 'rJava' could not be loaded
ERROR: lazy loading failed for package 'Rwordseg'
* removing 'C:/Program Files/R/R-3.2.5/library/Rwordseg'
The downloaded source packages are in
‘C:\Users\家\AppData\Local\Temp\RtmpS8wKpT\downloaded_packages’
Warning messages:
1: 执行中命令 '"C:/PROGRA~1/R/R-32~1.5/bin/i386/R" CMD INSTALL -l "C:\Program
Files\R\R-3.2.5\library" C:\Users\家
\AppData\Local\Temp\RtmpS8wKpT/downloaded_packages/Rwordseg_0.2-1.tar.gz' 已有
状态 1
2: In install.packages("Rwordseg", repos = "http://R-Forge.R-project.org") :
installation of package ‘Rwordseg’ had non-zero exit status
3.执行程式码出现Error in curl::curl_fetch_memory(url, handle = handle) :
Couldn't resolve host name 程式码如下
4.求一个可以跟《用R进行中文 text Mining》做到相同效果的程式码
[程式范例]:
data <- data[data!="www.ptt.cc"]
setwd("C:\\Users\\家\\Desktop\\R Test\\新增资料夹")
doc.size <- length(data)
doc.list<-c()
for( k in 1:length(data)){
html <- content(GET(data[k],config=set_cookies("over18"="1")),
encoding="UTF-8")
doc <- xpathSApply(html, "//div[@id='main-content']", xmlValue)
if(length(as.character(doc))==1){
name <- strsplit(data[k], '/')[[1]][4]
write(doc, gsub('html', 'txt', name))
}
}
[环境叙述]:
R 3.2.5 x32
[关键字]:
ptt text mining
作者: obarisk (OSWALT)   2016-12-04 08:44:00
2缺rJava断字可以考虑用jiebaR

Links booklink

Contact Us: admin [ a t ] ucptt.com