Re: [问题] Bag of words 中文单字词问题 ctr1 PTT批踢踢实业坊

Re: [问题] Bag of words 中文单字词问题

楼主: ctr1 (【积π】) 2018-06-27 08:42:43

我自己来回答
默认过滤掉一个字符长度的词
text = ["我|，|爱你|白Z",
"他|爱狗",
"猫|爱鼠"
]
vectorizer = CountVectorizer(min_df=1, token_pattern='(?u)\\b\\w+\\b')
vectorizer.fit(text)
vector = vectorizer.transform(text)
print (vectorizer.vocabulary_)
print (vector.shape)
print (vector.toarray())

作者: b24333666 (比飞笨) 2018-06-27 08:55:00

你怎么把上一篇的推文修掉了....

楼主: ctr1 (【积π】) 2018-06-27 08:59:00

兄弟你留言在另一个版~

作者: b24333666 (比飞笨) 2018-06-27 10:45:00

不好意思XDD

继续阅读

[问题] Python学习影像辨识切割技巧qwer8716911 [问题] python 如何使用继承wang19980531 Re: [问题] for loop 的 index 存取问题rexyeah Re: [问题] for loop 的 index 存取问题jlhc [问题] for loop 的 index 存取问题henry8168 [问题] generator / filter疑问tmdggyygan [问题] 字典里面的内容如何读入函数中？eco100 [问题] Bag of words 中文单字词问题ctr1 [问题] 有关用pip 安装openpyxe的问题pigers [问题] 如何使用python写line bot机器人？wang19980531