Re: [问题] pandas 问题 lance5487 PTT批踢踢实业坊

Re: [问题] pandas 问题

楼主: lance5487 ( ) 2018-02-09 22:35:50

不好意思，想再请问一个问题QQ 问题不太好描述，容我用举例的@@
USERID .... COLUMNA
A 10
A 20
A 30
A 40
A 80
B 20
B 30
B 40
我想问的是我想给columnA设一个门槛值，根据UserID去区分达到门槛的比例
假设我设的门槛是一个array{20,40,80}，然后回传一个DataFrame，如下所列
USERID THRESHOLD<=20 THRESHOLD<=40 THRESHOLD=80
A 2/5=0.4 4/5=0.8 5/5=1
B 1/3=0.33 3/3=1 3/3=1
. . . .
. . . .
. . . .
一个column会写，但多个column只能暴力解一直join，有没有比较简洁的用法
一个column的写法是
df.groupby('USERID').apply( lambda x: ((x['COLUMNA']<=20).sum())/len(x))
如果可以的话，尽量不要用到for，用for的效率比较差，但有for的解法也可以啦XD

作者: painkiller (肚子饿~) 2018-02-09 23:34:00

查查 pandas.cut怎么用

作者: HenryLiKing (HenryLiKing) 2018-02-09 23:34:00

你是比赛的吼

作者: goldflower (金色小黄花) 2018-02-10 04:14:00

len(df[id]=='A' & df[col]<20)/len(df[id]=='A') ?然后for id in df['id'].unique() 就好了吧

作者: b24333666 (比飞笨) 2018-02-12 10:16:00

楼上好面熟

作者: ar54971 2018-03-06 03:42:00

https://goo.gl/2WCUGr

作者: galeondx 2018-03-06 04:19:00

https://goo.gl/cybm9m https://goo.gl/MKaCK6 https://tinyurl.com/yadsk3lo

继续阅读

[问题] python是否有可以存指标的型态?hardman1110 [问题] 改变jupyter --config-dir路径wheado [问题] 关于import PCL这个套件(以解决)hatemath1991 [问题] 想知道折线图的视觉化分析套件Laviathan [问题] python 版本降转abc95007 [问题] Python 爬虫问题sinatora [问题] 更改 ppt 表格样式Sweach 使用conda安装套件的问题iphone2003 Re: [问题] pandas 问题lance5487 [问题] pyinstaller转exe档执行无反应giftedguilt