楼主:
acd51874 (Iwakuma)
2017-11-13 08:33:00原文连结 https://goo.gl/bcVKEA
The season’s complete, which means the numbers are official. This is convenie
nt for a writer, because it means there shouldn’t be any issues anymore with
comparing 2017 to another full season in the past. A full season is a full sea
son. So how about a quick full-season review of the pitch-framing data? There
’s something interesting going on. Something dramatic, something that shakes
the foundation of the numbers themselves. I have the graphs to prove it.
随着球季结束,这也代表有了整季的正式资料可以方便各路作家们使(唬)用(烂)、同时也
不会再有任何异议,真是太棒喏
关于Pitch-Framing,作者用季与季之间的资料来快速对照一下,发现有一些对framing的
既定印象并非是不可动摇的。
作者接下来会用几张图(而不是又臭又长der证明过程)来解释&说明他发现的情况
The most advanced pitch-framing information is available at Baseball Prospectu
s. It’s long been the gold standard, and so many hundreds of hours have gone
into generating the results that get published on the sortable leaderboards. T
here are two framing metrics of note, for catchers and for entire teams. One i
s just framing runs above average, which is self-explanatory. The other is CSA
A, or called strikes above average. This is directly related to framing runs a
bove average, but it’s expressed as a rate stat. I think that’s all you need
to know. In this post, I’m going to use them both.
本篇Framing资料取自BP,BP Framing的资料可针对单一捕手以及整队来看,资料又分为
两个面向:
1.framing runs above average
2.called strikes above average (CSAA)
作者将一并采用来做分析
There exist 10 years of detailed information, based on the pitch-tracking tech
nology that’s been in existence. This is a plot of standard deviations, year
to year, over the course of the decade. This is on the team level, using runs
above average. This is just an examination of each year’s spread.
由于追踪投球的技术进步,让我们有最近十年的详细资料可以拿来玩。下图是以整个球队
作为统计对象,并以RAA 作为数据统计的标准,用标准差的方式来呈现最近十年的分布情
况
https://i.imgur.com/v339xjK.jpg
There’s currently less spread than there was in the past. There hasn’t been
any real change since 2015, but there’s still less spread than in, say, 2008.
This is presumably related to a point I’ve made before: More teams than ever
are aware of the value of good framing. So more teams are prioritizing it, wh
ich makes it harder to stand out. When you raise the floor, you narrow the dis
tribution. There are still differences between the best teams and the worst, b
ut the gaps are somewhat smaller. Neat stuff.
最近三年的变化不大,但跟08年相比却是减少了不少。
这或许和我以前曾提过的"各队开始重视framing的价值"有关,因为大家都开始重视,与
原本领先者们的差距便开始缩小,但好的framing跟差劲的framing,差距仍然是很大的。
But that’s not really what I want to show you here. Instead, I’d like to cal
l your attention to something taking place with individual catchers. I gathere
d data for every catcher since 2008 who’s had at least 2,000 framing opportun
ities in consecutive seasons. Here’s how the numbers held up, in terms of CSA
A, between 2013 – 2016.
然而上面那个东西并不是我在本篇想要表达的重点,希望大家将目光放在捕手身上。
我将最近十年有超过2000次framing机会的捕手们的资料通通都抓了下来,底下这张图里
面的蓝点就是那些捕手们在13~16年的 called strikes above average(CSAA)
(可惜我们并不知道到底有谁、谁又在哪个位置)
https://i.imgur.com/JqUrPCy.jpg
You see that pretty strong, linear relationship. You want that relationship to
exist; that way you can have the confidence you’re measuring something real
and sustainable. That plot comes with an R^2 value of 0.49. The slope is 0.69.
A good framer in one year was likely to be a good framer in the next year, an
d the opposite was also true.
从这张图里可以发现有很强的线性关系(R^2=0.49、斜率0.69),而某些人也会希望这相关
性确实存在,因为这等于给了那些相信framing研究的人一剂强心针,也因此产生出了"好
的接球者来年也能维持好的接球表现,反之亦然"的推论。
Moving on now, here’s the same plot, except for only 2016 – 2017.
用同样的方法来套用在16~17年间的状况
https://i.imgur.com/p3uMxKx.jpg
Look, there’s still some relationship. This isn’t the picture of randomness.
And yet, the R^2 value is 0.20. The slope is just 0.42, which is down 40% fro
m in the earlier plot. Something just happened. Or, something is continuing to
happen. I think this last plot drives it home. Here are all the year-to-year
R^2 values, in terms of CSAA.
虽然还是看得到相关性,但已经很不显著了。
作者认为有什么事情已经发生了,或是正在发生。
https://i.imgur.com/kWlpgAV.jpg
The sustainability is eroding. Which means the predictability is eroding. Sure
, there was a little spike a year ago, but the last three years have the three
weakest relationships in the sample. And, actually, the last four years have
the fourweakest relationships in the sample. Which pitch-framing performance w
as first being measured, one of the things that made it so exciting was that t
he numbers held up so well, year to year. It was essentially proof of signal.
The method of measurement hasn’t meaningfully changed ever since. It’s the s
ame system. If anything, it’s more advanced now than ever. It’s had the bene
fit of time. But the year-to-year relationships are disintegrating. A good fra
mer in 2016 was still likely to look like a good framer in 2017, but that coul
dn’t be said with very much confidence. The data is getting increasingly rand
om.
图中显示CSAA的可持续性正在衰退,可预测性也因此衰退。纵使前一比较年度有个稍回反
弹的小高峰,但依然是在十年线以下(拖走...)。
"一个framing好的捕手,来年也可能依旧是framing好的捕手"
这句话依然有点道理,但说这话的人,信心指数已经不若以前那么高了。
Welington Castillo is currently a free agent. He just spent the year with the
Orioles. When he became an Oriole, he had the record of being a below-average
framer. Last year, he performed like an above-average framer. Ditto J.T. Realm
uto. Ditto Stephen Vogt. Buster Posey, meanwhile, got a lot worse, and so did,
say, Tony Wolters. I have a sample of 393 catcher season pairs. Of the 24 big
gest year-to-year changes in CSAA, seven of them just happened between 2016 an
d 2017. I don’t even know how to explain what’s happened with Chris Iannetta
.
我们都知道Wellington Castillo(以下简称威卡)的framing评价一向不好,然而今年球季
他的CSAA数据却是反常的高于平均,同样诡异的事情也发生在马林鱼队的柳木、转战我酒
的Stephen Vogt身上。
Framing评价顶级的巨人Buster Posey,今年则是反常的差,山脉的捕手Tony Wolters亦
然。
我甚至不知道Chris Iannetta究竟是花生省魔术。
Of those 393 catcher season pairs, Iannetta is responsible for nine of them. B
ut of the seven largest year-to-year changes, whether for better or worse, Ian
netta’s been responsible for three of them. All three are from the most recen
t seasons. From 2014 to 2015, Iannetta got dramatically better. From 2015 to 2
016, he got dramatically worse. And from 2016 to 2017, he got dramatically bet
ter again. Iannetta might be the current face of pitch-framing uncertainty. Or
maybe it’s Jonathan Lucroy, who just keeps on declining. I don’t know. Thin
gs are just weird.
在393组的跨季资料中,有九组是Iannetta的。此外无论好坏,数据变动最大的七组,Ian
netta就占了其中第三组,而且分别是近四年的三组跨季资料。从14年到15年,Chris Ian
netta的接球数据从负转正,15到16却由正转负,16-17却又再次由负转正。
Iannetta的数据所代表的意义可能是指现在的framing数据所面临到的问题-不确定性。
又或者这问题呈现的是在另一位捕手 Jonathan Lucroy 身上,他的framing数据是逐年下
滑的
There are a few possible explanations. One, it’s all a blip. I don’t know. M
aybe. Numbers do funny things sometimes. Two, there’s something wrong with th
e actual data, which might be related to the recent switch-over from PITCHf/x
to Trackman/Statcast. That wouldn’t explain what already seemed like a trend
before 2017. And all this information comes from Pitch Info, which takes delib
erate care to make all necessary adjustments and corrections.
对于上述的问题,几个可能的解释是:
(1):这全部的数据或许只是短时间之内呈现出来的现象而已,数字这种东西有时候是很
有趣的
(2):资料来源有问题,因为资料蒐集的系统从原本的 PITCHf/x 换成了今年的 Trackma
n/Statcast,但这不能解释2017年以前就开始产生的趋势。所有这些信息都来自Pitch In
fo,它需要仔细考虑做出一切必要的调整和修正
And three, get used to this. This could be the new normal, the consequence of
more teams caring about how their catchers catch. Maybe framing is easier to t
each and learn than we thought. Maybe more catchers than ever know what they’
re supposed to do, and so the baseline for everyone is so high that random vol
atility plays a larger-than-ever role. Even if everyone were exactly the same,
there would still be variation, because the baseball season isn’t infinitely
long. If every team, for example, were a true-talent .500 ballclub, a season
would still end up with 90-win teams and 90-loss teams. You could detect this
volatility because, in subsequent years, you’d observe further randomness. Pe
rformances wouldn’t correlate so well year to year. That’s what we’re seein
g with framers.
(3):这篇文章反映出来的情况有可能是新的常态。
更多队伍开始重视捕手framing的重要性,而framing也可能比我们想的还要来的容易被教
导学习。也因为捕手的技术有所提升,整体水平被拉高,所以数据的随机性开始明显了。
就算每个捕手能力都一样,当数据拿出来看,每个捕手数据都不会相等。
In a sense, you could see this all coming. It’s been possible to forecast, an
d I’ve written about this on multiple occasions. But it’s still pretty stark
to look at that 2016 – 2017 R^2. A far weaker relationship than ever. Seemin
gly far more randomness than ever. Is Welington Castillo actually a good pitch
-framer now? I never would’ve believed it, but the data says what it says. I
don’t know what to think about Castillo, and I don’t know what to think abou
t a lot of different guys. Pitch-framing has entered a strange new era. An era
in which it still matters, but an era in which it’s not easy to tell who’ll
actually stay good at it. It’s difficult to justify a heavy investment in so
mething that now comes with such a high degree of uncertainty. But that same u
ncertainty is what the league has to reckon with.
球员表现依然有可能被预测, 但是看看16-17年的R^2:0.42, 一个比以往任何时候都更
弱的关系。数据比以往更多了随机性。
威卡现在是一个framing好的球员? 我从来不会相信,但数据却这么说。 我不知道该怎
么想威卡,我不知道该怎么去评价跟威卡一样数据波动的球员们。
Framing已经进入了新的时代,光凭数据已经难以分辨谁能够保持好表现了。