Re: [新闻] OpenAI:已掌握DeepSeek盗用模型证据

楼主: Lushen (wind joker!!!)   2025-01-30 08:59:21
OpenAPI 的 Chief Research Officer (首席研究员)
Mark Chen 2025/01/29 凌晨发了一波推文评价 Deepseek R1 的论文
https://i.imgur.com/A73X07x.png
https://i.imgur.com/rjDczVH.png
恭喜 DeepSeek 产出了一个 o1 级别的推理模型!他们的研究论文显示,他们独立发现了
一些我们在通往 o1 道路上所找到的核心理念。
不过,我认为外界的反应有些过度,特别是在成本相关的叙事上。拥有两种范式(预训练
和推理)的一个重要影响是,我们可以在两个轴向上最佳化能力,而不是只有一个,这将
带来更低的成本。
但这也意味着我们有两个可以扩展的轴向,而我们计划在这两个方向上都积极投入算力!
随着蒸馏(distillation)技术的成熟,我们也看到降低成本和提升能力这两者之间的关
系越来越解耦。能够以更低的成本提供服务(尤其是在较高延迟的情况下),并不代表能
够产生更强的能力。
我们将持续改进模型的低成本部署能力,但我们仍对研究路线保持乐观,并将专注于执行
计划。我们很兴奋能在本季度及今年内推出更优秀的模型!
Congrats to DeepSeek on producing an o1-level reasoning model! Their research
paper demonstrates that they’ve independently found some of the core ideas
that we did on our way to o1.
However, I think the external response has been somewhat overblown,
especially in narratives around cost. One implication of having two paradigms
(pre-training and reasoning) is that we can optimize for a capability over
two axes instead of one, which leads to lower costs.
But it also means we have two axes along which we can scale, and we intend to
push compute aggressively into both!
As research in distillation matures, we're also seeing that pushing on cost
and pushing on capabilities are increasingly decoupled. The ability to serve
at lower cost (especially at higher latency) doesn't imply the ability to
produce better capabilities.
We will continue to improve our ability to serve models at lower cost, but we
remain optimistic in our research roadmap, and will remain focused in
executing on it. We're excited to ship better models to you this quarter and
over the year!
作者: tsubasawolfy (悠久の翼)   2025-01-30 09:06:00
GPT的mini系列可以更省吧
作者: MyPetTankDie   2025-01-30 09:27:00
效率化是好事
作者: redbeanbread (寻找)   2025-01-30 09:28:00
急了5090不够卖吧
作者: dongdong0405 (聿水)   2025-01-30 09:47:00
OpenAI里也有中国人 中又赢
作者: herculus6502 (金麟岂是池中物)   2025-01-30 09:54:00
首先,你要酿得出酒来

Links booklink

Contact Us: admin [ a t ] ucptt.com