wow
The global DeepSeek reappearance frenzy! The myth of Silicon Valley giants collapses, 30 knives witness the aha moment
就在这当口,全球复现DeepSeek的一波狂潮也来了。更令人兴奋的是,成本不到30美金,就可以亲眼见证「啊哈」时刻。7B模型复刻,结果令人惊讶港科大助理教授何俊贤的团队,只用了8K个样本,就在7B模型上复刻出了DeepSeek-R1-Zero和DeepSeek-R1的训练。与DeepSeek R1类似,研究者的强化学习方案极其简单,没有使用奖励模型或MCTS类技术。随后,生成长度开始再次增加,此时出现了自我反思机制。
Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.
Like
Report
Login to post

No comments yet
