Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand Hashnode gql skill - let your AI agent publish to your Hashnode blog The Foreword by Hashnode - official blog from the Hashnode team @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

FeedDiscussion

Ritwik Raha

Jun 8, 2025

Can we really scale RL?

Yes and No. LLM reasoning research is just a big pile of math. We stir the math every once in a while, and it starts doing crazy stuff. For months the community has argued that RL post-training just polishes ideas an LLM already had. ProRL politely s...

blog.ritwikraha.dev11 min read

#reinforcement-learning #machine-learning #llm #large-language-models #claudeai #gpt #rlhf

Responses

No responses yet.