Feed
Pro
Search

Author

Write
Drafts

Bug0 - The AI-native e2e QA regression testing Bug0 Browsers - Cloud Chromium on demand, per-minute, live preview Passmark - The open-source AI framework for regression testing Hackathons Changelog Brand @hashnode on X Hashnode on LinkedIn Support - hello+support@hashnode.com Code of Conduct Terms Privacy Sitemap
Sign in

Search Hashnode

Search posts, tags, users, and pages

Discussion on "Policy Gradients: REINFORCE from Scratch with NumPy" | Hashnode

FeedDiscussion

Berkan Sesen

I think about AI. A lot.

Apr 8

Policy Gradients: REINFORCE from Scratch with NumPy

In the DQN post, we trained a neural network to estimate Q-values and then picked the best action with argmax. That works when the action space is discrete — push left or push right. But what if you n

sesenai.hashnode.dev20 min read

#reinforcement-learning #deep-learning #optimisation

Responses

No responses yet.