BSBerkan Seseninsesenai.hashnode.dev00Solving CartPole Without Gradients: Simulated AnnealingApr 23 · 17 min read · In the previous post, we solved CartPole using the Cross-Entropy Method: sample 200 candidate policies, keep the best 40, refit a Gaussian, repeat. It worked beautifully, reaching a perfect score of 5Join discussion
BSBerkan Seseninsesenai.hashnode.dev00The Cross-Entropy Method: Solving RL Without GradientsApr 21 · 14 min read · Reinforcement learning has accumulated layers of complexity over the years: value functions, policy gradients, replay buffers, target networks. The Cross-Entropy Method predates all of it. Rubinstein Join discussion
BSBerkan Seseninsesenai.hashnode.dev00AI Experts Are Dead. Long Live the AI Experts.Apr 15 · 16 min read · Last month, my eight-year-old built a Flappy Bird clone from scratch. He can't really type yet. He certainly can't write Python. What he can do is talk to Claude while I whisper in his ear what to sayJoin discussion
BSBerkan Seseninsesenai.hashnode.dev00Hyperparameter Optimization: Grid vs Random vs BayesianApr 10 · 20 min read · You've trained a Random Forest and it works — 85% accuracy out of the box. But you used the default hyperparameters. What if n_estimators=500 with max_features=0.3 and min_samples_leaf=10 pushes that Join discussion
BSBerkan Seseninsesenai.hashnode.dev00Policy Gradients: REINFORCE from Scratch with NumPyApr 8 · 20 min read · In the DQN post, we trained a neural network to estimate Q-values and then picked the best action with argmax. That works when the action space is discrete — push left or push right. But what if you nJoin discussion
BSBerkan Seseninsesenai.hashnode.dev00Deep Q-Networks: Experience Replay and Target NetworksApr 6 · 22 min read · In the Q-learning post, we trained an agent to navigate a 4×4 frozen lake using a simple lookup table — 16 states × 4 actions = 64 numbers. But what happens when the state space isn't a grid? CartPoleJoin discussion
BSBerkan Seseninsesenai.hashnode.dev00Q-Learning from Scratch: Navigating the Frozen LakeApr 4 · 13 min read · Imagine you're standing on a frozen lake. Your goal is on the far side, but there are holes in the ice — fall in and it's game over. Worse, the ice is slippery: when you try to go right, you might sliJoin discussion
BSBerkan Seseninsesenai.hashnode.dev00Genetic Algorithms: From Line Fitting to the Travelling SalesmanApr 3 · 14 min read · Imagine you're planning a road trip through 25 cities. The number of possible routes is 25!/2 — roughly 7.8 × 10²⁴, more than the number of stars in the observable universe. You can't try them all. AnJoin discussion
BSBerkan Seseninsesenai.hashnode.dev00Backpropagation Demystified: Neural Nets from First PrinciplesApr 2 · 15 min read · Every modern deep learning framework — PyTorch, TensorFlow, JAX — does one thing brilliantly: it computes gradients for you. Call loss.backward() and millions of parameters update simultaneously. But Join discussion
BSBerkan Seseninsesenai.hashnode.dev00From K-Means to GMM: Hard vs Soft ClusteringMar 30 · 14 min read · You have a pile of unlabelled data and you want to find groups in it. K-Means is the algorithm everyone reaches for first — it's fast, simple, and usually works. But it makes a bold assumption: every Join discussion