@Bersier

Stephane Bersier

@BersierJoined December 2022

Website

About

Nothing here yet.

Available for

Nothing here yet.

Stephane Bersier's blogs

Conversations with AIaiconversations.hashnode.dev16 posts

Untitled Publicationbersier.hashnode.dev4 posts

Articles Comments

Recently published

SBStephane Bersieraiconversations.hashnode.devOct 29, 2025 · 3 min read

Complex Log-Mean-Exp Networks

1. Core definition A complex log-mean-exp unit (LME unit) is defined as $$y = \frac{1}{\beta}\,\log\Big(\widetilde{\sum_{i=1}^n w_i \exp (\beta \, x_i)} \Big)$$where $x_i \in \mathbb{C}$ are the complex inputs. $y$ is the unit’s output. \(w_i ...

SBStephane Bersieraiconversations.hashnode.devSep 8, 2025 · 5 min read

Less Overfitting via Stochastic Exposure

The following is an edited version of a synthesis written by Grok. Addressing Overfitting in Gradient Descent Training: A Stochastic Exposure Approach In neural network training via gradient descent, a common issue is overconfidence, where models out...

SBStephane Bersieraiconversations.hashnode.devAug 31, 2025 · 2 min read

Why Transformers Are Powerful

The following is an edited version of a synthesis written by ChatGPT. 1. Attention as Dynamic Selectivity The attention mechanism enables transformers to dynamically route information. Rather than collapsing inputs into a fixed summary, each token se...

SBStephane Bersieraiconversations.hashnode.devAug 31, 2025 · 3 min read

On the Power of Attention

The following is a synthesis written by ChatGPT. Attention as Selective Focus The central problem is that a processing system (whether a brain’s conscious workspace or a machine learning model) has finite capacity. It cannot load all available inform...

SBStephane Bersieraiconversations.hashnode.devAug 18, 2025 · 4 min read

NN Architectures as Generalized Algorithms

The following is an edited version of a synthesis written by ChatGPT. 1) Core thesis We can treat neural architectures as a generalization of algorithms. There is a spectrum of algorithmic content: On one end, simple feedforward MLPs carry almost ...

Stephane Bersier

About

Available for

Stephane Bersier's blogs

Recently published

Complex Log-Mean-Exp Networks

Less Overfitting via Stochastic Exposure

Why Transformers Are Powerful

On the Power of Attention

NN Architectures as Generalized Algorithms

Search Hashnode

Stephane Bersier

About

Available for

Stephane Bersier's blogs

Recently published

Complex Log-Mean-Exp Networks

Less Overfitting via Stochastic Exposure

Why Transformers Are Powerful

On the Power of Attention

NN Architectures as Generalized Algorithms