P
ppotiny-web
PPO-Snake training, end-to-end on WebGPU in your browser

Live environment

idle
env 0/256 length 1

Training metrics

avg5000last 500 episodes
peak avg5000best window so far
rollouts/sec020-step rolling avg
elapsed0s0 rollouts · 0 episodes
pol= — val= — ent= — gn= — kl= —

Average episode return (last 500)

Activity log

Watch the trained policy play

0 rollouts · trained 0
Watch how the policy behaves at the current point in training. Training is paused; weights are kept. Hit "Resume training" to continue.

Configure runs

idle
no runs yet