---
date: '2025-12-18'
description: on teaching how models should learn.
id: Reinforcement learning
modified: 2026-06-05 15:08:27 GMT-04:00
socials:
  lilog: https://lilianweng.github.io/posts/2018-02-19-rl-overview/#key-concepts
tags:
  - sapling
  - ml
  - scaling
title: Reinforcement learning
created: '2025-12-18'
published: '2025-12-18'
pageLayout: default
slug: thoughts/Reinforcement-learning
permalink: https://aarnphm.xyz/thoughts/Reinforcement-learning.md
generator:
  quartz: v4.6.0
  hostedProvider: Cloudflare
  baseUrl: aarnphm.xyz
full: https://aarnphm.xyz/llms-full.txt
---
![[thoughts/Policy gradient]]