C.3 Agent Foundations: TA Guide

Session overview

Total time: ~6.5 hours

BlockDurationFormat
Introductory lecture: bird’s eye view and motivation1 hourLecture
Readings and discussions2.5 hoursStructured reading + cross-topic discussion
Guest lecture: reflective oracles and nonrealizability1 hourLecture (Cole Wyeth)
Exercises1 hourIndividual / small group work
Guest lecture: decision theory, information engine, embedded agency and algorithmic thermodynamics1 hourLecture (Aram Ebtekar)

Lecture: Bird’s eye view and motivation

Key points to cover

Common questions / sticking points

Readings and discussion block

Format

Separate into fundamental readings vs topic-specific readings. Each participant reads the fundamentals plus one topic, then cross-pollinates in discussion.

Fundamental readings (everyone):

  • Embedded agency
  • Why agent foundations
  • General purpose search

Topic tracks (one per participant):

  1. Consequentialist foundations (coherence + complete class theorems)
  2. Lob’s theorem and tiling agents
  3. Logical induction
  4. Decision theory
  5. Optimization and thermodynamics
  6. Descriptive agent foundations

Facilitation notes

Per-topic notes

Guest lecture: Reflective oracles and nonrealizability

Context for TAs

Exercise block

Facilitation notes

Exercise solutions

Exercise 1: Godel’s second incompleteness theorem

Exercise 2: Lob’s theorem

Exercise 3: The complete class theorem

Exercise 4: The do-divergence theorem

Exercise 5: Channel additivity

Guest lecture: Decision theory, information engine, and algorithmic thermodynamics

Context for TAs