Skip to content

Tutorials

Step-by-step guides for running interpretability experiments with Seer.

Getting Started

  1. Local Mode - Run locally without Modal (no GPU, API-based investigations)
  2. Sandbox Intro - Spin up a GPU, load a model, let an agent explore it
  3. Scoped Sandbox - Expose specific functions instead of full notebook access

Example investigations

  1. Hidden Preference - Discover hidden bias using activation analysis
  2. Introspection - Can models detect injected concepts?
  3. Checkpoint Diffing - Find behavioral changes between model versions
  4. Petri Harness - Auditor/target/judge pattern for probing behaviors