Tutorials
Step-by-step guides for running interpretability experiments with Seer.
Getting Started
- Local Mode - Run locally without Modal (no GPU, API-based investigations)
- Sandbox Intro - Spin up a GPU, load a model, let an agent explore it
- Scoped Sandbox - Expose specific functions instead of full notebook access
Example investigations
- Hidden Preference - Discover hidden bias using activation analysis
- Introspection - Can models detect injected concepts?
- Checkpoint Diffing - Find behavioral changes between model versions
- Petri Harness - Auditor/target/judge pattern for probing behaviors