Practical AI Alignment and Interpretability Research Group — Interpretability Work

Organization:
Practical AI Alignment and Interpretability Research Group
Award Date:
09/2024
Amount:
$737,000
Purpose:
To support work to conduct interpretability research, create open-source course materials on mechanistic interpretability, and run a mentorship program.

Open Philanthropy recommended a grant of $737,000 over two years to the Practical AI Alignment and Interpretability Research Group to support work led by Atticus Geiger to conduct interpretability research, create open-source course materials on mechanistic interpretability, and run a mentorship program.

This falls within Open Philanthropy’s focus area of potential risks from advanced artificial intelligence.

Read more: