Robert Kirk
Home
Posts
Tag: safety
Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping
(20 Jul 2023)
How can Interpretability Help Alignment?
(28 May 2020)
What is Interpretability?
(17 Mar 2020)