Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping
This is just a link-post to this post, which I published on the Alignment Forum.
This is just a link-post to this post, which I published on the Alignment Forum.