"
Models inherit a vast range of humanlike motivations or “personas” from pre-training (when they are trained on a large volume of human work). Post-training is believed to select one or more of these personas more so than it focuses the model on a de novo goal, and can also teach the model how (via what process) it should carry out its tasks, rather than necessarily leaving it to derive means (i.e., power seeking) purely from ends
"