RL Excursions during Pretraining: How early is too early for On-policy Learning? https://rl-excursions.github.io/