---
date: "2025-03-04T18:44:46.000Z"
title: "2025-03-04"
tags: ["language_models"]
draft: false
---

Learned more about the post-training phase of fine-tuning LLMs and how the model initially goes through a pre-training phase.
From there, it is fine-tuned to contribute to a token stream with a human user, using prompt tokens to demarcate whether a message was written by the user or the assistant.

For example

```text
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>
<|im_start|>user
Can I ask a question?<|im_end|>
```

Finally, labs have continued to improve model benchmark performance using further fine-tuning, like RLHF, where humans pick the best of a set of responses from the model, and the model is further fine-tuned on this data.

Progress is slow, but I feel like I am finally beginning to develop more of a mental model of what is happening in model training.
When I [trained my own language model](/til/fastai/lesson4-blog-post-imitator) on the posts from the blog, I understood that I was training a completion model but fully appreciate the additional steps I would need to shape that "base" model into a chat model myself.
Now, I feel I have a better understanding of how that process works.

This brings me back to a question I have been asking for a while: what happened to the completion models?
Why do I have to use a model fine-tuned on `<|im_start|>` and `<|im_end|>` tokens?