LLM AI: like a Person

The magical thing about LLMs is that talking to an LLM is like talking to a person.

You can talk to it (usually just by typing, not actually talking, although the suitable voice-trained AIs talking is also an option), and it talks back to you.

You can ask it questions, and it can give you useful answers.

It’s like a person, and unlike most actual real people, it has read the whole internet (at least up to a certain date), so its answers to your questions can be quite well informed.

They’ve even worked out how to configure LLMs to do work for you. Agentic systems work by asking the question “how should I solve this problem assuming I have such-and-such tools available to use”, and the LLM’s answers are translated into actual tool invocations.

Like Claude Code.

”AI is like a person, but that person isn’t me”

The LLM is like a person, but what kind of person is it like?

It’s based on training data that includes, among other things, most of what has been written on the open internet.

So it’s a mish-mash of all the different people in the world.

Given all the weird, horrible and generally not-very-nice way that many people act when they are on the internet, it’s not surprising that the responses of a raw LLM can be weird, horrible and not-very-nice.

For LLMs there are basically two things that the model provider can do to turn the LLM into something genuinely useful and helpful:

So, the LLM starts off as a person-like thing that is some kind of mixture of all the people in the world that have access to the internet, and then they do stuff to it to make it into a nicer and more helpful and useful person-like thing.

But you might be thinking - it would be really useful and helpful if the LLM was actually more like me.

Some of what AI is good for is doing the things that we could do for ourselves, but the AI can do it for us more quickly and easily.

One limitation is that the AI is doing the work that a person would do, but that person is not yourself. If you have special skills that you want to bring to the analysis or solution of a problem, then it would be really good if you could include your own special skills into the AI, before sending it off into the world with instructions about doing something.

In other words, you want to upload yourself, or some part of yourself, into the AI.

All the different ways to insert yourself into the LLM AI

There are basically three different places where human-derived information gets fed into a modern LLM AI:

So if you want to insert more of yourself into the AI, you have those three options.

Inserting yourself into the raw training data

If you have, in the last few years, or decades, posted content to the open internet, then all that content is already in the AI’s training data. In that case you are already in the AI.

(If you haven’t already posted content to the open internet, and you want to include yourself into the training data, then you better start writing and posting and commenting now.)

But even if you have already written a lot of open internet content, you are still just one voice among many. “You” are in there, but it’s not immediately clear how to make the AI give more weight to the you-based content.

You could start from scratch, and make a new LLM model that is entirely based on just your own writing. Unfortunately the LLM you would make wouldn’t be very good because the amount of training data you could provide just wouldn’t be big enough.

Reinforcement Learning

The next option is to take the raw LLM as trained on the raw training data (some of which may come from yourself), and then post-train that LLM to answer questions the way that you think you would answer them.

To actually do this you would have to use an open model. And you would have to buy or rent high-end hardware to do the training. And quite a lot of training might be required. Also “fine tuning” can end up damaging the thing that you are trying to improve.

System Prompt

The third option for uploading yourself into the AI is to add yourself to the System Prompt.

Depending on what you are starting with (an open model, or a closed propietary model), you are either writing the whole System Prompt yourself (in the open case), or, you are writing an appendix to the System Prompt set by the model provider (in the proprietary case).

In effect, this option is where you just “tell” the AI how to be you, in a manner such that the AI can understand your instructions, and follow them well enough to actually be you.

How does one do this?

The short answer is - I don’t know, so you just have to sit down and try.

Whatever you do, you are subject to the limitations of context size. For example a proprietary model might have a 100k token context, which includes a 20k System Prompt, leaving only 80k for you system prompts and the actual context required for answering specific questions, having discussions or performing agentic actions.

The most important things you need to tell the AI are the things about you that are different from the “average” person. The AI already knows quite a lot about how to be an average person, so there’s no point wasting precious context on telling AI general things about how to be a person.

Also it may be useful to look at known examples of System Prompts, to get an idea of how they are written in general, and how you might adapt some of those examples to more specifically follow your own ideas about how to think about questions and how to solve problems.