The system prompt

The system prompt is something that you will not often see, but exists to serve as an LLM's initial set of instructions for responding to user prompts. Whenever you converse with an LLM, the system prompt is included to help ensure the relevance and accuracy of the LLM's response and is the starting point for a conversation with an LLM. Additionally, the system prompt can be a good place to define the model's focus and capabilities. For an example system prompt, I'm going to use the system prompt from a username generating assistant I created to take a user prompt and display some suggested usernames to a web page: 

  • You are an assistant that specializes in creating witty and unique usernames.
  • You should generate usernames that fit the theme of the prompt.
  • You should return 2 to 5 usernames that are each between 5 and 15 characters in length.

I'm giving the assistant a role, "an assistant that specializes in creating witty and unique usernames", as well as some instruction, "you should generate usernames that fit the theme of the prompt," and "you should return 2 to 5 username examples that are each between 5 and 15 characters in length." 

The final sentence where I give a range of "2 to 5 usernames" and "between 5 and 15 characters in length" gives the LLM some flexibility, which can improve responses by keeping the LLM's responses less rigid, but within an acceptable boundary. Knowing I want to display these to a web page, you might be thinking, couldn't this give a wide range of responses and formats, potentially making it difficult to display? Given that an LLM's responses are nondeterministic, this could certainly be difficult to display. Let's see how zero-shot and few-shot prompting can help me get the desired format.

Zero-shot prompting

I'm going to add an additional set of instructions to my system prompt to better steer the LLM to provide the usernames in a format that I can use to render them to the screen.

  • You are an assistant that specializes in creating witty and unique usernames. 
  • You should generate usernames that fit the theme of the prompt. 
  • You should return 2 to 5 usernames that are each between 5 and 15 characters in length. 

Please return the usernames as an array.

Great. Now the LLM knows that I want the data returned as an array and I can take that response and use it to display the usernames on the screen. Well, not quite. Here's how a few of those responses look with the current system prompt and a user prompt, "someone that likes fishing:"

First response:

Second response:

With zero-shot prompting, I'm telling the LLM what I want, but leaving it with a bit of freedom to provide that data to the best of its ability. I'm giving it zero examples of how I want the data returned.

With the first response, I could try and craft a regular expression to extract the usernames and with the second response, I can alter the values in code when I get the response from the LLM and get it to a usable format to render to a page. But what if I could give the LLM some examples of the response I expect and just take the response directly from the LLM and avoid having to make changes in the code?

Few-shot prompting

I'm going to remove the line where I specify that I want an array of usernames and replace this with examples to better guide the LLM's response:

  • You are an assistant that specializes in creating witty and unique usernames. 
  • You should generate usernames that fit the theme of the prompt. 
  • You should return 2 to 5 usernames that are each between 5 and 15 characters in length.

Prompt: A baking aficionado 

Response: ["KingBaker", "BaKing", "SuperBaker", "PassionateBaker"]

Prompt: Someone that loves to run 

Response: ["Loves2Run", "RunRunRun", "KeepOnRunning", "RunFastRunFar", "Run4Fun"]

With few-shot prompting, I'm giving the LLM a few examples of how I expect the data to be returned. It can then use these examples as context for future responses. Now if I try my user prompt of "someone that likes fishing" a couple of times:

First response:

Second response:


Awesome. I'm getting back a response in a format that I can easily work with!

Conclusion

You've now seen how the system prompt is used to keep LLMs focused on certain tasks and give it guidance for more specific outputs while zero-shot prompting and few-shot prompting can be used to further cater the LLM's responses to include information and a format we expect. Although each of these examples are implemented in the system prompt, you can use zero-shot and few-shot prompting outside of the system prompt. Keeping with our username generator, try pasting the following into a chat assistant like ChatGPT. Make sure to replace {Your prompt here} with your own prompt:

Generate usernames that fit the theme of the prompt. 
You should return 2 to 5 usernames that are each between 5 and 15 characters in length.

Prompt: A baking aficionado 

Response: ["KingBaker", "BaKing", "SuperBaker", "PassionateBaker"]

Prompt: Someone that loves to run 

Response: ["Loves2Run", "RunRunRun", "KeepOnRunning", "RunFastRunFar", "Run4Fun"]

Prompt: {Your prompt here}

Response: 

Now continue the conversation with a prompt like mountain climber and see how the LLM will retain this format in future responses without needing further guidance or examples.

Technologies