Creating a Chatbot

Founded in 2015, OpenAI has become one of the most influential and respected AI research organizations in the world. OpenAI covers a broad range of topics in AI, including deep learning, natural language processing, robotics and more. OpenAI provides powerful APIs to incorporate artificial intelligence (AI) and machine learning into applications and services.

To explore OpenAI application programming interfaces (APIs), let's develop a command-line interface chatbot. This chatbot will act as an expert of the user's choosing and will answer questions. The chatbot should be able to remember the context of the chat for the current session when answering questions, so the conversations feel natural and free-flowing.

The following are needed to write this chatbot:

OpenAI APIs: The question(s) asked of the chatbot will be in the request payload and the response will be the answer to the question.
Authenticate API requests: Identifies the user making the call.
Libraries for making API calls: An HTTP client library that can help make the API calls.

Putting things together

OpenAI has many models, and they are exposed through APIs. Each of them can perform different tasks. Pricing for each model also varies. Users can further customize models to their use case.

Models

Select a model that helps the use case of building a chatbot. A high-level overview of the models:

GPT-4: The most recent set of models which can understand and generate natural language or code. There is a waiting list as of now to access this model.
GPT-3.5: Set of models that is a predecessor to GPT-4 and can understand and generate natural language.
GTP-3: Set of models that is a predecessor to GPT-3.5 and can understand and generate natural language.
DALL.E: generates and edits images by taking natural language prompts.
Whisper: Converts audio to text.
Embeddings: Set of models measures the relatedness between two pieces of text.
Codex: Set of models generates code.
Moderation: Can identify sensitive or unsafe usage.

See other models published by OpenAI here. Each GPT model is trained to a certain date, which means responses will only have facts up to the trained date.

For our chatbot, let's use GPT-3.5 turbo; one of the models in the GPT-3.5 model set that is optimized for chat and cost effective. OpenAI models can yield different outputs for the same inputs. To select the right model, use a comparison tool that runs different models side-by-side to compare outputs.

Chat

The chatbot will use a chat model that can take a series of messages as input and return a generated output. Use the POST API endpoint.

Sample request

curl https://api.openai.com/v1/chat/completions \
 -H "Content-Type: application/json" \
 -H "Authorization: Bearer $OPENAI_API_KEY" \
 -d '{
		"model": "gpt-3.5-turbo",
   		"messages": [
		   {"role": "system", "content": "You are a helpful assistant"},
		   {"role": "user", "content": "Who won the world series in 2020?"},
		   {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
		   {"role": "user", "content": "Where was it played?"}
	    ]
  	 }'

The request has two headers: content-type and authorization. In the request payload, specify the model to use and messages, which are an array of message objects. Each message object consists of role and content. There are three roles here: system, user and assistant. The system's content sets the behavior and helps narrow down the dataset, the user's content is us asking the question and the assistant's content is generated in response to the user's question. Including the conversation history helps preserve the context. The conversation history should fit within the model's token limit.

model(string, required): ID of the model to use.
messages(string, required): Messages to generate chat completions for.
temperature(string, optional, defaults to 1): Value range from 0 to 2. Higher value makes answers more verbose and low makes it more focused.
max_tokens(integer, option defaults to intf): Maximum number of tokens to generate. Tokens are discussed below.
n(integer, optional , defaults to 1): How many chat completion choices to generate for each input message.
stream(boolean, optional, defaults to false): If set, partial deltas will be sent, like in ChatGPT.
Presence_penalty(number, optional, defaults to 0): Ranges between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
frequency_penalty(number, optional, defaults 0): Ranges from -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
logic_bias(map, optional, defaults to null): Modify the likelihood of specified tokens appearing in the completion.

There are other optional attributes in the request body that we will not address at this time. A sample response will look like:

Sample response

{
  "id":"chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve",
  "object":"chat.completion",
  "created":1677858242,
  "model":"gpt-3.5-turbo-0301",
  "usage":{
   "prompt_tokens":56,
   "completion_tokens":31,
   "total_tokens":87
  },
  "choices":[
   {
     "message":{
       "role":"assistant",
       "content":"The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers."
     },
     "finish_reason":"stop",
     "index":0
   }
  ]
}

Response to the user question is in choices. Every response will have finish_reason. The possible values are:

Stop: Model output is complete.
Length: Model output is incomplete due to max_tokens parameter or token limit.
Content_filter: Content omitted due to a flag from content filter.
Null: API response is still in progress or incomplete.

Authentication

OpenAI uses API keys for authentication. You can generate an API key here. The API key is secret and used to uniquely identify each user making the API call. It is generally a good idea not to share the API key. The API key will be sent in the HTTP header as Authorization: Bearer OPENAI_API_KEY.

Tokens

Tokens are common sequences of characters found in text. GPT models process the text using tokens. Token can be a single character or a word, they are used for billing. The number of tokens in a single API call affects cost and time required for response. Both input and output tokens are counted and used for billing. In the response above, total_tokens used in the API call is 87. Tokenizer is a tool by OpenAI which can used to generate tokens.

Chat models all have different token limits. For gpt-3.5-turbo max, the token limit is 4090 tokens. It is the user's responsibility to maintain the tokens. There are many libraries that can help calculate tokens in the request.

Pseudocode

Quick recap and assumptions:

Using gpt-3.5-turbo
POST https://api.openai.com/v1/chat/completions
Authentication using API key from https://platform.openai.com/account/api-keys
User will ask a question and will get the response
Chat history will be maintained
Chatbot will honor the token limit
Can use one of the many OpenAI libraries for making HTTP requests and generating tokens to honor the limit

  create a variable for API_KEY
  create a new instance of openai.Client
  create a new scanner to read from the standard input
  create a chatHistory object

  print "pick an expert: "
  read the input from the scanner and trim any leading or trailing white space
  if the input is empty, log a fatal error message and exit
  set the expert property of the chatHistory object to the input
  add a system chat message to the history indicating the selected expert

  loop forever
    print a prompt indicating the expert and ask for a question, terminate on "stop" command
    read the input from the scanner and trim any leading or trailing white space
    if the input is empty, continue to next iteration
    if the input is "stop", break out of the loop

    add the user's message to the chatHistory object
    ensure the token limit is maintained by the chatHistory object

    get a response from the OpenAI client using the chatHistory object as the message history
    if an error occurs, log an error message and continue to the next iteration

    print the expert's response to the user's input
    add the expert's response to the chatHistory object


chatHistory object 
            message [ ] datatype
            
maintainTokenLimit()
if the length of the message plus MAX_RESPONSE_TOKEN exceeds MAX_TOKEN_LIMIT
    remove old messages from the chat history

Conclusion

Using OpenAI APIs, we can easily build intelligent and capable applications; this chatbot is a small example. Just like the chatbot, we can easily write applications to generate code, create and edit images, etc. These APIs can also be integrated into existing applications to add sophisticated capabilities without worrying about the complexity of algorithms. AI in general is already offered as a service by many companies. Developers can leverage OpenAI APIs for building cutting-edge applications.