In this blog
Founded in 2015, OpenAI has become one of the most influential and respected AI research organizations in the world. OpenAI covers a broad range of topics in AI, including deep learning, natural language processing, robotics and more. OpenAI provides powerful APIs to incorporate artificial intelligence (AI) and machine learning into applications and services.
To explore OpenAI application programming interfaces (APIs), let's develop a command-line interface chatbot. This chatbot will act as an expert of the user's choosing and will answer questions. The chatbot should be able to remember the context of the chat for the current session when answering questions, so the conversations feel natural and free-flowing.
The following are needed to write this chatbot:
- OpenAI APIs: The question(s) asked of the chatbot will be in the request payload and the response will be the answer to the question.
- Authenticate API requests: Identifies the user making the call.
- Libraries for making API calls: An HTTP client library that can help make the API calls.
Putting things together
OpenAI has many models, and they are exposed through APIs. Each of them can perform different tasks. Pricing for each model also varies. Users can further customize models to their use case.
Models
Select a model that helps the use case of building a chatbot. A high-level overview of the models:
- GPT-4: The most recent set of models which can understand and generate natural language or code. There is a waiting list as of now to access this model.
- GPT-3.5: Set of models that is a predecessor to GPT-4 and can understand and generate natural language.
- GTP-3: Set of models that is a predecessor to GPT-3.5 and can understand and generate natural language.
- DALL.E: generates and edits images by taking natural language prompts.
- Whisper: Converts audio to text.
- Embeddings: Set of models measures the relatedness between two pieces of text.
- Codex: Set of models generates code.
- Moderation: Can identify sensitive or unsafe usage.
See other models published by OpenAI here. Each GPT model is trained to a certain date, which means responses will only have facts up to the trained date.
For our chatbot, let's use GPT-3.5 turbo; one of the models in the GPT-3.5 model set that is optimized for chat and cost effective. OpenAI models can yield different outputs for the same inputs. To select the right model, use a comparison tool that runs different models side-by-side to compare outputs.
Chat
The chatbot will use a chat model that can take a series of messages as input and return a generated output. Use the POST API endpoint.
Sample request
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
}'
The request has two headers: content-type and authorization. In the request payload, specify the model to use and messages, which are an array of message objects. Each message object consists of role and content. There are three roles here: system, user and assistant. The system's content sets the behavior and helps narrow down the dataset, the user's content is us asking the question and the assistant's content is generated in response to the user's question. Including the conversation history helps preserve the context. The conversation history should fit within the model's token limit.
- model(string, required): ID of the model to use.
- messages(string, required): Messages to generate chat completions for.
- temperature(string, optional, defaults to 1): Value range from 0 to 2. Higher value makes answers more verbose and low makes it more focused.
- max_tokens(integer, option defaults to intf): Maximum number of tokens to generate. Tokens are discussed below.
- n(integer, optional , defaults to 1): How many chat completion choices to generate for each input message.
- stream(boolean, optional, defaults to false): If set, partial deltas will be sent, like in ChatGPT.
- Presence_penalty(number, optional, defaults to 0): Ranges between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
- frequency_penalty(number, optional, defaults 0): Ranges from -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
- logic_bias(map, optional, defaults to null): Modify the likelihood of specified tokens appearing in the completion.
There are other optional attributes in the request body that we will not address at this time. A sample response will look like:
Sample response
{
"id":"chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve",
"object":"chat.completion",
"created":1677858242,
"model":"gpt-3.5-turbo-0301",
"usage":{
"prompt_tokens":56,
"completion_tokens":31,
"total_tokens":87
},
"choices":[
{
"message":{
"role":"assistant",
"content":"The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers."
},
"finish_reason":"stop",
"index":0
}
]
}
Response to the user question is in choices. Every response will have finish_reason. The possible values are:
- Stop: Model output is complete.
- Length: Model output is incomplete due to max_tokens parameter or token limit.
- Content_filter: Content omitted due to a flag from content filter.
- Null: API response is still in progress or incomplete.
Authentication
OpenAI uses API keys for authentication. You can generate an API key here. The API key is secret and used to uniquely identify each user making the API call. It is generally a good idea not to share the API key. The API key will be sent in the HTTP header as Authorization: Bearer OPENAI_API_KEY.
Tokens
Tokens are common sequences of characters found in text. GPT models process the text using tokens. Token can be a single character or a word, they are used for billing. The number of tokens in a single API call affects cost and time required for response. Both input and output tokens are counted and used for billing. In the response above, total_tokens used in the API call is 87. Tokenizer is a tool by OpenAI which can used to generate tokens.
Chat models all have different token limits. For gpt-3.5-turbo max, the token limit is 4090 tokens. It is the user's responsibility to maintain the tokens. There are many libraries that can help calculate tokens in the request.
Pseudocode
Quick recap and assumptions:
- Using gpt-3.5-turbo
- POST https://api.openai.com/v1/chat/completions
- Authentication using API key from https://platform.openai.com/account/api-keys
- User will ask a question and will get the response
- Chat history will be maintained
- Chatbot will honor the token limit
- Can use one of the many OpenAI libraries for making HTTP requests and generating tokens to honor the limit
create a variable for API_KEY
create a new instance of openai.Client
create a new scanner to read from the standard input
create a chatHistory object
print "pick an expert: "
read the input from the scanner and trim any leading or trailing white space
if the input is empty, log a fatal error message and exit
set the expert property of the chatHistory object to the input
add a system chat message to the history indicating the selected expert
loop forever
print a prompt indicating the expert and ask for a question, terminate on "stop" command
read the input from the scanner and trim any leading or trailing white space
if the input is empty, continue to next iteration
if the input is "stop", break out of the loop
add the user's message to the chatHistory object
ensure the token limit is maintained by the chatHistory object
get a response from the OpenAI client using the chatHistory object as the message history
if an error occurs, log an error message and continue to the next iteration
print the expert's response to the user's input
add the expert's response to the chatHistory object
chatHistory object
message [ ] datatype
maintainTokenLimit()
if the length of the message plus MAX_RESPONSE_TOKEN exceeds MAX_TOKEN_LIMIT
remove old messages from the chat history
Conclusion
Using OpenAI APIs, we can easily build intelligent and capable applications; this chatbot is a small example. Just like the chatbot, we can easily write applications to generate code, create and edit images, etc. These APIs can also be integrated into existing applications to add sophisticated capabilities without worrying about the complexity of algorithms. AI in general is already offered as a service by many companies. Developers can leverage OpenAI APIs for building cutting-edge applications.