POST
/
generate
curl --request POST \
  --url https://api.contextual.ai/v1/generate \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
  "model": "<string>",
  "messages": [
    {
      "content": "<string>",
      "role": "user"
    }
  ],
  "knowledge": [
    "<string>"
  ],
  "system_prompt": "<string>",
  "avoid_commentary": false,
  "temperature": 0,
  "top_p": 0.9,
  "max_new_tokens": 1024
}'
{
  "response": "<string>"
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

/generate input request object.

model
string
required

The version of the Contextual's GLM to use. Currently, we just have "v1".

messages
object[]
required

List of messages in the conversation so far. The last message must be from the user.

Message object for a message received in the /generate request

knowledge
string[]
required

The knowledge sources the model can use when generating a response.

system_prompt
string

Instructions that the model follows when generating responses. Note that we do not guarantee that the model follows these instructions exactly.

avoid_commentary
boolean
default:false

Flag to indicate whether the model should avoid providing additional commentary in responses. Commentary is conversational in nature and does not contain verifiable claims; therefore, commentary is not strictly grounded in available context. However, commentary may provide useful context which improves the helpfulness of responses.

temperature
number
default:0

The sampling temperature, which affects the randomness in the response. Note that higher temperature values can reduce groundedness.

Required range: 0 <= x <= 1
top_p
number
default:0.9

A parameter for nucleus sampling, an alternative to temperature which also affects the randomness of the response. Note that higher top_p values can reduce groundedness.

Required range: 0 < x <= 1
max_new_tokens
integer
default:1024

The maximum number of tokens that the model can generate in the response.

Required range: 1 <= x <= 2048

Response

200
application/json
Successful Response

/generate result object.

response
string
required

The model's response to the last user message.