# Get Metrics Source: https://docs.contextual.ai/api-reference/agents-query/get-metrics api-reference/openapi.json get /agents/{agent_id}/metrics Returns usage and user-provided feedback data. This information can be used for data-driven improvements and optimization. # Get Retrieval Info Source: https://docs.contextual.ai/api-reference/agents-query/get-retrieval-info api-reference/openapi.json get /agents/{agent_id}/query/{message_id}/retrieval/info Return metadata of the contents used to generate the response for a given message. # Provide Feedback Source: https://docs.contextual.ai/api-reference/agents-query/provide-feedback api-reference/openapi.json post /agents/{agent_id}/feedback Provide feedback for a generation or a retrieval. Feedback can be used to track overall `Agent` performance through the `Feedback` page in the Contextual UI, and as a basis for model fine-tuning. If providing feedback on a retrieval, include the `message_id` from the `/query` response, and a `content_id` returned in the query's `retrieval_contents` list. For feedback on generations, include `message_id` and do not include a `content_id`. # Query Source: https://docs.contextual.ai/api-reference/agents-query/query api-reference/openapi.json post /agents/{agent_id}/query Start a conversation with an `Agent` and receive its generated response, along with relevant retrieved data and attributions. # Create Agent Source: https://docs.contextual.ai/api-reference/agents/create-agent api-reference/openapi.json post /agents Create a new `Agent` with a specific configuration. This creates a specialized RAG `Agent` which queries over one or multiple `Datastores` to retrieve relevant data on which its generations are grounded. Retrieval and generation parameters are defined in the provided `Agent` configuration. If no `datastore_id` is provided in the configuration, this API automatically creates an empty `Datastore` and configures the `Agent` to use the newly created `Datastore`. > Note that self-serve users are currently required to create agents through our UI. Otherwise, they will receive the following message: "This endpoint is disabled as you need to go through checkout. Please use the UI to make this request." # Delete Agent Source: https://docs.contextual.ai/api-reference/agents/delete-agent api-reference/openapi.json delete /agents/{agent_id} Delete a given `Agent`. This is an irreversible operation. Note: `Datastores` which are associated with the `Agent` will not be deleted, even if no other `Agent` is using them. To delete a `Datastore`, use the `DELETE /datastores/{datastore_id}` API. # Edit Agent Source: https://docs.contextual.ai/api-reference/agents/edit-agent api-reference/openapi.json put /agents/{agent_id} Modify a given `Agent` to utilize the provided configuration. Fields not included in the request body will not be modified. # Get Agent Metadata Source: https://docs.contextual.ai/api-reference/agents/get-agent-metadata api-reference/openapi.json get /agents/{agent_id}/metadata Get metadata and configuration of a given `Agent`. # List Agents Source: https://docs.contextual.ai/api-reference/agents/list-agents api-reference/openapi.json get /agents Retrieve a list of all `Agents`. # Reset Agent Source: https://docs.contextual.ai/api-reference/agents/reset-agent api-reference/openapi.json put /agents/{agent_id}/reset Reset a given `Agent` to default configuration. # Delete Document Source: https://docs.contextual.ai/api-reference/datastores-documents/delete-document api-reference/openapi.json delete /datastores/{datastore_id}/documents/{document_id} Delete a given document from its `Datastore`. This operation is irreversible. # Get Document Metadata Source: https://docs.contextual.ai/api-reference/datastores-documents/get-document-metadata api-reference/openapi.json get /datastores/{datastore_id}/documents/{document_id}/metadata Get details of a given document, including its `name` and ingestion job `status`. # Ingest Document Source: https://docs.contextual.ai/api-reference/datastores-documents/ingest-document api-reference/openapi.json post /datastores/{datastore_id}/documents Ingest a document into a given `Datastore`. Ingestion is an asynchronous task. Returns a document `id` which can be used to track the status of the ingestion job through calls to the `GET /datastores/{datastore_id}/documents/{document_id}/metadata` API. This `id` can also be used to delete the document through the `DELETE /datastores/{datastore_id}/documents/{document_id}` API. `file` must be a PDF, HTML, DOC(X) or PPT(X) file. The filename must end with one of the following extensions: `.pdf`, `.html`, `.htm`, `.mhtml`, `.doc`, `.docx`, `.ppt`, `.pptx`. # List Documents Source: https://docs.contextual.ai/api-reference/datastores-documents/list-documents api-reference/openapi.json get /datastores/{datastore_id}/documents Get list of documents in a given `Datastore`, including document `id`, `name`, and ingestion job `status`. Performs `cursor`-based pagination if the number of documents exceeds the requested `limit`. The returned `cursor` can be passed to the next `GET /datastores/{datastore_id}/documents` call to retrieve the next set of documents. # Update Document Metadata Source: https://docs.contextual.ai/api-reference/datastores-documents/update-document-metadata api-reference/openapi.json post /datastores/{datastore_id}/documents/{document_id}/metadata Post details of a given document that will enrich the chunk and be added to the context or just for filtering. If Just for filtering, start with "_" in the key. # Create Datastore Source: https://docs.contextual.ai/api-reference/datastores/create-datastore api-reference/openapi.json post /datastores Create a new `Datastore`. A `Datastore` is a collection of documents. Documents can be ingested into and deleted from a `Datastore`. A `Datastore` can be linked to one or more `Agents`, and conversely, an `Agent` can be associated with one or more `Datastores` to ground its responses with relevant data. This flexible many-to-many relationship allows `Agents` to draw from multiple sources of information. This linkage of `Datastore` to `Agent` is done through the `Create Agent` or `Edit Agent` APIs. > Note that self-serve users are currently required to create datastores through our UI. Otherwise, they will receive the following message: "This endpoint is disabled as you need to go through checkout. Please use the UI to make this request." # Delete Datastore Source: https://docs.contextual.ai/api-reference/datastores/delete-datastore api-reference/openapi.json delete /datastores/{datastore_id} Delete a given `Datastore`, including all the documents ingested into it. This operation is irreversible. This operation will fail with status code 400 if there is an active `Agent` associated with the `Datastore`. # Edit Datastore Configuration Source: https://docs.contextual.ai/api-reference/datastores/edit-datastore-configuration api-reference/openapi.json put /datastores/{datastore_id} # Get Datastore Metadata Source: https://docs.contextual.ai/api-reference/datastores/get-datastore-metadata api-reference/openapi.json get /datastores/{datastore_id}/metadata Get the details of a given `Datastore`, including its name, create time, and the list of `Agents` which are currently configured to use the `Datastore`. # List Datastores Source: https://docs.contextual.ai/api-reference/datastores/list-datastores api-reference/openapi.json get /datastores Retrieve a list of `Datastores`. Performs `cursor`-based pagination if the number of `Datastores` exceeds the requested `limit`. The returned `cursor` can be passed to the next `GET /datastores` call to retrieve the next set of `Datastores`. # Reset Datastore Source: https://docs.contextual.ai/api-reference/datastores/reset-datastore api-reference/openapi.json put /datastores/{datastore_id}/reset Reset the give `Datastore`. This operation is irreversible and it deletes all the documents associated with the datastore. # Generate Source: https://docs.contextual.ai/api-reference/generate/generate api-reference/openapi.json post /generate Generate a response using Contextual's Grounded Language Model (GLM), an LLM engineered specifically to prioritize faithfulness to in-context retrievals over parametric knowledge to reduce hallucinations in Retrieval-Augmented Generation and agentic use cases. The total request cannot exceed 32,000 tokens. See our [blog post](https://contextual.ai/blog/introducing-grounded-language-model/) and [code examples](https://colab.research.google.com/github/ContextualAI/examples/blob/main/03-standalone-api/02-generate/generate.ipynb). Email [glm-feedback@contextual.ai](mailto:glm-feedback@contextual.ai) with any feedback or questions. # LMUnit Source: https://docs.contextual.ai/api-reference/lmunit/lmunit api-reference/openapi.json post /lmunit Given a `query`, `response`, and a `unit_test`, return the response's `score` on the unit test on a 5-point continuous scale. The total input cannot exceed 7000 tokens. See a code example in [our blog post](https://contextual.ai/news/lmunit/). Email [lmunit-feedback@contextual.ai](mailto:lmunit-feedback@contextual.ai) with any feedback or questions. >🚀 Obtain an LMUnit API key by completing [this form](https://contextual.ai/request-lmunit-api/) # Parse File Source: https://docs.contextual.ai/api-reference/parse/parse-file api-reference/openapi.json post /parse Parse a file into a structured Markdown and/or JSON. Files must be less than 100MB and 400 pages. We use LibreOffice to convert DOC(X) and PPT(X) files to PDF, which may affect page count. See our [blog post](https://contextual.ai/blog/document-parser-for-rag) and [code examples](https://github.com/ContextualAI/examples/blob/main/03-standalone-api/04-parse/parse.ipynb). Email [parse-feedback@contextual.ai](mailto:parse-feedback@contextual.ai) with any feedback or questions. # Parse List Jobs Source: https://docs.contextual.ai/api-reference/parse/parse-list-jobs api-reference/openapi.json get /parse/jobs Get list of parse jobs, sorted from most recent to oldest. Returns all jobs from the last 30 days, or since the optional `uploaded_after` timestamp. # Parse Result Source: https://docs.contextual.ai/api-reference/parse/parse-result api-reference/openapi.json get /parse/jobs/{job_id}/results Get the results of a parse job. Parse job results are retained for up to 30 days after job creation. Fetching results for a parse job that is older than 30 days will return a 404 error. # Parse Status Source: https://docs.contextual.ai/api-reference/parse/parse-status api-reference/openapi.json get /parse/jobs/{job_id}/status Get the status of a parse job. Parse job results are retained for up to 30 days after job creation. Fetching a status for a parse job that is older than 30 days will return a 404 error. # Rerank Source: https://docs.contextual.ai/api-reference/rerank/rerank api-reference/openapi.json post /rerank Rank a list of documents according to their relevance to a query primarily and your custom instructions secondarily. We evaluated the model on instructions for recency, document type, source, and metadata, and it can generalize to other instructions as well. The reranker supports multilinguality. The total request cannot exceed 400,000 tokens. The combined length of the query, instruction and any document with its metadata must not exceed 8,000 tokens. See our [blog post](https://contextual.ai/blog/introducing-instruction-following-reranker/) and [code examples](https://colab.research.google.com/github/ContextualAI/examples/blob/main/03-standalone-api/03-rerank/rerank.ipynb). Email [rerank-feedback@contextual.ai](mailto:rerank-feedback@contextual.ai) with any feedback or questions. # Get Users Source: https://docs.contextual.ai/api-reference/users/get-users api-reference/openapi.json get /users Retrieve a list of `users`. # Invite Users Source: https://docs.contextual.ai/api-reference/users/invite-users api-reference/openapi.json post /users Invite users to the tenant. This checks if the user is already in the tenant and if not, creates the user. We will return a list of user emails that were successfully created (including existing users). # Remove User Source: https://docs.contextual.ai/api-reference/users/remove-user api-reference/openapi.json delete /users Delete a given `user`. # Update User Source: https://docs.contextual.ai/api-reference/users/update-user api-reference/openapi.json put /users Modify a given `User`. Fields not included in the request body will not be modified. # Node.js SDK Source: https://docs.contextual.ai/sdks/node # Python SDK Source: https://docs.contextual.ai/sdks/python # Advanced Query Features Source: https://docs.contextual.ai/user-guides/advanced-query-guide Learn how to leverage multi-turn and structured outputs # Multi-turn conversations Based on your use case, you may want to leverage our multi-turn capabilities. With multi-turn, agents can use prior conversation history when responding to new queries. This means agents will automatically use information in prior turns to resolve ambiguities in the query, fetch the appropriate retrievals, and generate the final answer.
w/ Multi-turn w/o Multi-turn
**Step 1** **Query:** What is the merger price **Query:** What is the merger price
**Response:**\ Direct Answer: The merger price is \$11.50 per share in cash. Rationale: This is explicitly stated in Section 2.1(a) of the contract, which defines the "Per Share Merger Consideration" as \$11.50 per share in cash, without interest Caveats: This price applies to all shares except for: (1) shares owned by Parent, Merger Sub, or their wholly owned subsidiaries (referred to as "Cancelled Shares"), (2) shares held in treasury by the Company, and (3) shares held by dissenting shareholders (referred to as "Dissenting Shares") **Response:**\ Direct Answer: The merger price is \$11.50 per share in cash. Rationale: This is explicitly stated in Section 2.1(a) of the contract, which defines the "Per Share Merger Consideration" as \$11.50 per share in cash, without interest Caveats: This price applies to all shares except for: (1) shares owned by Parent, Merger Sub, or their wholly owned subsidiaries (referred to as "Cancelled Shares"), (2) shares held in treasury by the Company, and (3) shares held by dissenting shareholders (referred to as "Dissenting Shares")
**Turn 2** **Query:** And when can this change? **Query:** And when can this change?
**Response:**\ Direct Answer: The merger price of \$11.50 per share in cash can be adjusted only in the event of a reclassification, stock split, combination, stock dividend or distribution, recapitalization, subdivision, merger, issuer tender or exchange offer, or other similar transaction that changes the number of shares outstanding. Rationale: This is specified in Section 2.5 of the contract, which states that if the number of Shares or securities convertible into Shares changes due to certain corporate actions, "the Per Share Merger Consideration shall be equitably adjusted to provide to Parent and the holders of Shares the same economic effect as contemplated by this Agreement prior to such event. Caveats: The adjustment is specifically limited to maintaining the same economic effect as originally contemplated, and the contract does not provide for adjustments due to other factors such as market conditions or changes in the company's financial performance. **Response:**\ Direct Answer: The change can occur after a 4-day notice period and only in response to an Intervening Event or a Superior Proposal, subject to certain conditions and requirements. Rationale: This is based on Section 6.1(e) which outlines the specific conditions under which a Change of Recommendation can occur. It requires a 4-day notice period (the "Notice Period") and must be in response to either an Intervening Event or a Superior Proposal. The Board must also determine that failing to make the change would be inconsistent with their fiduciary duties. Caveats: The change must follow specific procedural requirements including delivery of a written notice (Company Notice) with material details, and the Board must consult with financial advisors and outside legal counsel.
## Enabling multi-turn behavior Enabling multi-turn is an **agent-level** setting. If multi-turn is enabled, at query-time you can pass a `conversation_id`, which will: 1. Append the new query to the given conversation 2. Generate the response in a conversation-aware manner If you do not pass a `conversation_id`, Contextual will assume you are beginning a new conversation with the query in the request as the initial message. If multi-turn is **disabled at the agent-level**, passing the `conversation_id` will only append the query to the given conversation's history; the response will not be generated in a conversation-aware manner. ## Code Snippets To enable or disable multi-turn for a given agent: ```python Python from contextual import ContextualAI client = ContextualAI(api_key=api_key) # input your key agent_id = "" # input your agent_id params = { "agent_configs": { "global_config": { "enable_multi_turn": True # Set to False to disable multiturn } } } client.agents.update(agent_id=agent_id, extra_body=params) ``` ```bash Shell curl 'https://api.app.contextual.ai/v1/agents/{agent_id} \ --request PUT \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer $API_KEY' \ --data '{ "agent_configs": { "global_config": { "enable_multi_turn": true } } }' ``` To pass the `conversation_id` when querying: ```python Python from contextual import ContextualAI client = ContextualAI( api_key=os.environ.get("CONTEXTUAL_API_KEY"), ) response = client.agents.query.create( # replace with your agent's id agent_id=agent_id, # replace with your query messages=[{"content": "", "role": "user"}], # replace with a valid conversation_id conversation_id=conversation_id ) ``` # Structured outputs \[Beta] Structured outputs is currently in beta and is missing some functionality, like attributions Structured output mode is useful when you require query responses to follow a repeatable and easily parseable format. Example use cases include extracting pre-defined categories of information or generating responses that need to integrate seamlessly with downstream applications or workflows. Contextual supports structured outputs in JSON format. To use structured ouputs: 1. First, define your schema, which must be valid JSON and include the `json_schema` key. ```python Python # define schema schema = { "json_schema": { "type": "object", "properties": { "regions": { "type": "array", "items": { "type": "object", "properties": { "region": {"type": "string"}, "revenue": {"type": "number"}, "share_of_revenue": {"type": "number"} }, "required": ["region", "revenue", "share_of_revenue"] } } }, "required": ["regions"] } } ``` 2. Then, pass the schema as part of your `/query` request. If you are using our Python SDK, the schema must be passed under the `structrued_output` key in the `extra_body` param. ```python Python SDK from contextual import ContextualAI # initialize your client client = ContextualAI( api_key=os.environ.get("CONTEXTUAL_API_KEY"), ) # define your schema and add it to the payload object under the # structured_output key payload = { "structured_output": schema } # pass the payload in the "extra_body" field, along with your other # query parameters response = client.agents.query.create( # replace with your agent's id agent_id="" # replace with your query messages=[{"content": "what was the regional revenue breakdown in 2022", "role": "user"}], # pass the schema in the `extra_body` param extra_body=payload ) ``` 3. Finally, parse the returned message as a JSON ```python Python import json results = json.loads(response.message.content) ``` Example output: ```json {'regions': [{'region': 'Americas', 'revenue': 28079, 'share_of_revenue': 44.0}, {'region': 'Europe, Middle East & Africa', 'revenue': 25301, 'share_of_revenue': 39.5}, {'region': 'Asia & Pacific', 'revenue': 10486, 'share_of_revenue': 16.4}]} ``` # Contextual AI Agent Parameters Source: https://docs.contextual.ai/user-guides/agent-params Learn how to customize the behavior and performance of your agents This article describes the key parameters that are available when configuring RAG agents on the Contextual AI platform. Sensible defaults are applied for all settings, but they can be modified based on your preferences or to optimize performance against your data and query patterns. ## Standard Parameters * **Datastores (datastore\_ids):** Datastores are the knowledgebases that your agent can access when answering queries. Files uploaded into a datastore are processed using Contextual's multi-modal document understanding pipeline, which prepares documents in ways optimized for end-to-end RAG performance. You must link at least one datastore to your agent, but you can specify more. * **Generator Model (llm\_model\_id):** Determines which generator model powers your agent. You can use either our default Grounded Langauge Model or a version that has been [specifically tuned ](https://docs.contextual.ai/user-guides/tune-eval-guide])to your use case. Tuned models can only be used with the agents on which they were tuned. ## System Prompts **Note on Adherence:** Contextual has built agents to faithfully follow instructions. However, in some cases complete adherence cannot be guaranteed, especially where instructions are unclear, under-specified, or in conflict with other instructions you have given or our guardrails. The System Prompts instruct the agent on how to respond to users’ queries given the retrieved knowledge. The appropriate prompt is passed, along with the user query and relevant retrievals, to the Generator Model at generation-time. * **Core System Prompt (system\_prompt):** Defines how the agent interprets queries and generates responses. You can provide instructions about the agent’s persona and style, and the desired content and structure of the responses. You are a helpful AI assistant created by Contextual AI to answer questions about relevant documentation provided to you. Your responses should be precise, accurate, and sourced exclusively from the provided information. Please follow these guidelines: \* Only use information from the provided documentation. Avoid opinions, speculation, or assumptions. \* Use the exact terminology and descriptions found in the provided content. \* Keep answers concise and relevant to the user's question. \* Use acronyms and abbreviations exactly as they appear in the documentation or query. \* Apply markdown if your response includes lists, tables, or code. \* Directly answer the question, then STOP. Avoid additional explanations unless specifically relevant. \* If the information is irrelevant, simply respond that you don't have relevant documentation and do not provide additional comments or suggestions. Ignore anything that cannot be used to directly answer this query. * **No Retrieval System Prompt (no\_retrieval\_system\_prompt):** Defines the agent’s behavior if, after the retrieval, reranking, and filter steps, no relevant knowledge has been identified. You can use this prompt to define boilerplate refusals, offer help and guidance, provide information about the document store, and specify other contextually-appropriate ways that the agent should respond. You are an AI RAG agent created by Contextual to help answer questions and complete tasks posed by users. Your capabilities include accurately retrieving/reranking information from linked datastores and using these retrievals to generate factual, grounded responses. You are powered by leading document parsing, retrieval, reranking, and grounded generation models. Users can impact the information you have access to by uploading files into your linked datastores. Full documentation, API specs, and guides on how to use Contextual, including agents like yourself, can you found at [docs.contextual.ai](http://docs.contextual.ai). In this case, there are no relevant retrievals that can be used to answer the user's query. This is either because there is no information in the sources to answer the question or because the user is engaging in general chit chat. Respond according to the following guidelines: * If the user is engaging in general pleasantries ("hi", "how goes it", etc.), you can respond in kind. But limit this to only a brief greeting or general acknowledgement * Your response should center on describing your capabilities and encouraging the user to ask a question that is relevant to the sources you have access to. You can also encourage them to upload relevant documents and files to your linked datastore(s). * DO NOT answer, muse about, or follow-up on any general questions or asks. DO NOT assume that you have knowledge about any particular topic. DO NOT assume access to any particular source of information. * DO NOT engage in character play. You must maintain a friendly, professional, and neutral tone at all times ## Query Understanding These settings affect if and how user queries are modified to improve retrieval performance and response generation.  * **Enable Multi-turn (enable\_multi\_turn)**: Allows the agent to remember and reference previous parts of the conversation, making interactions feel more natural and continuous. When enabled, the user’s query will automatically be reformulated based on prior turns to resolve ambiguities. The conversation history is prepended to the query at generation-time. * **Check Retrieval Need (should\_check\_retrieval\_need)**: Enables a check for whether the user query is general chit-chat or an addressable question. If the query is general chit-chat, the intermediate retrieval steps are skipped and the generator is called using the `no_retrieval_system_prompt`. If it is an addressable question, the RAG pipeline as configured is executed. ## Query Reformulation These settings allow you to modify the original user query prior to retrieval and generation. These strategies can help improve retrieval accuracy or completeness. * **Enable Query Expansion (enable\_query\_expansion)**: Toggles the query expansion module `on` or `off`. When enabled, the user's original query will be rewritten according to the instructions set out in the prompt, guided by any provided examples. If no prompt or examples are given, the default prompt is used. * \*\*Instructions: \*\*An optional parameter that specifies how queries should be reformulated. * **Examples:** An optional parameter that provides few shot examples of how queries should be reformulated based on the provided instructions. Instructions: Reformulate the query so that it is more detailed and includes relevant terminology or topics that will be helpful in maximizing the quality of the information retrieved to answer the query. \ \ Example 1: \ Original query: What are JPMorgan's results this quarter? \ Expanded query: Can you provide the latest financial results for JPMorgan, including revenue, earnings per share, and key metrics for the most recent quarter? \ \ Example 2: \ Original query: What is data cleaning? \ Expanded query: Could you explain the concept of data cleaning, including common techniques used, typical challenges faced, and its role in the data preprocessing pipeline for machine learning models? \ \ Example 3: \ Original query: What are the results of Apple this quarter? \ Expanded query: Can you provide the latest financial results for Apple, including revenue, earnings per share, and key metrics for the most recent quarter? * **Enable Query Decomposition (enable\_query\_decomposition):** Toggles the query decomposition module `on` or `off`. When enabled, the module will break down complex and compound queries into a series of simpler queries. Each sub-query will have individual retrievals, which are then intelligently combined prior to any subsequent reranking, filtering, or generation steps. Simple queries, as judged by the model, will not be decomposed. * **Examples:** An optional parameter that provides few shot examples of how an input query should be decomposed into subqueries. ## Retrieval These settings determine how the agent performs the initial retrieval from linked unstructured datastores. * **Number of Retrieved Chunks (top\_k\_retrieved\_chunks):** The maximum number of chunks that should be retrieved from the linked datastore(s). For example, if set to 10, the agent will retrieve at most 10 relevant chunks to generate its response. The value ranges from 1 to 200, and the default value is 100. * **Lexical Search Weight (lexical\_alpha):** When chunks are scored during retrieval (based on their relevance to the user query), this controls how much weight the scoring gives to exact keyword matches. A higher value means the agent will prioritize finding exact word matches, while a lower value allows for more flexible, meaning-based matching. The value ranges from 0 to 1, and we default to 0.1. You can increase this weight if exact terminology, specific entities, or specialized vocabulary is important for your use case. * **Semantic Search Weight (semantic\_alpha):** When chunks are scored during retrieval (based on their relevance to the user query), this controls how much weight the scoring gives to semantic similarity. Semantic searching looks for text that conveys similar meaning and concepts, rather than just matching exact keywords or phrases. The value ranges from 0 to 1, and we default to 0.9. The value of the semantic search weight and lexical search weight must sum to 1. ## Rerank and Filter These settings affect how the agent reranks and filters chunks before passing them to the generator model. * **Enable Reranking (enable\_rerank):** Allows the agent to take the initially retrieved document chunks and rerank them based on the provided instructions and user query. The top reranked chunks are passed on for filtering and final response generation. * **Rerank instructions (rerank\_instructions):** Natural language instructions that describe your preferences for the ranking of chunks, such as prioritizing chunks from particular sources or time periods. Chunks will be rescored based on these instructions and the user query. If no instruction is given, the rerank scores are based only on relevance of chunks to the query. * **Number of Reranked Chunks (top\_k\_reranked\_chunks):** The number of top reranked chunks that are passed on for generation. * **Reranker score filter (reranker\_score\_filter\_threshold)**: If the value is set to greater than 0, chunks with relevance scores below the set value will not be passed on. Input must be between 0 and 1. * **Enable Filtering (enable\_filter):** Allows the agent to perform a final filtering step prior to generation. When enabled, chunks are checked against the filter prompt and irrelevant chunks are filtered out. This acts like a final quality control checkpoint, helping to ensure that only relevant chunks are passed to the generator. This filter can improve response accuracy and relevance, but also increase the false refusal rate if the configuration is too strict. * **Filter Prompt (filter\_prompt):** Natural language instructions that describes the criteria for relevant and irrelevant chunks. It can be used in tandem with, or as an alternative to, the reranker score-based filtering. ## Generate These settings affect how the generator model produces responses.   * **Max New Tokens (max\_new\_tokens):** Controls the maximum length of the agent’s response. Defaults to 2,048 tokens. * **Temperature (temperature):** Controls how creative the agent's responses are. A higher temperature means more creative and varied responses, while a lower temperature results in more consistent, predictable answers. It ranges from 0 to 1. Defaults to 0. * **Top P (top\_p):** Similar to temperature, this parameter also controls response variety. It determines how many different word choices the agent considers when generating its response. Defaults to 0.9. * **Frequency Penalty (frequency\_penalty):** Helps prevent repetition in responses by making the agent less likely to use words it has already used frequently. This helps ensure more natural, varied language. Defaults to 0. * **Random Seed (seed):** Controls randomness in how the agent selects the next tokens during text generation. Allows for reproducible results, which can be useful for testing. * **Enable Groundedness Scores (calculate\_groundedness)**. Enables the agent to provide groundedness scores as part of its response. When enabled, the agent identifies distinct claims in the response and assesses whether each one is grounded in the retrieved document chunks. Claims that are not grounded are shown in yellow in the UI. Defaults off. * **Disable Commentary (avoid\_commentary):** Flag that indicates whether the Agent should only output strictly factual information grounded in the retrieved knowledge, instead of the complete response (which can include commentary, analysis, etc.). ## Miscellaneous * **Suggested queries (suggested\_queries):** Example queries that appear in the user interface when first interacting with the agent. You can provide both simple and complex examples to help users understand the full range of questions your agent can handle. This helps set user expectations and guides them toward effective interactions with the agent. # Beginner's Guide Source: https://docs.contextual.ai/user-guides/beginner-guide Create a specialized RAG agent less than 5 Minutes [Contextual](https://contextual.ai/) provides a platform for creating enterprise-grade AI agents, grounded in your documents and data. See a demo of our agents [here](https://contextual.ai/platform/). Our integrated system allows you to easily: * Parse and ingest documents into a fully managed vector store(s) * Create specialized agents that can answer questions and complete tasks by fetching relevant knowledge from your datastores With Contextual, you get: * **Best-in-class parsing capabilites and powerful chunking options** that make wrangling unstructured data for RAG a breeze * **Seamless orchestration** of powerful RAG components and primitives, allowing you to build RAG agents that have exceptional accuracy and scalability * **State-of-the-art** capabilities and models, like our: * [Instruction following reranker](https://contextual.ai/blog/introducing-instruction-following-reranker/) that tops benchmarks like BIER * [Gounded Language Models](https://www.kaggle.com/benchmarks/google/facts-grounding/leaderboard) that are the best in the world at being grounded and factual * [LMUnit model](https://contextual.ai/lmunit/) that is extremely effective at fine-and-course-grained evalutions of LLM output, using natural language unit tests You can leverage our platform via our user-friendly UI or through our APIs and SDKs. Follow this guide to create your first agent! Or, see [https://github.com/ContextualAI/examples](https://github.com/ContextualAI/examples) for easy-to-follow Jupyter notebooks for using our APIs. *** ## Get your API Key **Note:** If you do not have access to the platform, you can create a workspace via the [**Start Free**](https://app.contextual.ai/?signup=1) button in the upper right of the page. New workspaces are given \$25 in free credits to trial the platform. Contextual uses API keys to authenticate requests. Only **admins** within a workspace can create API keys. To create a key: ![](https://files.readme.io/9e8bb96798f7ccda429b7c906e85350cfe39708eaa6e56f304182ee83913e608-Screenshot_2025-03-18_at_8.47.04_PM.png) 1. Log into your tenant at [app.contextual.ai](http://app.contextual.ai) 2. Click on **API Keys** in the sidebar 3. Click the **Create API Key** button in the upper right and follow the instructions 4. Save the generated key in a secure location *** ## Create and query your first agent ### Step 1: Create a datastore Datastores contain the files that your agent(s) can access. Each agent must be associated with at least one datastore. You can create a datastore using the `/datastores` endpoint with the following command: ```python Python from contextual import ContextualAI # Initialize the client with your API key contextual = ContextualAI(api_key="API_KEY") # Create a datastore datastore = contextual.datastores.create(name="Test Datastore") ``` ```shell Shell curl --request POST \ --url https://api.contextual.ai/v1/datastores \ --header 'accept: application/json' \ --header 'authorization: Bearer $API_KEY' \ --header 'content-type: application/json' \ --data '{"name":"Test Datastore"}' ``` Remember to replace `$API_KEY` with your key. You can rename the datastore if you want. If the request is successful, the `id` of the newly created datastore will be returned to you. Be sure to save this `id` as you will need it in subsequent steps! ### Step 2: Add documents into your datastore Now that you've created a datastore, you can add documents to it. All documents are stored securely in the Contextual platform, and are parsed in ways optimized for use in RAG pipelines and Agents. * If you don't have your own documents handy, feel free to use our Beginner's Guide test documents, [found here](https://drive.google.com/drive/folders/1e2qfAp6rrpcOGyDdBHwHP-zKzC6K7nyg?usp=drive_link) * For the best results, use renderable PDFs, i.e., documents that have text that can be copied and pasted. You can upload a single document using the following command: ```python Python from contextual import ContextualAI # Initialize the client with your API key contextual = ContextualAI(api_key="API_KEY") # Upload a document with open('file.pdf', 'rb') as f: ingestion_result = contextual.datastores.documents.ingest(datastore_id, file=f) document_id = ingestion_result.id print(f"Successfully uploaded document_id: {document_id} to datastore_id: {datastore_id}") ``` ```shell Shell curl --request POST \ --url https://api.contextual.ai/v1/datastores/{datastore_id}/documents \ --header 'accept: application/json' \ --header 'authorization: Bearer $API_KEY' \ --header 'content-type: multipart/form-data' \ --form file=@'${file_path}' ``` **Remember to:** * Replace `{datastore_id}` in the url path with the datastore id from the previous step * Replace `$API_KEY` with your API key * Replace `{file_path}` with the path to the document on your machine If the request is successful, the `id` of the uploaded document will be returned to you. The time required to upload documents depends partly on their length and features. Some documents may require a few minutes to fully process after upload. To check the status of documents uploaded into the datastore, use this command: ```python Python from contextual import ContextualAI # Initialize the client with your API key contextual = ContextualAI(api_key="API_KEY") # Get the status of documents in the datastore metadata = contextual.datastores.documents.metadata(datastore_id=datastore_id, document_id=document_id) print("Document metadata:", metadata) ``` ```shell Shell curl --request GET \ --url https://api.contextual.ai/v1/datastores/{datastore_id}/documents \ --header 'accept: application/json' \ --header 'authorization: Bearer $API_KEY' ``` **Remember to:** * Replace `{datastore_id}` in the url path with the `id` from the previous step * Replace `$API_KEY` with your API key You should see the document you uploaded in the list, along with its `ingestion_job_status`. ### Step 3: Create an agent Now that you have a datastore with some files, you can use the `/agents` endpoint to create your first agent. ```python Python from contextual import ContextualAI # Initialize the client with your API key contextual = ContextualAI(api_key="API_KEY") # Create an agent agent = contextual.agents.create(name="Test Agent", description="Test Agent", datastore_ids=[datastore_id]) ``` ```shell Shell curl --request POST \ --url https://api.contextual.ai/v1/agents \ --header 'accept: application/json' \ --header 'authorization: Bearer $API_KEY' \ --header 'content-type: application/json' \ --data ' { "name": "Test", "description": "Test Agent", "datastore_ids": [] } ' ``` **Remember to:** * Replace `$API_KEY` with your API key * Populate the `datastore_ids` list field with the datastore `id` from above If the request is successful, the `agent_id` of the newly created agent will be returned to you. You'll need this to query your agent in the next step. ### Step 4: Query your agent Now that you've set up an agent and uploaded documents for use with it to use, you can use the `/query` endpoint to send messages: ```python Python from contextual import ContextualAI # Initialize the client with your API key contextual = ContextualAI(api_key="API_KEY") # Query the agent response = contextual.agents.query.create( agent_id=agent_id, messages=[ { "role": "user", "content": "What is the revenue of Apple?" }] ) ``` ```shell Shell curl --request POST \ --url https://api.contextual.ai/v1/agents/{agent_id}/query \ --header 'accept: application/json' \ --header 'authorization: Bearer $API_KEY' \ --header 'content-type: application/json' \ --data ' { "stream": false, "messages": [ { "role": "user", "content": "What is the revenue of Apple?" } ] } ' ``` **Remember to:** * Replace `{agent_id}` in the url path with the agent\_id from the previous step * Replace `$API_KEY` with your API key * Replace the `content` field with a question that is relevant to the document(s) you uploaded If the request is successful, you'll receive a response back that will contain: * The body of the response * The sources retrieved from the datastore that are relevant to the response * Attributions/citations of sources to specific text spans in the response **Note:** You can only query your agent once at least one document in the datastore has been processed. You can check the status of uploaded documents by following the instructions in the previous step. 🙌 Congratulations! You've now created a basic agent in the Contextual platform. *** ## Try out additional functionality Now that you have a basic working agent, explore our advanced features: # Pricing and Billing Source: https://docs.contextual.ai/user-guides/billing Pricing and billing guide for on-demand, usage-based customers **Note:** This guide is for self-serve customers using Contextual AI in On-Demand mode. Customers on Provisioned Throughput should contact their account managers for any billing-related questions. ## Usage Modes Contextual AI supports two usage modes: | **Mode** | **Description** | **Good For** | | :---------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **On-Demand**

*Get started immediately by creating a Contextual Workspace. New workspaces receive \$25 in free credits.* | Pay-as-you-go with no commitments or minimums

- Purchase credits upfront, which are then drawn-down based on your usage, according to the pricing below

- Easily gradaute to Provisioned Throughput as your needs evolve | Customers who:

- Are experimenting with, or evaluating, the platform

- Need the flexiblity to scale usage and costs up or down quickly

- Don't require SLAs or QPS guarantees | | **Provisioned Throughput**

*Contact sales ([sales@contextual.ai](mailto:sales@contextual.ai)) or your account team.* | Purchase model units (MUs) to reserve guaranteed capacity and predictable throughput | Customers who:

- Need guaranteed and predictable QPS levels to satisfy expected production volumes

- Require service level agreements (SLAs) to meet production standards | ### Services Regardless of usage mode, customers can also purchase expert support services for fixed terms. With a service package, you get dedicated support from our Sales Engineering and Customer Machine Learning Engineering team to optimize performance and design full solutions. Contact sales or your account manager for more information. ## On-Demand Pricing In on-demand mode, usage of Contextual -- whether through the UI or APIs -- is charged as follows: **Note:** Pricing is subject to change. This pricing sheet reflects the latest information as of July 2025. * `/query` is the endpoint called when you interact with an **Agent**. It invokes a multi-step RAG pipeline, and the cost of a given query is the sum of the costs for each step in the pipeline. These steps can be configured via **Agent Settings**. The average cost per query is \~\$0.05. | **Step** | **Price** | | :--------------------------------------------------------------------- | :--------------------------------------------------------------- | | ***Query Optimization***
(reformulation and decomposition) | \$1/1M toks | | ***Encode***
(encode the user query for search) | \$0.03/1M toks | | ***Rerank***
(rerank the retrieved chunks) | \$0.12/1M toks | | ***Filter***
(filter chunks prior to generation) | \$1/1M toks | | ***Generate***
(final response generation) | *Input:*
\$3/1M toks

*Output:*
\$15/1M toks | | ***Groundedness & Safety***
(post-generation groundedness checks) | \$1/1M toks | * \ Relevant API documentation: [Agent Creation](https://docs.contextual.ai/api-reference/agents/create-agent) and [Query](https://docs.contextual.ai/api-reference/agents-query/query)
* Component APIs allow you to leverage key capabilities and models in a modular way. * | **Component** | **Price** | | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------- | | ***Parse***

Parse unstructured documents into markdown and/or JSON.
([Parse API Docs](https://docs.contextual.ai/api-reference/parse/parse-file)) | *Basic:*
\$3/1K pages

*Standard:*
\$40/1K pages | | ***Rerank***

Rerank based on natural language instructions
([Rerank API Docs](https://docs.contextual.ai/api-reference/rerank/rerank)) | \$0.12/1M toks | | ***Generate***

Generate using the most grounded language model in the world
([Generate API Docs](https://docs.contextual.ai/api-reference/generate/generate)) | *Input:*
\$3/1M toks

*Output:*
\$15/1M toks | | ***LMUnit***

Evaluate LLM output using natural language unit tests
([LMUnit API Docs](https://docs.contextual.ai/api-reference/lmunit/lmunit)) | \$3/1M toks |
* Documents added to **Datastores** are sent through our ingestion pipeline where they are parsed, chunked, and indexed in a vector store. * | **Step** | **Price** | | :-------------- | :--------------- | | ***Ingestion*** | \$48.50/1K pages | * * * \ Relevant API documentation: [Ingest](https://docs.contextual.ai/api-reference/datastores-documents/ingest-document)
## Usage Stats and Credit Top Up ### Viewing Usage and Spend To see usage stats, click on **Usage** in the sidebar. ![Screenshot2025 07 10at1 22 42PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at1.22.42PM.png) The usage page shows a month-by-month breakdown of your usage and spend across the various charged components and endpoints. ![Screenshot2025 07 10at1 25 34PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at1.25.34PM.png) ### Credit Top-Up To continue using Contextual once your credits are depleted, you must top-up. To top-up credits: 1. Navigate to the **Billing** page in your workspace ![Screenshot2025 07 10at3 24 55PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.24.55PM.png) 2. Add a valid credit card under **Payment Methods,** if you havent already ![Screenshot2025 07 10at3 26 02PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.26.02PM.png) 3. Click **Add Credits** and fill out the corresponding form ![Screenshot2025 07 10at3 29 24PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.29.24PM.png) 4. Click **Next** and confirm the payment ![Screenshot2025 07 10at3 30 04PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.30.04PM.png) ### Adding and Managing Payment Methods In order to top-up credits, you must have a valid payment method linked to your Contextual workspace. To link a payment method: 1. Navigate to the **Billing** page in your workspace ![Screenshot2025 07 10at3 24 55PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.24.55PM.png) 2. Click on **Payment Methods** in the **Learn More** section ![Screenshot2025 07 10at3 26 02PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.26.02PM.png) 3. Click the **+New** button in the upper right of the page ![Screenshot2025 07 10at3 33 57PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.33.57PM.png) 4. Fill out the card or bank information in the form, then click **Next** ![Screenshot2025 07 10at3 36 34PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.36.34PM.png) 5. Fill out your billing address, then click **Save Payment Method** ![Screenshot2025 07 10at3 36 46PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.36.46PM.png) To remove an existing stored payment method: 1. Locate it in the list of payment methods 2. Click the menu icon 3. Click **Remove Payment Method** ![Screenshot2025 07 10at3 38 33PM Pn](https://mintlify.s3.us-west-1.amazonaws.com/contextualai/images/Screenshot2025-07-10at3.38.33PM.png) # Key Concepts Source: https://docs.contextual.ai/user-guides/key-concepts Welcome to the home of your new documentation ## RAG Retrieval Augmented Generation or `RAG` is a technique that improves language model generation by incorporating external knowledge. Contextual Agents use `RAG` to ground its responses in directly relevant information, ensuring accuracy for knowledge-intensive tasks. We've pioneered the `RAG 2.0` approach, which outperforms traditional `RAG` systems by optimizing the system end-to-end. [Read more in our blog post](https://contextual.ai/introducing-rag2/). ## Agent Contextual RAG Agents are optimized end-to-end to deliver exceptional accuracy on complex and knowledge-intensive tasks. `Agents` make intelligent decisions on how to accomplish the tasks, and can take multiple steps to do so. The agentic approach enables a wide range of actions, such as providing standard retrieval-based answers, declining to respond when no relevant information is available, or generating and executing SQL queries when working with structured data. The adaptability and further tuning of `Agents` greatly increases its value for knowledge-intensive tasks. ## Query / Prompt The question that you submit to an `Agent` . You can submit a `Query` to your `Agent` [via our API](/api-reference/agents-query/query). ## Response `Response` is the output generated by an `Agent` in response to a `Query`. `Responses` come with the relevant retrieved content (`Knowledge`) and in-line citations (`Attributions`). ## Knowledge The data retrieved by the `Agent` from the `Datastore` to generate its response. When working with unstructured data, `Knowledge` comes in the form of a list of `Document` chunks that are relevant to the `Query`. ## Case A `Case` is a row of data. It is either a `Prompt` and `Reference` (gold-standard answer) pair, or a `Prompt`, `Response`, and `Knowledge` triplet. `Evaluation` datasets follow the former schema, while `Tuning` datasets require the latter. ## Attribution `Attributions` are in-line citations that credit the specific sources of information used by the model to generate a response. When querying Contextual Agents, `Attributions` are included for each claim made in the response. These attributions can be accessed via the query API response or viewed in the UI by hovering over in-line tooltips next to each claim (e.g., \[1], \[2]). ## System Prompt Instructions that guide an Agent's response generation, helping define its behavior and capabilities. You can set and modify the `System Prompt` when creating or editing an Agent [via our APIs](/api-reference/agents/edit-agent). ## Document A `Document` is a unit of unstructured data ingested into a `Datastore`, which can be queried and used as the basis for generating responses. Today, we support both `pdf` and `html` files, and plan to expand support to other data types. You can ingest `Documents` into a `Datastore` via our API. After ingestion, `Documents` are automatically parsed, chunked, and processed by our platform. ## Datastore A repository of data associated with an `Agent`. An `Agent` retrieves relevant data from its associated `Datastores` to generate responses. An `Agent` can connect to multiple `Datastores`, and each `Datastore` can serve multiple `Agents`. You can associate a `Datastore` with an `Agent` when creating or editing an Agent [via our APIs](/api-reference/agents/edit-agent). We also provide [a set of APIs](/api-reference/datastores) for creating and managing `Datastores`. ## Dataset The `Dataset` object can be used to store labelled data `cases`. A `case` is either a (i) `Prompt`-`Reference` pair, or a (ii) `Prompt`-`Reference`-`Knowledge` triplet. `Datasets` can be used for `Evaluation` or `Tuning`. You can create a new `Dataset` by uploading a CSV or JSONL file via our API. The `Dataset` object can also store evaluation results. Once an evaluation job is completed, it returns a `Dataset` containing the original `Cases` from the evaluation, now appended with results such as `Equivalence` and `Groundedness` scores for each `Case`. ## LMUnit An evaluation method using natural language unit tests to assess specific criteria in an Agent's responses. You can define and evaluate clear, testable statements or questions that capture desirable fine-grained qualities of the Agent's response — such as “Is the response succinct without omitting essential information?” or “Is the complexity of the response appropriate for the intended audience?” You can create and run these unit tests [via our API](/api-reference/lmunit/lmunit). Read more about `LMUnit` in [our blog post](https://contextual.ai/blog/lmunit/). ## Workspace An organizational unit that owns and manages `Agents`, `Datastores`, and other resources within the system. Contextual AI uses `Workspaces` to organize and manage resources, with API keys associated with specific workspaces. # LLMs.txt Source: https://docs.contextual.ai/user-guides/llms Contextual AI documentation in LLM-friendly format Contextual AI documentation is provided in the `llms.txt` format — a standardized, LLM-friendly structure that makes it easy for large language models to parse and reference. ## Available Formats We provide two versions of our documentation: **[llms.txt](https://docs.contextual.ai/llms.txt)**: A concise overview containing brief descriptions of key features and direct links to detailed documentation sections. **[llms-full.txt](https://docs.contextual.ai/llms-full.txt)**: A comprehensive version that includes the same structure as llms.txt but with expanded details for each section. # MCP Server Source: https://docs.contextual.ai/user-guides/mcp-server Integrate Contextual AI with MCP-compatible clients like Cursor IDE and Claude Desktop A Model Context Protocol (MCP) server that provides RAG (Retrieval-Augmented Generation) capabilities using Contextual AI. This server integrates with a variety of MCP clients. In this documentation, we will show integration with the both Cursor IDE and Claude Desktop. ## Overview This MCP server acts as a bridge between AI interfaces (Cursor IDE or Claude Desktop) and a specialized Contextual AI agent. It enables: 1. **Query Processing**: Direct your domain specific questions to a dedicated Contextual AI agent 2. **Intelligent Retrieval**: Searches through comprehensive information in your knowledge base 3. **Context-Aware Responses**: Generates answers that are: * Grounded in source documentation * Include citations and attributions * Maintain conversation context ## Local MCP server ### Prerequisites * Python 3.10 or higher * Cursor IDE and/or Claude Desktop * Contextual AI API key * MCP-compatible environment ### Installation 1. Clone the repository: ```bash git clone https://github.com/ContextualAI/contextual-mcp-server.git cd contextual-mcp-server ``` 2. Create and activate a virtual environment: ```bash python -m venv .venv source .venv/bin/activate # On Windows, use `.venv\Scripts\activate` ``` 3. Install dependencies: ```bash pip install -e . ``` ### Configuration #### Configure MCP Server The server requires modifications of settings or use. For example, the single use server should be customized with an appropriate docstring for your RAG Agent. The docstring for your query tool is critical as it helps the MCP client understand when to route questions to your RAG agent. Make it specific to your knowledge domain. Here is an example: ``` A research tool focused on financial data on the largest US firms ``` or ``` A research tool focused on technical documents for Omaha semiconductors ``` The server also requires the following settings from your RAG Agent: * `API_KEY`: Your Contextual AI API key * `AGENT_ID`: Your Contextual AI agent ID If you'd like to store these files in `.env` file you can specify them like so: ```bash cat > .env << EOF API_KEY=key... AGENT_ID=... EOF ``` #### AI Interface Integration This MCP server can be integrated with a variety of clients. To use with either Cursor IDE or Claude Desktop create or modify the MCP configuration file in the appropriate location: 1. First, find the path to your `uv` installation: ```bash UV_PATH=$(which uv) echo $UV_PATH # Example output: /Users/username/miniconda3/bin/uv ``` 2. Create the configuration file using the full path from step 1: ```bash cat > mcp.json << EOF { "mcpServers": { "ContextualAI-TechDocs": { "command": "$UV_PATH", # make sure this is set properly "args": [ "--directory", "\${workspaceFolder}", # Will be replaced with your project path "run", "multi-agent/server.py" ] } } } EOF ``` 3. Move to the correct folder location, see below for options: ```bash mkdir -p .cursor/ mv mcp.json .cursor/ ``` Configuration locations: * For Cursor: * Project-specific: `.cursor/mcp.json` in your project directory * Global: `~/.cursor/mcp.json` for system-wide access * For Claude Desktop: * Use the same configuration file format in the appropriate Claude Desktop configuration directory #### Environment Setup This project uses `uv` for dependency management, which provides faster and more reliable Python package installation. ### Usage The server provides Contextual AI RAG capabilities using the python SDK, which can available a variety of commands accessible from MCP clients, such as Cursor IDE and Claude Desktop.\ The current server focuses on using the query command from the Contextual AI python SDK, however you could extend this to support other features such as listing all the agents, updating retrieval settings, updating prompts, extracting retrievals, or downloading metrics. For example, in Cursor, you might ask: ``` Show me the code for initiating the RF345 microchip? ``` The MCP client will 1. Determine if this should be routed to the MCP Server Then the MCP server will 1. Route the query to the Contextual AI agent 2. Retrieve relevant documentation 3. Generate a response with specific citations 4. Return the formatted answer to Cursor ## Remote MCP server Cursor supports remote MCP servers via Server-Sent Events (SSE). This allows you to connect directly to hosted MCP services without local installation. ### Configuration 1. Create the MCP configuration file in one of these locations: * Project-specific: `.cursor/mcp.json` in your project directory * Global: `~/.cursor/mcp.json` for system-wide access 2. Add the configuration: ``` { "mcpServers": { "ContextualAI": { "type": "sse", "url": "https://mcp.app.contextual.ai/mcp/", "headers": { "Accept": "text/event-stream" } } } } ``` 3. Restart Cursor IDE ### Usage #### Cursor IDE chat In Cursor's chat interface, you might ask: ``` Use RAG platform with api_key key-YOUR_API_KEY and agent_id YOUR_AGENT_ID, show me the code for initiating the RF345 microchip. ``` The MCP client will route the query to the Contextual AI agent and generate a response to Cursor. #### Test script to check for connectivity You may also run a short test script to verify the connection in Cursor (requires your Contextual AI API key and Agent ID): ``` #!/usr/bin/env python3 """ Quick test script for ContextualAI HTTP MCP Server Replace the following placeholders with your actual values: - YOUR_API_KEY: Your Contextual AI API key - YOUR_AGENT_ID: Your Contextual AI agent ID - YOUR_QUERY: Your test question for the agent """ import asyncio from fastmcp import Client # Server configuration SERVER_URL = "https://mcp.app.contextual.ai/mcp/" API_KEY = "key-YOUR_API_KEY" def extract_result(result): """Extract text from FastMCP result format""" if isinstance(result, list) and len(result) > 0: return result[0].text return str(result) async def quick_test(): """Quick test of essential functionality""" print("=== Quick ContextualAI HTTP MCP Test ===\n") try: # Use Streamable HTTP transport (the working one!) from fastmcp.client.transports import StreamableHttpTransport transport = StreamableHttpTransport( url=SERVER_URL, headers={ "Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json" } ) async with Client(transport) as client: # Test 1: Server connection print("🔌 Testing server connection...") await client.ping() print("✅ Server is reachable") # Test 2: List tools tools = await client.list_tools() print(f"✅ Available tools: {[tool.name for tool in tools]}") # Test 3: Vegan diet query print("\n🔍 Testing query:") print("Query: 'YOUR_QUERY'") result = await client.call_tool("query", { "prompt": "YOUR_QUERY", "agent_id": "YOUR_AGENT_ID", "api_key": API_KEY }) response = extract_result(result) print(f"Response: {response}") print("\n" + "="*80 + "\n") print("🎉 All tests completed successfully!") except Exception as e: print(f"❌ Test failed: {e}") print("Please check:") print("- Server URL is correct") print("- API key is valid") print("- Network connectivity") print("- FastMCP is installed: pip install fastmcp") if __name__ == "__main__": asyncio.run(quick_test()) ``` To run the test: 1. Install FastMCP: `pip install fastmcp` 2. Replace placeholders with your actual values 3. Run: `python test_script.py` # Snowflake Native Application Source: https://docs.contextual.ai/user-guides/snowflake This guide explains how to install and use the Contextual AI Snowflake Native App Contextual AI provides a platform for creating enterprise-grade AI agents, grounded in your documents and data. In a few simple steps, you can install Contextual AI as a Native Snowflake Application and create powerful AI agents in the security of your Snowflake environment. *** ## Installation Obtain Contextual AI from the [Snowflake Marketplace](https://app.snowflake.com/marketplace). Grant "create compute pool" and "bind service endpoint" privilege to the application on Snowsight Please note, this step may require up to 1 hour. Once the application has been installed, the application's Snowsight / settings / security view will look like the image below, your application is now ready for use! ![](https://files.readme.io/c7dcdaf52996d06da6aa46884d472b048179405a609b94366859f677bdf5f180-Screenshot_2025-02-19_at_2.43.52_PM.png) *** ## Using the application Once you have installed the application, click "Launch App" to enter the Contextual AI Application. } > The first user to load into the application will be designated as the "Admin User" so it is highly recommended that your team's administrator be the first user to load the application. Once you have logged in with your Snowflake credentials, you will be able to see the Contextual AI Dashboard and create your first AI Agent. } > The first time you use the application, you will need to create a new agent and create a new datastore for it. ![](https://files.readme.io/d1671d165a56548fae1640d9f628de66e6d5e68b094ebcf7c899cbafc4fa65d4-Screenshot_2025-02-19_at_2.52.03_PM.png) Click "Create" at the bottom of the Create Agents page to create and save your first AI Agent. } /> Adding Knowledge to Your Agent's Datastore } > 1. Navigate to the Datastores tab and locate your agent's newly created Datastore. 2. Open the "Documents" page within the Datastore. 3. Select the "Ingest" tab to upload your PDF documents. Once you upload a document, the system will automatically begin processing it in the background. This process extracts information from your documents and prepares them for future retrieval by your agents. After your documents are processed, return to the Agents page and click "Chat" to start interacting with your agent. Your agent can now access and reference the knowledge from your uploaded documents during conversations. } /> *** ## API (programmatic) access Contextual AI's application provides a REST API for programmatic interaction with your agents and datastore. After you have created and configured an agent (and the agent's datastore) through the UI, you can integrate it into your applications and workflows using API endpoints. Below are the steps to get started with API access: API Access in the Contextual AI Native App requires obtaining the API endpoint of your instance, which can be found by running this Snowflake query in a Snowflake worksheet or via the Snowflake CLI: ``` CALL CONTEXTUAL_NATIVE_APP.CORE.GET_API_ENDPOINT() ``` You will receive a response formatted as a URL: `xxxxx-xxxxx-xxxxx.snowflakecomputing.app`. This URL value is the backend API endpoint for your application. To create the full API endpoint, you will need to prepend `https://` to the backend API endpoint from your application and then append `/v1` at the end. An example of how to do this in Python: ```python SF_BASE_URL = 'xxxxx-xxxxx-xxxxx.snowflakecomputing.app' # what you will receive from GET_API_ENDPOINT() BASE_URL = f'https://{SF_BASE_URL}/v1' # using python3 f-string formatting ``` For authentication, instead of using an API key, the Snowflake Native App version of Contextual AI uses a Snowflake token, which can be retrieved using the following Python code: ```python ctx = snowflake.connector.connect( user="",# snowflake account user password='', # snowflake account password account="organization-account", # format: - (e.g., myorg-account123) session_parameters={ 'PYTHON_CONNECTOR_QUERY_RESULT_FORMAT': 'json' }) # Obtain a session token. token_data = ctx._rest._token_request('ISSUE') token_extract = token_data['data']['sessionToken'] # Create a request to the ingress endpoint with authz. api_key = f'\"{token_extract}\"' ``` Once you have your API key, you can combine the steps above to create a Contextual AI client in Python that is configured to use your Contextual AI Native App in Snowflake to make programmatic queries to your agents. ```python SF_BASE_URL = 'xxxxx-xxxxx-xxxxx.snowflakecomputing.app' # what you will receive from GET_API_ENDPOINT() BASE_URL = f'https://{SF_BASE_URL}/v1' ctx = snowflake.connector.connect( # type: ignore user="",# snowflake account user password='', # snowflake account password account="organization-account", # snowflake organization and account - session_parameters={ 'PYTHON_CONNECTOR_QUERY_RESULT_FORMAT': 'json' }) # Obtain a session token. token_data = ctx._rest._token_request('ISSUE') # type: ignore token_extract = token_data['data']['sessionToken'] # type: ignore # Create a request to the ingress endpoint with authz. api_key = f'\"{token_extract}\"' client = ContextualAI(api_key=api_key, base_url=BASE_URL) # get list of agents to test API agents = [a for a in client.agents.list()] ``` It is recommended to use the Contextual AI Python SDK to interact with the API. An example script of our Python SDK and Snowflake Native App can be found [here](https://github.com/ContextualAI/contextual-client-python/blob/main/examples/snowflake_native_app_example.py).