",
"description": "",
"template_name": "default",
"datastore_ids": [
"3c90c3cc-0d44-4b50-8888-8dd25736052a"
]
}'
```
Contextual AI will return the new agent's `id` and its `datastore_ids`:
```json theme={null}
{
"id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"datastore_ids": [
"3c90c3cc-0d44-4b50-8888-8dd25736052a"
]
}
```
# Disable Commentary
Source: https://docs.contextual.ai/reference/disable-commentary
`(avoid_commentary)`
## Description
Flag that indicates whether the Agent should only output strictly factual information grounded in the retrieved knowledge, instead of the complete response (which can include commentary, analysis, etc.).
# Enable Filtering
Source: https://docs.contextual.ai/reference/enable-filtering
`(enable_filter)`
## Description
Allows the agent to perform a final filtering step prior to generation. When enabled, chunks are checked against the filter prompt and irrelevant chunks are filtered out. This acts like a final quality control checkpoint, helping to ensure that only relevant chunks are passed to the generator. This filter can improve response accuracy and relevance, but also increase the false refusal rate if the configuration is too strict.
# Enable Groundedness Score
Source: https://docs.contextual.ai/reference/enable-groundedness-score
`(calculate_groundedness)`
## Description
Enables the agent to provide groundedness scores as part of its response. When enabled, the agent identifies distinct claims in the response and assesses whether each one is grounded in the retrieved document chunks. Claims that are not grounded are shown in yellow in the UI. Defaults to `off`.
# Enable Multi-turn
Source: https://docs.contextual.ai/reference/enable-multi-turn
`enable_multi_turn`
## Description
`enable_multi_turn` allows the agent to remember and reference previous parts of the conversation, making interactions feel more natural and continuous. When enabled, the user’s query will automatically be reformulated based on prior turns to resolve ambiguities. The conversation history is prepended to the query at generation-time.
# Enable Query Decomposition
Source: https://docs.contextual.ai/reference/enable-query-decomposition
`(enable_query_decomposition)`
## Description
Toggles the query decomposition module `on` or `off`. When enabled, the module will break down complex and compound queries into a series of simpler queries. Each sub-query will have individual retrievals, which are then intelligently combined prior to any subsequent reranking, filtering, or generation steps. Simple queries, as judged by the model, will not be decomposed.
# Enable Query Expansion
Source: https://docs.contextual.ai/reference/enable-query-expansion
`enable_query_expansion`
## Description
An optional parameter that specifies how queries should be reformulated. Toggles the query expansion module `on` or `off`; when enabled, the user’s original query will be rewritten according to the instructions set out in the prompt, guided by any provided examples. If no prompt or examples are given, the default prompt is used.
## Default Query Reformulation Prompt
The following illustrates a few shot examples of how queries should be reformulated based on the provided instructions.
```
Instructions: Reformulate the query so that it is more detailed and includes relevant terminology or topics that will be helpful in maximizing the quality of the information retrieved to answer the query.
Example 1:
Original query: What are JPMorgan’s results this quarter?
Expanded query: Can you provide the latest financial results for JPMorgan, including revenue, earnings per share, and key metrics for the most recent quarter?
Example 2:
Original query: What is data cleaning?
Expanded query: Could you explain the concept of data cleaning, including common techniques used, typical challenges faced, and its role in the data preprocessing pipeline for machine learning models?
Example 3:
Original query: What are the results of Apple this quarter?
Expanded query: Can you provide the latest financial results for Apple, including revenue, earnings per share, and key metrics for the most recent quarter?
```
# Enable Reranking
Source: https://docs.contextual.ai/reference/enable-reranking
`(enable_rerank)`
# Description
Allows the agent to take the initially retrieved document chunks and rerank them based on the provided instructions and user query. The top reranked chunks are passed on for filtering and final response generation.
# FAQ
Source: https://docs.contextual.ai/reference/faq
Answers to the most common questions about Contextual AI
***
## Platform, Components, & Technology
#### What is the Contextual AI platform?
The Contextual AI platform is a context engineering layer for enterprise AI, enabling organizations to build specialized, production-ready AI apps grounded in their organization's data in days, not months.
#### How does your platform differ from generic chatbots or large language model (LLM) APIs?
Unlike generic LLMs, our platform provides a suite of state-of-the-art (SOTA) context engineering tools that connect your private knowledge base (documents, system logs, workflows) with any LLM. This ensures the agent’s responses are grounded in your business context and helps eliminate hallucinations, thereby improving trust in the outputs.
#### What are the core capabilities of the platform?
Key capabilities include:
* Ingesting/unifying documents into vector stores
* Parsing and chunking unstructured data
* Building specialized AI apps with RAG pipelines
* APIs and SDKs for programmatic use
* Enterprise-grade security, scalability, and governance
* AI and system observability, and data management
#### What does “context engineering” mean?
Context engineering is the discipline of selecting, structuring, and delivering the right information to LLMs in order to deliver accurate, grounded responses. This includes:
* **Information retrieval** – Finding and preparing relevant data from enterprise sources
* **Context optimization and management** – Shaping prompts, guardrails, and agent behavior
* **Workflow/agent orchestration** – Connecting multi-step processes and enterprise systems
***
## Use Cases & Industries
#### Which industries are best suited for Contextual AI?
Industries with large complex knowledge bases that use documents with rich media (e.g., charts, images, tables) in challenging file types (e.g., PDF, HTML, Markdown), and/or compliance/regulatory demands: financial services, engineering/manufacturing, legal and professional services.
#### Can you give examples of typical Contextual AI use cases?
Some examples include:
* Technical support agents who answer user queries with responses grounded in vast documentation.
* Policy & procedures agent for employees: making institutional knowledge accessible via chat queries.
* Internal knowledge agents for onboarding, reducing time-to-productivity of new hires.
#### How quickly can an enterprise go from concept to production?
Our context layer and platform capabilities enable the creation of production-ready apps with high accuracy out of the box, which removes the complexity of integrating and optimizing the accuracy of a DIY alternative.
Enterprises using our platform have significantly reduced their concept-to-production timelines (e.g., as low as \~30 days for initial production deployments) in some cases.
***
## Data
#### What types of data sources can I connect?
You can ingest any source of enterprise content that you wish your agents to reference, including documents (PDFs, Word, HTML), logs, databases, knowledge bases, internal wikis, and cloud storage.
#### Can Contextual AI scale to my enterprise’s requirements on an ongoing basis?
Yes. Our platform is built for enterprise-scale deployments, offering state-of-the-art context engineering and production-grade agent capabilities.
Contextual AI customers routinely support:
* 2,000+ internal users
* 10,000+ support cases handled
* Millions of pages ingested
* Public-facing apps in production
* Multiple use cases on one unified platform
If your enterprise requires it, our platform can scale to meet your specific needs.
#### Can I connect data from third-party data platforms and cloud providers?
Yes, supported data sources include Google Drive, Microsoft SharePoint, Microsoft OneDrive, Box, and Confluence, with more being added continuously.
#### Do you train your models on our data?
No. We never use your documents, data, or interactions to train or fine-tune our underlying models. Your data remains isolated and is used only to power the AI agents you create within your own workspace.
#### How is our data used by the platform?
Your data is processed solely for retrieval, reasoning, and real-time inference by your agents. It is not stored for model improvement, shared with other customers, or used outside your environment.
#### Who owns the AI agents we build on the platform?
You do. Any agents you develop—including their configurations, behaviors, and outputs—are entirely your intellectual property.
#### Does the platform have any rights to our agents or data?
No. We do not claim ownership over your agents or your data. You retain full rights to everything you create and upload.
#### Is our data shared with other customers?
Never. Each customer environment is isolated, and your data is not accessible to or used by any other organization.
***
## Security & Governance
#### How is customer data handled and protected?
Customer data is processed in accordance with the customer’s agreement, and our Platform Services processes data for the benefit of the customer and their authorized users. We publish a [Privacy Policy](https://contextual.ai/legal/privacy-policy) detailing how we collect, use, and share personal information.
#### Do you support enterprise-grade governance and access controls?
Yes, the Contextual AI platform supports [role-based access control (RBAC)](/admin-setup/rbac), audit trails, enterprise authentication, respected entitlements at query time, and model armor to prevent agent misuse.
#### Does Contextual AI adhere to major security and privacy frameworks?
Yes, we design with security and privacy in mind (data residency, encryption at rest/in transit, access control) and provide documentation to support enterprise compliance needs. We comply with major data protection standards, including HIPAA, SOC 2, CCPA, and GDPR, with specific certifications varying by region and customer requirements. For more information, visit the [Contextual AI Trust Center](https://trust.contextual.ai/).
***
## Getting Started & Pricing
#### How do I get started with Contextual AI?
For a guided experience, you can [request a demo](https://contextual.ai/request-a-demo). For self-serve, you can sign up, create a workspace, ingest data (documents, logs, other sources), create an agent, and start querying via the UI or API. Our Beginner’s Guide walks you through steps such as API key creation, datastore creation, and querying.
#### Is there a free trial or credits available?
Yes, when you sign up, you’ll get \$25 (\$50 with a work email address) in free credits to explore the platform and build your first agent.
#### How is pricing structured?
Pricing is based on usage and varies by volume of queries, data ingress/egress, compute, and SLA. For more information, please refer to the [Pricing & Billing](/admin-setup/pricing-billing) page or contact [sales@contextual.ai](mailto:sales@contextual.ai) to request your usage profile and enterprise plan.
#### What support options are available?
We provide developer documentation (SDKs, APIs), onboarding guidance, enterprise-grade support SLAs, and professional services engagements to accelerate production deployments.
***
## Technical Details & Integrations
#### Which programming languages and SDKs are supported?
Contextual AI is available via API: cURL, Python, JavaScript, PHP, Go, Java, Ruby or via SDK: Python or Node.js.
#### Can I deploy on-premises or in my private cloud?
Yes, our architecture supports flexible and secure deployment options, including fully managed cloud, private cloud, or customer VPC.
#### How does the Contextual AI platform scale for high-volume/real-time queries?
The Contextual AI platform is designed for production-grade reliability: auto-scaling compute for agents, vector store indexing, caching layers, and optimized retrieval pipelines to support high query volumes with low latency.
***
## Company & Trust
#### Who are the founders and where is the company based?
The company was founded in 2023 by Douwe Kiela and Amanpreet Singh and is headquartered in Mountain View, California, USA.
#### Who are some of Contextual AI's clients or reference customers?
Contextual AI has enterprise clients across various sectors, including technology, financial services, and professional services. Some notable customers include Qualcomm, ClaimWise, Advantest, Comply, HSBC, ShipBob, Element Solution, IGS, Sevii, and others.
#### What is Contextual AI’s mission?
Our mission is to replace the DIY complexity of building enterprise AI by providing a unified “context layer” so that AI is accurate, secure, scalable, and specialized to business knowledge and workflows.
#### How do you stay ahead in AI and maintain the trustworthiness of your agents?
We invest in research, model development (including grounded language models, rerankers), enterprise-grade engineering, and alignment to ensure the agents’ outputs are fact-based, cite sources when appropriate, and support audit and traceability.
# Filter Prompt
Source: https://docs.contextual.ai/reference/filter-prompt
`(filter_prompt)`
## Description
Natural language instructions that describes the criteria for relevant and irrelevant chunks. It can be used in tandem with, or as an alternative to, the reranker score-based filtering.
# Frequency Penalty
Source: https://docs.contextual.ai/reference/frequency-penalty
`(frequency_penalty)`
## Description
Helps prevent repetition in responses by making the agent less likely to use words it has already used frequently. This helps ensure more natural, varied language. Defaults to `0`.
# Generation Model
Source: https://docs.contextual.ai/reference/generation-model
Selecting Your Generation Model
## Description
Determines which generation model powers your agent. You can use either our default Grounded Language Model (GLM) or a version that has been specifically tuned to your use case.
Tuned models can only be used with the agents on which they were tuned.
## Usage
You can configure the generation model either during agent creation or by modifying an existing agent.
1. In the agent's configuration settings, click **Generation** from the left-hand pane to access **Generation Settings**.
2. Under **Generation Model**, select the desired model from the drop-down list.
3. Click the **Save** button.
# Glossary
Source: https://docs.contextual.ai/reference/glossary
Essential definitions for understanding and using Contextual AI
## Agent
Contextual AI RAG Agents are optimized end-to-end to deliver exceptional accuracy on complex and knowledge-intensive tasks. `Agents` make intelligent decisions on how to accomplish the tasks, and can take multiple steps to do so. The agentic approach enables a wide range of actions, such as providing standard retrieval-based answers, declining to respond when no relevant information is available, or generating and executing SQL queries when working with structured data. The adaptability and further tuning of `Agents` greatly increases its value for knowledge-intensive tasks.
## Attributions
`Attributions` are in-line citations that credit the specific sources of information used by the model to generate a response. When querying Contextual AI Agents, `Attributions` are included for each claim made in the response. These attributions can be accessed via the query API response or viewed in the UI by hovering over in-line tooltips next to each claim (e.g., \[1], \[2]).
## Case
A `Case` is a row of data. It is either a `Prompt` and `Reference` (gold-standard answer) pair, or a `Prompt`, `Response`, and `Knowledge` triplet. `Evaluation` datasets follow the former schema, while `Tuning` datasets require the latter.
## Chunking
Chunking is the process of breaking large documents into smaller, semantically meaningful pieces so AI models can understand, retrieve, and reason over them more effectively. Instead of forcing a model to process an entire report, slide deck, or knowledge base at once, chunking creates manageable units, each with its own context, that improve search accuracy, reduce hallucinations, and make RAG workflows far more reliable.
To explore Contextual AI's four distinct chunking modes and see how they support diverse document types and use cases, please refer to [Chunking Configurations](/reference/chunking).
## Dataset
The `Dataset` object can be used to store labelled data `cases`. A `case` is either a (i) `Prompt`-`Reference` pair, or a (ii) `Prompt`-`Reference`-`Knowledge` triplet. `Datasets` can be used for `Evaluation` or `Tuning`. You can create a new `Dataset` by uploading a CSV or JSONL file via our API.
The `Dataset` object can also store evaluation results. Once an evaluation job is completed, it returns a `Dataset` containing the original `Cases` from the evaluation, now appended with results such as `Equivalence` and `Groundedness` scores for each `Case`.
## Datastore
A repository of data associated with an `Agent`. An `Agent` retrieves relevant data from its associated `Datastores` to generate responses. An `Agent` can connect to multiple `Datastores`, and each `Datastore` can serve multiple `Agents`. You can associate a `Datastore` with an `Agent` when creating or editing an Agent [via our APIs](/api-reference/agents/edit-agent). We also provide [a set of APIs](/api-reference/datastores) for creating and managing `Datastores`.
## Document
A `Document` is a unit of unstructured data ingested into a `Datastore`, which can be queried and used as the basis for generating responses. Today, we support both `pdf` and `html` files, and plan to expand support to other data types. You can ingest `Documents` into a `Datastore` via our API. After ingestion, `Documents` are automatically parsed, chunked, and processed by our platform.
## Knowledge
The data retrieved by the `Agent` from the `Datastore` to generate its response. When working with unstructured data, `Knowledge` comes in the form of a list of `Document` chunks that are relevant to the `Query`.
## LMUnit
An evaluation method using natural language unit tests to assess specific criteria in an Agent's responses. You can define and evaluate clear, testable statements or questions that capture desirable fine-grained qualities of the Agent's response — such as “Is the response succinct without omitting essential information?” or “Is the complexity of the response appropriate for the intended audience?” You can create and run these unit tests [via our API](/api-reference/lmunit/lmunit). Read more about `LMUnit` in [our blog post](https://contextual.ai/blog/lmunit/).
## Query / Prompt
The question that you submit to an `Agent` . You can submit a `Query` to your `Agent` [via our API](/api-reference/agents-query/query).
## RAG
Retrieval Augmented Generation or `RAG` is a technique that improves language model generation by incorporating external knowledge. Contextual AI Agents use `RAG` to ground its responses in directly relevant information, ensuring accuracy for knowledge-intensive tasks. We've pioneered the `RAG 2.0` approach, which outperforms traditional `RAG` systems by optimizing the system end-to-end. [Read more in our blog post](https://contextual.ai/introducing-rag2/).
## Response
`Response` is the output generated by an `Agent` in response to a `Query`. `Responses` come with the relevant retrieved content (`Knowledge`) and in-line citations (`Attributions`).
## System Prompt
Instructions that guide an Agent's response generation, helping define its behavior and capabilities. You can set and modify the `System Prompt` when creating or editing an Agent [via our APIs](/api-reference/agents/edit-agent).
## Workspace
An organizational unit that owns and manages `Agents`, `Datastores`, and other resources within the system. Contextual AI uses `Workspaces` to organize and manage resources, with API keys associated with specific workspaces.
# Lexical Search Weight
Source: https://docs.contextual.ai/reference/lexical-search-weight
`(lexical_alpha)`
## Description
When chunks are scored during retrieval (based on their relevance to the user query), this controls how much weight the scoring gives to exact keyword matches. A higher value means the agent will prioritize finding exact word matches, while a lower value allows for more flexible, meaning-based matching.
The value ranges from `0` to `1`, and Contextual AI defaults to `0.1`. You can increase this weight if exact terminology, specific entities, or specialized vocabulary is important for your use case.
# Max New Tokens
Source: https://docs.contextual.ai/reference/max-new-tokens
`(max_new_tokens)`
## Description
Controls the maximum length of the agent’s response. Defaults to `2,048` tokens.
# No Retrieval System Prompt
Source: https://docs.contextual.ai/reference/no-retrieval-system-prompt
`(no_retrieval_system_prompt)`
## Description
Defines the agent’s behavior if, after the retrieval, reranking, and filter steps, no relevant knowledge has been identified. You can use this prompt to define boilerplate refusals, offer help and guidance, provide information about the document store, and specify other contextually-appropriate ways that the agent should respond.
## Default No Retrieval System Prompt
```
You are an AI RAG agent created by Contextual to help answer questions and complete tasks posed by users. Your capabilities include accurately retrieving/reranking information from linked datastores and using these retrievals to generate factual, grounded responses. You are powered by leading document parsing, retrieval, reranking, and grounded generation models. Users can impact the information you have access to by uploading files into your linked datastores. Full documentation, API specs, and guides on how to use Contextual, including agents like yourself, can you found at docs.contextual.ai.
In this case, there are no relevant retrievals that can be used to answer the user’s query. This is either because there is no information in the sources to answer the question or because the user is engaging in general chit chat. Respond according to the following guidelines:
If the user is engaging in general pleasantries (“hi”, “how goes it”, etc.), you can respond in kind. But limit this to only a brief greeting or general acknowledgement
Your response should center on describing your capabilities and encouraging the user to ask a question that is relevant to the sources you have access to. You can also encourage them to upload relevant documents and files to your linked datastore(s).
DO NOT answer, muse about, or follow-up on any general questions or asks. DO NOT assume that you have knowledge about any particular topic. DO NOT assume access to any particular source of information.
DO NOT engage in character play. You must maintain a friendly, professional, and neutral tone at all times
```
# Number of Reranked Chunks
Source: https://docs.contextual.ai/reference/number-of-reranked-chunks
`(top_k_reranked_chunks)`
## Description
The number of top reranked chunks that are passed on for generation.
# Number of Retrieved Chunks
Source: https://docs.contextual.ai/reference/number-of-retrieved-chunks
`(top_k_retrieved_chunks)`
## Description
The maximum number of chunks that should be retrieved from the linked datastore(s). For example, if set to `10`, the agent will retrieve at most 10 relevant chunks to generate its response. The value ranges from `1` to `200`, and the default value is `100`.
# Random Seed
Source: https://docs.contextual.ai/reference/random-seed
`(seed)`
## Description
Controls randomness in how the agent selects the next tokens during text generation. Allows for reproducible results, which can be useful for testing.
# Rerank Instructions
Source: https://docs.contextual.ai/reference/rerank-instructions
`(rerank_instructions)`
## Description
Natural language instructions that describe your preferences for the ranking of chunks, such as prioritizing chunks from particular sources or time periods. Chunks will be rescored based on these instructions and the user query.
If no instruction is given, the rerank scores are based only on relevance of chunks to the query.
# Reranker Score Filter
Source: https://docs.contextual.ai/reference/reranker-score-filter
`(reranker_score_filter_threshold)`
## Description
If the value is set to greater than `0`, chunks with relevance scores below the set value will not be passed on. Input must be between `0` and `1`.
# Semantic Search Weight
Source: https://docs.contextual.ai/reference/semantic-search-weight
`(semantic_alpha)`
## Description
When chunks are scored during retrieval (based on their relevance to the user query), this controls how much weight the scoring gives to semantic similarity. Semantic searching looks for text that conveys similar meaning and concepts, rather than just matching exact keywords or phrases.
The value ranges from `0` to `1`, and Contextual AI defaults to `0.9`. The value of the semantic search weight and lexical search weight must sum to `1`.
# Suggested Queries
Source: https://docs.contextual.ai/reference/suggested-queries
`(suggested_queries)`
## Description
Example queries that appear in the user interface when first interacting with the agent. You can provide both simple and complex examples to help users understand the full range of questions your agent can handle. This helps set user expectations and guides them toward effective interactions with the agent.
# Temperature
Source: https://docs.contextual.ai/reference/temperature
`(temperature)`
## Description
Controls how creative the agent’s responses are. A higher temperature means more creative and varied responses, while a lower temperature results in more consistent, predictable answers. Ranges from `0` to `1`, defaults to `0`.
# Top P
Source: https://docs.contextual.ai/reference/top-p
`(top_p)`
## Description
Similar to temperature, this parameter also controls response variety. It determines how many different word choices the agent considers when generating its response. Defaults to `0.9`.
# 2025 Release Updates
Source: https://docs.contextual.ai/release-notes/latest
Product Highlights & Release Information
***
## December
### 12/11/2025
**Higher Page & File Size Limits for PDF, DOCX, & PPTX Ingestion**
Document ingestion now supports files up to 2,000 pages and 400MB.
We’ve significantly expanded our document ingestion capabilities to support much larger files. The platform can now process documents up to **2,000 pages** and **400MB in size**, a substantial increase from the previous limits of **400 pages** and **100MB**. This enhancement allows users to work with more comprehensive documents without needing to manually split them into smaller parts.
For existing tenants, adoption of the new limits requires a manual update to the feature flags. To enable the expanded capacity, set:
* `MAX_PDF_PAGES_PER_FILE_ALLOWED` to `2000`
* `MAX_BYTE_SIZE_FILE_ALLOWED` to `400000000`
Please note that processing performance may still slow down for documents exceeding 400 pages. Organizations with strict latency requirements should continue to divide large documents into smaller segments to maintain optimal processing speed.
### 12/8/2025
**Oryx v0.1.0 Now Available**
Oryx by Contextual AI allows you to seamlessly integrate your application with Contextual AI agents, from interface to proxy, with a fully customizable and unbranded user experience.
Oryx by Contextual AI is now officially live, enabling teams to seamlessly integrate Contextual AI agents directly into their applications with a fully customizable, unbranded, and developer-friendly experience. Built to support enterprise engineering teams, Oryx abstracts away the complexity of connection handling, agent logic, and data flow so you can focus entirely on crafting exceptional user experiences.
Key Features
- Seamless integration of Contextual AI agents into any product’s interface or backend
- Fully customizable, unopinionated UI that allows teams complete control over branding and user experience
- Enterprise-grade connection and data flow management, handled automatically by Oryx
- Optimized developer experience, offering intuitive APIs and configurable components
- React support via `@contextualai/oryx-react` for building rich, responsive front-end agent interactions
- Secure backend connectivity via `@contextualai/oryx-proxy-node`, usable in any JavaScript runtime environment
- [Curated styling guide](https://oryx.contextual.ai/styling) with example code and a sample repository to accelerate production-ready deployments
Availability
Oryx is now available as `v0.1.0`. Explore the docs and examples at [https://oryx.contextual.ai/](https://oryx.contextual.ai/)
**Login & Signup Page Redesign**
The Contextual AI platform login and sign-up pages have been redesigned, bringing greater visual consistency with our company's corporate website experience.
We’ve refreshed the Contextual AI platform login and sign-up experience with the newly introduced Möbius imagery from our corporate website, continuing our broader effort to unify the platform’s look and feel under our evolving brand identity. This update enhances visual consistency across surfaces and reflects the modernized aesthetic we’re rolling out throughout the product.
### 12/4/2025
**Monitoring Page with Analytics & Insights**
Contextual AI now provides a **Monitoring** page with searchable, exportable analytics across agent activity, user engagement, and knowledge-base operations through interactive charts and key usage metrics.
The new Monitoring Page gives teams deeper visibility into how their agents and knowledge bases are performing. With built-in analytics, filters, and exportable charts, you can quickly understand usage patterns, identify issues, and track adoption across your organization.
Key Features
- Agent performance metrics including daily active users, total queries, feedback/thumbs-up rates, and retry rates
- Knowledge-base activity insights such as document upload volume and datastore creation trends
- Searchable filters for agents and users, with intuitive **Apply** and **Clear** controls for fast exploration
- Interactive charts that allow drill-down analysis and CSV export for reporting or offline review
How To Enable
Set the feature flag `ENABLE_MONITORING_PAGE` to `on` (default: `off`). Once enabled, the **Monitoring Page** will appear under **Observability** in the sidebar.
### 12/1/2025
**Custom RBAC with Groups**
You can now create custom, fine-grained roles and groups with scoped permissions for tighter, more precise access control.
Admins can now define custom roles with tailored permission bundles across key objects (e.g., Agents, Datastores, Billing, and other admin features). Permissions can be scoped to specific Agents or Datastores, enabling finer-grained governance by ensuring each team member has the right level of access for their responsibilities. You can also create Groups to simplify role management.
Contact your Contextual AI account manager to enable this feature.
***
## November
### 11/5/2025
**Data Connectors Now Support Sync History**
Data connectors now include full sync history, giving teams visibility into every sync event and the ability to inspect individual document failures.
You can now review a complete record of all sync activity across your data connectors. Each sync includes drill-down details so you can see exactly which documents failed and why. This makes troubleshooting faster and provides clearer operational insight into connector performance.
How to enable: Default on.
**Configurable Feedback Options**
Agent developers can now customize the end-user feedback workflow, including mandatory feedback and personalized issue labels.
We’ve added new configuration controls that let agent developers tailor how users provide feedback. You can require feedback for every query, define your own issue labels, and adapt the workflow to match your team’s QA process. These options give you more flexibility in how feedback is collected and managed.
How to enable: Default on. Go to **Agent Config → User Experience**.
**Three New Connectors: OneDrive, Box, Confluence**
OneDrive, Box, and Confluence are now supported as third-party connectors for datastores.
We’ve expanded our integration ecosystem with three widely requested connectors. You can now ingest and manage documents from OneDrive, Box, and Confluence directly through your datastore, extending your retrieval pipeline with more enterprise content sources.
How to enable: **Create Datastore → Third-Party Connection**.
**Feedback Annotation Module**
A new annotation module allows admins to label, annotate, visualize, and export feedback directly within the platform.
The platform now supports end-to-end feedback annotation. Admins can define annotation labels, tag feedback within the UI, generate basic visualizations, and export annotated data as CSV. This gives enterprise teams a centralized place to manage QA workflows, reduce tool switching, and streamline evaluation processes.
How to enable: Default on for all tenants. Access via **Agent Config → Feedback Annotation**.
***
## September
### 9/17/2025
**Billing API Endpoint Available**
A new Billing API endpoint provides programmatic access to usage and consumption data.
Teams can now integrate their billing and consumption data into their internal tools or dashboards using the new Billing API. This enables easier reporting, monitoring, and automation around usage.
Endpoint: [https://docs.contextual.ai/api-reference/billing/get-billing-metadata](https://docs.contextual.ai/api-reference/billing/get-billing-metadata)
**Ingestion Config Override (API Only)**
You can now override default datastore configurations per document via API to fine-tune ingestion behavior.
This API-only feature lets you apply document-specific ingestion settings without modifying the global datastore configuration.
This feature is useful for documents that require customized handling—such as specialized chunking, parsing, or classification. The update affects the Ingest and List Documents endpoints.
### 9/15/2025
**Chunk Viewer UI Improvements**
The Chunk Viewer received multiple UX enhancements, including shareable URLs, clearer file context, improved previews, and easier text editing.
We’ve refreshed the Chunk Viewer to simplify navigation and collaboration. You can now share direct links to specific documents and chunks, see the file name at the top of the page, and see a formatted “Preview” as the default view. We’ve also improved visibility for image captioning and added an Edit button to update chunk text more easily.
***
## August
### 8/27/2025
**Support for Templates**
Teams can now create and apply templates when building new agents.
Template support is now live. You can select from existing templates or save an agent’s configuration as a reusable template for future agents. This accelerates setup, ensures consistency across deployments, and simplifies onboarding for new team members.
### 8/22/2025
**Datastore Updates**
Datastore management has been enhanced with new filtering tools, improved search, and expanded model options for document processing.
We’ve added several usability improvements to streamline datastore operations. You can now filter documents by status with one click (Processed, Processing, Failed), search by filename (prefix search), and customize filters or sort by creation date, status, or name.
We’ve also introduced new swappable models—layout, image captioning, hierarchy, and document-naming models—to improve processing accuracy and support a wider variety of document types. These settings are available in datastore configuration and help optimize ingestion pipelines.
### 8/20/2025
**GPT-5 available as a Generation Model**
Agents can now use GPT-5 as their generation model for production workloads.
We’ve added GPT-5 as an option for all agents, giving teams access to the latest generation capabilities for reasoning, summarization, and content generation. This upgrade improves accuracy and output quality across a wide range of use cases.
***
## June
### 6/30/2025
**Page-level Chunking in Datastore Configuration Options**\
Contextual AI now supports a new page-level chunking mode that preserves slide and page boundaries for more accurate, context-aware retrieval in RAG workflows.
Page-level chunking mode optimizes parsing for page-boundary-sensitive
documents. Instead of splitting content purely by size or heading structure,
this mode ensures each page becomes its own retrieval-preserving chunk
unless the maximum chunk size is exceeded.
This is particularly effective for slide decks, reports, and other
page-oriented content, where the meaning is closely tied to individual
pages.
Page-level chunking joins existing segmentation options including
heading-depth, heading-greedy, and
simple-length.
To enable, set `chunking_mode = "page"` when configuring a
datastore via the ingest document API or via the UI.
### 6/2/2025
**Query Reformulation & Decomposition**\
Contextual AI now supports query reformulation and decomposition, enabling agents to rewrite, clarify, and break down complex or ambiguous user queries.
Query reformulation allows agents to rewrite or expand user queries to
better match the vocabulary and structure of your corpus. This is essential
when user queries are ambiguous, underspecified, or contain terminology not
aligned with the domain.
Decomposition automatically determines whether a query should be split into
smaller sub-queries. Each sub-query undergoes its own retrieval step before
results are merged into a final ranked set.
Common reformulation use cases include:
- Aligning queries with domain-specific terminology
- Making implicit references explicit
- Adding metadata or contextual tags to guide retrieval
Enable these features via Query Reformulation in the agent
settings UI, or via the Agent API.
***
## May
### 5/29/2025
**Optimize parsing and chunking strategies via Datastore configuration**\
Contextual AI has released new advanced datastore configuration options that let developers fine-tune parsing, chunking, and document processing workflows to produce highly optimized, use-case-specific RAG-ready outputs.
Today, Contextual AI announces the release of advanced datastore
configuration options, enabling developers to optimize document processing
for RAG-ready outputs tailored to their specific use cases and document
types.
Clients can now customize parsing and chunking workflows to maximize RAG
performance. Configure heading-depth chunking for granular hierarchy
context, use custom prompts for domain-specific image captioning, enable
table splitting for complex structured documents, and set precise token
limits to optimize retrieval quality.
These configuration options ensure your documents are processed optimally
for your RAG system – whether you’re working with technical manuals
requiring detailed hierarchical context, visual-heavy documents needing
specialized image descriptions, or structured reports with complex tables.
To get started, simply use our updated Agent API and datastore UI with the
new configuration parameters to customize parsing and chunking behavior for
your specific documents and use cases.
### 5/20/2025
**Chunk viewer for document inspection**\
Contextual AI introduces the Chunk Inspector, a visual debugging tool that lets developers inspect and validate document parsing and chunking results to ensure their content is fully RAG-ready.
Today, Contextual AI announces the release of the Chunk Inspector, a visual
debugging tool that allows developers to examine and validate document
parsing and chunking results.
Clients can now inspect how their documents are processed through our
extraction pipeline, viewing rendered metadata, extracted text, tables or
image captioning results for each chunk. This transparency enables
developers to diagnose extraction issues, optimize chunking configurations,
and ensure their documents are properly RAG-ready before deployment.
The Chunk Inspector provides immediate visibility into how your datastore
configuration affects document processing, making it easier to fine-tune
parsing and chunking settings for optimal retrieval performance.
To get started, simply navigate to the Chunk Inspector in your datastore UI
after ingesting a document to review the extraction and chunking results.
### 5/13/2025
**Document Parser for RAG now Generally Available**\
Contextual AI has launched a new Document Parser for RAG, a powerful /parse API that delivers highly accurate, hierarchy-aware understanding of large enterprise documents—dramatically improving retrieval quality across complex text, tables, and diagrams.
Today, Contextual AI announces the Document Parser for RAG with our
separate /parse component API, enabling enterprise AI agents to navigate and
understand large and complex documents with superior accuracy and context
awareness.
The document parser excels at handling enterprise documents through three
key innovations: document-level understanding that captures section
hierarchies across hundreds of pages, minimized hallucinations with
confidence levels for table extraction, and superior handling of complex
modalities such as technical diagrams, charts, and large tables. In testing
with SEC filings, including document hierarchy metadata in chunks increased
the equivalence score from 69.2% to 84.0%, demonstrating significant
improvements in end-to-end RAG performance.
Get started today for free by creating a Contextual AI account. Visit the
Components tab to use the Parse UI playground, or get an API key and call
the API directly. We provide credits for the first 500+ pages in Standard
mode (for complex documents that require VLMs and OCR), and you can buy
additional credits as your needs grow. To request custom rate limits and
pricing, please contact us. If you have any feedback or need support, please
email [parse-feedback@contextual.ai](mailto:parse-feedback@contextual.ai).
***
## March
### 3/24/2025
**Groundedness scoring of model responses now Generally Available**\
Contextual AI now offers groundedness scoring, a feature that evaluates how well each part of an agent’s response is supported by retrieved knowledge, helping developers detect and manage ungrounded or potentially hallucinated claims with precision.
Today, Contextual AI launched groundedness scoring for model responses.
Ensuring that agent responses are supported by retrieved knowledge is
essential for RAG applications. While Contextual’s Grounded Language Models
already produce highly grounded responses, groundedness scoring adds an
extra layer of defense against hallucinations and factual errors.
When users query an agent with groundedness scores enabled, a specialized
model automatically evaluates how well claims made in the response are
supported by the knowledge. Scores are reported for individual text spans
allowing for precise detection of unsupported claims. In the platform
interface, the score for each text span is viewable upon hover and
ungrounded claims are visually distinguished from grounded ones. Scores are
also returned in the API, enabling developers to build powerful
functionality with ease, like hiding ungrounded claims or adding caveats to
specific sections of a response.
To get started, simply toggle “Enable Groundedness Scores” for an agent in
the “Generation” section of the agent configuration page, or through the
agent creation or edit API. Groundedness scores will automatically be
generated and displayed in the UI, and returned as part of responses to
`/agent/{agent_id}/query` requests.
### 3/21/2025
**Metadata ingestion & document filtering**\
Contextual AI now supports document-level metadata ingestion and metadata-based filtering, enabling developers to target queries by attributes like author, date, department, or custom fields for more precise and relevant retrieval.
Today, Contextual AI announces the release of document metadata ingestion
and allows for metadata-based filtering during queries.
Clients can now narrow search results using document properties like author,
date, department, or any custom metadata fields, delivering more precise and
contextually relevant responses.
To get started, simply use our ingest document and update document metadata
APIs to add metadata to documents. Once done, use our document filter in the
query API to filter down results.
· · ·
**Document format support expansion: DOC(X) and PPT(X)**\
Contextual AI now supports ingesting DOC(X) and PPT(X) files, allowing RAG agents to seamlessly use Microsoft Office documents as part of their retrieval corpus.
Today, Contextual AI announces the release of the support of DOC(X) and
PPT(X) files for ingestion into datastore.
This enables clients to leverage Microsoft Office documents directly in
their RAG agents, expanding the range of content they can seamlessly
incorporate.
To get started, use our document API or our user interface to ingest new files.
### 3/17/2025
**Filtering by reranker relevance score now Generally Available**\
Contextual AI now allows users to filter retrieved chunks by reranker relevance score, giving them more precise control over which chunks are used during response generation via a new `reranker_score_filter_threshold` setting in the Agent APIs and UI.
Today, Contextual AI announces support for filtering retrieved chunks based
on the relevance score assigned by the reranker.
The ability to filter chunks based on relevance score gives users more
precision and control in ensuring that only the most relevant chunks are
considered during response generation. It is an effective alternative or
complement to using the filter\_prompt for a separate filtering LLM.
To get started, use the `reranker_score_filter_threshold parameter` in the
Create/Edit Agent APIs and in the UI.
### 3/11/2025
**Instruction-following reranker now Generally Available**\
Contextual AI has released the world’s first instruction-following reranker—a state-of-the-art model that lets users provide natural-language ranking instructions to improve retrieval relevance and response accuracy, now available
Today, Contextual AI announces the world’s first instruction-following
reranker, available in both agents and as a separate /rerank component API.
The instruction-following reranker enables users to specify natural language
instructions about how the reranker should rank retrievals, which improves
accuracy in reranking and response generation. The reranker ranks documents
according to their relevance to the query first and your custom instructions
secondarily. We evaluated the model on instructions for recency, document
type, source, and metadata, and it can generalize to other instructions as
well. For instructions related to recency and timeframe, specify the
timeframe (e.g., instead of saying “this year”) because the reranker doesn’t
know the current date. The reranker is state-of-the-art on the
industry-standard BEIR benchmark, as well as our internal benchmarks.
To get started for free with the `/rerank` component API, create a Contextual
AI account, visit the Getting Started tab, and either get an API key for the
`/rerank API` or use the `/rerank UI` playground. We provide credits for the
first 50M tokens, and you can buy additional credits as your needs grow. To
request custom rate limits and pricing, please contact us. If you have any
feedback or need support, please email [reranker-feedback@contextual.ai](mailto:reranker-feedback@contextual.ai).
This reranker is the default for new agents created with the Contextual AI
platform. To specify instructions, use the reranker\_instruction parameter in
the Create/Edit Agent APIs and in the UI. See blog post for more details.
### 3/4/2025
**Grounded Language Model now Generally Available**\
Contextual AI has introduced the Grounded Language Model (GLM), a highly faithful RAG-optimized LLM that prioritizes retrieved knowledge over parametric knowledge, supports optional commentary control, and is now available both as the default agent model and through a standalone `/generate` API.
Today, Contextual AI announces the Grounded Language Model (GLM), the most
grounded language model in the world, available in both agents and as a
separate /generate component API.
The GLM is an LLM that is engineered specifically to prioritize faithfulness
to the retrieved knowledge over parametric knowledge to reduce
hallucinations in Retrieval-Augmented Generation. Uniquely, the model
distinguishes between facts and commentary that it generates, and users can
toggle an avoid\_commentary flag to determine whether the model can include
commentary in its response or not.
To get started for free with the /generate component API, create a
Contextual AI account, visit the Getting Started tab, and either get an API
key the /generate standalone API or use the /generate UI playground. We
provide credits for the first 1M input and 1M output tokens, and you can buy
additional credits as your needs grow. To request custom rate limits and
pricing, please contact us here. If you have any feedback or need support,
please email [glm-feedback@contextual.ai](mailto:glm-feedback@contextual.ai).
The GLM is already the default model in new agents created with the
Contextual AI platform. See blog post for more details.
### 3/3/2025
**Advanced parameters now Generally Available**\
Contextual AI now offers advanced agent configuration parameters that let you fine-tune retrieval, reranking, filtering, and generation behaviors, giving you precise control over how your RAG agents search, select, filter, and generate responses for your specific use cases.
Today, Contextual AI announces the availability of advanced parameters in
agent creation and editing.
With these parameters, you can control more granular aspects of your
specialized RAG agent across retrieval, reranking, filtering and generation.
In particular, you can:
-
Fine-tune retrieval relevance by adjusting lexical and semantic search
weightings, helping you balance keyword precision with conceptual matching
-
Optimize chunk selection by configuring both the number of retrieved
chunks and reranked chunks, allowing you to maximize relevance while
managing context window usage
-
Customize filtering criteria with your own custom filter prompt, enabling
you to implement domain-specific relevance rules
-
Control response generation with precise temperature, top\_p, and frequency
penalty settings, giving you the flexibility to balance consistency and
creativity in answers
These controls empower you to optimize your agent according to use
case-specific requirements with much greater precision, ultimately
delivering more accurate and relevant responses to your users’ queries.
To get started, use the `agent_configs` object in the Create/Edit Agent APIs.
You can also change these parameters by editing the agent in the UI. They
are subject to change.
***
## February
### 2/10/2025
**Agent-level entitlements now Generally Available**\
Contextual AI now supports agent-level entitlements, allowing administrators to assign per-user access rights and define fine-grained permission policies for agents directly from the platform’s Permissions page.
Today, Contextual AI announced support for Agent-level entitlements.
With this release, customers can now configure access rights per-user to
agents. Using the Permissions page on our platform UI, administrators can
define access policies for their entire tenant or grant individual access to
specific agents to specific users.
· · ·
**Users API now Generally Available**\
Contextual AI has introduced a new Users API that allows administrators to programmatically create, view, update, and remove end-user accounts, complementing existing user management in the platform UI.
Today, Contextual AI announces the release of our new Users API to help
customers manage their end-users on the platform.
Administrators can now programmatically manage user accounts through the
Users API. This includes creating new users, describing users, updating user
information, and removing users. In addition to the Users API, customers can
also manage their end-users through the platform UI.
To get started, use the Users API today on the platform. Learn more here.
· · ·
**Metrics API now Generally Available**\
Contextual AI has launched a new Metrics API that gives developers programmatic access to agent query and feedback data, enabling automated analysis, reporting, and alerting based on real user interactions.
Today, Contextual AI announces the release of our new Metrics API. This
endpoint provides programmatic access to an agent’s query and feedback data
from end-users.
With the Metrics API, developers can analyze usage and feedback, automate
reporting, and set up alerts. The Metrics API returns data such as user and
message information, query, response, feedback, and user-submitted details
or issues with generated responses.
To get started, use the Metrics API today on the platform. Learn more here.
· · ·
**Multi-turn conversations now available in Private Preview**\
Contextual AI now supports multi-turn chat, enabling agents to use prior conversation history and retrieved knowledge to interpret follow-up questions, resolve ambiguities, and generate more contextually grounded answers.
Today, Contextual AI announced support for multi-turn chat conversations.
With this release, agents can rely on prior conversational history and
retrieved knowledge. When users ask follow-up questions, agents can
automatically use information from prior turns in the conversation to
resolve ambiguities in the query, fetch the appropriate retrievals, and
generate the final answer.
This feature is currently available in private preview. Contact us to get access.
· · ·
**Many-to-many mapping between agents and datastores now Generally Available**\
Contextual AI now supports many-to–many connections between agents and datastores, allowing multiple agents to access multiple datastores for more flexible, efficient, and scalable RAG workflows.
Today, Contextual AI announced support for many-to-many mapping between
agents and datastores. Multiple datastores can now connect to multiple
agents, enabling more flexible and efficient ways to build specialized RAG
agents.
With many-to-many mapping, you can now connect multiple agents to multiple
datastores, eliminating data silos allowing for cross-datastore access. As
your system grows, agents can interact with any datastore without manual
duplication or re-uploading, ensuring faster access and better efficiency.
To get started, build your first datastore today on the platform. Learn more
here.
· · ·
**Support for multimodal data is now in Private Preview**\
Contextual AI can now extract and reason over charts, graphs, and other visual elements within PDFs, enabling agents to answer questions based on both text and image content.
Today, Contextual AI announced support for reasoning over images in unstructured data.
Our document understanding engine can now extract and reason over charts,
graphs, and other visual elements within PDF files. In addition to text,
your agents can now answer queries based on the content of images in PDF
files.
This feature is currently available in private preview. Contact us to get access.
· · ·
**Support for structured data is now in Private Preview**\
Contextual AI now supports natural-language querying of structured data from connected enterprise databases—like Snowflake, Redshift, BigQuery, and PostgreSQL—by letting agents generate and run SQL directly through new datastore database connections.
Today, Contextual AI announced support for retrieving structured data from databases.
Our platform now connects directly to your enterprise data sources, enabling
natural language queries across structured data stored in Snowflake,
Redshift, BigQuery, and PostgreSQL databases. Simply ask questions in plain
English and your AI agent will generate and execute the appropriate SQL
queries to retrieve your data. Customers can now add database connections to
their Contextual Datastores.
This feature is currently available in private preview. Contact us to get access.
***
# Node.js SDK
Source: https://docs.contextual.ai/sdks/node
# Python SDK
Source: https://docs.contextual.ai/sdks/python