June
6/30/2025
Page-level Chunking in Datastore Configuration OptionsContextual AI now supports a new page-level chunking mode that preserves slide and page boundaries for more accurate, context-aware retrieval in RAG workflows.
Page-level chunking mode optimizes parsing for page-boundary-sensitive documents. Instead of splitting content purely by size or heading structure, this mode ensures each page becomes its own retrieval-preserving chunk unless the maximum chunk size is exceeded.
This is particularly effective for slide decks, reports, and other page-oriented content, where the meaning is closely tied to individual pages.
Page-level chunking joins existing segmentation options including heading-depth, heading-greedy, and simple-length.
To enable, set chunking_mode = "page" when configuring a
datastore via the ingest document API or via the UI.
6/2/2025
Query Reformulation & DecompositionContextual AI now supports query reformulation and decomposition, enabling agents to rewrite, clarify, and break down complex or ambiguous user queries.
Query reformulation allows agents to rewrite or expand user queries to better match the vocabulary and structure of your corpus. This is essential when user queries are ambiguous, underspecified, or contain terminology not aligned with the domain.
Decomposition automatically determines whether a query should be split into smaller sub-queries. Each sub-query undergoes its own retrieval step before results are merged into a final ranked set.
Common reformulation use cases include:
- Aligning queries with domain-specific terminology
- Making implicit references explicit
- Adding metadata or contextual tags to guide retrieval
Enable these features via Query Reformulation in the agent settings UI, or via the Agent API.
May
5/29/2025
Optimize parsing and chunking strategies via Datastore configurationContextual AI has released new advanced datastore configuration options that let developers fine-tune parsing, chunking, and document processing workflows to produce highly optimized, use-case-specific RAG-ready outputs.
Today, Contextual AI announces the release of advanced datastore configuration options, enabling developers to optimize document processing for RAG-ready outputs tailored to their specific use cases and document types.
Clients can now customize parsing and chunking workflows to maximize RAG performance. Configure heading-depth chunking for granular hierarchy context, use custom prompts for domain-specific image captioning, enable table splitting for complex structured documents, and set precise token limits to optimize retrieval quality.
These configuration options ensure your documents are processed optimally for your RAG system – whether you’re working with technical manuals requiring detailed hierarchical context, visual-heavy documents needing specialized image descriptions, or structured reports with complex tables.
To get started, simply use our updated Agent API and datastore UI with the new configuration parameters to customize parsing and chunking behavior for your specific documents and use cases.
5/20/2025
Chunk viewer for document inspectionContextual AI introduces the Chunk Inspector, a visual debugging tool that lets developers inspect and validate document parsing and chunking results to ensure their content is fully RAG-ready.
Today, Contextual AI announces the release of the Chunk Inspector, a visual debugging tool that allows developers to examine and validate document parsing and chunking results.
Clients can now inspect how their documents are processed through our extraction pipeline, viewing rendered metadata, extracted text, tables or image captioning results for each chunk. This transparency enables developers to diagnose extraction issues, optimize chunking configurations, and ensure their documents are properly RAG-ready before deployment.
The Chunk Inspector provides immediate visibility into how your datastore configuration affects document processing, making it easier to fine-tune parsing and chunking settings for optimal retrieval performance.
To get started, simply navigate to the Chunk Inspector in your datastore UI after ingesting a document to review the extraction and chunking results.
5/13/2025
Document Parser for RAG now Generally AvailableContextual AI has launched a new Document Parser for RAG, a powerful /parse API that delivers highly accurate, hierarchy-aware understanding of large enterprise documents—dramatically improving retrieval quality across complex text, tables, and diagrams.
Today, Contextual AI announces the Document Parser for RAG with our separate /parse component API, enabling enterprise AI agents to navigate and understand large and complex documents with superior accuracy and context awareness.
The document parser excels at handling enterprise documents through three key innovations: document-level understanding that captures section hierarchies across hundreds of pages, minimized hallucinations with confidence levels for table extraction, and superior handling of complex modalities such as technical diagrams, charts, and large tables. In testing with SEC filings, including document hierarchy metadata in chunks increased the equivalence score from 69.2% to 84.0%, demonstrating significant improvements in end-to-end RAG performance.
Get started today for free by creating a Contextual AI account. Visit the Components tab to use the Parse UI playground, or get an API key and call the API directly. We provide credits for the first 500+ pages in Standard mode (for complex documents that require VLMs and OCR), and you can buy additional credits as your needs grow. To request custom rate limits and pricing, please contact us. If you have any feedback or need support, please email parse-feedback@contextual.ai.
March
3/24/2025
Groundedness scoring of model responses now Generally AvailableContextual AI now offers groundedness scoring, a feature that evaluates how well each part of an agent’s response is supported by retrieved knowledge, helping developers detect and manage ungrounded or potentially hallucinated claims with precision.
Today, Contextual AI launched groundedness scoring for model responses.
Ensuring that agent responses are supported by retrieved knowledge is essential for RAG applications. While Contextual’s Grounded Language Models already produce highly grounded responses, groundedness scoring adds an extra layer of defense against hallucinations and factual errors.
When users query an agent with groundedness scores enabled, a specialized model automatically evaluates how well claims made in the response are supported by the knowledge. Scores are reported for individual text spans allowing for precise detection of unsupported claims. In the platform interface, the score for each text span is viewable upon hover and ungrounded claims are visually distinguished from grounded ones. Scores are also returned in the API, enabling developers to build powerful functionality with ease, like hiding ungrounded claims or adding caveats to specific sections of a response.
To get started, simply toggle “Enable Groundedness Scores” for an agent in
the “Generation” section of the agent configuration page, or through the
agent creation or edit API. Groundedness scores will automatically be
generated and displayed in the UI, and returned as part of responses to
/agent/{agent_id}/query requests.
3/21/2025
Metadata ingestion & document filteringContextual AI now supports document-level metadata ingestion and metadata-based filtering, enabling developers to target queries by attributes like author, date, department, or custom fields for more precise and relevant retrieval.
Today, Contextual AI announces the release of document metadata ingestion and allows for metadata-based filtering during queries.
Clients can now narrow search results using document properties like author, date, department, or any custom metadata fields, delivering more precise and contextually relevant responses.
To get started, simply use our ingest document and update document metadata APIs to add metadata to documents. Once done, use our document filter in the query API to filter down results.
Contextual AI now supports ingesting DOC(X) and PPT(X) files, allowing RAG agents to seamlessly use Microsoft Office documents as part of their retrieval corpus.
Today, Contextual AI announces the release of the support of DOC(X) and PPT(X) files for ingestion into datastore.
This enables clients to leverage Microsoft Office documents directly in their RAG agents, expanding the range of content they can seamlessly incorporate.
To get started, use our document API or our user interface to ingest new files.
3/17/2025
Filtering by reranker relevance score now Generally AvailableContextual AI now allows users to filter retrieved chunks by reranker relevance score, giving them more precise control over which chunks are used during response generation via a new
reranker_score_filter_threshold setting in the Agent APIs and UI.
Today, Contextual AI announces support for filtering retrieved chunks based on the relevance score assigned by the reranker.
The ability to filter chunks based on relevance score gives users more precision and control in ensuring that only the most relevant chunks are considered during response generation. It is an effective alternative or complement to using the filter_prompt for a separate filtering LLM.
To get started, use the reranker_score_filter_threshold parameter in the
Create/Edit Agent APIs and in the UI.
3/11/2025
Instruction-following reranker now Generally AvailableContextual AI has released the world’s first instruction-following reranker—a state-of-the-art model that lets users provide natural-language ranking instructions to improve retrieval relevance and response accuracy, now available
Today, Contextual AI announces the world’s first instruction-following reranker, available in both agents and as a separate /rerank component API.
The instruction-following reranker enables users to specify natural language instructions about how the reranker should rank retrievals, which improves accuracy in reranking and response generation. The reranker ranks documents according to their relevance to the query first and your custom instructions secondarily. We evaluated the model on instructions for recency, document type, source, and metadata, and it can generalize to other instructions as well. For instructions related to recency and timeframe, specify the timeframe (e.g., instead of saying “this year”) because the reranker doesn’t know the current date. The reranker is state-of-the-art on the industry-standard BEIR benchmark, as well as our internal benchmarks.
To get started for free with the /rerank component API, create a Contextual
AI account, visit the Getting Started tab, and either get an API key for the
/rerank API or use the /rerank UI playground. We provide credits for the
first 50M tokens, and you can buy additional credits as your needs grow. To
request custom rate limits and pricing, please contact us. If you have any
feedback or need support, please email reranker-feedback@contextual.ai.
This reranker is the default for new agents created with the Contextual AI platform. To specify instructions, use the reranker_instruction parameter in the Create/Edit Agent APIs and in the UI. See blog post for more details.
3/4/2025
Grounded Language Model now Generally AvailableContextual AI has introduced the Grounded Language Model (GLM), a highly faithful RAG-optimized LLM that prioritizes retrieved knowledge over parametric knowledge, supports optional commentary control, and is now available both as the default agent model and through a standalone
/generate API.
Today, Contextual AI announces the Grounded Language Model (GLM), the most grounded language model in the world, available in both agents and as a separate /generate component API.
The GLM is an LLM that is engineered specifically to prioritize faithfulness to the retrieved knowledge over parametric knowledge to reduce hallucinations in Retrieval-Augmented Generation. Uniquely, the model distinguishes between facts and commentary that it generates, and users can toggle an avoid_commentary flag to determine whether the model can include commentary in its response or not.
To get started for free with the /generate component API, create a Contextual AI account, visit the Getting Started tab, and either get an API key the /generate standalone API or use the /generate UI playground. We provide credits for the first 1M input and 1M output tokens, and you can buy additional credits as your needs grow. To request custom rate limits and pricing, please contact us here. If you have any feedback or need support, please email glm-feedback@contextual.ai.
The GLM is already the default model in new agents created with the Contextual AI platform. See blog post for more details.
3/3/2025
Advanced parameters now Generally AvailableContextual AI now offers advanced agent configuration parameters that let you fine-tune retrieval, reranking, filtering, and generation behaviors, giving you precise control over how your RAG agents search, select, filter, and generate responses for your specific use cases.
Today, Contextual AI announces the availability of advanced parameters in agent creation and editing.
With these parameters, you can control more granular aspects of your specialized RAG agent across retrieval, reranking, filtering and generation. In particular, you can:
- Fine-tune retrieval relevance by adjusting lexical and semantic search weightings, helping you balance keyword precision with conceptual matching
- Optimize chunk selection by configuring both the number of retrieved chunks and reranked chunks, allowing you to maximize relevance while managing context window usage
- Customize filtering criteria with your own custom filter prompt, enabling you to implement domain-specific relevance rules
- Control response generation with precise temperature, top_p, and frequency penalty settings, giving you the flexibility to balance consistency and creativity in answers
These controls empower you to optimize your agent according to use case-specific requirements with much greater precision, ultimately delivering more accurate and relevant responses to your users’ queries.
To get started, use the agent_configs object in the Create/Edit Agent APIs.
You can also change these parameters by editing the agent in the UI. They
are subject to change.
February
2/10/2025
Agent-level entitlements now Generally AvailableContextual AI now supports agent-level entitlements, allowing administrators to assign per-user access rights and define fine-grained permission policies for agents directly from the platform’s Permissions page.
Today, Contextual AI announced support for Agent-level entitlements.
With this release, customers can now configure access rights per-user to agents. Using the Permissions page on our platform UI, administrators can define access policies for their entire tenant or grant individual access to specific agents to specific users.
Contextual AI has introduced a new Users API that allows administrators to programmatically create, view, update, and remove end-user accounts, complementing existing user management in the platform UI.
Today, Contextual AI announces the release of our new Users API to help customers manage their end-users on the platform.
Administrators can now programmatically manage user accounts through the Users API. This includes creating new users, describing users, updating user information, and removing users. In addition to the Users API, customers can also manage their end-users through the platform UI.
To get started, use the Users API today on the platform. Learn more here.
Contextual AI has launched a new Metrics API that gives developers programmatic access to agent query and feedback data, enabling automated analysis, reporting, and alerting based on real user interactions.
Today, Contextual AI announces the release of our new Metrics API. This endpoint provides programmatic access to an agent’s query and feedback data from end-users.
With the Metrics API, developers can analyze usage and feedback, automate reporting, and set up alerts. The Metrics API returns data such as user and message information, query, response, feedback, and user-submitted details or issues with generated responses.
To get started, use the Metrics API today on the platform. Learn more here.
Contextual AI now supports multi-turn chat, enabling agents to use prior conversation history and retrieved knowledge to interpret follow-up questions, resolve ambiguities, and generate more contextually grounded answers.
Today, Contextual AI announced support for multi-turn chat conversations.
With this release, agents can rely on prior conversational history and retrieved knowledge. When users ask follow-up questions, agents can automatically use information from prior turns in the conversation to resolve ambiguities in the query, fetch the appropriate retrievals, and generate the final answer.
This feature is currently available in private preview. Contact us to get access.
Contextual AI now supports many-to–many connections between agents and datastores, allowing multiple agents to access multiple datastores for more flexible, efficient, and scalable RAG workflows.
Today, Contextual AI announced support for many-to-many mapping between agents and datastores. Multiple datastores can now connect to multiple agents, enabling more flexible and efficient ways to build specialized RAG agents.
With many-to-many mapping, you can now connect multiple agents to multiple datastores, eliminating data silos allowing for cross-datastore access. As your system grows, agents can interact with any datastore without manual duplication or re-uploading, ensuring faster access and better efficiency.
To get started, build your first datastore today on the platform. Learn more here.
Contextual AI can now extract and reason over charts, graphs, and other visual elements within PDFs, enabling agents to answer questions based on both text and image content.
Today, Contextual AI announced support for reasoning over images in unstructured data.
Our document understanding engine can now extract and reason over charts, graphs, and other visual elements within PDF files. In addition to text, your agents can now answer queries based on the content of images in PDF files.
This feature is currently available in private preview. Contact us to get access.
Contextual AI now supports natural-language querying of structured data from connected enterprise databases—like Snowflake, Redshift, BigQuery, and PostgreSQL—by letting agents generate and run SQL directly through new datastore database connections.
Today, Contextual AI announced support for retrieving structured data from databases.
Our platform now connects directly to your enterprise data sources, enabling natural language queries across structured data stored in Snowflake, Redshift, BigQuery, and PostgreSQL databases. Simply ask questions in plain English and your AI agent will generate and execute the appropriate SQL queries to retrieve your data. Customers can now add database connections to their Contextual Datastores.
This feature is currently available in private preview. Contact us to get access.