Documentation Index
Fetch the complete documentation index at: https://docs.contextual.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Contextual AI’s reranker is the first with instruction-following capabilities to handle conflicts in retrieval. It is the most accurate reranker in the world per industry-leading benchmarks like BEIR. To learn more about the reranker and its importance in RAG pipelines, please see our blog.
This how-to guide uses the same example to demonstrate how to use the reranker with the Contextual API directly, our Python SDK, and our Langchain package.
The current reranker models include:
- ctxl-rerank-v2-instruct-multilingual
- ctxl-rerank-v2-instruct-multilingual-mini
- ctxl-rerank-v1-instruct
Global Variables & Examples
First, we will set up the global variables and examples we’ll use with each different implementation method.
from google.colab import userdata
api_key = userdata.get("API_TOKEN")
base_url = "https://api.contextual.ai/v1"
rerank_api_endpoint = f"{base_url}/rerank"
query = "What is the current enterprise pricing for the RTX 5090 GPU for bulk orders?"
instruction = "Prioritize internal sales documents over market analysis reports. More recent documents should be weighted higher. Enterprise portal content supersedes distributor communications."
documents = [
"Following detailed cost analysis and market research, we have implemented the following changes: AI training clusters will see a 15% uplift in raw compute performance, enterprise support packages are being restructured, and bulk procurement programs (100+ units) for the RTX 5090 Enterprise series will operate on a $2,899 baseline.",
"Enterprise pricing for the RTX 5090 GPU bulk orders (100+ units) is currently set at $3,100-$3,300 per unit. This pricing for RTX 5090 enterprise bulk orders has been confirmed across all major distribution channels.",
"RTX 5090 Enterprise GPU requires 450W TDP and 20% cooling overhead."
]
metadata = [
"Date: January 15, 2025. Source: NVIDIA Enterprise Sales Portal. Classification: Internal Use Only",
"TechAnalytics Research Group. 11/30/2023.",
"January 25, 2025; NVIDIA Enterprise Sales Portal; Internal Use Only"
]
model = "ctxl-rerank-v2-instruct-multilingual"
REST API implementation
import requests
headers = {
"accept": "application/json",
"content-type": "application/json",
"authorization": f"Bearer {api_key}"
}
payload = {
"query": query,
"instruction": instruction,
"documents": documents,
"metadata": metadata,
"model": model
}
rerank_response = requests.post(rerank_api_endpoint, json=payload, headers=headers)
print(rerank_response.json())
Python SDK
try:
from contextual import ContextualAI
except:
%pip install contextual-client
from contextual import ContextualAI
client = ContextualAI (api_key = api_key, base_url = base_url)
rerank_response = client.rerank.create(
query = query,
instruction = instruction,
documents = documents,
metadata = metadata,
model = model
)
print(rerank_response.to_dict())
Langchain
try:
from langchain_contextual import ContextualRerank
except:
%pip install langchain-contextual
from langchain_contextual import ContextualRerank
from langchain_core.documents import Document
# intialize Contextual reranker via langchain_contextual
compressor = ContextualRerank(
model=model,
api_key=api_key,
)
# Prepare metadata in dictionary format for Langchain Document class
metadata_dict = [
{
"Date": "January 15, 2025",
"Source": "NVIDIA Enterprise Sales Portal",
"Classification": "Internal Use Only"
},
{
"Date": "11/30/2023",
"Source": "TechAnalytics Research Group"
},
{
"Date": "January 25, 2025",
"Source": "NVIDIA Enterprise Sales Portal",
"Classification": "Internal Use Only"
}
]
# prepare documents as langchain Document objects
# metadata stored in document objects will be extracted and used for reranking
langchain_documents = [
Document(page_content=content, metadata=metadata_dict[i])
for i, content in enumerate(documents)
]
# print to validate langchain document
print(langchain_documents[0])
# use compressor.compress_documents to rerank the documents
reranked_documents = compressor.compress_documents(
query=query,
instruction=instruction,
documents=langchain_documents,
)
print(reranked_documents)
Additional Resources