Rerank - Contextual AI Documentation

Overview

Contextual AI’s reranker is the first with instruction-following capabilities to handle conflicts in retrieval. It is the most accurate reranker in the world per industry-leading benchmarks like BEIR. To learn more about the reranker and its importance in RAG pipelines, please see our blog. This how-to guide uses the same example to demonstrate how to use the reranker with the Contextual API directly, our Python SDK, and our Langchain package. The current reranker models include:

ctxl-rerank-v2-instruct-multilingual
ctxl-rerank-v2-instruct-multilingual-mini
ctxl-rerank-v1-instruct

Global Variables & Examples

First, we will set up the global variables and examples we’ll use with each different implementation method.

from google.colab import userdata

api_key = userdata.get("API_TOKEN")
base_url = "https://api.contextual.ai/v1"
rerank_api_endpoint = f"{base_url}/rerank"

query = "What is the current enterprise pricing for the RTX 5090 GPU for bulk orders?"

instruction = "Prioritize internal sales documents over market analysis reports. More recent documents should be weighted higher. Enterprise portal content supersedes distributor communications."

documents = [
    "Following detailed cost analysis and market research, we have implemented the following changes: AI training clusters will see a 15% uplift in raw compute performance, enterprise support packages are being restructured, and bulk procurement programs (100+ units) for the RTX 5090 Enterprise series will operate on a $2,899 baseline.",
    "Enterprise pricing for the RTX 5090 GPU bulk orders (100+ units) is currently set at $3,100-$3,300 per unit. This pricing for RTX 5090 enterprise bulk orders has been confirmed across all major distribution channels.",
    "RTX 5090 Enterprise GPU requires 450W TDP and 20% cooling overhead."
]

metadata = [
    "Date: January 15, 2025. Source: NVIDIA Enterprise Sales Portal. Classification: Internal Use Only",
    "TechAnalytics Research Group. 11/30/2023.",
    "January 25, 2025; NVIDIA Enterprise Sales Portal; Internal Use Only"
]

model = "ctxl-rerank-v2-instruct-multilingual"

REST API implementation

import requests

headers = {
    "accept": "application/json",
    "content-type": "application/json",
    "authorization": f"Bearer {api_key}"
}

payload = {
    "query": query,
    "instruction": instruction,
    "documents": documents,
    "metadata": metadata,
    "model": model
}

rerank_response = requests.post(rerank_api_endpoint, json=payload, headers=headers)

print(rerank_response.json())

Python SDK

try:
  from contextual import ContextualAI
except:
  %pip install contextual-client
  from contextual import ContextualAI

client = ContextualAI (api_key = api_key, base_url = base_url)

rerank_response = client.rerank.create(
    query = query,
    instruction = instruction,
    documents = documents,
    metadata = metadata,
    model = model
)

print(rerank_response.to_dict())

Langchain

try:
  from langchain_contextual import ContextualRerank
except:
  %pip install langchain-contextual
  from langchain_contextual import ContextualRerank

from langchain_core.documents import Document

# intialize Contextual reranker via langchain_contextual
compressor = ContextualRerank(
    model=model,
    api_key=api_key,
)

# Prepare metadata in dictionary format for Langchain Document class
metadata_dict = [
    {
        "Date": "January 15, 2025",
        "Source": "NVIDIA Enterprise Sales Portal",
        "Classification": "Internal Use Only"
    },
    {
        "Date": "11/30/2023",
        "Source": "TechAnalytics Research Group"
    },
    {
        "Date": "January 25, 2025",
        "Source": "NVIDIA Enterprise Sales Portal",
        "Classification": "Internal Use Only"
    }
]


# prepare documents as langchain Document objects
# metadata stored in document objects will be extracted and used for reranking
langchain_documents = [
    Document(page_content=content, metadata=metadata_dict[i])
    for i, content in enumerate(documents)
]

# print to validate langchain document
print(langchain_documents[0])

# use compressor.compress_documents to rerank the documents
reranked_documents = compressor.compress_documents(
    query=query,
    instruction=instruction,
    documents=langchain_documents,
)
print(reranked_documents)

Additional Resources

Read the blog post to learn more about the Contextual AI Reranker
Access the related notebook that demonstrates how to use the reranker with the Contextual API directly, our Python SDK, and our Langchain package
API Reference
Python SDK
Langchain Package

​Overview

​Global Variables & Examples

​REST API implementation

​Python SDK

​Langchain

​Additional Resources