Overview
This guide demonstrates how to use /parse with the Contextual AI API directly and via our Python SDK. It uses the same document — Attention Is All You Need — for both examples.
0. Fetch the Document & Set API Key
First, we’ll fetch the document that we’ll use throughout the notebook.
url = "https://arxiv.org/pdf/1706.03762"
import requests
# Download doc
file_path = "attention-is-all-you-need.pdf"
with open(file_path, "wb") as f:
f.write(requests.get(url).content)
API Key
Grab your API Key from the Contextual console and set it on the left in the Secrets pane as CONTEXTUAL_API_KEY.
# Set the API key in the pane of google colab
from google.colab import userdata
api_key = userdata.get('CONTEXTUAL_API_KEY')
1. REST API implementation
You can use our API directly with the requests package. See docs.contextual.ai for details.
import requests
import json
base_url = "https://api.contextual.ai/v1"
headers = {
"accept": "application/json",
"authorization": f"Bearer {api_key}"
}
1.1 Submit Parse Job
Next, we’ll define the configuration for our parse job and submit it. This initiates an async parse job and returns a job_id we can use to monitor progress.
url = f"{base_url}/parse"
config = {
"parse_mode": "standard",
"figure_caption_mode": "concise",
"enable_document_hierarchy": True,
"page_range": "0-5",
}
with open(file_path, "rb") as fp:
file = {"raw_file": fp}
result = requests.post(url, headers=headers, data=config, files=file)
response = json.loads(result.text)
job_id = response['job_id']
job_id
1.2 Monitor Job Status
Using the job_id from above, we can monitor the status of our parse job.
# Check on parse job status
from time import sleep
url = f"{base_url}/parse/jobs/{job_id}/status"
while True:
result = requests.get(url, headers=headers)
parse_response = json.loads(result.text)['status']
print(f"Job is {parse_response}")
if parse_response == "completed":
break
sleep(30)
1.3 List all jobs
If we submit multiple jobs and want to see the status of each of them, then we can use the list jobs api:
url = f"{base_url}/parse/jobs"
result = requests.get(url, headers=headers)
parse_response = json.loads(result.text)
parse_response
1.4 Get Parse results
url = f"{base_url}/parse/jobs/{job_id}/results"
output_types = ["markdown-per-page"]
result = requests.get(
url,
headers=headers,
params={"output_types": ",".join(output_types)},
)
result = json.loads(result.text)
1.5 Display 1st Page
from IPython.display import display, Markdown
display(Markdown(result['pages'][0]['markdown']))
2. Contextual SDK
try:
from contextual import ContextualAI
except:
%pip install --upgrade --quiet contextual-client
from contextual import ContextualAI
# Setup Contextual Python SDK
client = ContextualAI(api_key=api_key)
2.1 Submit Parse Job
with open(file_path, "rb") as fp:
response = client.parse.create(
raw_file=fp,
parse_mode="standard",
figure_caption_mode="concise",
enable_document_hierarchy=True,
page_range="0-5",
)
job_id = response.job_id
job_id
2.2 Monitor Job Status
# Check on parse job status
from time import sleep
while True:
result = client.parse.job_status(job_id)
parse_response = result.status
print(f"Job is {parse_response}")
if parse_response == "completed":
break
sleep(30)
2.3 List all jobs
2.4 Get Job Results
results = client.parse.job_results(job_id, output_types=['markdown-per-page'])
2.5 Display 1st Page
from IPython.display import display, Markdown
display(Markdown(results.pages[0].markdown))
3. Parse UI
To see job results in an interactive manner and submit new jobs, navigate to the UI using the following link by running the cell below.
You’ll need to change your-tenant-name to your tenant.
tenant = "your-tenant-name"
print(f"https://app.contextual.ai/{tenant}/components/parse?job={job_id}")
4. Output Types
You can set the desired output format(s) of the parsed file using the output_types parameter. Valid values are:
markdown-document
markdown-per-page
blocks-per-page
Specify multiple values to receive multiple formats in the response.
Format descriptions:
markdown-document parses the entire document into one concatenated markdown output.
markdown-per-page provides markdown output per page.
blocks-per-page provides a structured JSON representation of content blocks on each page, sorted by reading order.
4.1 Display Markdown-per-page
results = client.parse.job_results(job_id, output_types=['markdown-per-page'])
for page in results.pages:
display(Markdown(page.markdown))
4.2 Blocks per page
results = client.parse.job_results(job_id, output_types=['blocks-per-page'])
for page in results.pages:
for block in page.blocks:
display(Markdown(block.markdown))
4.3 Markdown-document
This returns the document text into a single field markdown_document.
result = client.parse.job_results(job_id, output_types=['markdown-document'])
display(Markdown(result.markdown_document))
5.1 Display hierarchy
To easily inspect the document hierarchy, rendered as a markdown table of contents you can run:
from IPython.display import display, Markdown
display(Markdown(results.document_metadata.hierarchy.table_of_contents))
5.2 Add hierarchy context
LLMs work best when they’re provided with structured information about the document’s hierarchy and organization. That’s why we’ve written the parse api to be context aware, i.e. we can include metadata such as which section the text is from.
To do this we’ll set output_type to blocks-per-page and use the parameter parent_ids to get the corresponding section headings. The parent_ids are sorted from root-level to bottom in case of nested sections.
results = client.parse.job_results(job_id, output_types=['blocks-per-page'])
hash_table = {}
for page in results.pages:
for block in page.blocks:
hash_table[block.id] = block.markdown
page = results.pages[3] # example
for block in page.blocks:
if block.parent_ids:
parent_content = "\n".join([hash_table[parent_id] for parent_id in block.parent_ids])
print(f"Metadata:\n------\n{parent_content} \n\n Text\n------\n {block.markdown}\n\n")
If we’re interested in extracting large tables, sometimes we need to split up those tables to use them in the LLM but preserve table header information across each chunk. To do that we’ll use the enable_split_tables and max_split_table_cells parameters like so:
We’re using a document with a large table. You can take a look at the original doc here.
url = 'https://raw.githubusercontent.com/ContextualAI/examples/refs/heads/main/03-standalone-api/04-parse/data/omnidocbench-text.pdf'
# Download doc
file_path = "omnidocbench-text_pdf.pdf"
with open(file_path, "wb") as f:
f.write(requests.get(url).content)
file_path = 'omnidocbench-text_pdf.pdf'
with open(file_path, "rb") as fp:
response = client.parse.create(
raw_file=fp,
parse_mode="standard",
enable_split_tables=True,
max_split_table_cells=100,
)
job_id = response.job_id
job_id
# Check on parse job status
while True:
result = client.parse.job_status(job_id)
parse_response = result.status
print(f"Job is {parse_response}")
if parse_response == "completed":
break
sleep(30)
result = client.parse.job_results(job_id, output_types=['markdown-per-page'])
for page in result.pages:
display(Markdown(page.markdown))