Skip to main content

Overview

This guide demonstrates how to use /parse with the Contextual AI API directly and via our Python SDK. It uses the same document — Attention Is All You Need — for both examples.

0. Fetch the Document & Set API Key

First, we’ll fetch the document that we’ll use throughout the notebook.
url = "https://arxiv.org/pdf/1706.03762"

import requests

# Download doc
file_path = "attention-is-all-you-need.pdf"
with open(file_path, "wb") as f:
    f.write(requests.get(url).content)

API Key

Grab your API Key from the Contextual console and set it on the left in the Secrets pane as CONTEXTUAL_API_KEY.
# Set the API key in the pane of google colab
from google.colab import userdata
api_key = userdata.get('CONTEXTUAL_API_KEY')

1. REST API implementation

You can use our API directly with the requests package. See docs.contextual.ai for details.
import requests
import json

base_url = "https://api.contextual.ai/v1"

headers = {
    "accept": "application/json",
    "authorization": f"Bearer {api_key}"
}

1.1 Submit Parse Job

Next, we’ll define the configuration for our parse job and submit it. This initiates an async parse job and returns a job_id we can use to monitor progress.
url = f"{base_url}/parse"

config = {
    "parse_mode": "standard",
    "figure_caption_mode": "concise",
    "enable_document_hierarchy": True,
    "page_range": "0-5",
}

with open(file_path, "rb") as fp:
    file = {"raw_file": fp}
    result = requests.post(url, headers=headers, data=config, files=file)
    response = json.loads(result.text)

job_id = response['job_id']
job_id

1.2 Monitor Job Status

Using the job_id from above, we can monitor the status of our parse job.
# Check on parse job status
from time import sleep

url = f"{base_url}/parse/jobs/{job_id}/status"

while True:
    result = requests.get(url, headers=headers)
    parse_response = json.loads(result.text)['status']
    print(f"Job is {parse_response}")
    if parse_response == "completed":
        break
    sleep(30)

1.3 List all jobs

If we submit multiple jobs and want to see the status of each of them, then we can use the list jobs api:
url = f"{base_url}/parse/jobs"

result = requests.get(url, headers=headers)
parse_response = json.loads(result.text)
parse_response

1.4 Get Parse results

url = f"{base_url}/parse/jobs/{job_id}/results"

output_types = ["markdown-per-page"]

result = requests.get(
    url,
    headers=headers,
    params={"output_types": ",".join(output_types)},
)

result = json.loads(result.text)

1.5 Display 1st Page

from IPython.display import display, Markdown

display(Markdown(result['pages'][0]['markdown']))

2. Contextual SDK

try:
  from contextual import ContextualAI
except:
  %pip install --upgrade --quiet contextual-client
  from contextual import ContextualAI

# Setup Contextual Python SDK
client = ContextualAI(api_key=api_key)

2.1 Submit Parse Job

with open(file_path, "rb") as fp:
    response = client.parse.create(
        raw_file=fp,
        parse_mode="standard",
        figure_caption_mode="concise",
        enable_document_hierarchy=True,
        page_range="0-5",
    )

job_id = response.job_id
job_id

2.2 Monitor Job Status

# Check on parse job status
from time import sleep


while True:
    result = client.parse.job_status(job_id)
    parse_response = result.status
    print(f"Job is {parse_response}")
    if parse_response == "completed":
        break
    sleep(30)

2.3 List all jobs

client.parse.jobs()

2.4 Get Job Results

results = client.parse.job_results(job_id, output_types=['markdown-per-page'])

2.5 Display 1st Page

from IPython.display import display, Markdown

display(Markdown(results.pages[0].markdown))

3. Parse UI

To see job results in an interactive manner and submit new jobs, navigate to the UI using the following link by running the cell below.
You’ll need to change your-tenant-name to your tenant.
tenant = "your-tenant-name"
print(f"https://app.contextual.ai/{tenant}/components/parse?job={job_id}")

4. Output Types

You can set the desired output format(s) of the parsed file using the output_types parameter. Valid values are:
  • markdown-document
  • markdown-per-page
  • blocks-per-page
Specify multiple values to receive multiple formats in the response. Format descriptions:
  • markdown-document parses the entire document into one concatenated markdown output.
  • markdown-per-page provides markdown output per page.
  • blocks-per-page provides a structured JSON representation of content blocks on each page, sorted by reading order.

4.1 Display Markdown-per-page

results = client.parse.job_results(job_id, output_types=['markdown-per-page'])

for page in results.pages:
    display(Markdown(page.markdown))

4.2 Blocks per page

results = client.parse.job_results(job_id, output_types=['blocks-per-page'])

for page in results.pages:
    for block in page.blocks:
        display(Markdown(block.markdown))

4.3 Markdown-document

This returns the document text into a single field markdown_document.
result = client.parse.job_results(job_id, output_types=['markdown-document'])

display(Markdown(result.markdown_document))

5. Hierarchy Metadata

5.1 Display hierarchy

To easily inspect the document hierarchy, rendered as a markdown table of contents you can run:
from IPython.display import display, Markdown

display(Markdown(results.document_metadata.hierarchy.table_of_contents))

5.2 Add hierarchy context

LLMs work best when they’re provided with structured information about the document’s hierarchy and organization. That’s why we’ve written the parse api to be context aware, i.e. we can include metadata such as which section the text is from. To do this we’ll set output_type to blocks-per-page and use the parameter parent_ids to get the corresponding section headings. The parent_ids are sorted from root-level to bottom in case of nested sections.
results = client.parse.job_results(job_id, output_types=['blocks-per-page'])

hash_table = {}

for page in results.pages:
  for block in page.blocks:
    hash_table[block.id] = block.markdown

page = results.pages[3]  # example
for block in page.blocks:
  if block.parent_ids:
    parent_content = "\n".join([hash_table[parent_id] for parent_id in block.parent_ids])
    print(f"Metadata:\n------\n{parent_content} \n\n  Text\n------\n {block.markdown}\n\n")

6. Table Extraction

If we’re interested in extracting large tables, sometimes we need to split up those tables to use them in the LLM but preserve table header information across each chunk. To do that we’ll use the enable_split_tables and max_split_table_cells parameters like so: We’re using a document with a large table. You can take a look at the original doc here. image
url = 'https://raw.githubusercontent.com/ContextualAI/examples/refs/heads/main/03-standalone-api/04-parse/data/omnidocbench-text.pdf'

# Download doc
file_path = "omnidocbench-text_pdf.pdf"
with open(file_path, "wb") as f:
    f.write(requests.get(url).content)
file_path = 'omnidocbench-text_pdf.pdf'
with open(file_path, "rb") as fp:
    response = client.parse.create(
        raw_file=fp,
        parse_mode="standard",
        enable_split_tables=True,
        max_split_table_cells=100,
    )

job_id = response.job_id
job_id
# Check on parse job status
while True:
    result = client.parse.job_status(job_id)
    parse_response = result.status
    print(f"Job is {parse_response}")
    if parse_response == "completed":
        break
    sleep(30)
result = client.parse.job_results(job_id, output_types=['markdown-per-page'])

for page in result.pages:
  display(Markdown(page.markdown))