Create Schema
Create a new extraction schema.
Creates a JSON Schema that defines the structure of data to extract from documents. The schema must conform to our supported subset of JSON Schema 2020-12 features.
Supported Schema Features:
Basic Types:
string: Text data with optional constraints (minLength,maxLength,pattern,enum)integer: Whole numbers with optional constraints (minimum,maximum,enum)number: Decimal numbers with optional constraints (minimum,maximum,enum)boolean: True/false valuesnull: Null values (often used withanyOffor optional fields)
Complex Types:
object: Key-value pairs with defined propertiesarray: Lists of items with defined item schemasanyOf: Union types (e.g., string or null for optional fields)
String Formats:
Supported formats: date-time, time, date, duration, email, hostname, ipv4, ipv6, uuid, uri
Schema Structure:
- Root schema must be an
objecttype - Use
$defsfor reusable schema components - Use
$refto reference definitions - Arrays must have
itemsschema defined
Constraints:
- Maximum 10 leaf nodes per array (prevents overly complex schemas)
- No circular references in
$refdefinitions - String formats must be from the supported list
Example Schemas:
Simple Company Schema:
{
"type": "object",
"properties": {
"company_name": {
"type": "string",
"description": "The name of the company exactly as it appears in the document"
},
"form_type": {
"type": "string",
"enum": ["10-K", "10-Q", "8-K", "S-1"],
"description": "The type of SEC form"
},
"trading_symbol": {
"type": "string",
"description": "The trading symbol of the company"
},
"zip_code": {
"type": "integer",
"description": "The zip code of the company headquarters"
}
},
"required": ["company_name", "form_type", "trading_symbol", "zip_code"]
}
Complex Resume Schema:
{
"type": "object",
"properties": {
"personalInfo": {
"type": "object",
"properties": {
"fullName": {"type": "string"},
"contact": {
"type": "object",
"properties": {
"emails": {
"type": "array",
"items": {"type": "string", "format": "email"}
},
"phones": {
"type": "array",
"items": {"type": "string"}
}
}
}
},
"required": ["fullName"]
},
"workExperience": {
"type": "array",
"items": {
"type": "object",
"properties": {
"jobTitle": {"type": "string"},
"company": {"type": "string"},
"startDate": {"type": "string"},
"endDate": {"type": ["string", "null"]},
"isCurrent": {"type": "boolean"}
},
"required": ["jobTitle", "company", "startDate"]
}
}
},
"required": ["personalInfo", "workExperience"]
}
Schema with References:
{
"type": "object",
"properties": {
"algorithms": {
"type": "array",
"items": {"$ref": "#/$defs/algorithm"}
}
},
"$defs": {
"algorithm": {
"type": "object",
"properties": {
"name": {"type": "string"},
"description": {"type": "string"}
},
"required": ["name"]
}
}
}
Best Practices:
- Use descriptive field names that clearly indicate what data should be extracted
- Add detailed descriptions to help the AI understand what to extract
- Use
enumfor known values (e.g., form types, status values) - Make fields optional by using
anyOfwithnullor omitting fromrequired - Use arrays for lists of similar items (e.g., work experience, education)
- Keep schemas focused - avoid overly complex nested structures
- Test with sample documents to ensure the schema captures the expected data
Documentation Index
Fetch the complete documentation index at: https://docs.contextual.ai/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Request model for creating a new extraction schema.
The schema_definition must be a valid JSON Schema that defines the structure of data to extract from documents. The system supports a subset of JSON Schema 2020-12 features optimized for document extraction.
Name of the schema
JSON Schema definition. Must be a valid JSON Schema that defines the structure of data to extract from documents. See the comprehensive schema guide in the API documentation for detailed examples and supported features.
{
"properties": {
"company_name": {
"description": "The name of the company exactly as it appears in the document",
"type": "string"
},
"form_type": {
"description": "The type of SEC form",
"enum": ["10-K", "10-Q", "8-K", "S-1"],
"type": "string"
},
"trading_symbol": {
"description": "The trading symbol of the company",
"type": "string"
},
"zip_code": {
"description": "The zip code of the company headquarters",
"type": "integer"
}
},
"required": [
"company_name",
"form_type",
"trading_symbol",
"zip_code"
],
"type": "object"
}Description of the schema
Additional metadata for the schema
Response
Successful Response
Response model for schema information.
Unique ID of the schema
Name of the schema
JSON schema definition
Schema metadata
Timestamp when the schema was created
Timestamp when the schema was last updated
Description of the schema