> ## Documentation Index
> Fetch the complete documentation index at: https://docs.contextual.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Create Schema

> Create a new extraction schema.

Creates a JSON Schema that defines the structure of data to extract from documents. The schema must conform to our supported subset of JSON Schema 2020-12 features.

Supported Schema Features:

Basic Types:

1. `string`: Text data with optional constraints (`minLength`, `maxLength`, `pattern`, `enum`)
2. `integer`: Whole numbers with optional constraints (`minimum`, `maximum`, `enum`)
3. `number`: Decimal numbers with optional constraints (`minimum`, `maximum`, `enum`)
4. `boolean`: True/false values
5. `null`: Null values (often used with `anyOf` for optional fields)

Complex Types:

1. `object`: Key-value pairs with defined properties
2. `array`: Lists of items with defined item schemas
3. `anyOf`: Union types (e.g., string or null for optional fields)

String Formats:

Supported formats: `date-time`, `time`, `date`, `duration`, `email`, `hostname`, `ipv4`, `ipv6`, `uuid`, `uri`

Schema Structure:

1. Root schema must be an `object` type
2. Use `$defs` for reusable schema components
3. Use `$ref` to reference definitions
4. Arrays must have `items` schema defined

Constraints:

1. Maximum 10 leaf nodes per array (prevents overly complex schemas)
2. No circular references in `$ref` definitions
3. String formats must be from the supported list

Example Schemas:

Simple Company Schema:
```json
{
  "type": "object",
  "properties": {
    "company_name": {
      "type": "string",
      "description": "The name of the company exactly as it appears in the document"
    },
    "form_type": {
      "type": "string",
      "enum": ["10-K", "10-Q", "8-K", "S-1"],
      "description": "The type of SEC form"
    },
    "trading_symbol": {
      "type": "string",
      "description": "The trading symbol of the company"
    },
    "zip_code": {
      "type": "integer",
      "description": "The zip code of the company headquarters"
    }
  },
  "required": ["company_name", "form_type", "trading_symbol", "zip_code"]
}
```

Complex Resume Schema:
```json
{
  "type": "object",
  "properties": {
    "personalInfo": {
      "type": "object",
      "properties": {
        "fullName": {"type": "string"},
        "contact": {
          "type": "object",
          "properties": {
            "emails": {
              "type": "array",
              "items": {"type": "string", "format": "email"}
            },
            "phones": {
              "type": "array",
              "items": {"type": "string"}
            }
          }
        }
      },
      "required": ["fullName"]
    },
    "workExperience": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "jobTitle": {"type": "string"},
          "company": {"type": "string"},
          "startDate": {"type": "string"},
          "endDate": {"type": ["string", "null"]},
          "isCurrent": {"type": "boolean"}
        },
        "required": ["jobTitle", "company", "startDate"]
      }
    }
  },
  "required": ["personalInfo", "workExperience"]
}
```

Schema with References:
```json
{
  "type": "object",
  "properties": {
    "algorithms": {
      "type": "array",
      "items": {"$ref": "#/$defs/algorithm"}
    }
  },
  "$defs": {
    "algorithm": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "description": {"type": "string"}
      },
      "required": ["name"]
    }
  }
}
```

Best Practices:

1. Use descriptive field names that clearly indicate what data should be extracted
2. Add detailed descriptions to help the AI understand what to extract
3. Use `enum` for known values (e.g., form types, status values)
4. Make fields optional by using `anyOf` with `null` or omitting from `required`
5. Use arrays for lists of similar items (e.g., work experience, education)
6. Keep schemas focused - avoid overly complex nested structures
7. Test with sample documents to ensure the schema captures the expected data



## OpenAPI

````yaml api-reference/openapi.json post /extract/schemas
openapi: 3.1.0
info:
  title: Endpoints
  version: '1.0'
servers:
  - url: https://api.contextual.ai/v1
security:
  - BearerAuth: []
paths:
  /extract/schemas:
    post:
      tags:
        - /extract
      summary: Create Schema
      description: >-
        Create a new extraction schema.


        Creates a JSON Schema that defines the structure of data to extract from
        documents. The schema must conform to our supported subset of JSON
        Schema 2020-12 features.


        Supported Schema Features:


        Basic Types:


        1. `string`: Text data with optional constraints (`minLength`,
        `maxLength`, `pattern`, `enum`)

        2. `integer`: Whole numbers with optional constraints (`minimum`,
        `maximum`, `enum`)

        3. `number`: Decimal numbers with optional constraints (`minimum`,
        `maximum`, `enum`)

        4. `boolean`: True/false values

        5. `null`: Null values (often used with `anyOf` for optional fields)


        Complex Types:


        1. `object`: Key-value pairs with defined properties

        2. `array`: Lists of items with defined item schemas

        3. `anyOf`: Union types (e.g., string or null for optional fields)


        String Formats:


        Supported formats: `date-time`, `time`, `date`, `duration`, `email`,
        `hostname`, `ipv4`, `ipv6`, `uuid`, `uri`


        Schema Structure:


        1. Root schema must be an `object` type

        2. Use `$defs` for reusable schema components

        3. Use `$ref` to reference definitions

        4. Arrays must have `items` schema defined


        Constraints:


        1. Maximum 10 leaf nodes per array (prevents overly complex schemas)

        2. No circular references in `$ref` definitions

        3. String formats must be from the supported list


        Example Schemas:


        Simple Company Schema:

        ```json

        {
          "type": "object",
          "properties": {
            "company_name": {
              "type": "string",
              "description": "The name of the company exactly as it appears in the document"
            },
            "form_type": {
              "type": "string",
              "enum": ["10-K", "10-Q", "8-K", "S-1"],
              "description": "The type of SEC form"
            },
            "trading_symbol": {
              "type": "string",
              "description": "The trading symbol of the company"
            },
            "zip_code": {
              "type": "integer",
              "description": "The zip code of the company headquarters"
            }
          },
          "required": ["company_name", "form_type", "trading_symbol", "zip_code"]
        }

        ```


        Complex Resume Schema:

        ```json

        {
          "type": "object",
          "properties": {
            "personalInfo": {
              "type": "object",
              "properties": {
                "fullName": {"type": "string"},
                "contact": {
                  "type": "object",
                  "properties": {
                    "emails": {
                      "type": "array",
                      "items": {"type": "string", "format": "email"}
                    },
                    "phones": {
                      "type": "array",
                      "items": {"type": "string"}
                    }
                  }
                }
              },
              "required": ["fullName"]
            },
            "workExperience": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "jobTitle": {"type": "string"},
                  "company": {"type": "string"},
                  "startDate": {"type": "string"},
                  "endDate": {"type": ["string", "null"]},
                  "isCurrent": {"type": "boolean"}
                },
                "required": ["jobTitle", "company", "startDate"]
              }
            }
          },
          "required": ["personalInfo", "workExperience"]
        }

        ```


        Schema with References:

        ```json

        {
          "type": "object",
          "properties": {
            "algorithms": {
              "type": "array",
              "items": {"$ref": "#/$defs/algorithm"}
            }
          },
          "$defs": {
            "algorithm": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "description": {"type": "string"}
              },
              "required": ["name"]
            }
          }
        }

        ```


        Best Practices:


        1. Use descriptive field names that clearly indicate what data should be
        extracted

        2. Add detailed descriptions to help the AI understand what to extract

        3. Use `enum` for known values (e.g., form types, status values)

        4. Make fields optional by using `anyOf` with `null` or omitting from
        `required`

        5. Use arrays for lists of similar items (e.g., work experience,
        education)

        6. Keep schemas focused - avoid overly complex nested structures

        7. Test with sample documents to ensure the schema captures the expected
        data
      operationId: create_schema_extract_schemas_post
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/SchemaCreateRequest'
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/SchemaResponse'
        '422':
          description: Validation Error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HTTPValidationError'
components:
  schemas:
    SchemaCreateRequest:
      properties:
        name:
          type: string
          title: Name
          description: Name of the schema
        description:
          anyOf:
            - type: string
            - type: 'null'
          title: Description
          description: Description of the schema
        schema_definition:
          additionalProperties: true
          type: object
          title: Schema Definition
          description: >-
            JSON Schema definition. Must be a valid JSON Schema that defines the
            structure of data to extract from documents. See the comprehensive
            schema guide in the API documentation for detailed examples and
            supported features.
          examples:
            - properties:
                company_name:
                  description: >-
                    The name of the company exactly as it appears in the
                    document
                  type: string
                form_type:
                  description: The type of SEC form
                  enum:
                    - 10-K
                    - 10-Q
                    - 8-K
                    - S-1
                  type: string
                trading_symbol:
                  description: The trading symbol of the company
                  type: string
                zip_code:
                  description: The zip code of the company headquarters
                  type: integer
              required:
                - company_name
                - form_type
                - trading_symbol
                - zip_code
              type: object
        metadata:
          anyOf:
            - additionalProperties: true
              type: object
            - type: 'null'
          title: Metadata
          description: Additional metadata for the schema
      additionalProperties: false
      type: object
      required:
        - name
        - schema_definition
      title: SchemaCreateRequest
      description: >-
        Request model for creating a new extraction schema.


        The schema_definition must be a valid JSON Schema that defines the
        structure of data to extract from documents. The system supports a
        subset of JSON Schema 2020-12 features optimized for document
        extraction.
    SchemaResponse:
      properties:
        schema_id:
          type: string
          title: Schema Id
          description: Unique ID of the schema
        name:
          type: string
          title: Name
          description: Name of the schema
        description:
          anyOf:
            - type: string
            - type: 'null'
          title: Description
          description: Description of the schema
        schema_definition:
          additionalProperties: true
          type: object
          title: Schema Definition
          description: JSON schema definition
        metadata:
          additionalProperties: true
          type: object
          title: Metadata
          description: Schema metadata
        created_at:
          type: string
          title: Created At
          description: Timestamp when the schema was created
        updated_at:
          type: string
          title: Updated At
          description: Timestamp when the schema was last updated
      type: object
      required:
        - schema_id
        - name
        - schema_definition
        - metadata
        - created_at
        - updated_at
      title: SchemaResponse
      description: Response model for schema information.
    HTTPValidationError:
      properties:
        detail:
          items:
            $ref: '#/components/schemas/ValidationError'
          type: array
          title: Detail
      type: object
      title: HTTPValidationError
    ValidationError:
      properties:
        loc:
          items:
            anyOf:
              - type: string
              - type: integer
          type: array
          title: Location
        msg:
          type: string
          title: Message
        type:
          type: string
          title: Error Type
      type: object
      required:
        - loc
        - msg
        - type
      title: ValidationError
  securitySchemes:
    BearerAuth:
      type: http
      scheme: bearer
      bearerFormat: API Key

````