Ingest a document into a given Datastore.
Ingestion is an asynchronous task. Returns a document id which can be used to track the status of the ingestion job through calls to the GET /datastores/{datastore_id}/documents/{document_id}/metadata API.
This id can also be used to delete the document through the DELETE /datastores/{datastore_id}/documents/{document_id} API.
file must be a PDF, HTML, DOC(X) or PPT(X) file. The filename must end with one of the following extensions: .pdf, .html, .htm, .mhtml, .doc, .docx, .ppt, .pptx.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Datastore ID of the datastore in which to ingest the document
File to ingest.
Metadata request in stringified JSON format. custom_metadata is a flat dictionary containing one or more key-value pairs, where each value must be a primitive type (str, bool, float, or int). The default maximum metadata fields that can be used is 15, contact [email protected] if more is needed. The combined size of the metadata must not exceed 2 KB when encoded as JSON. The strings with date format must stay in date format or be avoided if not in date format. The custom_metadata.url or link field is automatically included in returned attributions during query time, if provided.
**Example Request Body (as returned by `json.dumps`):**
```json
"{{
\"custom_metadata\": {{
\"topic\": \"science\",
\"difficulty\": 3
}}
}}"
```Whether to perform deep validations on the ingestion result after completion of the ingestion job. If this feature is disabled, basic validations will still be performed which confirm that all expected chunk IDs are present in the VectorDB. If enabled, full equality checks are performed between the expected chunks and the actual chunks present in the VectorDB.
These full equality checks are computationally expensive and should only be used during tests or debugging.Overrides the datastore's default configuration for this specific document. This allows applying optimized settings tailored to the document's characteristics without changing the global datastore configuration.
Unique identifier for the file to be ingested.
"FILE_ID_123"
Successful Response
Response body from POST /data/documents
ID of the document being ingested