Ingest a document into a given Datastore.
Ingestion is an asynchronous task. Returns a document id which can be used to track the status of the ingestion job through calls to the GET /datastores/{datastore_id}/documents/{document_id}/metadata API.
This id can also be used to delete the document through the DELETE /datastores/{datastore_id}/documents/{document_id} API.
file must be a PDF, HTML, DOC(X), PPT(X), PNG, JPG, or JPEG file. The filename must end with one of the following extensions: .pdf, .html, .htm, .mhtml, .doc, .docx, .ppt, .pptx, .png, .jpg, .jpeg.
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Datastore ID of the datastore in which to ingest the document
File to ingest.
Metadata request in stringified JSON format. custom_metadata is a flat dictionary containing one or more key-value pairs, where each value must be a primitive type (str, bool, float, or int). The default maximum metadata fields that can be used is 15, contact [email protected] if more is needed. The combined size of the metadata must not exceed 2 KB when encoded as JSON. The strings with date format must stay in date format or be avoided if not in date format. The custom_metadata.url or link field is automatically included in returned attributions during query time, if provided.
**Example Request Body (as returned by `json.dumps`):**
```json
"{{
\"custom_metadata\": {{
\"topic\": \"science\",
\"difficulty\": 3
}}
}}"
```{
"custom_metadata": { "field1": "value1", "field2": "value2" }
}Overrides the datastore's default configuration for this specific document. This allows applying optimized settings tailored to the document's characteristics without changing the global datastore configuration.
{
"parsing": {
"figure_caption_mode": "custom",
"figure_captioning_prompt": "Generate a detailed caption for each figure."
}
}Successful Response
Response body from POST /data/documents
ID of the document being ingested