Documentation Index
Fetch the complete documentation index at: https://docs.experio.cloud/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Data sources define which folders and files Experio should scan and process from your connected cloud storage providers. Each data source is linked to a connector and specifies folder paths, scanning behavior, and filtering rules. Navigate to Admin > Data Sources > Data Sources.Creating a Data Source
Click Add New Data Source to start a multi-step configuration wizard:Choose Source Type
Select the type of data source:
- Box — Scan folders from a Box account
- Google Drive — Scan folders from Google Drive
- SharePoint — Scan folders from a SharePoint site
- File Upload — Upload files directly to Experio
Validate Configuration
Enter the connection details and validate that Experio can access the specified location. The system verifies credentials and folder access.
Configure Filters
Set up folder hierarchy and filtering rules:
- Folder paths — Specify which folders to scan
- Recursive scanning — Include subfolders
- Filter expressions — Include or exclude files based on patterns
Setup Source
Configure ingestion settings for the data source:
- Days to sync — How far back to scan for files
- Use OCR — Enable optical character recognition for scanned documents
- Classification max pages — Limit pages sent to the classifier
- Ingestion type — Choose Full ingestion (default) for the complete pipeline, or Parse only to stop after parsing (useful when a downstream system handles classification and embedding)
Data Source Properties
| Property | Description |
|---|---|
| Name | Display name for identifying the data source |
| Connector | The authorized connection to use |
| Folder Path | Root folder to scan |
| Recursive | Whether to scan subfolders |
| Filter Expression | Pattern to include/exclude files |
| Ingestion Type | Pipeline mode: Full ingestion (default) runs the complete pipeline (download → parse → classify → graph → embed). Parse only stops after parsing — files are downloaded and parsed, but not classified, added to the knowledge graph, or embedded. Parsed artifacts are stored in Minio for downstream consumption. |
| Status | Active, paused, or error |
Managing Data Sources
Editing
Click on any data source to open its configuration. Modify settings and save to apply changes. Changes take effect on the next scan cycle.Monitoring
Each data source shows:- Last scan time — When the source was last scanned
- Files found — Number of files discovered
- Files processed — Number of files successfully ingested
- Errors — Any files that failed processing
OAuth Callbacks
For Box and SharePoint data sources, OAuth callback handling is built in. If a token expires, you’ll be prompted to re-authorize through the connector.Parse-Only Mode
When a data source has Ingestion Type set to Parse only, the ingestion pipeline stops after downloading and parsing files. Specifically:- Files are downloaded from the cloud provider and parsed using the standard parser
- Parsed artifacts are stored in Minio (under
parsed/{file_id}/...) with the same retention policy as full ingestion - No classification, graph ingestion, or embedding occurs
- Files reach a terminal status of
parsed_onlyinstead ofingested
Ingestion Type can only be changed when the data source has no files currently processing. If you try to switch modes while a scan is in flight, the update is rejected with a validation error. Wait for the current scan to complete (or stop it) before changing the mode. The new mode takes effect on the next scan.
File Upload
The File Upload data source type allows direct file uploads:- Drag and drop files onto the upload area
- Track upload progress with visual indicators
- Files are queued for processing automatically after upload