Motivation
AskMyDocs is self-hostable and MIT-licensed — your knowledge base, your embeddings, your audit trail, on your own infrastructure. There is no managed control plane and no phone-home: the only outbound calls are to the AI provider you configure. This page takes a bare Linux host (or container) to a running instance.Requirements
| Component | Minimum | Notes |
|---|---|---|
| PHP | >= 8.3 | with the usual Laravel extensions (pdo_pgsql, mbstring, openssl, …) |
| Composer | 2.x | dependency manager |
| PostgreSQL | >= 15 | with the pgvector extension |
| Node.js | >= 20 | only to build the Vite SPA |
| npm | bundled with Node |
pgvector is not optional — the vector similarity search
and the FTS GIN index are core to retrieval. SQLite is used only in the test
suite (where vector(N) columns swap to JSON text).
The runtime topology
A production instance is three long-lived process groups plus the database: The web tier answers requests; the queue worker runs ingestion + canonical-indexing jobs (IngestDocumentJob, CanonicalIndexerJob); the
scheduler runs nightly retention, graph rebuild, and insight computation.
Install — step by step
Environment file + app key
APP_URL, the DB_* block, your AI_PROVIDER + *_API_KEY, and
KB_EMBEDDINGS_DIMENSIONS to match your embeddings model. See
Configuration.Enable pgvector & migrate
Create the database, then enable the extension once:The migrations create every table, the pgvector columns, and the FTS GIN
index. See the core concepts.
Build the SPA
npm run build compiles the React admin/chat SPA and the embeddable
KITT widget bundle.Run the worker & scheduler
Point a process manager (systemd, Supervisor) at the worker, and add the
scheduler to cron:See Scheduler & maintenance.
Storage: the KB disk
Canonical markdown is the source of truth; the database is a projection rebuildable from it. The KB disk is configured inconfig/filesystems.php and
selected via env:
KB_FILESYSTEM_DISK=s3 (with the standard
AWS_* credentials and composer require league/flysystem-aws-s3-v3 "^3.0") —
this routes ingestion through the fully-configured s3 disk in
config/filesystems.php. Do not set KB_DISK_DRIVER=s3 on its own: the
default kb disk entry carries no S3 credential keys. Every path is normalised
through App\Support\KbPath so the ingest and delete flows resolve identically.
Queue connection
Ingestion is asynchronous by default. For anything beyond a single-node demo, use a real queue:QUEUE_CONNECTION=sync, ingestion runs inline in the request — fine for a
laptop, wrong for production (a 100-document batch would block the HTTP call).
Worked example: a minimal single-node deploy
Gotchas & operations
- pgvector must exist before
migrate. A missing extension fails thevector(N)column creation with a Postgres error. - The dimension is a contract.
KB_EMBEDDINGS_DIMENSIONSmust equal your embeddings model’s output width atmigratetime. Changing it later means a resize migration + cache flush + re-index (gotcha). syncqueue is not for production. Usedatabaseorredisand a supervised worker.- Cache config in production.
php artisan config:cache route:cache— andconfig:clearafter any.envchange. - Health-check the deploy.
GET /healthzreturnsok; the admin dashboard health panel surfaces DB / pgvector / queue / disk / provider status (troubleshooting).
Configuration
Every knob, layered env → config → service.
Scheduler & maintenance
The nightly jobs that keep storage tidy.