Deployment and operations overview
This chapter targets operations, IT, and security teams. If you’re evaluating from the business side, skim “Quick planning” first; if you operate the stack, walk the sections in order.
Quick planning
| Profile | Recommendation | Why |
|---|---|---|
| Small teams / fast trial | SaaS cloud | Zero infra to run |
| Large domestic enterprises / sensitive data | On-premises | Code and data stay inside the perimeter |
| Multinational / strict data sovereignty | Hybrid | combo agent locally, shared services in cloud |
| Defense / government / classified | Domestic-stack on-prem | All-domestic CPUs, DBs, and LLMs |
Details: Deployment plans.
Chapter map
Deployment plans
SaaS vs private vs hybrid vs domestic stack — hardware envelope, scope of delivery, licensing
Installation
Single node quick path · multi-node · Kubernetes · air-gap · upgrade & rollback
Security & compliance
Data boundaries · credential handling · audit · classified / domestic compliance
Monitoring & ops
System metrics · LLM cost · logs · alerting · tuning
Architecture cheat sheet
Minimum hardware (reference)
| Footprint | Nodes | Spec / node | Notes |
|---|---|---|---|
| Single-node sandbox | 1 | 8 vCPU / 32 GB RAM / 200 GB SSD | POC / demo |
| Small team (under 50) | 3 | 16 / 64 / 1 TB SSD | Split data vs app vs storage |
| Mid-market (50–500) | 5–10 | 16 / 64 / 2 TB SSD + GPU if self-hosting LLMs | Horizontally scale app tier |
| Large (>500) | Custom | — | Capacity planning engagement |
Self-hosted LLM nodes depend on the model card—see Installation.
Upgrade / rollback
- SaaS: platform-managed rolling upgrades
- On-prem: quarterly stable trains + security hotfixes — supports in-place upgrade and one-click rollback
- Rollback window: revert to prior build within ~24 h post-upgrade
FAQ
Q: Bring our own LLM? A: Yes—configure arbitrary OpenAI-compatible, Anthropic, domestic models (DeepSeek / Qwen / GLM / MiniMax), or self-hosted vLLM.
Q: Air-gapped data? A: On-prem + self-hosted LLM keeps code, data, models, and inference inside the customer network.
Q: SSO? A: OIDC / SAML 2.0—Azure AD, Okta, Feishu, WeCom, LDAP, etc.