AI Solution Catalogue

Solution Description

OrchAI Platform is a private, scalable, on-premise AI model management platform purpose-built for government enterprise use. It centralises the full AI model lifecycle — deployment, serving, monitoring, and decommissioning — on government-owned GPU infrastructure, ensuring all data remains within the government data centre at all times.

The platform enforces Role-Based Access Control (RBAC) through a unified API Gateway: each application team or system is issued a scoped API key that controls which AI models it may access, with configurable rate limits (per minute / hour / day) and monthly token quotas enforced at the gateway level. This enables multiple departments and applications to securely share the same GPU infrastructure without cross-team data exposure.

OrchAI supports language models (LLMs via vLLM / Ollama), vision models (YOLO via NVIDIA Triton), and speech models (Whisper via vLLM) through a standardised OpenAI-compatible API, with real-time usage monitoring via Prometheus and Grafana.

OrchAI 平台是一套專為受監管企業及政府機構而設的私有、本地部署人工智能模型管理平台,直接應對企業採用人工智能時面臨的三大核心挑戰:

安全性——所有模型與數據均在機構自有基礎設施內運行,與公共互聯網及商業雲端供應商完全隔離,在不作任何妥協的前提下,滿足數據主權及保密資料處理的合規要求;

問責性——透過基於角色的存取控制(RBAC)、按團隊或系統發放的 API 金鑰授權範圍、實時 Token 用量追蹤及完整審計記錄,讓管理層對每位用戶使用哪個模型、消耗多少資源、用於何種用途一目了然,符合受監管環境所要求的管治標準;

可擴展性——統一的 API 閘道與 GPU 加速推理架構,支援多個部門、應用系統及人工智能模式(語言、視覺、語音)在同一基礎設施上安全共用資源,並可從單一部門逐步擴展至全機構部署,毋須重新設計架構。

Use Case

Department common LLM platform.

1. The team config the (Qwen3, Gemma3) via API key for chatbot and document processing use cases.

2. Teams deploying domain-specific fine-tuned models for healthcare workflows.

3. Object detection (YOLO) via Triton for image-based workflows.

4. Audio transcription (Whisper) for meeting and clinical note processing.

Presentation Videos

If any government department would like to obtain additional information about the AI solution, please contact Smart LAB.