AI Solution Catalogue

Solution Description

Red Hat AI 3.0 platform enables users to transition from AI experimentation to fully scalable, production-ready services on a unified open-source platform. Deployable entirely within private data centers, it provides full control over infrastructure, data, and operations while supporting a secure, high-performance Sovereign AI environment.

Built on open-source technologies, Red Hat AI 3.0 prevents vendor lock-in and ensures long-term flexibility and strategic independence. It also allows departments to leverage the best open-source tools and models, combining rapid community-driven innovation with enterprise-grade stability and support.

Key Elements

Accelerated Inference: Uses vLLM and llm-d for distributed, SLA-aware inference, which optimizes response times for high-traffic citizen portals and AI applications while using fewer GPUs.

GPU as a Service: Creates a heterogeneous pool of GPU resources (NVIDIA, AMD, Intel, MetaX 沐曦) across the enterprise, and allows multiple departments to share expensive hardware on-demand, maximizing ROI.

Model as a Service: A centralized control plane to deploy and reuse AI models across users via a unified API gateway. This simplifies cross-departmental use of standardized LLMs for tasks.

Model Customization: Advanced model fine-tuning and RAG tools that enable departments to "shape" models using local laws and policy documents without leaking data.

Agentic AI Development: Centrally manage and develop AI agents using the Model Context Protocol (MCP). Connects AI agents to legacy databases to automate complex workflows like permit approvals.

GenAIOps Tools: Provides enterprise-grade lifecycle management for generative AI applications, including prompt management, model monitoring, evaluation of LLM performance and AI guardrails

MLOps Capability: Provides a streamlined, end-to-end process to build, deploy, and manage AI models at scale with CI/CD pipelines, model registry, data science JupyterLab environements and feature stores.

Use Case

Reducing GPU hardware resources and invement: Red Hat AI 3.0 includes LLM model compression technologies that can reduce the hardware utilization that LLM uses, reducing the need for extra GPUs and thus reducing the hardware investment cost.

Boosting LLM inference and token generation performance: Red Hat AI 3.0 includes LLM inference optimization engine (vLLM) that incorporate state-of-the-art LLM inference techniques that can boost token generation throughput and lower token genration latency, and thus maximizing GPU hardware utilization and performance.

Sharing an LLM model privately with governance and obserability: Red Hat AI 3.0 includes Model-as-a-Service, which departments can act as an internal AI model provider to share an AI model with a large group of users without repeatedly deployment AI for every uses. Such feature can have governance to control who and what models to access, and able to monitor the usage of each AI models across the department and users.

Incorporating private knowledge into AI models: Red Hat AI 3.0 provides model fine-tuning and RAG tools, which departments can incorporate unique data to the AI model, tailor making AI to aware of unique business context and terms without leaking data to public.

Sharing expensive GPU resources acorss users: Red Hat AI 3.0 provides GPU-as-a-Service, where Heterogeneous GPUs (NVIDIA, AMD, Intel, MetaX 沐曦) can be centrally managed as a single, shared cloud-native cluster. Support Multi-tenancy with high security, isolated environments for each bureau while sharing the underlying physical infrastructure, maximizing the ROI of hardware.

Centralize MCP and Agentic AI development: Red Hat AI 3.0 centralize MCP servers management and provisioning, and provide tools to centralize AI application development API endpoints for easier management and maintainability. The platform also provides tools to deploy and integrate 3rd party agentic solutions like LangChain, DIFY.

JuypterLab-as-a-Service: Red Hat AI 3.0 allows user to dynamically create JuypterLab on a private cloud enviroment, and allow administrator centrally manage all JuypterLab resources for data scientist. With this, departments can centralize resource provisioning and management without invidivually installing working environment for each data science users.

Operationalize ML projects: Red Hat AI 3.0 streamlines the end-to-end ML software development lifecyle, covering tools for model development, training, fine-tuning, evaluation, versioning, comparison, deployment and monitoring.

Presentation Videos

If any government department would like to obtain additional information about the AI solution, please contact Smart LAB.