June 5, 2024
Cisco AI Toolkit Demonstration
In this technology demonstration, we will showcase the deployment of an AI workload using Cisco Intersight automation on a single UCS X-Series blade, leveraging a neighboring X440p (X-Fabric) with an NVIDIA A40 GPU. This demo will highlight the powerful capabilities of both Cisco Intersight and the Cisco AI Toolkit, which includes tools for AI model deployment, system monitoring, and resource management. By utilizing Miniconda for environment management, CUDA for GPU acceleration, and various large language models like Vicuna and Meta OPT, we will feed data into a retrieval-augmented generation (RAG) model.
This local, non-cloud setup ensures maximum data security for our customers to explore their own RAG initiatives. To learn more, visit the AI Proving Ground section on the WWT platform.
Intro: AI Toolkit
- 1:40 High Level Block Diagram of Software Stack
- 3:02 Intersight, UCSX Series and the X440P (X-Fabric)
- 7:33 Automated Deployment of Ubuntu via Intersight Cloud Orchestrator
- 10:50 AI Toolkit Deployment
- 12:25 RAG Demo
Software components included:
- Ubuntu Linux for the operating system.
- GCC Compiler for development with NVIDIA CUDA.
- NVIDIA GPU drivers and CUDA for GPU acceleration.
- Miniconda for package and environment management.
- AI Monitor for system resource monitoring.
- WebUI for LLM testing and fine-tuning.
- OpenAI compatible API.
- Various LLMs such as Vicuna and Meta OPT models.
- LangChain and Chroma for document inferencing.