NVIDIA NIM
NVIDIA Corporation
NVIDIA NIM
NVIDIA Corporation
NVIDIA NIM
NVIDIA Corporation
Accelerate the deployment of AI inference applications with secure, ready-to-use AI model endpoints.
NVIDIA NIM offered by NVIDIA through Azure Marketplace deployable to Azure AI Foundry, is a set of easy-to-use microservices designed for secure, reliable deployment of high-performance AI model inferencing. Supporting a wide range of AI models, including open-source community and NVIDIA AI Foundation models, NVIDIA NIM ensures seamless, scalable AI inferencing, on-premises or in the cloud, leveraging industry-standard APIs. Together, NVIDIA NIM and Azure AI Foundry provide an optimized inference platform that is operationally efficient.
These prebuilt containers support a broad spectrum of AI models—from open-source community models to NVIDIA AI Foundation models as well as custom AI models. NIM microservices are deployed with a single command for easy integration into enterprise-grade AI applications using standard APIs and just a few lines of code. Built on robust foundations including inference engines like Triton Inference Server, TensorRT, TensorRT-LLM, and PyTorch, NIM is engineered to facilitate seamless AI inferencing at scale, ensuring that you can deploy AI applications on Azure AI Foundry with confidence.
Benefits of NVIDIA NIM include:
- Ease of use: Speed time to market with prebuilt, cloud-native microservices continuously maintained to deliver optimized inference on NVIDIA accelerated infrastructure.
Enterprise Security and Manageability: Maintain security and control of generative AI applications and data with deployment of the latest AI models in your choice of NVIDIA compute on Azure AI Foundry.
Performance and Scale: Improve TCO with low-latency, high-throughput AI inference that scales with cloud—the Llama 3.1 8B NIM delivers up to 2.6X higher throughput than off-the-shelf deployment on H100 systems.