Azure HPC PRO: 8-Week PoC

OAKWOOD SYSTEMS GROUP INC

Oakwood’s 8-week Azure HPC PRO PoC delivers a scalable, mid-tier HPC solution with AKS support, advanced scheduling, and high-speed connectivity for moderate workloads.

Oakwood’s Azure HPC PRO PoC is purpose-built for organizations with moderate compute demands that are ready to move beyond the constraints of on-premises infrastructure. This 8-week engagement delivers a robust, cloud-based HPC environment that combines the scalability of Azure with high-performance data solutions like Premium SSD and NetApp Files. Containerized workloads are supported via AKS, while Slurm enables intelligent, mid-tier job scheduling across compute nodes. With ExpressRoute ensuring secure, high-speed connectivity, this PoC is ideal for teams seeking a cost-effective, reliable, and production-relevant path to adopting Azure HPC for simulation, modeling, or AI/ML workloads.

  • Delivery Approach:

    • Assessment: Identify core performance requirements for mid-range workloads.
    • Design: Create a balanced environment with container support, mid-tier storage, and computing power.
    • Deployment: Configure CycleCloud, Slurm, AKS, and premium storage options like NetApp Files.
    • Testing: Execute moderate HPC tasks and analyze scalability.
    • Review and Report: Deliver a report detailing workload management, performance metrics, and production readiness.
  • Deliverables:

    • Customized CycleCloud deployment with mid-tier compute resources.
    • Integration of Slurm for job scheduling and AKS for containerized workloads.
    • Enhanced storage using Azure Premium SSD or NetApp Files.
    • Moderate HPC workload testing to assess scalability and performance.
  • Notes:

    • In addition to the 8 weeks, the Oakwood Team will devote an additional 2 weeks to work with the organization/user(s) to further optimize and manage the environment to ensure optimal performance.
    • Th This PoC supports containerized workloads via AKS and features mid-level job scheduling with Slurm, allowing for adaptable workload management. Secure, high-speed connectivity through ExpressRoute makes this PoC ideal for organizations seeking a reliable and cost-effective solution that offers both scalability and high performance for moderate HPC applications.

Delivery Approach:

  •	Assessment: Identify core performance requirements for mid-range workloads.
  •	Design: Create a balanced environment with container support, mid-tier storage, and computing power.
  •	Deployment: Configure CycleCloud, Slurm, AKS, and premium storage options like NetApp Files.
  •	Testing: Execute moderate HPC tasks and analyze scalability.
  •	Review and Report: Deliver a report detailing workload management, performance metrics, and production readiness.

Deliverables:

  •	Customized CycleCloud deployment with mid-tier compute resources.
  •	Integration of Slurm for job scheduling and AKS for containerized workloads.
  •	Enhanced storage using Azure Premium SSD or NetApp Files.
  •	Moderate HPC workload testing to assess scalability and performance.

Notes:

  •	In addition to the 8 weeks, the Oakwood Team will devote an additional 2 weeks to work with the organization/user(s) to further optimize and manage the environment to ensure optimal performance.
  • The organization/user(s) will be limited by the number of jobs that can be ran during this PoC.
       • At this point a discussion will be had around moving forward with a full production environment or decommissioning the infrastructure that was stood up during the PoC.

Is Oakwood’s Azure HPC PRO PoC right for you?

  •	Targeted Use: Moderate computational demands and flexible workload management.
  •	Complexity & Level of Effort: Moderate complexity, with a balanced setup that includes containerized workload support via AKS, mid-tier storage, and advanced job scheduling with Slurm. Higher level of customization compared to the CORE PoC.
  •	Scalability: Offers moderate scalability, ideal for organizations seeking a reliable solution for moderately complex applications and workloads.

Recommendation Guide:

  •	If you need: A mid-level HPC environment that balances cost and performance, with containerized workloads and scalable storage options.
  •	Consider HPC CORE if: Your workload is lightweight and doesn’t require containerization or high storage performance.
  •	Consider HPC MAX if: You have complex workloads with high-performance needs, such as AI/ML simulations, requiring advanced networking and storage.
https://store-images.s-microsoft.com/image/apps.45871.9652d3f0-5dc2-4ccc-ab0e-2d69755a3e2c.7e050c9f-7515-43f2-86e4-c2b5aeb24b95.778de7a8-b9d7-4b77-b102-8f2dde0ab98a
https://store-images.s-microsoft.com/image/apps.45871.9652d3f0-5dc2-4ccc-ab0e-2d69755a3e2c.7e050c9f-7515-43f2-86e4-c2b5aeb24b95.778de7a8-b9d7-4b77-b102-8f2dde0ab98a
https://store-images.s-microsoft.com/image/apps.52346.9652d3f0-5dc2-4ccc-ab0e-2d69755a3e2c.7e050c9f-7515-43f2-86e4-c2b5aeb24b95.0fe585ae-1cc9-419b-8d1c-8f032b73570b
https://store-images.s-microsoft.com/image/apps.50979.9652d3f0-5dc2-4ccc-ab0e-2d69755a3e2c.7e050c9f-7515-43f2-86e4-c2b5aeb24b95.7e96de27-ffac-41b6-9b5e-c168b399449d