← Back to portfolio
Multi-Cloud Infrastructure Management for AI Applications
John Snow Labs is a healthcare AI company, accelerating progress in data science by providing state-of-the-art models, data and platforms.
Business Challenge
The client was running AI-driven applications across multiple clouds — Azure, AWS, and Oracle Cloud — to serve global customers. However, managing such a multi-cloud environment comes with a lot of challenges.
Key Features
Designed and managed scalable infrastructure across Azure, AWS, and Oracle Cloud for AI workloads.
Used Terraform to standardize deployments across clouds and enforce infrastructure-as-code best practices.
Implemented role-based access control (RBAC) and security policies to meet compliance requirements.
Configured GPU-enabled compute clusters for model training and inference.
Configured bare-metal GPU instances for high-performance AI workloads.
Built a CI/CD pipeline with Jenkins for infrastructure and application updates.
Used Helm to manage Kubernetes applications across clouds.
Configured Keycloak for centralized authentication and authorization.
Results
Unified deployments across Azure, AWS, and Oracle Cloud with Terraform.
Reduced provisioning time from days to under 30 minutes.
Cut model training times by up to 60% with GPU clusters and bare-metal instances.
Increased inference throughput to millions of predictions per day.
Strengthened compliance with role-based access and centralized identity management.
Eliminated security gaps by unifying authentication with Keycloak.
Jenkins-based CI/CD enabled zero-downtime deployments.
Reduced application rollout time by 70% using Helm for Kubernetes management.
Increased system reliability with proactive monitoring using Prometheus and Grafana.
Tech Stack
AWS
Azure
Oracle Cloud
Kubernetes
Helm
Terraform
ElasticSearch
Jenkins
Nodejs
PostgreSQL
Docker
Python
C#
Java