AI is reshaping the way we build and run businesses in all
industries. One challenge with AI models is scaling. Traditional infrastructure
is not built for the dynamic scaling required by AI models during training or
operation.
To optimise cost and reduce operational overhead, a modern
approach combining serverless and microservices to provide a flexible,
scalable, and efficient workload layer is required. Microsoft Azure enables
these two mechanisms through Azure Functions and Azure Kubernetes Services.
Serveless is required for AI deployments, especially because
of unpredictable demands. The capability of running a function triggered by an
AI agent in response to an event without the overhead of deployments is crucial
for multi-agent AI solutions. Serverless is needed for real-time image
recognition, language translation and dynamic execution of payloads triggered
by APIs, data streams and IoT devices.
Another advantage of a serverless approach is agility and
the ability to deploy AI models with fewer infrastructure concerns, increasing
the iteration cycles. It enables developers and data scientists to deploy their
solutions more easily and quickly. If 10 years ago, deployment in a Development
environment every 1-2 days was normal, nowadays, 20-30 deployments per day are
part of the way of working using service infrastructure.
Azure Functions provides the agility, scaling and
flexibility required to build such a solution on top of the Microsoft
ecosystem. It combines the serverless benefits with strong integration with
other services and the security layer of Microsoft Azure.
Running AI models in a serverless approach can be
challenging and has drawbacks. For instance, a 'cold start ', a delay in the
execution of a function due to its initialisation, can impact the AI system's
responsiveness. Resource limitations, especially regarding GPU and CPU,
stateful processing, limited or no GPU support, and cost optimisation at scale,
are other potential challenges.
Kubernetes provides a managed environment for a
container-based approach, enabling AI models to run and scale successfully.
Azure Kubernetes Services (AKS) can also successfully run AI workloads,
providing high computation resources in a cost-efficient environment.
A chatbot with NLP capabilities deployed across AKS can
scale across multiple nodes of the same cluster in a high-availability and
low-latency environment.
Compared with a native serverless approach, Kubernetes and
AKS can successfully manage and orchestrate GPU and CPU workloads. With native
support for GPU nodes, an AI model can be trained on GPU nodes without
additional infrastructure complexity. Kubernetes allow organisations to share
the GPU nodes of their backend across multiple models, reducing processing time
while keeping GPU costs low.
Azure Functions and AKS run in a native cloud environment.
There are scenarios where multi-cloud and hybrid deployments are required. For
additional flexibility, Azure Arc extends AKS capabilities, allowing
organisations to run AI models in non-cloud environments or edge locations.
This is a perfect solution for organisations requiring data sovereignty and
localised AI models. The healthcare industry is a good example, where
processing patients' data might require AI models that run on-premise environments
to comply with local laws.
The combination of serverless and Kubernetes in a modern
computation approach enables organisations to run their AI workloads
cost-effectively, event-driven, and scalable. Azure Functions and AKS are two
services that respond to AI deployment needs with advanced orchestration
capabilities. With the strong support of containerisation for AKS and Azure
Functions, Azure Arc brings cloud capabilities to locations where Azure Region
or cloud cannot be used.
The combination of Azure Arc, Azure Kubernetes, and Azure Functions enables customers to build, run, and manage AI applications in a multi-cloud and hybrid approach while keeping the technical stack and debt under control.
Comments
Post a Comment