DEVOPS & CLOUD
How to build hotdog vs. not hotdog on 5G Edge
Deploying your first Kubernetes application in a Wavelength Zone.
By Robert Belson, Corporate Strategy, Verizon, Mark Persiko & Bryan Kobler, Sr. DevOps Engineers at Verizon Location Technology
Ever find yourself missing that fateful moment in Season 4 of the critically-acclaimed TV series, Silicon Valley, where the promise of “Shazam for food” was swiftly spoiled by a much simpler, yet indubitably comical, hotdog vs. not hotdog classifier?
In this tutorial, you can re-live your favorite moments of Silicon Valley — but this time at the 5G network edge. We’ll walk through how to automate your first managed Kubernetes cluster in an AWS Wavelength Zone on EKS. Next, we’ll create deployment manifests for our inference server, API server and web host and expose select services through a NodePort. Lastly, we’ll showcase the performance of the application using a Verizon 5G Ultra Wideband-enabled device connected to Verizon 5G Edge.
If you’ve made it this far, what are you waiting for? Let’s start building.
Part 1: AWS Wavelength Zone infrastructure
To simplify things, let’s use an existing CloudFormation template for the entire environment, as well as the handy Designer feature, to unpack what will be created.
VPC and subnets
● We’ll start with a single VPC with three public subnets: two subnets in the parent region and one subnet in a Wavelength Zone. As an example, if you select us-west-2 as your region, you could have public subnets in us-west-2a and us-west-2b in the parent region and us-west-2-wl1-sfo-wlz-1, the Bay Area Wavelength Zone, as your carrier subnet.
● Next, to facilitate internet access, we attach an Internet Gateway to our VPC for our two parent region subnets and a Carrier Gateway to our VPC for our carrier subnet.
● After configuring appropriate route tables for our Internet and Carrier gateways, we add VPC Endpoints so that our worker nodes can effectively self-register to the control plane without associating Carrier IPs to the nodes themselves
EKS cluster and node groups
● In the parent region, almost 99% of the heavy lifting for EKS could be done in the AWS Console. For this exercise, after creating the EKS cluster, we need to launch a node group of self-managed nodes, which can be achieved by using an existing AWS-managed CloudFormation template.
● Note: Through this template, we need to extract the node ID of the auto-scaling group from the template outputs in order to allocate and attach a Carrier IP to the node. To learn more about allocating Carrier IP addresses, check out this previous blog.
● Now, the most important piece: To ensure that your EKS cluster can make calls to other AWS services (i.e. to manage your resources), we create an IAM role for the cluster.
● We also need an IAM role for the node group with two key policies: AmazonEKSWorkerNodePolicy and AmazonEC2ContainerRegistryReadOnly. This allows us to self-register to the control plane and pull ECR images to use.
At this point, you should be ready to begin configuring your EKS cluster and deploying containers. To learn more about EKS, read about our experience deploying our first EKS cluster to Verizon 5G Edge with AWS Wavelength.
Deploying our first 5G application at the edge
Mark Persiko, Sr. DevOps Engineer at Verizon Location Technology
Breaking the monolith
While this application could be run as a monolith on EC2 without EKS, here’s what could go wrong:
● Limited infrastructure resilience: If the underlying EC2 instance fails, the whole application fails by design. Moreover, if a single component fails, then the entire application workflow could be halted at any moment.
● Networking overhead: If you decide to decouple your application as a series of EC2 instances behind Auto Scaling groups (e.g., one ASG for API, one ASG for Inference, etc.), the underlying connectivity could become extremely burdensome. For example, in this environment, how does the API server know the new IP address of the latest inference server, and how is traffic load-balanced? Even if you address this with a series of load balancers, you incur an unnecessary premium for a managed service mesh, load balancer, or any incremental architecture.
With EKS, we can focus on our application and EKS can take care of self-healing, auto scaling, networking and — most importantly — cost efficiency.
Is it lunchtime yet?
Jokes aside. Let’s get started on unpacking our application architecture. Now that we have an EKS cluster with a worker node in a Wavelength Zone, we need to decide how to deploy the application itself to EKS.
In this example, we’ve already done the heavy lifting to develop the application and publish our application through publicly-available container images on ECR.
So, let’s map out our deployment. We know that we want our application to classify images as a “hotdog” or “not hotdog,” but how should we break this down into individual microservices? Here’s one approach:
● Web: Web server with a user interface that allows visitors to upload images to classify as hotdog or not hotdog
● API: API server that accepts incoming requests consisting of images the user would like to classify
● Inference: Image classification engine itself (i.e., PyTorch + Torchserve default object_detector module)
Now that we’ve identified the individual components, let’s layer in a series of constraints:
● For the sake of simplicity, let’s assume that services needing internet access will be exposed via NodePort
● For cost-saving purposes, the Wavelength node group is of size 1. However, for optimal performance of our inference engine, consider that node group to run on a g4dn.2xlarge (i.e., GPU-based) instance
Now we’re ready to create a deployment roadmap. If we assume that each microservice we delineated above — Web, Inference and API — all have appropriate Kubernetes objects, such as Deploy, Service, etc., here’s how we could think about the workflow:
● Step 1: Client navigates to app via IP address/port exposed by worker node IP/NodePort, which is itself the web microservice
● Step 2: Uploaded image is sent to API Deploy through respective worker Node IP + NodePort associated to Deploy
● Step 3: Image is passed from the API to Inference service using the K8S service object DNS name
● Step 4: Classification is returned to the API web service for users to see whether the image is a hotdog or not a hotdog.
Part 2: Deploy Web Host
After authenticating to your cluster, it’s time to create a new file webDeployment.yaml for your first deployment manifest. To begin, we create a Deployment called web-deployment and open port 80 on the container itself. Next, we use a NodePorts expose the Service on each Node’s IP at a static port (we chose 30007).
Note: Be sure that your node group security group allows access to port 30007 (e.g., the NodePort itself).
Lastly, be sure to apply the deployment manifest.
kubectl apply -f webDeployment.yaml
Part 3: Deploy API Proxy
Now that we have our web host, we need our API proxy to take in the requests (i.e., images) to forward onto the inference engine.
For your first deployment manifest, create a new file apiDeployment.yaml. In this file, we create a Deployment api web-deployment and open port 5000 on the container itself. Next, we use a NodePorts expose the Service on each node’s IP at a static port (we chose 30008).
Note: Just as with the webDeployment, make sure that your node group security group allows access to port 30008.
Next, be sure to apply the configuration with kubectl.
kubectl apply -f apiDeployment.yaml
Part 4: Deploy Inference Engine
Now that we have our web host and API proxy, it’s time to deploy the PyTorch inference engine. For this deployment manifest, create one last file infDeployment.yaml. In this file, we create a Deployment api web-deployment and open port 8080 on the container itself, which PyTorch expects to serve the model.
Because this container does not need internet access, we only expose the Service as ClusterIP (we chose 8080).
Congratulations! You have finished deploying the hotdog vs. not hotdog application. To view it in action, visit any of the Public IPs in your cluster using the following URL:
You’ll see that the web application automatically recognizes the carrier IP of the underlying node. All you need to do is upload an image, click Process Image, and Verizon 5G Edge takes care of the rest.
Thanks to Mike Coleman and the AWS team for maintaining an awesome ML inference lab on AWS Wavelength. To check out the lab, join a Verizon 5G Edge and AWS Wavelength Immersion Day!