Platform Engineering

Carrier Grade DevOps – How Telcos Utilize GitOps for High Performance Software Engineering

As VentureBeat describes enterprise DevOps teams are bypassing the plethora of management consoles required to manage their applications and instead doing so from within the code itself, through ‘GitOps’ automations.

Enterprise organizations like Chick-fil-A utilize these practices.

Their digital properties are powered by a Digital Experience Engine (DXE), a cloud-based microservices architecture composed of about one hundred services, running in a Kubernetes-based application platform.

They utilize GitOps to manage the complexity of rolling out application updates across a distributed mobile and POS digital business system.

As the term suggests it’s an approach based on using Git as the central, single source of truth for application development and deployment. There is a ‘source of truth’ for both your infrastructure and application code, allowing development teams to increase velocity and improve system reliability.

GitOps.tech offers this intro guide, where they state: “The fundamental idea of GitOps can be summarized as operations managed and performed in a declarative way with Git as the source-of-truth system.”

Your system configuration is defined and stored in a version control system, with the use of software agents to detect when this changes and automatically update the production environment to match it.

This approach brings many benefits:

  • Your apps can be easily deployed and rolled back to and from Kubernetes. And even more importantly, when disaster strikes, your cluster’s infrastructure can also be dependably and quickly reproduced. This trivializes rollbacks; where you can use a `Git revert` to go back to your previous application state.
  • When you use Git workflows to manage your cluster, you automatically gain a convenient audit log of all cluster changes outside of Kubernetes. An audit trail of who did what, and when to your cluster can be used to meet SOC 2 compliance and ensure stability.
  • Continuous deployment automation with an integrated feedback control loop speeds up Mean Time to Deployment. Your team can ship 30-100 times more changes per day, increasing overall development output 2-3 times.

Your system configuration is defined and stored in a version control system, with the use of software agents to detect when this changes and automatically update the production environment to match it.

CodeFresh provides this helpful guide to explain the relationship between GitOps and DevOps:

“GitOps enhances DevOps by incorporating Git throughout the software delivery process, making it easier to orchestrate projects and keep them in sync. The end goal is to achieve smoother, faster, and more reliable software development and delivery.

GitOps pipelines use Kubernetes concepts, so they are easy to adopt by teams who already work with Kubernetes. They build on traditional DevOps practices, so changes to existing workflows are minimal for teams that have invested time in automating their software delivery.”

Carrier Grade DevOps

As Mike Kress writes for the Container Journal, Telcos can reuse these same principles for the same reasons. GitOps can prove essential to the goals of building the Cloud Native Telco, as rolling out new capabilities like 5G requires management of a vast array of new devices.

This scale can prove too vast for enterprise DevOps practices that deal with a relatively small number of applications, and so Mike defines GitOps as ‘Carrier Grade DevOps‘, where operators can define the overall network and device configuration and check it into an auditable revision control system.

A software-based agent checks the repository, detects if some device configurations differ from the repository, and adjusts them. The result is the same; a network of devices, containers, and software components end up with a different configuration. However, with GitOps, there is a central, auditable changelog. Changes are pulled by the network rather than pushed by a script.

Because a versioned repository contains the configuration, it is easy to see what changed if there are problems. This methodology makes the system both more secure and diagnosable while simplifying the overall configuration process.

This New Stack interview explores their Cloud Native journey and adoption of Kubernetes, notably the challenge of doing so given their legacy telco infrastructure and the fact telcos aren’t necessarily technology leaders. They identify the evolution, from the traditional process of simply installing vendor ‘boxes’ during the early 2G era, through moving more to virtualized services via VMs for 3/4G, and now for 5G they are looking for Kubernetes-based microservices.

Speaking on a CNCF webinar Vuk Gojnic explains the detail of managing Kubernetes in a Telco, to operate their 5G services and with a relatively small team of ten SREs. He highlights how they created their own distribution ‘Das Schiff’ (Github repo) for Cluster as a Service.

From 4m:05s he moves on to defining the role of GitOps, through using cluster APIs and Flux CD, to achieve the self-management capability essential to managing the scale of Telco infrastructure. He builds on this further in this Weave.works webinar, showcasing their ‘GitOps Loop’, the design model for the self-managing system. Deutsche Telekom is a keynote case study for Weave.works.

In another CNCF presentation Michal Sewera and Samy Nitsche of DT provide a highly detailed analysis of the Cloud Native Telco model, again emphasizing it as an evolution from ‘box’ deployment through the new Kubernetes software-based world.

From 8m:20s they demonstrate how it is applied for a 5GC architecture, from 10m:30s the challenges of introducing this new paradigm into the organization and the new software practices that are employed, from 16m:30s a decomposition of the Das Schiff system, and from 20m:08s how they utilize GitOps.

Wrapping up from 22m:25s they provide valuable insights into the challenges they’ve met as part of this major transformation, most notably the new software practices and how they impact goals such as live release updates, configuration management, eBPF-based network functions and integrating software testing into this new model.

 

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button