Platform Engineering

GitOps for Telcos: Carrier Grade DevOps

How Deutsche Telecom is harnessing GitOps practices to achieve autonomous Kubernetes-based 5G infrastructure.

The sweet spot of the Cloud Native Telco opportunity is the potential to transform the DevOps model so that they can achieve a much higher frequency of code deployments and thus output a much faster rate of new digital services.

Central to this is learning from the enterprise sector and their best practices such as ‘GitOps’. This is an approach based on using Git as the central, single source of truth for application development and deployment.

Enterprise organizations like Chick-fil-A utilize these principles. Their digital properties are powered by a Digital Experience Engine (DXE), a cloud-based microservices architecture composed of about one hundred services, running in a Kubernetes-based application platform.

They utilize GitOps to manage the complexity of rolling out application updates across a distributed mobile and POS digital business system.

Carrier Grade DevOps

As Mike Kress writes for the Container Journal, Telcos can reuse these same principles for the same reasons. GitOps can prove essential to the goals of building the Cloud Native Telco, as rolling out new capabilities like 5G requires management of a vast array of new devices.

This scale can prove too vast for enterprise DevOps practices that deal with a relatively small number of applications, and so Mike defines GitOps as ‘Carrier Grade DevOps‘, where operators can define the overall network and device configuration and check it into an auditable revision control system.

A software-based agent checks the repository, detects if some device configurations differ from the repository, and adjusts them. The result is the same; a network of devices, containers, and software components end up with a different configuration. However, with GitOps, there is a central, auditable changelog. Changes are pulled by the network rather than pushed by a script.

Because a versioned repository contains the configuration, it is easy to see what changed if there are problems. This methodology makes the system both more secure and diagnosable while simplifying the overall configuration process.

Deutche Telecom: The Roadmap to the Fully Cloud Native Telco

A keynote exemplar doing exactly this is Deutsche Telecom.

This New Stack interview explores their Cloud Native journey and adoption of Kubernetes, notably the challenge of doing so given their legacy telco infrastructure and the fact telcos aren’t necessarily technology leaders. They identify the evolution, from the traditional process of simply installing vendor ‘boxes’ during the early 2G era, through moving more to virtualized services via VMs for 3/4G, and now for 5G they are looking for Kubernetes-based microservices.

Speaking on a CNCF webinar Vuk Gojnic explains the detail of managing Kubernetes in a Telco, to operate their 5G services and with a relatively small team of ten SREs. He highlights how they created their own distribution ‘Das Schiff’ (Github repo) for Cluster as a Service.

From 4m:05s he moves on to defining the role of GitOps, through using cluster APIs and Flux CD, to achieve the self-management capability essential to managing the scale of Telco infrastructure. He builds on this further in this Weave.works webinar, showcasing their ‘GitOps Loop’, the design model for the self-managing system. Deutsche Telekom is a keynote case study for Weave.works.

In another CNCF presentation Michal Sewera and Samy Nitsche of DT provide a highly detailed analysis of the Cloud Native Telco model, again emphasizing it as an evolution from ‘box’ deployment through the new Kubernetes software-based world.

From 8m:20s they demonstrate how it is applied for a 5GC architecture, from 10m:30s the challenges of introducing this new paradigm into the organization and the new software practices that are employed, from 16m:30s a decomposition of the Das Schiff system, and from 20m:08s how they utilize GitOps.

Wrapping up from 22m:25s they provide valuable insights into the challenges they’ve met as part of this major transformation, most notably the new software practices and how they impact goals such as live release updates, configuration management, eBPF-based network functions and integrating software testing into this new model.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button