OpenShift

Containers and OpenShift

Container Technology

Velocity Software has released support for managing performance for all of the container technologies. This even includes zCX on z/OS. Using the standard SNMP data collector provided by Velocity Software, some additional metrics were included to identify containers and pods within the container server.

The container technologies include:

DOCKER: The original research is from 2016 when customers started implementing Docker on Linux on z. There is a Docker analysis discussion produced at that time.
Redhat Openshift: Marketed by IBM and Redhat, RHOS is based on Kubernetes, and has started attracting customers. It is in production in some of the larger shops. Due to its large overhead, IBM provides 3 additional IFLs to lower the associated cost of operations.
Rancher (SUSE): Produced by Suse, Rancher is a much lighter container management architecture. Rancher also uses Kubernetes.
zCX: Running on a zIIP managed by z/OS, this is still Linux and performance data is acquired using the same methods. The CPU numbers are likely not as accurate as when running on an IFL under z/VM, since accurate SMT CPU metrics are not provided in this environment.

As these technologies all are using SMT, there is a separate discussion on SMT metrics.

Performance Management for Container Implementations

Velocity Software's objective is always full performance management. Full performance management includes:

Capacity Planning: Provide long term metrics allowing for full capacity planning at all levels
Performance analysis: Provide metrics for analyzing performance problems
Chargeback: Provide valid metrics allowing for chargeback for workloads to as granular a level as required.
Alert Management: Provide the ability to detect problems and pass on alerts to operations within any architecture chosen by the enterprise

To understand container technology, there is no extra level of virtualization, there is just a container manager process that owns the resources, and allows spawned processes (containers) to use those resources.

To provide metrics suitable for chargeback, all the data is needed to understand resources consumed by:

Platform processes, the overhead processing not associated with the container management or containers,
Container metrics, combining all processes associated with a container
Pod metrics, combining all containers associated with a pod.

With Velocity Software's unique approach that collects all process data each minute, all needs of chargeback are available from zVPS, the Velocity Performance Suite. The container provided by Velocity Software to expose the metrics also collects container configuration information. This information identifies the pods, the pod names, the container names and associated processes.

RHOS servers are closed servers in that the platform can not be modified to add instrumentation. Velocity Software provides a container that can be installed within the normal architecture from which full metrics are available. This includes the container and pod configurations needed for full accounting.

RHOS provides several pods and containers that are part of managing the container workload. These are what accounts for they typical overhead of three IFLs.

Case Study of Production Environment

zVPS collects metrics to cover chargeback requirements at all levels possible. From a typical Linux environment, the ability to charge back for application resource consumption was provided in the early Linux on mainframe days. This technology now includes the metrics defining container and pod configuration indexes allowing resource consumption by pod and by container to be exposed.

From a Linux perspective, part of the research was to understand total resource consumption and understand resources consumed by RHOS functions such as Prometheus kube functions. The following is an example of one server with no work, just the standard operational pods on our test system. Note the pods and containers within those pods. The list of both is very extensive for RHOS. But given that the CPU and real storage (RAM) resoure consumption is available, the data can be used for chargeback as well as capacity planning.

Report: ESAK8S2      Kubernetes Resource Utilization Report
Monitor initialized: 05/02/23 at 18:54:30 on 8562 serial 040F78
-------------------------------------------------------------------
 
NODE/                    <---Container-->  <--Container CPU------->
Time/ PodName            <--Process ID-->  <------CPU Percents---->
Date   ContainerName      ProcID ProcName   Tot  sys user syst usrt
------------------------  ------ --------  ---- ---- ---- ---- ----
rhoscp1
   console-operator-59d
    console-operator       17395 console   0.62 0.12 0.50    0    0
   openshift-controller
    openshift-controller   29430 cluster-  0.38 0.08 0.30    0    0
   kube-controller-mana
    kube-controller-mana   12030 cluster-  1.15 0.15 1.00    0    0
   oauth-openshift-64b4
    oauth-openshift        12987 oauth-se  0.50 0.05 0.45    0    0
   apiserver-d84c8f947-
    oauth-apiserver        28826 oauth-ap  3.24 0.17 3.07    0    0
   packageserver-5f99c6
    packageserver          18982 package-  0.35 0.08 0.27    0    0
   machine-config-opera
    machine-config-opera   12098 machine-  0.17 0.02 0.15    0    0
   insights-operator-7f
    insights-operator      14827 insights  0.30 0.05 0.25    0    0
   node-exporter-chc49
    node-exporter           3207 node_exp  0.38 0.13 0.25    0    0
   authentication-opera
    authentication-opera   14796 authenti  1.47 0.22 1.25    0    0
   kube-apiserver-rhosc
    kube-apiserver          3378 watch-te  11.1 1.04 10.1    0    0
    kube-apiserver-cert-    5683 cluster-  0.18 0.08 0.10    0    0
    kube-apiserver-check    6166 cluster-  0.47 0.08 0.38    0    0
   prometheus-k8s-1
    prometheus           1817314 promethe  42.1 0.95 41.1    0    0
    thanos-sidecar       1817435 thanos    0.15 0.02 0.13    0    0
    prometheus-proxy     1817486 oauth-pr  0.57 0.03 0.53    0    0
   prometheus-operator-
    prometheus-operator-   11347 promethe  0.17 0.02 0.15    0    0