Red Hat OpenShift 4 on Exoscale

VSHN Supported Features and Configuration

While this page describes the product features, a more detailed technical insight into how we operate OpenShift 4 can be found under openshift.docs.vshn.ch.

Red Hat provides a document with OpenShift Container Platform 4.x Tested Integrations (for x86_x64) which applies also to this product.

Supported by default

These features and configurations are available out-of-the box and installed and configured by default.

Feature / Configuration Description

Authentication

Authentication is by default via VSHN Account (LDAP). User management is done via VSHN Portal (Self-Service).

Network Policy

Network policies are supported by default and a set of default policies are applied: allow-from-same-namespace, allow-from-openshift-ingress and allow-from-openshift-monitoring.

Integrated registry

The integrated registry is installed and enabled by default. It uses the cloud providers object storage to store images.

Machine Config (Compute nodes)

A set of default Machine configuration is available (see infrastructure specifics).

Operator Hub

The Operator Hub is enabled by default, no support is given on any Operators installed via the Operator Hub.

Image Builds

Building images on the platform is supported and enabled by default. Please note that using the Docker build strategy isn’t secure as it exposes the host system to root privilege escalation.

Cluster Monitoring

Cluster monitoring is enabled and used to ensure and assess cluster stability. Alerts are sent to VSHN and handled accordingly. Alert rules are tweaked regularly by VSHN.

Cluster metrics are further used to monitor the resource usage and resource availability on the whole cluster.

Cluster limits

We adhere to the official numbers which are documented under Planning your environment according to object maximums. Although the following limits are set by VSHN:

Infrastructure Nodes

Each cluster has at least 3 nodes dedicated to OpenShift infrastructure components like router, registry, web console and monitoring components. No user workload is allowed on these nodes.

OpenShift Cluster Maintenance

OpenShift and node updates are applied continuously when they’re available. See also Version and Upgrade Policy.

OpenShift Cluster Backup

A full backup of the etcd database is made every 4 hours. This includes a dump of all objects in JSON format, this way single objects can be restored on request. The backup data is encrypted before it is stored in an object storage backend, usually on the same cloud as the cluster is running. K8up is used as the backup operator, using Restic as backup backend.

Persistent storage volumes are not automatically backed up. The user of persistent volumes is obliged to take care of this. For that purpose, K8up is available on the cluster to help with that task. We’re also happy to help, just let us know.

Supported on request

These features or configuration adjustments must be specifically requested and some restrictions apply. Activation and configuration of these features imply additional engineering costs and can cause additional engineering costs for operating them (although no fixed additional recurring costs apply).

Feature / Configuration Description

Authentication

Authentication can be configured to use a custom provider in addition to the default VSHN Account. See Supported identity providers for a list of available providers.

Cluster-wide HTTP or HTTPS proxy

Configuring OpenShift to use a cluster-wide HTTP or HTTPS proxy is possible, but incurs additional individual engineering effort. The documentation states: The cluster-wide proxy is only supported if you used a user-provisioned infrastructure installation or provide your own networking, such as a virtual private cloud or virtual network, for a supported provider.

OpenShift Central Logging

The integrated central logging based on Kibana and Elasticsearch is installed and configured on request. Logging will use a significant amount of resources. For that reason, the infrastructure nodes will need be made bigger.

Cluster Admin

For private clusters "Cluster Admin" can be granted. This implies "with great power comes great responsibility". A sign-off is needed.

Disabling of Red Hat remote health monitoring (Telemetry)

OpenShift 4, by default, continuously sends data to Red Hat, see about remote health monitoring for details. This is enabled by default, but can be disabled on request. The exact metrics sent to Red Hat are documented in data-collection.md.

Registry configuration

Exposing of the registry via the ingress controller can be configured.

Custom Machine Sets

Custom MachineSet can be defined to customize compute node availability.

Cluster Autoscaling

Autoscaling will only be enabled on request and will be configured according to the defined needs.

OpenShift Pipelines

Pipelines are in Technology Preview and therefore are only available on request. Running Pipelines outside of the OpenShift cluster for example with GitLab CI or GitHub Actions is preferred by VSHN.

Egress IP

The egress IP feature depends on the possibilities of the underlying networking infrastructure and therefore is only supported where the infrastructure allows it.

Audit logging

While audit logging is enabled by default on OpenShift 4 per control plane node (see Viewing node audit logs) they are not forwarded or stored outside the cluster. There is no availability guarantee by default. If there is a need for special treatment of audit logs, it needs to be requested.

Unsupported

These features or configuration adjustments are not supported by VSHN, but can still be activated or changed, although are neither monitored, backed up nor maintained. No guarantees are given, use them at your own risk.

Feature / Configuration Description Reasoning

Metering Operator

The OpenShift metering operator is not supported.

The metering component imposes some very complex services with a high resource demand. We do not offer the required expertise to run these services.

Upgrade channels

We only support stable upgrade channels. Changing the channel isn’t supported or encouraged.

The stable upgrade channel offers the most tested upgrades which we see as a cornerstone for a stable service offering. Other channels could be used on non-production clusters. Specifically the fast channel is used for VSHN internal lab clusters for our own update QA.

Network configuration

OVN-Kubernetes is the default Container Network Interface (CNI) network provider. OpenShift SDN in "NetworkPolicy" isolation mode is still supported for existing clusters. Changing to a different isolation mode isn’t supported.

Networking is a complex component and the OpenShift OVN-Kubernetes brings full integration and support by Red Hat. The most common use cases are handled by this configuration.

Red Hat OpenShift Service Mesh

Support for Red Hat OpenShift Service Mesh is not available from VSHN (yet).

This is mainly caused by the lack of experience running a service mesh.

Jaeger

Support for Jaeger is not available from VSHN (yet).

This is mainly caused due to the lack of experience running Jaeger.

Container-native virtualization

No support is available for container-native virtualization.

This is mainly caused due to the lack of experience running container-native virtualization and it is currently in Technology Preview.

OpenShift Serverless

Support for OpenShift Serverless is not available from VSHN (yet).

This is mainly caused due to the lack of experience running OpenShift Serverless and it is currently in Technology Preview.

Operator Lifecycle Manager (OLM)

The Operator Lifecycle Manager is installed and fully functional on the cluster, but we don’t guarantee full functionality of Operators installed via OLM by the end-user.

There are many Operators available via OperatorHub and we are not able to provide support for any of them.

Airgapped (disconnected) environments

Installing and running OpenShift 4 in an airgapped environment, meaning that the cluster has no Internet access, is currently not supported by VSHN.

The cluster needs access to specific endpoints which are documented in the official OpenShift documentation and in the VSHN Knowledgebase. Supporting airgapped setups is on our long-term roadmap.

Bring-Your-Own-Subscription

OpenShift clusters managed by VSHN are bound to VSHNs CCSP subscriptions with Red Hat.

Attaching an OpenShift cluster to another subscription brings in a lot of operational support burden.

Disk Encryption

Encryption of local disks is currently not supported. If encryption at rest is needed it’s up to the storage provider (CSI) to support that.

The needed infrastructure (e.g. Tang server) to provide this feature is not available yet.

Features marked as Technology Preview by Red Hat are unsupported by VSHN as well. A list of Technology Preview features is available in the release notes. For OpenShift 4.5 this list can be found in the OpenShift Container Platform release notes.

Still interested in one (or more) of these unsupported options? Get in contact with sales@vshn.ch and we figure out together what we can offer.

Version and Upgrade Policy

The official Red Hat OpenShift Container Platform Life Cycle Policy applies and has implications on the supported versions.

Supported is only the latest available OpenShift 4 release. Installations must be upgraded to the next minor release within three months after a new release is available, or the latest when the next minor release is available.

Errata updates are installed as they are released and include updates to OpenShift itself as well as the Red Hat CoreOS nodes. By default the stable upgrade channel is used.

Support Data Sharing

For getting support from Red Hat we usually have to share status information with Red Hat. This is done using the oc adm must-gather command, which collects support information without sensitive data like secrets. More information about this tool is documented under Gathering data about your cluster.

Cluster Resource Handling and Availability

By the nature of a clustered system like Kubernetes is, some constraints apply to how resources are available to the user of the platform and how to work with them:

  • For having enough room to handle failing nodes and to ease maintenance processes it’s important to adhere to at least n+1 node availability and have at least three worker nodes in the cluster. For example on a three-node cluster it is required to only use the resources of two-thirds of them.

  • Some resources on each node and in the whole cluster are always reserved for system services.

    • Cluster level: there needs to be enough resources available to run the control-plane and other system services like the registry or monitoring component, that’s why there are dedicated nodes in the cluster to run this workload.

    • Node level: there is an amount of resources reserved on each node to allow for operating system services to function properly.

Exoscale Specific

exoscale

The official documentation from Red Hat applies: Installing a cluster on bare metal. As Exoscale is not an official Red Hat OpenShift installer supported provider, the so-called UPI (User-provisioned infrastructure) installation mode applies.

Each cluster is installed in its own Exoscale account. Billing is usually done via VSHN.

The default installation on Exoscale uses public IP addresses for all VMs and restricts access to the cluster with Exoscale’s Security Groups.

Please contact us to discuss your particular requirements if the network architecture outlined above does not fit your needs.

Default Configuration / Minimum Requirements

This table shows the default configuration which is applied when nothing else is specified and defines the minimum requirements.

Item Description

Load Balancer

2 load balancer nodes, all in one zone, separated via anti-affinity rules.

  • Machine type: Medium

  • Disk: 20GB SSD

Control Plane

3 control plane nodes, all in one zone, separated via anti-affinity rules.

  • Machine type: Extra-Large

  • Disk: 120 GB SSD

Infrastructure Nodes

3 nodes, all in one zone, separated via anti-affinity rules.

  • Machine type: Extra-Large

  • Disk: 120 GB SSD

When integrated logging is requested, these nodes are upgraded to at least Huge.

Compute Nodes

3 nodes, all in one zone, separated via anti-affinity rules.

  • Machine type: Extra-Large

  • Disk: 120 GB SSD

Storage Nodes

3 nodes, all in one zone, separated via anti-affinity rules.

  • Machine type: Extra-Large (CPU Optimized)

  • Disk: 300 GB SSD

These storage nodes will be used for system storage (Logging and Metrics) and aren’t available to user workload. For storage to be consumed by user applications, see below.

The storage per VM is distributed as follows: 120 GB for the operating system, 180 GB for the storage cluster as backing storage. This gives roughly 175 GB usable space in the storage cluster, giving us a margin below 85% (at which point the cluster will go readonly, to ensure that we don’t run into issues with Ceph) capacity assuming that Prometheus and Alertmanager consume roughly 110 GB in total.

When integrated logging is requested, 4 nodes, each with 800 GB storage, are needed.

For Metrics and Logging together, we estimate that roughly 750 GB of storage will be consumed in total. Adding 20% to that gives a requirement of 900 GB usable storage for the storage cluster. With a replication factor of 3, that amount of usable storage requires 2.7 TB backing disk. Since we’re limited to max 800 GB disk per node on Exoscale, and 120 GB need to be set aside for the OS, we have max 680 GB / node for the storage cluster. 2.7 TB / 680 GB = 3.97 → we’ll need 4 nodes with 800 GB disk for Metrics + Logging.

Persistent Storage

Storage is only available with APPUiO Managed Storage Cluster.

Cloud Region

Defaults to CH-GVA-2 (Geneva, Switzerland)

VSHN supports all available zones of Exoscale.

For a detailed description about the machine types, have a look at the official Exoscale documentation.

Limitations

The following limitations are known on this infrastructure:

  • No autoscaling for worker nodes since there’s no support for Exoscale in OpenShift itself.

  • No support for service type LoadBalancer. This needs to be engineered case-by-case with the ExternalIP feature of OpenShift. Note that the Exoscale Cloud Controller Manager is not supported and tested on OpenShift.

Cloud Costs

If you want to calculate what the Exoscale resource costs look like, the minimal set of resources consists of

  • 2 x Medium VMs

  • 9 x Extra-Large VMs

  • 1120 GB SSD storage

  • 2 x Elastic IP

  • S3 object storage for the OpenShift registry

A good price calculator can be found under Exoscale Pricing Calculator.