This is an internal documentation. There is a good chance you’re looking for something else. See Disclaimer.
Inter-Platform Migration¶
Here is documented how to move all our services from one Kubernetes platform to another. This is based on the documentation created for moving from OpenShift 3 to OpenShift 4. In the process, everything was recreated. Additionally, information has been incorporated concerning the move to a non-OpenShift platform.
Overview¶
This guide assumes that the new platform is located in the same datacenter and that the following services do not have to be moved:
Elastic search
Postgres servers
Solr servers
S3
The steps, as provided, allow a migration without downtime.
Non-OpenShift Platforms¶
Several OpenShift-specific features are in use that will hinder a migration to another non-OpenShift Kubernetes platform.
(Lists below are non-exhaustive.)
The following OpenShift-specific features are used:
oc new-project
(via API) is used to create projects.DeploymentConfig
Image triggers (to trigger a deployment on
docker push
)
ImageStream
Ingress annotations:
haproxy.router.openshift.io/timeout - HTTP read timeout
haproxy.router.openshift.io/hsts_header - HSTS HTTP header
The following service come with OpenShift and need to be replaced:
Docker registry
Logging (Kibana)
Logging via line-based JSON
Prometheus (cluster monitoring)
Platform Preparation¶
Service Account for Ansible¶
To allow a full recreation, Ansible needs to be granted access. That is a service account needs to be created and granted the required permissions:
$ oc -n serviceaccounts get serviceaccount ansible
NAME SECRETS AGE
ansible 2 165d
See also Service Accounts.
Grant admin access:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ansible-admin
namespace: serviceaccounts
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- kind: ServiceAccount
name: ansible
namespace: serviceaccounts
Other Service Accounts¶
Other service accounts may need to be recreated as well:
oc -n serviceaccounts get serviceaccounts
NAME SECRETS AGE
ansible 2 165d
builder 2 165d
default 2 165d
deployer 2 165d
teamcity 2 165d
tocco-registry-backup 2 63d
toco-registry-backup is managed by VSHN. Default, builder and deployer are used by OpenShift internally.
Currently, teamcity is the only other global service account. A gitlab account is likely to be created in the future.
Groups¶
Three groups exist partitioning our users:
$ oc get groups
NAME USERS
tocco-admin …
tocco-dev …
tocco-viewer …
Those need to be recreated. Ansible will grant namespace-level access to these groups.
A custom ClusterRoleBinding exist to grant access to cluster metrics:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
name: tocco-cluster-reader
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-reader
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: tocco-admin
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: tocco-viewer
Migration of Nice¶
Two approaches were used for the migration from OpenShift 3 to OpenShift 4:
Change DNS directly:
Start installation on OpenShift 4
Change DNS
Wait for DNS TTL to expire
Stop installation on OpenShift 3
Reverse proxy from old to new platform:
Start installation on OpenShift 4
Reverse proxy traffic from OpenShift 3 to OpenShift 4
Stop installation on OpenShift 3
Change DNS
The latter approach was used where we did not managed DNS ourselves. With this approach DNS changes can be done afterward without coordination with the customer.
Set up Reverse Proxy (Approach A)¶
A Nginx image was used to reverse-proxy from OpenShift 3 to 4.
default.conf:
# WebSocket support
map $http_upgrade $connection_upgrade {
default upgrade;
'' '';
}
server {
listen 8080;
server_name _;
location / {
# FIXME
# adjust upstream
proxy_pass https://proxy.apps.openshift.tocco.ch;
# FIXME
# Set that to (at least) whatever the upstream limit is.
client_max_body_size 400M;
# verify upstream TLS certificate
proxy_ssl_verify on;
proxy_ssl_trusted_certificate /etc/ssl/certs/ca-certificates.crt;
# FIXME
# Adjust according to upstream limit
proxy_read_timeout 30m;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header Host $http_host;
# WebSocket support
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
}
}
Dockerfile:
FROM nginx:stable
COPY default.conf /etc/nginx/conf.d/default.conf
For a possible strategy to deploy it on OpenShift, see:
Migration for Installations where we Control DNS (Approach A)¶
Set
location: cloudscale-os4
in config.ymlCreate installation on OpenShift 4:
ansible-playbook playbook.yml --skip-tags skip_route_dns_verification,acme,monitoring,teamcity -l <installation>
Obtain tokens:
oc login -u <username> https://api.c-tocco-ocp4.tocco.ch:6443 os4_token=$(oc whoami -t) oc login -u <username> https://console.appuio.ch os3_token=$(oc whoami -t)
Copy TLS certificate from OpenShift 3 to OpenShift 4:
./copy_tls_cert --disable-acme-on-os3 --os3-token $os3_token --os4-token $os4_token <installation>
This will copy the TLS certificates from the routes on OpenShift 3 to the ingresses on OpenShift 4 and disable certificate renewal on OpenShift 3.
(
copy_tls_cert
can be found in 5322a5d49757689be695b3bfeef98c1a6079c431.)Deploy installation
Alternatively, copy the existing image from OpenShift 3.
Pull from OpenShift 3:
docker login -u any --password-stdin registry.appuio.ch <<<$os3_token docker pull registry.appuio.ch/toco-nice-<installation>/nice
Copy and push to OpenShift 3:
docker login -u any --password-stdin registry.apps.openshift.tocco.ch <<<$os3_token docker tag registry.appuio.ch/toco-nice-<installation>/nice registry.apps.openshift.tocco.ch/nice-<installation>/nice docker push registry.apps.openshift.tocco.ch/nice-<installation>/nice
Verify installation is running on OpenShift 4
Adjust DNS
Wait for TTL to expire, then stop installation on OpenShift 3
Enable ACME on OpenShift 4:
ansible-playbook playbook.yml -l <installation>
You can check if a valid cert is available on OpenShift like so:
gnutls-cli os4.tocco.ch --sni-hostname <hostname> --verify-hostname <hostname> </dev/null
Or look at the Certificates:
oc edit certificate
Correct Docker pull URL on production:
ansible-playbook playbook.yml -t teamcity -l <prod_installation>
Docker image needs to be pulled from OpenShift 4 when updating production. Let’s tell TeamCity about it.
Migration for Installations using Nginx Reverse Proxy (Approach B)¶
OpenShift 3: switch project:
oc project toco-nice-<installation>
OpenShift 3: start nginx reverse proxy:
oc scale --replicas 1 dc/nginx-reverse-proxy
OpenShift 3: Disable ACME certificate renewal:
for name in $(oc get route -o json | jq -r '.items[]|if (.spec|has("path")|not) and (.metadata.annotations["tocco.ansible-managed"] == "true") then .metadata.name else empty end'); do oc annotate "route/$name" kubernetes.io/tls-acme- done
OpenShift 3: Add route for letsencrypt:
ansible-playbook playbook.yml -t letsencrypt-migration-routes -l <customer>
OpenShift 4: add location in config.yml:
location: cloudscale-os4
OpenShift 4: create installation:
ansible-playbook playbook.yml --skip-tags skip_route_dns_verification,monitoring,teamcity -l <installation>
Obtain tokens:
oc login -u <username> https://api.c-tocco-ocp4.tocco.ch:6443 os4_token=$(oc whoami -t) oc login -u <username> https://console.appuio.ch os3_token=$(oc whoami -t)
Deploy installation
Alternatively, copy the existing image from OpenShift 3.
Pull from OpenShift 3:
docker login -u any --password-stdin registry.appuio.ch <<<$os3_token docker pull registry.appuio.ch/toco-nice-<installation>/nice
Copy and push to OpenShift 3:
docker login -u any --password-stdin registry.apps.openshift.tocco.ch <<<$os4_token docker tag registry.appuio.ch/toco-nice-<installation>/nice registry.apps.openshift.tocco.ch/nice-<installation>/nice docker push registry.apps.openshift.tocco.ch/nice-<installation>/nice
Verify installation is running on OpenShift 4
OpenShift 3: comment out location temporarily:
# location: cloudscale-os4
OpenShift3: route all traffic to new OS4:
ansible-playbook playbook.yml -t full-migration-routes -l <customer>
Comment in location again
OpenShift 3: stop installation:
oc scale --replicas 0 dc/nice
Other Services¶
Other services (than Nice) are setup according to Set up Application/Service on OpenShift and the corresponding Ansible plays need to be modified.
Service use OpenShift-specific features too. See Non-OpenShift Platforms.