This is an internal documentation. There is a good chance you’re looking for something else. See Disclaimer.
Infrastructure Overview¶
Overview of our infrastructure operated by VSHN.
OpenShift 4 Platform¶
Platform based on OpenShift which in turn is built around Kubernetes.
Tocco¶
For every customer we operate an independent instance and every instance is in an independent OpenShift project with the name nice-${INSTALLATION_NAME}.
Infrastructure Overview¶
Deployment¶
Application is built from source on TeamCity and resulting image pushed to Docker registry.
Create a database backup.
Automatic deployment triggered by OpenShift’s ImageChange trigger.
Wait for application to be deployed.
Configuration Management¶
List of Components¶
On a Kubernetes-level our setup looks something like this:
Name |
Provided Service |
Management |
||||||
---|---|---|---|---|---|---|---|---|
dc/nice |
There are two containers:
|
Ansible except for:
|
||||||
svc/nice
ingress/nice
ingress/nice-*
|
There is one service, svc/nice, that handles all traffic going to our application. There is always a ingress called nice using ${INSTALLATION_NAME}.tocco.ch as FQDN. Additional ingresses may exist that follow the naming convention nice-${FQDN}. All ingresses use ACME to issue and renew TLS certificates. All connections are upgraded to HTTPS by Nginx (in nginx container). Connection timeout has been increased to 15 minutes. This is required for our old, legacy client. The default platform-wide connection limit for the ingress had to be raised too. The following setting is used to ensure the X-Forwarded-For header is not blindly trusted when coming from outside the OpenShift platform: haproxy.router.openshift.io/set-forwarded-headers: replace
This is configured on cluster level. See also Route-specific annotations. Nice assume X-Forwarded-For can be trusted. |
Ansible A No such manually create ingresses exist as of today. |
||||||
Docker registry
is/nice
|
Docker image of our main application, Tocco. Built and then pushed from outside OpenShift by our CD tool TeamCity. Pushed images are deployed automatically using an imageChange trigger. Images are backed up daily. |
Ansible |
||||||
is/nginx |
There are two global nginx images in use:
Both images reside in the project shared-imagestreams. |
Manually Updating and promoting from staging to production is done manually. |
||||||
monitoring |
Currently only a simple http check is used to check
if our status page ( Solr cores are also monitored by checking their response times. You are able to specify a response warn and critical time and also if we should get a mail. |
Ansible Ansible generates a definition in the Puppet Hiera format as required by VSHN’s monitoring. The configuration is then committed to monitoring.yaml. |
||||||
logging |
Logs are written to stdout as JSON. Those logs are then collected and made available using Elastic Search and Kibana. |
|||||||
DNS |
Domains managed by us are hosted at Nine. However, many domains are hosted by customers themselves or third parties in the customer’s name. |
Manually via web interface. |
||||||
PVC for LMS |
Our e-learning solution stores files in a PVCs. Deprecated: With Nice 3.0, these files have been moved into the DB. 4 systems remain with such volumes. |
Manually |
||||||
PVC for out-of-memory dumps |
For debugging purposes, we use PVCs to extract memory dumps from Tocco. |
Manually |
Tocco Manual¶
Manual of Tocco consisting of static HTML and hosted on OpenShift.
Deprecated: will be deprecated with Nice ~3.4.
Name |
Provided Service |
Management |
---|---|---|
dc/documentation-${VERSION} |
For every version of Tocco, a manual is released and deployed separately. |
Manually via template |
ingress/documentation-${VERSION} |
Manually via template |
|
monitoring |
Puppet Added to VSHN’s Puppet config manually. |
|
logs |
Default Nginx logs written to stdout |
|
DNS |
Manually |
Jira Commit Info Service¶
Integration of deployment, merge and commit information into Jira. See also Commit-Info-Service.
Name |
Provided Service |
Management |
---|---|---|
dc/commit-info
jira-addon
|
Manually |
|
pvc/repository |
Clone of our main Git repository. Used to display commit and deployment information in Jira. |
Manually |
ingress/*
svc/*
|
Manually |
|
is/* |
Deployed via GitLab CI |
Sonar¶
SonarQube code inspection tool.
An instance of SonarQube is running to analyze the source code of Tocco. Analyses are started from TeamCity for backend code and Gitlab CI for the client. See SonarQube for details.
Name |
Provided Service |
Management |
---|---|---|
dc/* |
Manually |
|
is/* |
Deployed manually |
Address Provider¶
External addressprovider service
The service is deployed via GitLab CI and the service definition is managed via Ansible (playbook, role).
Deployment:
$ cd ${ANSIBLE_REPO/services
$ ansible-playbook playbook.yml -t address-provider
Name |
Provided Service |
Management |
---|---|---|
dc/* |
Manually |
|
ingress/*
svc/*
|
Manually |
|
is/* |
Production: Deployed via TeamCity
Test: Deployed via GitLab
|
Image service¶
We use a service called imaginary running in its own pod. The Openshift project
containing the service is called image-service
. All calls to the service require a header API-Key
be used,
containing the key as defined in image_service_api_key
in secrets2.yml.
From the backend we call the /crop
endpoint of the service to generate thumbnails. Other endpoints may be used freely
if the need ever arises, nothing is blocked.
The service is deployed via GitLab CI and the service definition is managed via Ansible (playbook, role).
Deployment:
$ cd ${ANSIBLE_REPO/services
$ ansible-playbook playbook.yml -t image-service
Name |
Provided Service |
Management |
---|---|---|
dc/* |
Manually |
|
is/* |
Deployed manually |
Managed Servers - VSHN¶
Postgres¶
Postgres database server used for the primary database of Tocco.
Version |
Postgres 12 |
Required extensions |
Extensions are installed on database via Ansible (CREATE EXTENSION). |
Backups |
7 daily database dumps + 4 weekly |
Users / databases |
Databases and users are managed by Ansible. |
Locale |
Locale settings on Postgres impact ordering (ORDER BY) and is required to be en_US.UTF-8. See OPS-772 |
Solr¶
Apache Solr used to provide full-text search capabilities.
Version |
Solr 7.4 |
Authentication |
Via Basic Authentication Plugin providing HTTP Auth support. |
Transport security |
HTTPS with TLS cert signed by globally trusted authority. |
Backups |
7 daily + 4 weekly Implemented using LVM snapshots. |
Cores (AKA indexes) |
Created via Ansible |
Monitoring |
Every Solr core is monitored in Icinga against reachability and response time |
Mail Relay¶
SMTP server used for outgoing mails.
The mail server admits all incoming mails. Restricting Sender domains/addresses is left up to Tocco.
Transport Security |
STARTTLS with TLS cert signed by globally trusted authority. |
Mails are signed using DKIM. Generally, one and the same key is used for all mails. However, for a few domains we use another key to avoid name clashes. See also DNS Records for Outgoing Mails |
S3¶
S3 storage is used for files uploaded to Tocco.
No data is currently being deleted, however we are doing backups
Keys |
Each installation has it’s own key |
Buckets |
There is one bucket per installation |
Managed Servers - Nine¶
DNS¶
DNS Servers are managed by Nine and the login for the cockpit is stored in ansible vault.
We are using ALIAS and ANAME Records |
|
Configuration |
DNS Management has to be done manually at the Nine cockpit. We’d love to have an API to manage the records through ansible. |