[2020-01-09 Thu 18:11] Back in 2017, I decided that our applications should be deployed in a Swarm cluster. Fast forward to 2020, we have more than 50 stacks running and the number is still counting on.

I have no regret of that decision: (1) The simplicity of docker-compose file format allows us to deploy a new stack very quickly, skipping the ops part. Any developer can have a look at the stack file and say exactly what’s going on. (2) The declarative nature of config files (minimal) and volumes allow us to backup and update the application with ease, basically create a new snapshot, try to change the image version and re-deploy the stack. With the help of network drive (and advance FS) like NFS with ZFS we, don’t have to worry about volume provision and applications can be scaled quickly within the limit of the stateful part (DB and Storage.)

As of 2020, the war between Swarm and K8s seems to be settled. Swarm seems to be more appropriate for the hobbyists (single node cluster) and/or RPi cluster etc. If I’m looking for the auto-scale ability of the cloud provider to prepare for bigger applications, the only solution seems to be K8s.

Current structure

  • A Swarm cluster of a bunch of VPSes for memory and CPU
  • A pool of NFS servers served as storage for application
  • Traefik as Ingress controller
  • Let’s Encrypt as TLS provider

A typical docker-compose file

version: '3.1'

services:
  redmine:
    image: redmine
    volumes:
      - /path/to/nfs/volume/tld.redmine.www/redmine/config/configuration.yml:/usr/src/redmine/config/configuration.yml
      - /path/to/nfs/volume/tld.redmine.www/redmine/files:/usr/src/redmine/files
    environment:
      REDMINE_DB_MYSQL: db
      REDMINE_DB_PASSWORD: example
    networks:
      - default
      - web
    deploy:
      placement:
        constraints:
          - node.hostname == prod2
      labels:
        - 'traefik.frontend.rule=Host:www.redmine.tld'
        - 'traefik.port=3000'

  db:
    image: mysql:5.7
    volumes:
      - /path/to/nfs/volume/tld.redmine.www/mysql:/var/lib/mysql
    environment:
      MYSQL_ROOT_PASSWORD: example
      MYSQL_DATABASE: redmine
    networks:
      - default
    deploy:
      placement:
        constraints:
          - node.hostname == prod2

networks:
  web:
    external: true

Idea

The idea of this post is to evaluate the possibility target K8s deployment, basically transform the above docker-compose file to a bunch of Service, Deployement and Ingress and so on.

Imagine a new structure of:

  • A K8s cluster with
  • NFS provisioner (that should be changed to cloud provider if necessary)
  • Traefik as Ingress controller
  • Let’s Encrypt as TLS provider

In order to migrate, each of the above two services should generate:

  • A static PV with Retain policy point to the correct location beneath the NFS mount point and a claimRef point to
  • A PVC, should be bound automatically to the above PV, serve as the volume for the Pod
  • An Ingress that exposes www.redmine.tld from the above labels via the port defined by (no for the db service: port-forward if we need to access the DB from outside)
  • A ClusterIP Service that expose the port defined by the label to the cluster
  • A Deployment to deploy the container. It should be able to handle at least replicas and placment properties of the yaml

Ideally this process should be done automatically from the above yaml file, allows us to use more advance features of K8s while keeping the simplicity of docker-compose file format.

I can go ahead and list all of them here (2 services * 5 files = 10 files in total) but let’s skip them for the sake of brevity.

Research

Write it manually

Too much, should be 10 times longer, 400 lines of code for a simple service, with a lot of duplication. We will compare the final result if one of anything below works…

Generator from stack.yaml

Sound like… Helm

Helm

Two possibilities:

  • Use stack file directly as values.yaml
  • A big helper file to extract data from each service of yaml file and extends the templates (10 files only 1, 2 line with per service/deployment etc from these values, or they can be merged into one single file)

Looking at a Helm chart example, it seems unimaginable but imagine we use only one chart for every stack, the trade of worth it. The helper file can be versioned and symlinked to the chart.

Resouces

Kustomization

Heard about it, but how to extract variables from a docker-compose file?

Compose-on-kubernetes

Looks like the holy grail. But…

  • Ingress seems to be missing

DONE Evaluation

Let’s go with Helm first…

(Somehow) it works for above docker-compose.yml. The final output of all the deployment object is about 240 lines, only 6 times bigger than the stack.yml.

The repo can be found here: linktohack/helm-stack: Deploy your `docker-compose` `stack` with Helm.

Next step

The next step is to evaluate other stack files. There are still some points need to be worked on:

  • Handle more labels property (custom header etc…)
  • Handle node constraint (nodeSelector)
  • Less assumption about the stack file: i.e. normalize it
    • Separated volumes section
  • Network section