Post

Fully Automated K3S etcd High Availability Install

Setting up k3s is hard.That’s why we made it easy.Today we’ll set up a High Availability K3s cluster using etcd, MetalLB, kube-vip, and Ansible.We’ll automate the entire process giving you an easy, repeatable way to create a k3s cluster that you can run in a few minutes.

A HUGE THANKS to our sponsor, Micro Center!

New Customers Exclusive – Get a Free 240gb SSD at Micro Center: https://micro.center/1043bc

📺 Watch Video

Prep

You’ll need to be sure you have Ansible installed on your machine and that it is at least 2.11+. If you don’t, you can use the install Ansible post on how to install and update it.

Second, you’ll need to provision the VMs. Here’s an easy way to create perfect Proxmox templates with cloud image and cloud init and a video if you need.

Next, you’ll need to fork and clone the repo.While you’re at it, give it a ⭐ too :).

1
git clone https://github.com/techno-tim/k3s-ansible

Next you’ll want to create a local copy of ansible.example.cfg.

1
cp ansible.example.cfg ansible.cfg

You’ll want to adapt this to suit your needs however the defaults should work without issue.If you’re looking for the old defaults, you can see them in this PR that remove the file.

Next you’ll need to install some requirements for ansible

1
ansible-galaxy install -r ./collections/requirements.yml

Next, you’ll want to cd into the repo and copy the sample directory within the inventory directory.

(Be sure you’re using the latest template)

1
cp -R inventory/sample inventory/my-cluster

Installing k3s

Next, edit the inventory/my-cluster/hosts.ini to match your systems.DNS works here too.

1
2
3
4
5
6
7
8
9
10
11
12
[master]
192.168.30.38
192.168.30.39
192.168.30.40

[node]
192.168.30.41
192.168.30.42

[k3s_cluster:children]
master
node

Edit inventory/my-cluster/group_vars/all.yml to your liking.See comments inline.

It’s best to start using these args, and optionally include traefik if you want it installed with k3s however I would recommend installing it later with helm

It’s best to start with the default values in the repo.

1
2
3
4
5
6
7
8
9
# change these to your liking, the only required are: --disable servicelb, --tls-san 
extra_server_args: >-
  
  --node-taint node-role.kubernetes.io/master=true:NoSchedule
  --tls-san 
  --disable servicelb
  --disable traefik
extra_agent_args: >-
  

I would not change these values unless you know what you are doing.It will most likely not work for you but listing for posterity.

Note: These are for an advanced use case. There isn’t a one size fits all setting for everyone and their needs, I would try using k3s with the above values before changing them.This could have undesired effects like nodes going offline, pods jumping or being removed, etc… Using these args might come at the cost of stability Also, these will not work anymore without some modifications

1
2
extra_server_args: "--disable servicelb --disable traefik --write-kubeconfig-mode 644 --kube-apiserver-arg default-not-ready-toleration-seconds=30 --kube-apiserver-arg default-unreachable-toleration-seconds=30 --kube-controller-arg node-monitor-period=20s --kube-controller-arg node-monitor-grace-period=20s --kubelet-arg node-status-update-frequency=5s"
extra_agent_args: "--kubelet-arg node-status-update-frequency=5s"

Start provisioning of the cluster using the following command:

1
ansible-playbook ./site.yml -i ./inventory/my-cluster/hosts.ini

Note: note: add –ask-pass –ask-become-pass if you are using password SSH login.

After deployment control plane will be accessible via virtual ip address which is defined in inventory/my-cluster/group_vars/all.yml as apiserver_endpoint

kube config

To get access to your Kubernetes cluster and copy your kube config locally run:

1
scp ansibleuser@192.168.30.38:~/.kube/config ~/.kube/config

Testing your cluster

Be sure you can ping your VIP defined in inventory/my-cluster/group_vars/all.yml as apiserver_endpoint

1
ping 192.168.30.222

Getting nodes

1
kubectl get nodes

Deploying a sample nginx workload

1
kubectl apply -f example/deployment.yml

Check to be sure it was deployed

1
kubectl describe deployment nginx

Deploying a sample nginx service with a LoadBalancer

1
kubectl apply -f example/service.yml

Check service and be sure it has an IP from metal lb as defined in inventory/my-cluster/group_vars/all.yml

1
kubectl describe service nginx

Visit that url or curl

1
curl http://192.168.30.80

You should see the nginx welcome page.

You can clean this up by running

1
2
kubectl delete -f example/deployment.yml
kubectl delete -f example/service.yml

Resetting your cluster

This will remove k3s from all nodes.These nodes should be rebooted afterwards.

1
ansible-playbook ./reset.yml -i ./inventory/my-cluster/hosts.ini

What’s next?

See here to get the steps for installing traefik + let’s encrypt

See here for steps to deploy rancher

Troubleshooting

Be sure to see this post on how to troubleshoot common problems

⚙️ See all the hardware I recommend at https://l.technotim.live/gear

🚀 Don’t forget to check out the 🚀Launchpad repo with all of the quick start source files

This post is licensed under CC BY 4.0 by the author.