Kubernetes @ Home: Vagrant Nodes

[Estimated Reading Time: 8 minutes]

The first significant practical step in establishing my Bare Metal Kubernetes cluster was to provision 4 VM’s. At this point it is worth mentioning that actually installing Kubernetes is almost trivial once you have the necessary hardware in place. Since my ‘hardware’ was going to be virtual, this meant that provisioning the VM’s was by far the biggest job so after an initial manual install, so I set about automating as much of this process as I could.

The Task

Having settled on 4 VM’s to provide my 4 nodes, I had 4 VM’s to be configured and 4 installations of Ubuntu Server 20.04 to be performed.

I had already previously established a working Kubernetes cluster by manually creating and configuring each VM and installing Ubuntu Server on them all. Although not complicated, this was tedious and I foresaw much frustration if/when I ever needed to rebuild the cluster so looked for something to automate this repetitive and fiddly business.

The ‘Vagrant’ Mis-Step

Vagrant appeared to offer a perfect solution.

Though not related to Kubernetes, Vagrant employs a similar approach, using a declarative mechanism rather than imperative. That is, Vagrant takes a description of a VM (or VM’s) and compares it to the current “real world”, making whatever changes are required to bring the real world into line with the declared desired state.

Vagrant supports a number of virtual machine platforms with VirtualBox being the default. Crucially, it offered support for Hyper-V.

It initially appeared that all I needed was a VagrantFile describing the 4 VM’s for my nodes, something similar to:

Vagrant.configure("2") do |config|
  # This is a multi-machine VagrantFile.
  # These settings are applied to all VM's
  config.vm.box = "deltics/focal64"

  config.vm.network "public_network"

  config.vm.provider "hyperv" do |vm|
    vm.cpus = 4
    vm.memory = "8192"
    vm.vlan_id = 2
  end

  # The remaining settings are applied individually
  # to each separate machine that is defined

  config.vm.define "master" do |node|
    node.hostname = "master"
    
    node.vm.provider "hyperv" do |vm|
      vm.vmname = "k8s.master"
      vm.mac = "ffffffffff00"
    end
  end 
  
  config.vm.define "worker1" do |node|
    node.hostname = "worker1"
    
    node.vm.provider "hyperv" do |vm|
      vm.vmname = "k8s.worker1"
      vm.mac = "ffffffffff01"
    end
  end 
  
  config.vm.define "worker2" do |node|
    node.hostname = "worker2"
    
    node.vm.provider "hyperv" do |vm|
      vm.vmname = "k8s.worker2"
      vm.mac = "ffffffffff02"
    end
  end 
  
  config.vm.define "worker3" do |node|
    node.hostname = "worker3"
    
    node.vm.provider "hyperv" do |vm|
      vm.vmname = "k8s.worker3"
      vm.mac = "ffffffffff03"
    end
  end 
end

A VagrantFile is specified in a project (folder). It may describe a single VM or a number of them. In the context of that project (i.e. folder), when the command vagrant up is performed, Vagrant examines the VagrantFile and compares it against the state of the virtual machines currently established in that project, the VM’s themselves being created in a .vagrantd subfolder.

The first time the command is run, for example, there are no actual VM’s that exist at all, so Vagrant will apply the configuration described in the VagrantFile and create the specified VM’s in that .vagrantd folder. If changes are made to the VagrantFile, those changes are applied to any existing VM’s the next time vagrant up is performed.

The “2” in the initial configure statement indicates the desired Vagrant API Version and is nothing to do with the number of VM’s defined by the file.

A quick note on VagrantFile syntax. Although it looks like a declarative configuration file such as yaml or json, it is in fact a Ruby program; the Vagrant Api is specifically designed to make it appear as declarative as possible.

Almost everything needed for my VM’s could be specified in a VagrantFile, as illustrated above.

Almost everything.

As illustrated above, my project required a “Multi-Machine” file, with basic configuration that applied to all machines and further configuration specific to each individual machine (which in turn identifies those machines as needing to be created). In addition, whilst many properties of a VM are configurable through generic Vagrant properties, some are specific to the particular provider being used, in my case Hyper-V. Provider-specific properties are accessible using the …vm.provider “hyperv” do |vm| configuration blocks.

Properties common to all VM’s in the project are specified at the “top level”. VM-specific settings are then provided within a nested configuration block that defines each VM. For example, this block defines and configures the VM for the master node, including the hostname supported by the general Vagrant Api and the Hyper-V provider-specific configuration for the name of the VM (as identified in Hyper-V itself) and MAC address:

  config.vm.define "master" do |node|
    node.hostname = "master"
    
    node.vm.provider "hyperv" do |vm|
      vm.vmname = "k8s.master"
      vm.mac = "ffffffffff00"
    end
  end

In my case, Hyper-V provider-specific properties are required both in the common section and in each VM-specific section.

We can see that the VagrantFile specifies the same amount of RAM for each VM as well as the CPU count and VLAN ID since these are consistent across all VM’s:

  config.vm.network "public_network"

  config.vm.provider "hyperv" do |vm|
    vm.cpus = 4
    vm.memory = "8192"
    vm.vlan_id = 2
  end

The network specification here seems to be more for Vagrant’s purposes than for configuring the VM’s themselves as it has no bearing on or relation to the Network Switch settings of the VM’s that I can determine.

Then, as we’ve seen, a number of VM’s are defined with configuration specific to each, including:

The VM name (as identified in HyperV)
The Hostname (by which the VM is known on the network)
The MAC address

The MAC addresses in this file are not accurate but illustrate the use of a consecutive range.

Some key things are missing from this specification:

Secure Boot configuration (i.e. disabled)
Any form of HDD specification
The network switch to use (“Kubernetes Switch”)

The first two of these are actually provided by the “box” declaration at the very top of the VagrantFile:

  # This is a multi-machine VagrantFile.
  # These settings are applied to all VM's
  config.vm.box = "deltics/focal64"

What’s In A (Vagrant) Box?

A “box” is a pre-prepared image of a virtual machine that serves as a “template” for a VM specified in a VagrantFile. In a multi-machine VagrantFile each VM could be based on a different box. In my case, the VM’s all use the same box, hence this is specified in the global section of the file and identifies a custom box that I created.

This highlights the first, relatively minor, problem with Vagrant.

To use Vagrant to provision my nodes, I had to go through the process of configuring and installing the OS on a VM which would then provide the template for my VagrantFile specified machines. This includes specifying a HDD (on which to install the OS).

i.e. the HDD of a VM provisioned using Vagrant is determined by the box used (though additional drives can be added, using an experimental feature). The same applies to BIOS/Firmware configuration and additional devices such as virtual DVD drive etc.

Here Be Dragons…

I decided I needed to prepare a box, on which point the Vagrant documentation is not particularly clear. It all makes sense once you figure out how things are supposed to be organised, but IMHO it is not very clearly described.

For those planning to experiment with Hyper-V box creation, this is the process:

Configure and prepare your VM (see below)
Export your Hyper-V VM. I recommend ensuring that your exported VM has no checkpoints, so that the Snapshots folder created during export will be empty and can simply be removed as it won’t be required.
Create a metadata.json file in the folder containing the exported VM. At a minimum this metadata file needs to include a provider property, set to hyperv (for a Hyper-V VM)
Create a tar archive of the folder, giving the archive a .box extension. Compression is a good idea if you intend to publish the box to a catalogue, but otherwise is unnecessarily time consuming if you only intend using it ‘locally’, making sure to include only contents of the folder with the metadata file, and not the folder itself

So you should end up with a folder (and an archive) that contains two folders and your metadata file:

Note that the pluralised names on these folders (created by the Export process) are not significant. In the case of a VM with a single HDD, there is only one virtual machine and only one virtual hard disk in those folders.

Once packaged, your custom box is made available for (local) use by adding it to the local registry of boxes:

vagrant box add --name my-box /path/to/the/new.box --provider hyperv

Stand On The Shoulders of Giants Others

Preparing a custom box is considered “advanced” use of Vagrant, with the recommendation being to use a pre-prepared box from the Vagrant community where possible.

This is great advice, except that the catalogue of such boxes contains a lot of boxes with little to no documentation explaining what each box is, contains or is good for, beyond identifying the OS and the providers that the box supports.

Since a box is a template VM, the VM image format in a box also determines which provider it may be used with. Though a box may contain templates for multiple providers most boxes in the public catalogues only seem to support the (default) VirtualBox provider; very few support Hyper-V. In particular, there was no official Ubuntu 20.04 (aka “focal”) box for Hyper-V.

Unable to vouch for the content of public boxes or be certain of how the templated VM they contained were configured, I opted to create my own box.

Apart from anything else, this presented another learning opportunity! \o/

Preparing a VM to be Boxed

To enable Vagrant to work its magic, the OS in the template VM has to be prepared in a very precise manner.

Briefly:

First, a specific user account must be configured with a prescribed password and provided with password-less sudo (in the case of Linux VMs).

Secondly, additional networking components are required to be installed to enable Vagrant to inspect the internals of the machine and to determine that the network stack in the VM has been provisioned correctly. Again, the documentation on this point is incomplete and inaccurate when it comes to Ubuntu 20.04 and it took me some time trawling through duckduckgo search results to figure out the solution to that!

The solution I eventually found was in this answer on Stack Overflow. I followed the answer entirely, though a commenter had suggested that only installing the additional packages (linux-virtual, linux-cloud-tools-virtual and linux-tools-virtual) was required.

There are additional/alternate considerations when preparing boxes containing VM’s with an OS other than Linux (or different flavours of Linux).

Simply put, preparing a VM box is more complicated than simply installing the OS you want to use in a configured VM template.

And every time a problem with provisioning a VM is determined to be due to an issue with the “box”, the template VM has to be reconfigured or fixed and the box re-exported and re-packaged. This is not necessarily difficult but was both time-consuming and frustrating.

Good, Not Great

Even when everything was working, Vagrant still did not present a fully automated solution.

Particularly annoying was the inability to specify the desired network switch in the VagrantFile, with Vagrant instead interactively prompting for a switch to be selected as each VM was created.

The VM creation process would also prompt – again, for each VM – for host OS credentials to mount the project folder into each guest VM. Since I didn’t need this, I could perhaps have disabled this behaviour, but I didn’t research that as I had to give up on Vagrant entirely for a different reason.

The Vagrant Deal-Breaker

After a lot of trouble to get Vagrant “working” I then discovered that my preferred VLAN configuration was going to cause a problem.

With a couple of test VM’s and the host OS on the same VLAN, everything worked well. Vagrant would create and configure my VM’s (after I answered the prompts) and report once they were operating correctly.

But this does not appear to work when a VM is on a different VLAN than the host OS, as was the case with my actual Kubernetes node VMs! I simply could not get Vagrant to acknowledge (and therefore fully configure) the VM’s when configured on a separate VLAN.

The VLAN was determined to be the cause by moving the host system onto the same VLAN as the VM’s, at which point Vagrant was once again happy.

I don’t understand why this is the case, since the Unifi Security Gateway is configured to allow routing of traffic between VLAN’s and I don’t otherwise have any problem with machines communicating across VLANs. In particular, with my initial manual install of Kubernetes with the same VLAN separation, there was no issue with the nodes, Kubernetes networking or my ability to communicate with them or the host.

This left me with three issues:

Being aware that networking is one of the more complicated aspects of Kubernetes, I was concerned that this issue might be symptomatic of a bigger networking problem-berg lurking with a Vagrant-based approach
Despite all the work involved in preparing a box, Vagrant was unable to completely configure my VM’s, at least not fully automatically.
The degree of automation that was provided by Vagrant was significantly offset by the amount of up-front work required to prepare a box, not to mention the “fussiness” of that work

So I decided to explore an alternative automation solution.

The Alternative: PowerShell

The alternative I switched to was simple PowerShell scripting, which I’ll cover in the next post, along with the actual creation of the Kubernetes cluster.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31