This tutorial will demonstrate how to create a high availability Vault cluster with Consul as back-end storage. Vault is mainly used to store passwords, keys, tokens, and certificates. To set up a high availability cluster we need 2 vault machines and 3 Consul machines, refer below diagram for more details.
Vault HA Architecture
From the HA Vault architecture, you can see one Vault node is Active and the other one is Passive or standby. The active instance will handle all the requests (reads and writes) and all standby nodes redirect requests to the active node.
We will be going to set up a cluster similar to the diagram. For this, provision 2 Vault machines and 3 Consul machines. I have provisioned all my machines in the Azure cloud. To provision multiple Virtual machines you can use Ansible or Terraform. Please refer to my older post about provisioning multiple VMs in the Azure cloud.
Provision using Ansible playbook
Provision using Terraform script
Server details
My 5 servers are up and running in the cloud. All servers are provisioned in the same VNET so they can communicate with each other using the private IP address or the VM name.
VM name | Private IP address |
vm-vault-1 | 50.1.0.5 |
vm-vault-2 | 50.1.0.6 |
vm-stage-consul-1 | 50.1.4.4 |
vm-stage-consul-2 | 50.1.4.5 |
vm-stage-consul-3 | 50.1.4.6 |
Vault and Consul installation steps
First, log in to the Consul machine and download the latest consul binary from the Hashi corp official page and move to /usr/local/bin/consul.
Step.1:
vm-stage-consul-1: sudo mkdir /usr/local/bin/consul -p vm-stage-consul-1: wget https://releases.hashicorp.com/consul/1.9.1/consul_1.9.1_linux_amd64.zip vm-stage-consul-1:unzip consul_1.9.1_linux_amd64.zip vm-stage-consul-1:sudo mv consul /usr/local/bin/consul vm-stage-consul-1:/usr/local/bin/consul$ ls consul
Step.2:
Create a Consul configuration file under /usr/local/etc/consul/consul_s1.json location as follows and create a data directory to store the data (/var/consul/data)
vm-stage-consul-1:mkdir -p /var/consul/data vm-stage-consul-1:/usr/local/etc/consul$ cat consul_s1.json { "server": true, "node_name": "consul_s1", "datacenter": "dc1", "data_dir": "/var/consul/data", "bind_addr": "0.0.0.0", "client_addr": "0.0.0.0", "advertise_addr": "50.1.4.4", "bootstrap_expect": 3, "retry_join": ["50.1.4.4", "50.1.4.5", "50.1.4.6"], "ui": true, "log_level": "DEBUG", "enable_syslog": true, "acl_enforce_version_8": false }
Notice that the server parameter is set to true to indicate that this instance will run in server mode.
Step.3:
Next, create a consul systemd unit file under /etc/systemd/system/consul.service and a PID file under /var/run/consul/consul-server.pid
sudo mkdir /var/run/consul sudo touch /var/run/consul/consul-server.pid $ cat /etc/systemd/system/consul.service ### BEGIN INIT INFO # Provides: consul # Required-Start: $local_fs $remote_fs # Required-Stop: $local_fs $remote_fs # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Consul agent # Description: Consul service discovery framework ### END INIT INFO [Unit] Description=Consul server agent Requires=network-online.target After=network-online.target [Service] PIDFile=/var/run/consul/consul-server.pid PermissionsStartOnly=true ExecStart=/usr/local/bin/consul/consul agent \ -config-file=/usr/local/etc/consul/consul_s1.json \ -pid-file=/var/run/consul/consul-server.pid ExecReload=/bin/kill -HUP $MAINPID KillMode=process KillSignal=SIGTERM Restart=on-failure RestartSec=42s [Install] WantedBy=multi-user.target
Perform step1 to 3 on remaining Consul servers. Please change the following parameters as follows.
file name: consul_s2.json, consul_s3.json etc
Node_name: consul_s2 (consul_s2.json file)
advertise_addr: Changed according to the server’s private IP address.
The config file location should be changed based on your config file name in the consul.service file.
Once the Consul binary has installed on all nodes, execute the below commands to enable the service on all machines.
vm-stage-consul-1:~$ sudo systemctl daemon-reload vm-stage-consul-1:~$ sudo systemctl restart consul vm-stage-consul-1:~$ sudo systemctl status consul ● consul.service - Consul server agent Loaded: loaded (/etc/systemd/system/consul.service; disabled; vendor preset: enabled) Active: active (running) since Thu 2021-06-03 15:43:46 UTC; 6s ago Main PID: 4984 (consul) Tasks: 7 (limit: 2263) CGroup: /system.slice/consul.service └─4984 /usr/local/bin/consul/consul agent -config-file=/usr/local/etc/consul/consul_s1.json -pid-file=/var/run/consul/consul-server.pid Jun 03 15:43:47 vm-stage-consul-1 consul[4984]: 2021-06-03T15:43:47.831Z [DEBUG] agent.server.serf.wan: serf: messageJoinType: consul_s1.dc1
To confirm all our Consul servers are up and running execute the below command on server 1.
vm-stage-consul-1:~$ /usr/local/bin/consul/consul members Node Address Status Type Build Protocol DC Segment consul_s1 50.1.4.4:8301 alive server 1.9.1 2 dc1 <all> consul_s2 50.1.4.5:8301 alive server 1.9.1 2 dc1 <all> consul_s3 50.1.4.6:8301 alive server 1.9.1 2 dc1 <all>
Here we can see all our services are running fine.
Set up consul client agents on vault nodes
Step.4:
To install Consul client agent in Vault nodes login to the first vault server and execute the below commands. Download the binary file.
vm-vault-1:~$ wget https://releases.hashicorp.com/consul/1.9.1/consul_1.9.1_linux_amd64.zip vm-vault-1:~$ unzip consul_1.9.1_linux_amd64.zip vm-vault-1:~$ sudo mkdir -p /usr/local/bin/consul vm-vault-1:~$ sudo mv consul /usr/local/bin/consul/
Create a configuration file as follows. /usr/local/etc/consul/consul_c1.json. Please note here I used c1 because this is a Consul client service.
vm-vault-1:~$ sudo mkdir /usr/local/etc/consul vm-vault-1:~$ sudo vim /usr/local/etc/consul/consul_c1.json { "server": false, "datacenter": "dc1", "node_name": "consul_c1", "data_dir": "/var/consul/data", "bind_addr": "50.1.0.5", "client_addr": "127.0.0.1", "retry_join": ["50.1.4.4", "50.1.4.5", "50.1.4.6"], "log_level": "DEBUG", "enable_syslog": true, "acl_enforce_version_8": false }
Here “bind_addr” is the private IP address of the Vault server and not the Consul server IP address. Add all Consul server IPs under the “retry_join” field.
Copy and paste the systemd unit file to /etc/systemd/system/consul.service location and start the consul service as performed earlier. Follow Step.3 commands.
Log in to the second Vault server and perform the same steps. Don’t forget to change the below properties.
Filename: consul_c2.json
node_name: consul_c2
bind_addr:<second vault address>
All the changes related to Consul have been completed. Next, configure the Vault server.
Configure the vault server
Step.5:
Download vault binary from the Hashicorp website and save it to the /usr/local/bin/ location
vm-vault-1:~$ wget https://releases.hashicorp.com/vault/1.6.1/vault_1.6.1_linux_amd64.zip unzip vault_1.6.1_linux_amd64.zip sudo mv vault /usr/local/bin/ sudo mkdir /etc/vault/ sudo vim /etc/vault/vault_server.hcl listener "tcp" { address = "0.0.0.0:8200" cluster_address = "50.1.0.5:8201" tls_disable = "true" } storage "consul" { address = "127.0.0.1:8500" path = "vault/" } api_addr = "http://50.1.0.5:8200" cluster_addr = "https://50.1.0.5:8201"
Once the configuration file is created, next create a system unit file under /etc/systemd/system/vault.service location as follows. Please make sure that the PID file is in place /var/run/vault/vault.pid.
vm-vault-1:~$ cat /etc/systemd/system/vault.service ### BEGIN INIT INFO # Provides: vault # Required-Start: $local_fs $remote_fs # Required-Stop: $local_fs $remote_fs # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: Vault server # Description: Vault secret management tool ### END INIT INFO [Unit] Description=Vault secret management tool Requires=network-online.target After=network-online.target [Service] PIDFile=/var/run/vault/vault.pid ExecStart=/usr/local/bin/vault server -config=/etc/vault/vault_server.hcl -log-level=debug ExecReload=/bin/kill -HUP $MAINPID KillMode=process KillSignal=SIGTERM Restart=on-failure RestartSec=42s LimitMEMLOCK=infinity [Install] WantedBy=multi-user.target
Reload the service daemon and start the vault service.
sudo systemctl start vault sudo systemctl status vault ● vault.service - Vault secret management tool Loaded: loaded (/etc/systemd/system/vault.service; disabled; vendor preset: enabled) Active: active (running) since Thu 2021-06-03 17:01:27 UTC; 6s ago Main PID: 19648 (vault) Tasks: 6 (limit: 4074) CGroup: /system.slice/vault.service └─19648 /usr/local/bin/vault server -config=/etc/vault/vault_server.hcl -log-level=debug Jun 03 17:01:28 vm-vault-1 vault[19648]: 2021-06-03T17:01:28.023Z [DEBUG] storage.consul: config path set: path=vault/
Perform Step.5 on Vault server 2 as well. If you could start the Vault service without any issue then we have completed the Vault and Consul installation. Login to Consul server 1 and execute the below command to make sure that all the nodes are up.
vm-stage-consul-1:~$ /usr/local/bin/consul/consul members Node Address Status Type Build Protocol DC Segment consul_s1 50.1.4.4:8301 alive server 1.9.1 2 dc1 <all> consul_s2 50.1.4.5:8301 alive server 1.9.1 2 dc1 <all> consul_s3 50.1.4.6:8301 alive server 1.9.1 2 dc1 <all> consul_c1 50.1.0.5:8301 alive client 1.9.1 2 dc1 <default> consul_c2 50.1.0.6:8301 alive client 1.9.1 2 dc1 <default>
We have completed the Vault and Consul installation. Next, initialize the vault by executing the below command
Export the vault address
vm-vault-1:~$ export VAULT_ADDR='http://127.0.0.1:8200' vm-vault-1:~$ vault operator init Unseal Key 1: KyHzE+WPqgN759d7hXNiEK2DJUIlgW1H7KvpiSdGjfmF Unseal Key 2: N839Ijnn7KvtbFC8NrBS4alwFmO6w5b1rXLPFR7c1fcg Unseal Key 3: idzt+yxUuVofVWrENX3mlb64VPgIOoixqsk8QU3fr00w Unseal Key 4: igeYuAaXE84F78cH0ZXqxMnR5qsjJ0DVpGBFlYfQuskk Unseal Key 5: qes2c+iHWlVVJekdY6tCeUtU3T/fKUAYhlLDn04o9n6A Initial Root Token: s.f4ywih3cFm6uRg7sqM5Kj9mg
The above command generates 5 vaults unseal key and a Root token. Now that you have successfully initialized Vault, go ahead and unseal the first vault server
vm-vault-1:~$ vault operator unseal KyHzE+WPqgN759d7hXNiEK2DJUIlgW1H7KvpiSdGjfmF Key Value --- ----- Seal Type shamir Initialized true Sealed true Total Shares 5 Threshold 3 Unseal Progress 1/3 Unseal Nonce 97605fdd-3942-9700-d0f7-0cb3b175bdc6 Version 1.6.1 Storage Type consul HA Enabled true
Repeat the “vault operator unseal” command with the 2 more keys to unseal the vault. Once done execute the “vault status” command to see that status of vault.
vm-vault-1:~$ vault status Key Value --- ----- Seal Type shamir Initialized true Sealed false Total Shares 5 Threshold 3 Version 1.6.1 Storage Type consul Cluster Name vault-cluster-3f751a14 Cluster ID 6deb80f5-ec50-99a3-8a50-f9ab6c293920 HA Enabled true HA Cluster https://50.1.0.5:8201 HA Mode active
Here we can see vm-vault-1 is active, unseal the second vault server using the same key and execute vault status. The second server will be in standby mode.
vm-vault-2:~$ vault status Key Value --- ----- Seal Type shamir Initialized true Sealed false Total Shares 5 Threshold 3 Version 1.6.1 Storage Type consul Cluster Name vault-cluster-3f751a14 Cluster ID 6deb80f5-ec50-99a3-8a50-f9ab6c293920 HA Enabled true HA Cluster https://50.1.0.5:8201 HA Mode standby Active Node Address http://50.1.0.5:8200
Vault servers are now active in HA mode.
Reference: