| 1 | # Setup on Ubuntu |
| 2 | |
| 3 | ``` |
| 4 | apt install ansible ansible-mitogen |
| 5 | ``` |
| 6 | |
| 7 | # Required collections |
| 8 | |
| 9 | ``` |
| 10 | ansible-galaxy install -r roles/requirements.yml |
| 11 | ``` |
| 12 | |
| 13 | # Privileged data |
| 14 | |
| 15 | Privileged data is stored in Bitwarden. To use roles that fetch privileged data, |
| 16 | the following utilities must be available: |
| 17 | |
| 18 | * [bw](https://bitwarden.com/help/cli/) |
| 19 | |
| 20 | Once installed, login and unlock the vault: |
| 21 | |
| 22 | ``` |
| 23 | bw login # or, `bw unlock` |
| 24 | export BW_SESSION=xxxx |
| 25 | bw sync -f |
| 26 | ``` |
| 27 | |
| 28 | # Running playbooks |
| 29 | |
| 30 | ``` |
| 31 | ansible-playbook -i hosts [-l SUBSET] site.yaml |
| 32 | ``` |
| 33 | |
| 34 | ## Skip slow tasks |
| 35 | |
| 36 | `ansible-playbook --skip-tags slow` |
| 37 | |
| 38 | # Bootstrapping hosts |
| 39 | |
| 40 | ## CI host |
| 41 | |
| 42 | ### Debian |
| 43 | |
| 44 | 1. Boot host with PXE |
| 45 | 2. Select the option `Debian Bookworm amd64 (CI-host)` or equivalent |
| 46 | 3. Post-preseed verifications: |
| 47 | * Check that start-stop-daemon is available in `$PATH`. If not: `touch /sbin/start-stop-daemon; chmod +x /sbin/start-stop-daemon ; apt-get install --reinstall dpkg` |
| 48 | * Verify that the ZFS pool `tank` exists on the target host. If not, create it e.g. `zpool create -f tank mirror dev1 dev2` |
| 49 | 4. Add the host to the ansible inventory in the hosts group and in the appropriate cluster group |
| 50 | 5. For LXD hosts, add the host to the `lxd` group |
| 51 | 6. Follow the appropriate LXD or Incus cluster steps |
| 52 | |
| 53 | ### Windows |
| 54 | |
| 55 | 1. Configure either SSH or WinRM connection: see https://docs.ansible.com/ansible/latest/os_guide/windows_setup.html |
| 56 | 2. For arm64 hosts: |
| 57 | * Install the necessary optional features (eg. OpenSSH, Hyper-V) since Windows RSAT isn't available on Arm64 yet |
| 58 | |
| 59 | ## CI 'rootnode' |
| 60 | |
| 61 | 1. Add the new ansible node to the `node_standalone` group in the inventory |
| 62 | 2. Add an entry to the `vms` variable in the host vars for the libvirt host |
| 63 | * See the defaults and details in `roles/libvirt/vars/main.yml` and `roles/libvirt/tasks/main.yml` |
| 64 | * Make sure to set the `cdrom` key to the path of ISO for the installer |
| 65 | 3. Run the playbook, eg. `ansible-playbook -i hosts -l cloud07.internal.efficios.com site.yml` |
| 66 | * The VM should be created and started |
| 67 | 4. Once the VM is installed take a snapshot so that Jenkins may revert to the original state |
| 68 | * `ansible-playbook playbooks/snapshot-rootnode.yml -e '{"revert_before": false}' -l new-rootnode` |
| 69 | |
| 70 | ### Ubuntu auto-installer |
| 71 | |
| 72 | 1. Note your IP address |
| 73 | 2. Switch to the directory with the user-data files: `cd roles/libvirt/files` |
| 74 | 3. Write out the instance-specific metadata, eg. |
| 75 | |
| 76 | ``` |
| 77 | cat > meta-data <<EOF |
| 78 | instance-id: iid-XXX |
| 79 | hostname: XXX.internal.efficios.com |
| 80 | EOF |
| 81 | ``` |
| 82 | * The instance-id is used to determine if re-installation is necessary. |
| 83 | 4. Start a python web server: `python3 -m http.server 3003` |
| 84 | 5. Connect to the VM using a remote viewer on the address given by `virsh --connect qemu+ssh://root@host/system domdisplay` |
| 85 | 6. Edit the grub boot options for the installer and append the following as arguments for the kernel: `autoinstall 'ds=nocloud-net;s=http://IPADDRESS:3003/'` and boot the installer |
| 86 | * Note that the trailing `/` and quoting are important |
| 87 | * The will load the `user-data`, `meta-data`, and `vendor-data` files in the directory served by the python web server |
| 88 | 7. After the installation is complete, the system will reboot and run cloud-init for the final portion of the initial setup. Once completed, ansible can be run against it using the ubuntu user and becoming root, eg. `ansible-playbook -i hosts -u ubuntu -b ...` |
| 89 | |
| 90 | # LXD Cluster |
| 91 | |
| 92 | ## Start a new cluster |
| 93 | |
| 94 | 1. For the initial member of the cluster, set the `lxd_cluster` variable in the host variables to something similar to: |
| 95 | |
| 96 | ``` |
| 97 | lxd_cluster: |
| 98 | server_name: cluster-member-name |
| 99 | enabled: true |
| 100 | member_config: |
| 101 | - entity: storage-pool |
| 102 | name: default |
| 103 | key: source |
| 104 | value: tank/lxd |
| 105 | ``` |
| 106 | |
| 107 | 2. Run the `site.yml` playbook on the node |
| 108 | 3. Verify that storage pool is configured: |
| 109 | |
| 110 | ``` |
| 111 | $ lxc storage list |
| 112 | | name | driver | state | |
| 113 | | default | zfs | created | |
| 114 | ``` |
| 115 | |
| 116 | * If not present, create it on necessary targets: |
| 117 | |
| 118 | ``` |
| 119 | $ lxc storage create default zfs source=tank/lxd --target=cluster-member-name |
| 120 | # Repeat for any other members |
| 121 | # Then, on the member itself |
| 122 | $ lxc storage create default zfs |
| 123 | # The storage listed should not be in the 'pending' state |
| 124 | ``` |
| 125 | |
| 126 | 4. Create a metrics certificate pair for the cluster, or use an existing one |
| 127 | |
| 128 | ``` |
| 129 | openssl req -x509 -newkey ec -pkeyopt ec_paramgen_curve:secp384r1 -sha384 -keyout metrics.key -nodes -out metrics.crt -days 3650 -subj "/CN=metrics.local" |
| 130 | lxc config trust add metrics.crt --type=metrics |
| 131 | ``` |
| 132 | |
| 133 | ## Adding a new host |
| 134 | |
| 135 | 2. In the member's host_vars file set the following keys: |
| 136 | 1. On the existing host or cluster, generate a token for the new member: `lxc cluster add member-host-name` |
| 137 | * `lxd_cluster_ip`: The IP address on which the server will listen |
| 138 | * `lxd_cluster`: In a fashion similar to the following entry |
| 139 | ``` |
| 140 | lxd_cluster: |
| 141 | enabled: true |
| 142 | # Same as the name from the token created above |
| 143 | server_name: 'member-host-name' |
| 144 | # This shoud match `lxd_cluster_ip` |
| 145 | server_address: 172.18.0.192 |
| 146 | cluster_token: 'xxx' |
| 147 | member_config: |
| 148 | - entity: storage-pool |
| 149 | name: default |
| 150 | key: source |
| 151 | value: tank/lxd |
| 152 | ``` |
| 153 | * The `cluster_token` does not need to be kept in git after the the playbook's first run |
| 154 | 3. Assuming the member is in the host's group of the inventory, run the `site.yml` playbook. |
| 155 | |
| 156 | ## Managing instances |
| 157 | |
| 158 | Local requirements: |
| 159 | |
| 160 | * python3, python3-dnspython, python3-jenkins, samba-tool, kinit |
| 161 | |
| 162 | To automatically provision instances, perform certain operations, and update DNS entries: |
| 163 | |
| 164 | 1. Update `vars/ci-instances.yml` |
| 165 | 2. Open a kerberos ticket with `kinit` |
| 166 | 3. Run the playbook, eg. `ansible-playbook playbooks/ci-instances.yml` |
| 167 | |
| 168 | # Incus cluster |
| 169 | |
| 170 | ## Migration from LXD |
| 171 | |
| 172 | 1. Run the `site.yml` playbook on the hosts to install `incus` and `incus-tools` |
| 173 | 2. On one cluster member, start the `lxd-to-incus` script, and follow the prompts |
| 174 | 3. On each other cluster member, start `lxd-to-incus --cluster-member` |
| 175 | 4. When prompted on each cluster member, uninstall `lxd`. |