Jan Schär 9e9cfaabea | ||
---|---|---|
config | ||
tools | ||
.gitignore | ||
readme.md |
readme.md
soifai
This repository contains the configuration for setting up computers for participants, either for contests or for training. The installation is done with FAI (Fully Automatic Installation), read the FAI guide to learn about it.
The way it works is that you have a FAI server, which is just a computer running a lot of different servers. The computers to be installed are then network booted, and the installation happens automatically, which takes about 10 minutes. Finally, the computers can be rebooted into the installed operating system.
When the installed computers are running, you can edit the configuration and perform a soft update. This applies the entire configuration without reinstalling the machines, and it only takes a few seconds.
Here is what each server on the FAI server does:
- DHCP: This assigns IP addresses to machines, and also the TFTP server and filename for network booting.
- TFTP: This serves the bootloader and kernel which runs during the installation. It also has a config file for each machine, which tells it what to do after network booting.
- NFS: Short for Network File System. It is read-only for clients, and contains two things:
- The NFS root. This is the root file system of the OS which runs during the installation.
- The config space. This is stored in this repository, and it defines how the machines are configured.
- SSH: When the installation is complete, the installer uploads log files via ssh, and also disables installation for the just installed machine.
- NTP: This serves the time, important for contests.
- apt-cacher-ng: This is a proxy for APT, which caches all downloaded packages. That way, the machines don't need to be connected to the internet for the installation, and you only download everything once.
- FAI monitor: This is a nice GUI which shows you the installation progress of all machines. It is optional.
Setting up the FAI server
This part is unfortunately not automatic, but here is how to do it. I don't think it's a good idea to install this on your personal laptop, so instead I made a fresh install of Ubuntu on an external SSD and installed FAI server there. You could consider using Debian instead of Ubuntu, since Ubuntu needs a few extra steps.
When installing the system, set the hostname to contestserver
.
Don't use fai
as your username, because that is reserved for the log upload.
Installing Ubuntu on an external disk
This works almost like a normal install: Download the Ubuntu installer ISO, flash it to a USB drive, and boot from it. However, there is one issue, which is that the EFI bootloader needs to be installed in a different way on external disks, and the Ubuntu installer does not support that. Here are some guides:
- https://wiki.ubuntuusers.de/EFI_Externer-Datentr%C3%A4ger/ (it seems this doesn't work, but it explains some things)
- https://www.58bits.com/blog/2020/02/28/how-create-truly-portable-ubuntu-installation-external-usb-hdd-or-ssd
You can do it like this: (TODO: test these exact steps)
- Start the installer from the terminal with bootloader installation disabled:
ubiquity -b &
- When partitioning, create an EFI partition and a root filesystem partition
- Boot the system via the installer drive:
- boot from installer drive
- enter grub command line
- run
configfile (hd1,gpt2)/boot/grub/grub.cfg
(thehd1
part depends, it should be the external disk)
- unmount
/boot/efi
, edit/etc/fstab
and set the UUID for the EFI partition from the external disk, then mount it. - Run
grub-install -d /usr/lib/grub/x86_64-efi --efi-directory=/boot/efi/ --removable /dev/sdX
(where sdX is the external disk) - Alternatively you can do the chroot approach described in the guide
Setup
Connect the Ethernet port to a switch or another laptop, so that the link is up.
Set the IP address of the interface to static 10.0.0.9
, netmask 255.255.255.0
, no gateway.
Connect to the internet over a second network interface, e.g. Wifi or USB tethering with your phone. You need an internet connection while installing the machines, for apt-cacher-ng.
Wireshark (optional, useful for debugging network problems):
sudo apt install wireshark
sudo usermod -a -G wireshark $USER
You need to reboot before you can use wireshark.
Time server:
sudo apt install ntp
You should also install ntp on the grader. That way, the grader and the computers have exactly the same time. The NTP server on the FAI server is used as a fallback.
Parallel ssh:
sudo apt install pssh
Only on Ubuntu:
sudo apt install debian-archive-keyring
Install FAI:
sudo apt install perl-tk fai-quickstart
Set content of /etc/dhcp/dhcpd.conf
:
authoritative;
#deny unknown-clients;
option dhcp-max-message-size 2048;
use-host-decl-names on;
#always-reply-rfc1048 on;
subnet 10.0.0.0 netmask 255.255.255.0 {
range 10.0.0.100 10.0.0.150;
option routers 10.0.0.9;
option domain-name "contest";
option domain-name-servers 10.0.0.9;
option ntp-servers 10.0.0.9;
server-name contestserver;
next-server 10.0.0.9;
if substring(option vendor-class-identifier, 0, 20) = "PXEClient:Arch:00000" {
filename "fai/pxelinux.0";
}
if substring(option vendor-class-identifier, 0, 20) = "PXEClient:Arch:00007" {
filename "fai/syslinux.efi";
}
}
Reload: sudo systemctl restart isc-dhcp-server
Package download cache:
sudo apt install apt-cacher-ng
Set a password for actions the web interface in /etc/apt-cacher-ng/security.conf
.
Cache report at http://localhost:3142/acng-report.html
Only on Ubuntu: Ubuntu does not have the sysvinit-core package, so we can't use it for the NFS root. Use Debian instead:
- Replace content of
/etc/fai/apt/sources.list
with https://github.com/faiproject/fai/blob/master/conf/sources.list - Replace content of
/etc/fai/nfsroot.conf
with https://github.com/faiproject/fai/blob/master/conf/nfsroot.conf - Replace content of
/etc/fai/NFSROOT
with https://github.com/faiproject/fai/blob/master/conf/NFSROOT
Edit /etc/fai/nfsroot.conf
and change these variables:
FAI_DEBOOTSTRAP_OPTS="--force-check-gpg"
FAI_CONFIGDIR=/srv/soifai/config
Edit /etc/fai/fai.conf
and uncomment LOGUSER=fai
.
Clone this repository and move it to /srv
(move with sudo).
Set up FAI:
sudo fai-setup -v
Now you need to download some files:
cd /srv/soifai/config
mkdir downloads
Download VS code extensions and put them into the /srv/soifai/config/downloads
folder.
- https://marketplace.visualstudio.com/items?itemName=swissolyinfo.soicode
- https://marketplace.visualstudio.com/items?itemName=ms-vscode.cpptools
- https://marketplace.visualstudio.com/items?itemName=ms-python.python
Note: Some extensions (e.g. cpptools and python) may have pre-release versions, which should be avoided. But the marketplace website does not show which versions are pre-release, and if you just click "Download Extension", you will get a pre-release if there is one. To see which version is the most recent non-prerelease, open the extension in the marketplace in VS Code and look for the version tag. Then download this version in the "Version History" tab on the marketplace website.
Download the SOI Code::Blocks template for Ubuntu and put it in /srv/soifai/config/downloads
.
Download the SOI header:
wget -qO - https://git.soi.ch/SOI/soi-header/archive/master.tar.gz > downloads/soi-header.tar.gz
Create an ssh key and add the public key to /srv/soifai/config/simplefiles/CONTESTANT/root/.ssh/authorized_keys
.
ssh-keygen -t ed25519
Invent a password for root on the machines. Create a password hash for it:
sudo apt install whois
mkpasswd
Take the hash and put it in the ROOTPW
variable in class/FAIBASE.var
.
Installing the machines
Add the required number of hosts to simplefiles/CONTESTANT/etc/hosts
, and install that file locally:
sudo cp simplefiles/CONTESTANT/etc/hosts /etc/hosts
Collect MAC addresses of clients (guide). Start this command:
tcpdump -qtel broadcast and port bootpc | tee /tmp/mac.list
Now boot all the machines, then press Ctrl+C. Get a list of all MAC addresses with:
perl -ane 'print "\U$F[0]\n"' /tmp/mac.list|sort|uniq
For each machine, run this with the correct hostname and MAC address. This assigns the hostname and enables installation:
sudo dhcp-edit contestant10 3c:97:0e:1a:09:05
sudo fai-chboot -IFv -u nfs://10.0.0.9/srv/soifai/config contestant10
dhcp-edit
adds a line to /etc/dhcp/dhcpd.conf
.
If a you want to change the MAC address associated with a hostname later, you need to edit that file.
Note: Network booting is disabled after installation completes. To reenable:
sudo fai-chboot -e contestant10
Run FAI monitor to monitor installations:
fai-monitor | fai-monitor-gui -
Now you are ready to install the machines. For this, you need to interrupt the boot process and select network booting as booting method. For network booting to work, Secure Boot needs to be disabled on the machines.
Check /etc/exports
if NFS doesn't work. The IP address range must be correct.
To reload nfs config: sudo exportfs -ra
.
Logs are stored in /var/log/fai/remote-logs
.
Run this so you can read them:
sudo chmod o+rx /var/log/fai/remote-logs
In case network booting does not work, you can also boot with a USB stick (FAI-CD):
sudo fai-cd -A autodiscover.iso
sudo dd if=autodiscover.iso of=/dev/sdx bs=1M conv=fsync
For this to work, fai-monitor
needs to be running.
The IP address in /var/log/fai/variables
needs to be correct.
Administrating running machines
For administrating all machines at the same time, use parallel-ssh
.
For this, you need a file containing a list of hosts; see tools/hostlist
for an example.
Add host keys to your known_hosts
:
parallel-ssh -h hostlist -O StrictHostKeyChecking=accept-new true
Disable wifi:
parallel-ssh -h hostlist nmcli radio wifi off
(Comment: Wifi devices are turned on by default, and if you turn it off (through the UI, or with nmcli or rfkill), systemd-rfkill will create a file called e.g. /var/lib/systemd/rfkill/pci-0000\:02\:00.0-bcma-1\:wlan
with content 1, to remember that this device should be turned off. I have not found a way to have all wifi devices turned off by default.)
Test time sync:
parallel-ssh -h hostlist -i date
Performing a soft update
Do this to apply configuration after you have changed it.
Open FAI monitor to see progress:
fai-monitor | fai-monitor-gui -
Then:
parallel-ssh -h hostlist -o output --timeout 0 fai -v softupdate
The output of each machine is stored in the folder output
.
While working on the config and testing on a single machine, you can also run a single script, which is faster:
mount -t nfs 10.0.0.9:/srv/soifai/config /var/lib/fai/config
FAI=/var/lib/fai/config target=/ /var/lib/fai/config/scripts/CONTESTANT/10-config
Preparing client certificates
Prepare the usernames.csv
.
Each line should contain a username and real name.
Then run:
sudo apt install golang-cfssl
./create-certs.sh
You will need certs/ca.pem
on the grader.
To test the client certificates, you can set up a test server:
sudo apt install nginx
Edit /etc/nginx/sites-enabled/default
:
server {
listen 443 ssl default_server;
listen [::]:443 ssl default_server;
include snippets/snakeoil.conf;
ssl_client_certificate /srv/soifai/tools/certs/ca.pem;
ssl_verify_client on;
root /var/www/html;
index index.html index.htm index.nginx-debian.html;
server_name _;
}
Then: sudo systemctl restart nginx.service
With this configuration, you should now get an error if you access the server without a valid client certificate.
Starting the contest
You should stop apt-cacher-ng during contests: sudo systemctl stop apt-cacher-ng.service
Run these commands to prepare the contest
# Kill user session
parallel-ssh -h hostlist loginctl kill-session 1
parallel-ssh -h hostlist pkill -KILL -u contestant
# Delete user
parallel-ssh -h hostlist userdel -r contestant
# Recreate user
parallel-ssh -h hostlist adduser --disabled-password --gecos "\"SOI Finals\"" contestant
# Assign users to machines
./assign-user.sh contestant10 stofl
./assign-user.sh contestant11 binna1
...
# Adjust the contest lock screen configuration and copy it to all machines
parallel-scp -h hostlist ./contest-lock.json /etc/contest-lock.json
# Reboot to trigger autologin and clear `/tmp`
parallel-ssh -h hostlist reboot
Start backups:
./backup-create.sh timer
If backup machine needed
Replace contestant25
with the backup machine name.
Replace contestantxx
with the old machine (prepare by keeping the assignment of users to machines close by).
./assign-user.sh contestant25 username
rsync -av --chown contestant:contestant backups/contestantxx/xxxx/ root@contestant25:/home/contestant/
Replacement exam setup
We used the laptops owned by SOI for the replacement exams.
For this, we manually set a static IP address (same as the DHCP assigned one) in the settings on each laptop (it will prompt you for the root password). That way, it works without the FAI server.
All the commands in the section "Starting the contest" can then be run from the grader.
You just need to install: sudo apt install ntp pssh
Contest setup at Uni Bern
At Uni Bern, we don't set up the machines from scratch. Most of the software is already installed for us.
Copy this repository to the server, and run admin commands from there. We don't use FAI, instead we just run a script.
Test on one machine:
rsync --archive --chown root:root --delete --verbose config/ root@chagall.soi:/var/lib/bernconfig/
ssh root@chagall.soi /var/lib/bernconfig/setup-bern.sh
Deploy to all machines:
parallel-rsync -h hostlist -x "--archive --chown root:root --delete" config/ /var/lib/bernconfig/
parallel-ssh -h hostlist -o output /var/lib/bernconfig/setup-bern.sh
We need to add finals.soi.ch to the hosts file, so copy the file from any machine, add the entry and then deploy it:
parallel-rsync -h hostlist -x "--archive --chown root:root" hosts /etc/hosts
Otherwise, contest administration works the same. Run the backup script on the server inside a tmux, so that it continues when your ssh connection breaks.
Training setup
We also use FAI to set up the SOI laptops for non-contest use.
In config/class/50-host-classes
, replace CONTESTANT
with TRAINING
.
You need to set the password for the admin account.
You can find the password in the internal wiki.
If you reinstall all laptops, you can choose a new password and update it in the wiki.
Hash the password with mkpasswd
, and put the hash in SUPER_USER_PW
in the file config/class/TRAINING.var
.
This repo is public, so don't commit the hash.
Then, install the laptops as in the section "Installing the machines".
Problems and solutions
This is a list of problems that we had and how we solved them.
DHCP server not running. This happens if the network cable was not plugged in when booting.
systemctl status isc-dhcp-server.service
systemctl restart isc-dhcp-server.service
Network booting fails. Fixed by disabling Secure Boot in the system settings.
Problems with RTL8153-based USB Ethernet adapters. I suspect that problems are caused by faulty firmware in the adapters. We had a strange problem where network booting failed when the laptops were connected to a Netgear GS316P switch, but worked when connected to a Netgear GS108 switch. The adapters have issues when connected over USB 3, but work fine over USB 2. You can force USB 2 by connecting adapters via an USB 2 extension cable. Half-inserting the USB connector also works in a pinch.
Installation of packages fails.
Check that the date is set correctly in the system settings.
If you get a HTTP 503 error, try restarting apt-cacher-ng
.
Installed system does not boot. Fixed by changing boot mode from legacy/BIOS to UEFI.
The first time you boot the installed system, everything is fine, but after a reboot, the screen just shows a blinking cursor.
Linux is actually running and you can ssh into the machine, but for some unknown reason gdm failed to start.
We don't know why this happens yet, but we have a workaround: Just run systemctl start gdm
via ssh.
parallel-ssh -h hostlist systemctl start gdm
User indicator does not appear.
Fixed by adding the gnome shell version from gnome-shell --version
to the list of supported versions: shell-version
in simplefiles/CONTESTANT/usr/share/gnome-shell/extensions/user-indicator@soi.ch/metadata.json
.
The same applies for the contest-lock extension.
cpptools VS code extension crashes. Fixed by using an older of cpptools, which you can download in the "Version History" tab on the marketplace website. This happened because we were unknowingly using a pre-release.
Config space
The config space defines how the machines are setup. It is mounted with nfs on the target machine.
FAI has the concept of classes. Classes are defined by a list of strings. Order matters, classes are applied in the order in which they are defined.
We have defined these classes:
PARTICIPANT
: This sets up things that are used for both contest and training.- various code editors and other tools
- VS Code extensions
- soi header
- Code::Blocks template
- wallpaper
- default favorite apps
- default list of keyboard layouts
- only allow ssh login as root, and disable password auth
CONTESTANT
:- firewall (configured in
simplefiles/CONTESTANT/etc/nftables.conf
) - disable bluetooth
- disable sleep
- disable lock on blank screen
- disable software update notifications
- disable some panels in gnome-control-center
- add polkit rules which block changing network settings and mounting storage devices (it prompts for the root password)
- configure NTP server
- set
authorized_keys
for root - enable automatic login
- set browser homepage and bookmarks to https://finals.soi.ch
- add a gnome shell extension which displays the user name in the top bar
- add contest lock gnome shell extension
- add some management scripts to be run via ssh
- add some packages
- firewall (configured in
TRAINING
:- add admin user with sudo rights
- remove APT proxy after packages are installed
Here is an overview of the file structure:
-
class
: First, all scripts in this folder which start with two digits are run. Of these files, the ones which do not have a.sh
suffix define the classes.FAI then goes through the list of classes and reads variables from the
$class.var
file (if it exists). -
script
: This contains a folder of scripts for each class. In these scripts, we can use these variables:$FAI
: Path to the config space.$target
: The root of the system being installed. For soft update, this is/
.$ROOTCMD
:chroot $target
during install. For soft update, this is empty.
-
simplefiles
: This contains a bunch of files for each class that are copied over the root of the target. (The FAI example config only hasfiles
, but that is annoying because classes are at the end of the path.) -
package_config
: This defines which packages are installed.
During the warmup, some participants requested additional VS code extensions.
Because we didn't want them to be enabled for everyone, we put the .vsix files into simplefiles/CONTESTANT/opt/
and did a soft update, so that they could manually install it from /opt
.
Whenever you add a new script, don't forget to make it executable with chmod +x script.sh
, otherwise it will silently not be executed.
Third-party apt repos are already setup. This is how it was done:
wget -qO - https://packagecloud.io/AtomEditor/atom/gpgkey | gpg --dearmor > simplefiles/PARTICIPANT/usr/share/keyrings/atom-archive-keyring.gpg
wget -qO - https://download.sublimetext.com/sublimehq-pub.gpg | gpg --dearmor > simplefiles/PARTICIPANT/usr/share/keyrings/sublimehq-archive-keyring.gpg
wget -qO - https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > simplefiles/PARTICIPANT/usr/share/keyrings/microsoft-archive-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/atom-archive-keyring.gpg] http://packagecloud.io/AtomEditor/atom/any/ any main" > simplefiles/PARTICIPANT/etc/apt/sources.list.d/atom.list
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/sublimehq-archive-keyring.gpg] http://download.sublimetext.com/ apt/stable/" > simplefiles/PARTICIPANT/etc/apt/sources.list.d/sublime-text.list
echo "deb [arch=amd64,arm64,armhf signed-by=/usr/share/keyrings/microsoft-archive-keyring.gpg] http://packages.microsoft.com/repos/code stable main" > simplefiles/PARTICIPANT/etc/apt/sources.list.d/vscode.list
Contest lock screen
The contest lock screen is a gnome extension which can lock the screen and show a countdown until the contest starts.
The screen is unlocked when the contest starts.
The lock screen also displays the user name and a title.
It is configured in the file /etc/contest-lock.json
.
It watches this file, and when it changes the new configuration is instantly applied.
If there is an error in the config file, it will continue to use the old config and print a message. To see the logs, run this on a contestant machine:
journalctl -f -o cat /usr/bin/gnome-shell
An additional text can be shown with the message
field. It can contain newlines (\n
).
In case there is a problem with the contest lock screen and you can't fix it, the backup solution is to turn off AutomaticLoginEnable
and set a password instead, that you announce when the contest starts:
parallel-ssh -h hostlist 'chpasswd <<< contestant:stofl'
Development notes
Links:
Regular lock screen (contest-lock is based on this):
- https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/screenShield.js
- https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/unlockDialog.js
Developer commands:
- Restart gnome-shell: Press Alt+F2, enter
r
. Only works if you log in with Xorg. - Open the gnome-shell developer tools: Press Alt+F2, enter
lg
.
TODO
- It would be useful to have something like lineinfile from ansible. FAI has ainsl, but it's not powerful enough. We could just copy the ansible code and make a cli out of it: https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/lineinfile.py
- Allow locale change. The problem is that this requires logging out and back in. Maybe this could be a feature of contest-lock. You may also want to install task-german, firefox-esr-l10n-de (and same for other languages).
License
This project is distributed under the terms of the GNU General Public License, version 2.