This repository has been archived on 2024-05-18. You can view files and clone it, but cannot push or open issues or pull requests.
soifai/readme.md

637 lines
23 KiB
Markdown
Raw Normal View History

2022-07-15 11:59:26 +02:00
# soifai
This repository contains the configuration for setting up computers for participants, either for contests or for training.
The installation is done with FAI (Fully Automatic Installation), read the [FAI guide] to learn about it.
[FAI guide]: https://fai-project.org/fai-guide/
The way it works is that you have a FAI server, which is just a computer running a lot of different servers.
The computers to be installed are then network booted, and the installation happens automatically, which takes about 10 minutes.
Finally, the computers can be rebooted into the installed operating system.
When the installed computers are running, you can edit the configuration and perform a soft update.
This applies the entire configuration without reinstalling the machines, and it only takes a few seconds.
Here is what each server on the FAI server does:
- DHCP: This assigns IP addresses to machines, and also the TFTP server and filename for network booting.
- TFTP: This serves the bootloader and kernel which runs during the installation.
It also has a config file for each machine, which tells it what to do after network booting.
- NFS: Short for Network File System. It is read-only for clients, and contains two things:
- The NFS root. This is the root file system of the OS which runs during the installation.
- The config space. This is stored in this repository, and it defines how the machines are configured.
- SSH: When the installation is complete, the installer uploads log files via ssh, and also disables installation for the just installed machine.
- NTP: This serves the time, important for contests.
- apt-cacher-ng: This is a proxy for APT, which caches all downloaded packages. That way, the machines don't need to be connected to the internet for the installation, and you only download everything once.
- FAI monitor: This is a nice GUI which shows you the installation progress of all machines. It is optional.
## Setting up the FAI server
This part is unfortunately not automatic, but here is how to do it.
I don't think it's a good idea to install this on your personal laptop, so instead I made a fresh install of Ubuntu on an external SSD and installed FAI server there.
You could consider using Debian instead of Ubuntu, since Ubuntu needs a few extra steps.
When installing the system, set the hostname to `contestserver`.
Don't use `fai` as your username, because that is reserved for the log upload.
### Installing Ubuntu on an external disk
This works almost like a normal install: Download the Ubuntu installer ISO, flash it to a USB drive, and boot from it.
However, there is one issue, which is that the EFI bootloader needs to be installed in a different way on external disks, and the Ubuntu installer does not support that.
Here are some guides:
- https://wiki.ubuntuusers.de/EFI_Externer-Datentr%C3%A4ger/ (it seems this doesn't work, but it explains some things)
- https://www.58bits.com/blog/2020/02/28/how-create-truly-portable-ubuntu-installation-external-usb-hdd-or-ssd
You can do it like this: (TODO: test these exact steps)
- Start the installer from the terminal with bootloader installation disabled: `ubiquity -b &`
- When partitioning, create an EFI partition and a root filesystem partition
- Boot the system via the installer drive:
- boot from installer drive
- enter grub command line
- run `configfile (hd1,gpt2)/boot/grub/grub.cfg` (the `hd1` part depends, it should be the external disk)
- unmount `/boot/efi`, edit `/etc/fstab` and set the UUID for the EFI partition from the external disk, then mount it.
- Run `grub-install -d /usr/lib/grub/x86_64-efi --efi-directory=/boot/efi/ --removable /dev/sdX` (where sdX is the external disk)
- Alternatively you can do the chroot approach described in the guide
### Setup
Connect the Ethernet port to a switch or another laptop, so that the link is up.
Set the IP address of the interface to static `10.0.0.9`, netmask `255.255.255.0`, no gateway.
Connect to the internet over a second network interface, e.g. Wifi or USB tethering with your phone.
You need an internet connection while installing the machines, for apt-cacher-ng.
Wireshark (optional, useful for debugging network problems):
```
sudo apt install wireshark
sudo usermod -a -G wireshark $USER
```
You need to reboot before you can use wireshark.
Time server:
```
sudo apt install ntp
```
You should also install ntp on the grader.
That way, the grader and the computers have exactly the same time.
The NTP server on the FAI server is used as a fallback.
Parallel ssh:
```
sudo apt install pssh
```
Only on Ubuntu:
```
sudo apt install debian-archive-keyring
```
Install FAI:
```
sudo apt install perl-tk fai-quickstart
```
Set content of `/etc/dhcp/dhcpd.conf`:
```
authoritative;
#deny unknown-clients;
option dhcp-max-message-size 2048;
use-host-decl-names on;
#always-reply-rfc1048 on;
subnet 10.0.0.0 netmask 255.255.255.0 {
range 10.0.0.100 10.0.0.150;
option routers 10.0.0.9;
option domain-name "contest";
option domain-name-servers 10.0.0.9;
option ntp-servers 10.0.0.9;
server-name contestserver;
next-server 10.0.0.9;
if substring(option vendor-class-identifier, 0, 20) = "PXEClient:Arch:00000" {
filename "fai/pxelinux.0";
}
if substring(option vendor-class-identifier, 0, 20) = "PXEClient:Arch:00007" {
filename "fai/syslinux.efi";
}
}
```
Reload: `sudo systemctl restart isc-dhcp-server`
Package download cache:
```
sudo apt install apt-cacher-ng
```
Set a password for actions the web interface in `/etc/apt-cacher-ng/security.conf`.
Cache report at http://localhost:3142/acng-report.html
Only on Ubuntu:
Ubuntu does not have the sysvinit-core package, so we can't use it for the NFS root. Use Debian instead:
- Replace content of `/etc/fai/apt/sources.list` with https://github.com/faiproject/fai/blob/master/conf/sources.list
- Replace content of `/etc/fai/nfsroot.conf` with https://github.com/faiproject/fai/blob/master/conf/nfsroot.conf
- Replace content of `/etc/fai/NFSROOT` with https://github.com/faiproject/fai/blob/master/conf/NFSROOT
Edit `/etc/fai/nfsroot.conf` and change these variables:
```
FAI_DEBOOTSTRAP_OPTS="--force-check-gpg"
FAI_CONFIGDIR=/srv/soifai/config
```
Edit `/etc/fai/fai.conf` and uncomment `LOGUSER=fai`.
Clone this repository and move it to `/srv` (move with sudo).
Set up FAI:
```
sudo fai-setup -v
```
Now you need to download some files:
```
cd /srv/soifai/config
mkdir downloads
```
Download VS code extensions and put them into the `/srv/soifai/config/downloads` folder.
- https://marketplace.visualstudio.com/items?itemName=swissolyinfo.soicode
- https://marketplace.visualstudio.com/items?itemName=ms-vscode.cpptools
- https://marketplace.visualstudio.com/items?itemName=ms-python.python
Note: Some extensions (e.g. cpptools and python) may have pre-release versions, which should be avoided. But the marketplace website does not show which versions are pre-release, and if you just click "Download Extension", you will get a pre-release if there is one. To see which version is the most recent non-prerelease, open the extension in the marketplace in VS Code and look for the version tag. Then download this version in the "Version History" tab on the marketplace website.
Download the [SOI Code::Blocks template] for Ubuntu and put it in `/srv/soifai/config/downloads`.
[SOI Code::Blocks template]: https://soi.ch/wiki/soi-codeblocks/#install-the-soi-project-template
Download the SOI header:
```
wget -qO - https://git.soi.ch/SOI/soi-header/archive/master.tar.gz > downloads/soi-header.tar.gz
```
Create an ssh key and add the public key to `/srv/soifai/config/simplefiles/CONTESTANT/root/.ssh/authorized_keys`.
```
ssh-keygen -t ed25519
```
Invent a password for root on the machines.
Create a password hash for it:
```
sudo apt install whois
mkpasswd
```
Take the hash and put it in the `ROOTPW` variable in `class/FAIBASE.var`.
## Installing the machines
Add the required number of hosts to `simplefiles/CONTESTANT/etc/hosts`, and install that file locally:
```
sudo cp simplefiles/CONTESTANT/etc/hosts /etc/hosts
```
Collect MAC addresses of clients ([guide](https://fai-project.org/fai-guide/#_a_id_mac_a_collecting_ethernet_addresses_for_multiple_hosts)).
Start this command:
```
tcpdump -qtel broadcast and port bootpc | tee /tmp/mac.list
```
Now boot all the machines, then press Ctrl+C.
Get a list of all MAC addresses with:
```
perl -ane 'print "\U$F[0]\n"' /tmp/mac.list|sort|uniq
```
For each machine, run this with the correct hostname and MAC address.
This assigns the hostname and enables installation:
```
sudo dhcp-edit contestant10 3c:97:0e:1a:09:05
sudo fai-chboot -IFv -u nfs://10.0.0.9/srv/soifai/config contestant10
```
`dhcp-edit` adds a line to `/etc/dhcp/dhcpd.conf`.
If a you want to change the MAC address associated with a hostname later, you need to edit that file.
Note: Network booting is disabled after installation completes. To reenable:
```
sudo fai-chboot -e contestant10
```
Run FAI monitor to monitor installations:
```
fai-monitor | fai-monitor-gui -
```
Now you are ready to install the machines.
For this, you need to interrupt the boot process and select network booting as booting method.
For network booting to work, Secure Boot needs to be disabled on the machines.
Check `/etc/exports` if NFS doesn't work. The IP address range must be correct.
To reload nfs config: `sudo exportfs -ra`.
Logs are stored in `/var/log/fai/remote-logs`.
Run this so you can read them:
```
sudo chmod o+rx /var/log/fai/remote-logs
```
In case network booting does not work, you can also boot with a USB stick (FAI-CD):
```
sudo fai-cd -A autodiscover.iso
sudo dd if=autodiscover.iso of=/dev/sdx bs=1M conv=fsync
```
For this to work, `fai-monitor` needs to be running.
The IP address in `/var/log/fai/variables` needs to be correct.
## Administrating running machines
For administrating all machines at the same time, use `parallel-ssh`.
For this, you need a file containing a list of hosts; see `tools/hostlist` for an example.
Add host keys to your `known_hosts`:
```
parallel-ssh -h hostlist -O StrictHostKeyChecking=accept-new true
```
Disable wifi:
```
parallel-ssh -h hostlist nmcli radio wifi off
```
(Comment: Wifi devices are turned on by default, and if you turn it off (through the UI, or with nmcli or rfkill), systemd-rfkill will create a file called e.g. `/var/lib/systemd/rfkill/pci-0000\:02\:00.0-bcma-1\:wlan` with content 1, to remember that *this* device should be turned off. I have not found a way to have *all* wifi devices turned off by default.)
Test time sync:
```
parallel-ssh -h hostlist -i date
```
### Performing a soft update
Do this to apply configuration after you have changed it.
Open FAI monitor to see progress:
```
fai-monitor | fai-monitor-gui -
```
Then:
```
parallel-ssh -h hostlist -o output --timeout 0 fai -v softupdate
```
The output of each machine is stored in the folder `output`.
While working on the config and testing on a single machine, you can also run a single script, which is faster:
```
mount -t nfs 10.0.0.9:/srv/soifai/config /var/lib/fai/config
FAI=/var/lib/fai/config target=/ /var/lib/fai/config/scripts/CONTESTANT/10-config
```
### Preparing client certificates
Prepare the `usernames.csv`.
Each line should contain a username and real name.
Then run:
```
sudo apt install golang-cfssl
./create-certs.sh
```
You will need `certs/ca.pem` on the grader.
To test the client certificates, you can set up a test server:
```
sudo apt install nginx
```
Edit `/etc/nginx/sites-enabled/default`:
```
server {
listen 443 ssl default_server;
listen [::]:443 ssl default_server;
include snippets/snakeoil.conf;
ssl_client_certificate /srv/soifai/tools/certs/ca.pem;
ssl_verify_client on;
root /var/www/html;
index index.html index.htm index.nginx-debian.html;
server_name _;
}
```
Then: `sudo systemctl restart nginx.service`
With this configuration, you should now get an error if you access the server without a valid client certificate.
### Starting the contest
You should stop apt-cacher-ng during contests: `sudo systemctl stop apt-cacher-ng.service`
Run these commands to prepare the contest
```sh
# Kill user session
parallel-ssh -h hostlist loginctl kill-session 1
parallel-ssh -h hostlist pkill -KILL -u contestant
# Delete user
parallel-ssh -h hostlist userdel -r contestant
# Recreate user
parallel-ssh -h hostlist adduser --disabled-password --gecos "\"SOI Finals\"" contestant
# Assign users to machines
./assign-user.sh contestant10 stofl
./assign-user.sh contestant11 binna1
...
# Adjust the contest lock screen configuration and copy it to all machines
parallel-scp -h hostlist ./contest-lock.json /etc/contest-lock.json
# Reboot to trigger autologin and clear `/tmp`
parallel-ssh -h hostlist reboot
```
Start backups:
```
./backup-create.sh timer
```
### If backup machine needed
Replace `contestant25` with the backup machine name.
Replace `contestantxx` with the old machine (prepare by keeping the assignment of users to machines close by).
```
./assign-user.sh contestant25 username
rsync -av --chown contestant:contestant backups/contestantxx/xxxx/ root@contestant25:/home/contestant/
```
## Replacement exam setup
We used the laptops owned by SOI for the replacement exams.
For this, we manually set a static IP address (same as the DHCP assigned one) in the settings on each laptop (it will prompt you for the root password).
That way, it works without the FAI server.
All the commands in the section "Starting the contest" can then be run from the grader.
You just need to install: `sudo apt install ntp pssh`
## Contest setup at Uni Bern
At Uni Bern, we don't set up the machines from scratch.
Most of the software is already installed for us.
Copy this repository to the server, and run admin commands from there.
We don't use FAI, instead we just run a script.
Test on one machine:
```
rsync --archive --chown root:root --delete --verbose config/ root@chagall.soi:/var/lib/bernconfig/
ssh root@chagall.soi /var/lib/bernconfig/setup-bern.sh
```
Deploy to all machines:
```
parallel-rsync -h hostlist -x "--archive --chown root:root --delete" config/ /var/lib/bernconfig/
parallel-ssh -h hostlist -o output /var/lib/bernconfig/setup-bern.sh
```
We need to add finals.soi.ch to the hosts file, so copy the file from any machine, add the entry and then deploy it:
```
parallel-rsync -h hostlist -x "--archive --chown root:root" hosts /etc/hosts
```
Otherwise, contest administration works the same.
Run the backup script on the server inside a tmux, so that it continues when your ssh connection breaks.
## Training setup
We also use FAI to set up the SOI laptops for non-contest use.
In `config/class/50-host-classes`, replace `CONTESTANT` with `TRAINING`.
You need to set the password for the admin account.
You can find the password in the internal wiki.
If you reinstall all laptops, you can choose a new password and update it in the wiki.
Hash the password with `mkpasswd`, and put the hash in `SUPER_USER_PW` in the file `config/class/TRAINING.var`.
This repo is public, so don't commit the hash.
Then, install the laptops as in the section "Installing the machines".
## Problems and solutions
This is a list of problems that we had and how we solved them.
2023-06-01 15:25:01 +02:00
**DHCP server not running.**
This happens if the network cable was not plugged in when booting.
```
systemctl status isc-dhcp-server.service
systemctl restart isc-dhcp-server.service
```
2022-07-15 11:59:26 +02:00
**Network booting fails.**
Fixed by disabling Secure Boot in the system settings.
2023-06-01 15:25:01 +02:00
**Problems with RTL8153-based USB Ethernet adapters.**
I suspect that problems are caused by faulty firmware in the adapters.
We had a strange problem where network booting failed when the laptops were connected to a Netgear GS316P switch, but worked when connected to a Netgear GS108 switch.
The adapters have issues when connected over USB 3, but work fine over USB 2.
You can force USB 2 by connecting adapters via an USB 2 extension cable.
Half-inserting the USB connector also works in a pinch.
**Installation of packages fails.**
Check that the date is set correctly in the system settings.
If you get a HTTP 503 error, try restarting `apt-cacher-ng`.
2022-07-15 11:59:26 +02:00
**Installed system does not boot.**
Fixed by changing boot mode from legacy/BIOS to UEFI.
**The first time you boot the installed system, everything is fine, but after a reboot, the screen just shows a blinking cursor.**
Linux is actually running and you can ssh into the machine, but for some unknown reason gdm failed to start.
We don't know why this happens yet, but we have a workaround: Just run `systemctl start gdm` via ssh.
```
parallel-ssh -h hostlist systemctl start gdm
```
2023-06-01 15:25:01 +02:00
**User indicator does not appear.**
2022-07-15 11:59:26 +02:00
Fixed by adding the gnome shell version from `gnome-shell --version` to the list of supported versions: `shell-version` in `simplefiles/CONTESTANT/usr/share/gnome-shell/extensions/user-indicator@soi.ch/metadata.json`.
The same applies for the contest-lock extension.
2023-06-01 15:25:01 +02:00
**cpptools VS code extension crashes.**
Fixed by using an older of cpptools, which you can download in the "Version History" tab on the marketplace website.
This happened because we were unknowingly using a pre-release.
2022-07-15 11:59:26 +02:00
## Config space
The config space defines how the machines are setup.
It is mounted with nfs on the target machine.
FAI has the concept of *classes*.
Classes are defined by a list of strings.
Order matters, classes are applied in the order in which they are defined.
We have defined these classes:
- `PARTICIPANT`: This sets up things that are used for both contest and training.
- various code editors and other tools
- VS Code extensions
- soi header
- Code::Blocks template
- wallpaper
- default favorite apps
- default list of keyboard layouts
- only allow ssh login as root, and disable password auth
- `CONTESTANT`:
- firewall (configured in `simplefiles/CONTESTANT/etc/nftables.conf`)
- disable bluetooth
- disable sleep
- disable lock on blank screen
- disable software update notifications
- disable some panels in gnome-control-center
- add polkit rules which block changing network settings and mounting storage devices (it prompts for the root password)
- configure NTP server
- set `authorized_keys` for root
- enable automatic login
- set browser homepage and bookmarks to https://finals.soi.ch
- add a gnome shell extension which displays the user name in the top bar
- add contest lock gnome shell extension
- add some management scripts to be run via ssh
- add some packages
- `TRAINING`:
- add admin user with sudo rights
- remove APT proxy after packages are installed
Here is an overview of the file structure:
- `class`:
First, all scripts in this folder which start with two digits are run.
Of these files, the ones which do not have a `.sh` suffix define the classes.
FAI then goes through the list of classes and reads variables from the `$class.var` file (if it exists).
- `script`:
This contains a folder of scripts for each class.
In these scripts, we can use these variables:
- `$FAI`: Path to the config space.
- `$target`: The root of the system being installed.
For soft update, this is `/`.
- `$ROOTCMD`: `chroot $target` during install.
For soft update, this is empty.
- `simplefiles`:
This contains a bunch of files for each class that are copied over the root of the target.
(The FAI example config only has `files`, but that is annoying because classes are at the end of the path.)
- `package_config`:
This defines which packages are installed.
During the warmup, some participants requested additional VS code extensions.
Because we didn't want them to be enabled for everyone, we put the .vsix files into `simplefiles/CONTESTANT/opt/` and did a soft update, so that they could manually install it from `/opt`.
Whenever you add a new script, don't forget to make it executable with `chmod +x script.sh`, otherwise it will silently not be executed.
Third-party apt repos are already setup. This is how it was done:
```
wget -qO - https://packagecloud.io/AtomEditor/atom/gpgkey | gpg --dearmor > simplefiles/PARTICIPANT/usr/share/keyrings/atom-archive-keyring.gpg
wget -qO - https://download.sublimetext.com/sublimehq-pub.gpg | gpg --dearmor > simplefiles/PARTICIPANT/usr/share/keyrings/sublimehq-archive-keyring.gpg
wget -qO - https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > simplefiles/PARTICIPANT/usr/share/keyrings/microsoft-archive-keyring.gpg
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/atom-archive-keyring.gpg] http://packagecloud.io/AtomEditor/atom/any/ any main" > simplefiles/PARTICIPANT/etc/apt/sources.list.d/atom.list
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/sublimehq-archive-keyring.gpg] http://download.sublimetext.com/ apt/stable/" > simplefiles/PARTICIPANT/etc/apt/sources.list.d/sublime-text.list
echo "deb [arch=amd64,arm64,armhf signed-by=/usr/share/keyrings/microsoft-archive-keyring.gpg] http://packages.microsoft.com/repos/code stable main" > simplefiles/PARTICIPANT/etc/apt/sources.list.d/vscode.list
```
### Contest lock screen
The contest lock screen is a gnome extension which can lock the screen and show a countdown until the contest starts.
The screen is unlocked when the contest starts.
The lock screen also displays the user name and a title.
It is configured in the file `/etc/contest-lock.json`.
It watches this file, and when it changes the new configuration is instantly applied.
If there is an error in the config file, it will continue to use the old config and print a message.
To see the logs, run this on a contestant machine:
```
journalctl -f -o cat /usr/bin/gnome-shell
```
An additional text can be shown with the `message` field. It can contain newlines (`\n`).
In case there is a problem with the contest lock screen and you can't fix it, the backup solution is to turn off `AutomaticLoginEnable` and set a password instead, that you announce when the contest starts:
```
parallel-ssh -h hostlist 'chpasswd <<< contestant:stofl'
```
**Development notes**
Links:
- https://www.codeproject.com/Articles/5271677/How-to-Create-A-GNOME-Extension
- https://gjs.guide/
Regular lock screen (contest-lock is based on this):
- https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/screenShield.js
- https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/unlockDialog.js
Developer commands:
- Restart gnome-shell: Press Alt+F2, enter `r`. Only works if you log in with Xorg.
- Open the gnome-shell developer tools: Press Alt+F2, enter `lg`.
## TODO
- It would be useful to have something like lineinfile from ansible.
FAI has ainsl, but it's not powerful enough.
We could just copy the ansible code and make a cli out of it:
https://github.com/ansible/ansible/blob/devel/lib/ansible/modules/lineinfile.py
- Allow locale change.
The problem is that this requires logging out and back in.
Maybe this could be a feature of contest-lock.
You may also want to install task-german, firefox-esr-l10n-de
(and same for other languages).
## License
This project is distributed under the terms of the GNU General
Public License, version 2.