Move and rename files and add state machine

This commit is contained in:
Robin Meier 2024-03-19 22:02:37 +01:00
parent 53cb304d7e
commit 699a2e16e8
19 changed files with 189 additions and 102 deletions

3
.gitignore vendored
View File

@ -3,6 +3,9 @@
.last_changelog_read .last_changelog_read
storage/ storage/
log/
config/*
!config/*.EXAMPLE
# This file is unused atm # This file is unused atm
docker_health_check.sh docker_health_check.sh

View File

@ -9,6 +9,8 @@ The covered tasks range from file change tracking via http/ssh monitoring to zfs
## Installation ## Installation
The scripts in this repo ***must*** be checked out into `/root/scripts`. The scripts in this repo ***must*** be checked out into `/root/scripts`.
This is subject to change.
The scripts should be installable wherever.
```bash ```bash
cd /root cd /root
@ -30,21 +32,25 @@ This will be helpful when updating the admin scripts later on.
### Config Files ### Config Files
For each script there is a `.[script_name]_env.EXAMPLE` file, which you must copy (remove `.EXAMPLE` part) and edit while providing your own information. Config files are located in the `config/` directory.
Each script has its own cofiguration file, in this repo there only are the `config/[script_name].EXAMPLE` example configuration files.
For each script you want to use, you must copy the example and fill in your own data.
For example (`monitoring.sh`):
```bash ```bash
SCRIPT_NAME=zfs_health_check cd /root/scripts
cp /root/scripts/.${SCRIPT_NAME}_env.EXAMPLE /root/scripts/.${SCRIPT_NAME}_env cp config/monitoring.EXAMPLE config/monitoring
vim /root/scripts/.${SCRIPT_NAME}_env vim config/monitoring
``` ```
If you want to use the example configuration, you could symbolic link the files instead of just copying them.
This really only makes sense for `.system_health_check`. If you want to use the provided example configuration, you could symbolic link the files instead of just copying them.
This really only makes sense for `system_health_check`.
The command for this is: The command for this is:
```bash ```bash
cd /root/scripts cd /root/scripts
ln -s .system_health_check.EXAMPLE .system_health_check ln -s config/system_health_check.EXAMPLE config/system_health_check
``` ```
### Shutdown Notification ### Shutdown Notification
@ -64,7 +70,7 @@ To install the [startup helper script](#using-startup-helper) into the regular u
```bash ```bash
USRNAME=radioelephant USRNAME=radioelephant
cp /root/scripts/post_startup.sh /home/$USRNAME/post_startup.sh cp /root/scripts/post_startup.sh /home/$USRNAME/post_startup.sh
cp /root/scripts/.post_startup_env.EXAMPLE /home/$USRNAME/.post_startup_env cp /root/scripts/config/post_startup.EXAMPLE /home/$USRNAME/.post_startup_env
chown $USRNAME:$USRNAME /home/$USRNAME/post_startup.sh chown $USRNAME:$USRNAME /home/$USRNAME/post_startup.sh
chown $USRNAME:$USRNAME /home/$USRNAME/.post_startup_env chown $USRNAME:$USRNAME /home/$USRNAME/.post_startup_env
vim /home/$USRNAME/.post_startup_env vim /home/$USRNAME/.post_startup_env
@ -83,10 +89,10 @@ cd /root/scripts
git pull git pull
``` ```
For most of the scripts you only need to check if the `.[script_name]_env.EXAMPLE` has changed and contains different keys than your copied `.[script_name]_env` file. For most of the scripts you only need to check if the example configuration file (i.e. `config/monitoring.EXAMPLE`) has changed and contains different keys than your copy (i.e. `config/monitoring`).
For your convenience, changes to environment variable files will be documented in the [CHANGELOG](CHANGELOG.md). For your convenience, changes to configuration files will be documented in the [CHANGELOG](CHANGELOG.md).
If you followed the instructions in this README, then you will find the last time you updated this repository in the `.last_changelog_read` file. If you followed the instructions in this README, then you will find the last time you pulled this repository and read the [CHANGELOG](CHANGELOG.md) in the `.last_changelog_read` file.
Read it with `cat /root/scripts/.last_changelog_read`. Get the value with `cat /root/scripts/.last_changelog_read`.
**Make sure to update the last reading time file after reading the CHANGELOG with `date > /root/scripts/.last_changelog_read`** **Make sure to update the last reading time file after reading the CHANGELOG with `date > /root/scripts/.last_changelog_read`**
@ -123,9 +129,11 @@ This will be noted in the CHANGELOG for your convenience.
The check and monitoring scripts in this repo can be run periodically be run and if any problems are detected, they produce output. The check and monitoring scripts in this repo can be run periodically be run and if any problems are detected, they produce output.
The output of these scripts can be redirected and used however you like. The output of these scripts can be redirected and used however you like.
Typically I redirect the output to the `telegram_notification.sh` script which notifies me of any noisy scripts. Typically I redirect the output to the `helpers/tg_notify.sh` script which notifies me of any noisy scripts.
In case of expected repeating failures, I first redirect the ouput to `helpers/state_machine.sh "keyword"` which silences repeated messages.
The "state machine" only saves a copy of the last message per keyword and compares it to the next message.
Regardless of any problems each script also logs its executions in `/root/logs`. Regardless of any problems each script also logs its executions under `logs/`
Make sure you created this folder during [installation](#installation). Make sure you created this folder during [installation](#installation).
### Crontab Scheduling ### Crontab Scheduling
@ -137,20 +145,20 @@ If you are unsure about the cron schedule, use [Crontab Guru](https://crontab.gu
My current crontab looks like this: My current crontab looks like this:
```crontab ```crontab
* * * * * bash -c '/root/scripts/file_monitor.sh | /root/scripts/telegram_notification.sh' * * * * * bash -c 'cd /root/scripts && ./file_monitor.sh | ./helpers/tg_notify.sh'
*/2 * * * * bash -c '/root/scripts/monitoring.sh | /root/scripts/telegram_notification.sh' */2 * * * * bash -c 'cd /root/scripts && ./monitoring.sh | ./helpers/state_machine.sh "monitoring" | ./helpers/tg_notify.sh'
*/4 * * * * bash -c '/root/scripts/dyndns.sh | /root/scripts/telegram_notification.sh' */4 * * * * bash -c 'cd /root/scripts && ./dyndns.sh | ./helpers/tg_notify.sh'
*/3 * * * * bash -c '/root/scripts/system_health_check.sh | /root/scripts/telegram_notification.sh' */3 * * * * bash -c 'cd /root/scripts && ./system_health_check.sh | ./helpers/state_machine.sh "system" | ./helpers/tg_notify.sh'
15 * * * * bash -c '/root/scripts/docker_health_check.sh | /root/scripts/telegram_notification.sh' 15 * * * * bash -c 'cd /root/scripts && ./docker_health_check.sh | ./helpers/state_machine.sh "docker" | ./helpers/tg_notify.sh'
*/15 * * * * bash -c '/root/scripts/zfs_health_check.sh | /root/scripts/telegram_notification.sh' */15 * * * * bash -c 'cd /root/scripts && ./zfs_health_check.sh | ./helpers/state_machine.sh "zfs" | ./helpers/tg_notify.sh'
@reboot sleep 10 && /root/scripts/telegram_notification.sh '[STARTUP] System just booted' @reboot sleep 10 && /root/scripts/helpers/tg_notify.sh '[STARTUP] System just booted'
@reboot sleep 30 && bash -c '/root/scripts/zfs_health_check.sh | /root/scripts/telegram_notification.sh' @reboot sleep 30 && bash -c 'cd /root/scripts && ./zfs_health_check.sh | ./helpers/state_machine.sh "zfs" | ./helpers/tg_notify.sh'
``` ```
Adapt this to your needs, you might also implement other checks and only use the `telegram_notification.sh` script from this repo. Adapt this to your needs, you might also implement other checks and only use the `helpers/tg_notify.sh` script from this repo.
Or you might implement your own notification script to notify you via another service. Or you might implement your own notification script to notify you via another service.
The `telegram_notification.sh` can easily be adapted (just remove comment) to forward all notifications to `STDOUT` which typically makes cron send a mail. The `helpers/tg_notify.sh` can easily be adapted (just remove comment) to forward all notifications to `STDOUT` which typically makes cron send a mail.
### Using Startup Helper ### Using Startup Helper

View File

@ -0,0 +1,2 @@
STORAGE_PATH=/root/scripts/storage/state_machine # NO trailing slash
RENOTIFY_AGE_SEC=7200 # In seconds (2h)

View File

@ -1,18 +1,15 @@
#!/bin/bash #!/bin/bash
logfile=/root/logs/dyndns.log # Load configuration
log_identifier="[DNS]"
log() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" >> $logfile
}
log_echo() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" | tee -a $logfile
}
set -o allexport set -o allexport
source /root/scripts/.dyndns_env source /root/scripts/config/dyndns
set +o allexport set +o allexport
# Import logging functionality
logfile=/root/scripts/log/dyndns.log
log_identifier="DNS"
source /root/scripts/functions/logging.sh
url="https://${USERNAME}:${PASSWORD}@infomaniak.com/nic/update?hostname=" url="https://${USERNAME}:${PASSWORD}@infomaniak.com/nic/update?hostname="
log "Updating DynDNS for ${MAIN_DOMAIN}" log "Updating DynDNS for ${MAIN_DOMAIN}"

View File

@ -8,30 +8,28 @@
# Author: Robin Meier - robin@meier.si # Author: Robin Meier - robin@meier.si
################################################################################ ################################################################################
logfile=/root/logs/file_monitor.log # Load configuration
log_identifier="[FILE]"
log() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" >> $logfile
}
log_echo() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" | tee -a $logfile
}
set -o allexport set -o allexport
source /root/scripts/.file_monitor_env source /root/scripts/config/file_monitor
set +o allexport set +o allexport
# Import logging functionality
logfile=/root/scripts/log/file_monitor.log
log_identifier="FILE"
source /root/scripts/functions/logging.sh
# Make sure directory exists
mkdir -p /root/scripts/storage/file_monitor mkdir -p /root/scripts/storage/file_monitor
for file in $FILES for file in $FILES
do do
# Touch storage file if not existing # Touch storage file if not existing
if [ ! -f /root/scripts/storage/file_monitor/${file//\//_} ]; then if [ ! -f /root/scripts/storage/file_monitor/${file//\//_} ]; then
touch /root/scripts/storage/file_monitor/${file//\//_} ]; touch /root/scripts/storage/file_monitor/${file//\//_}
fi fi
if [ "$file" -nt "/root/scripts/storage/file_monitor/${file//\//_}" ]; then if [ "$file" -nt "/root/scripts/storage/file_monitor/${file//\//_}" ]; then
log_echo "[CHANGE] $file" log_echo "[CHANGE] $file"
touch /root/scripts/storage/file_monitor/${file//\//_} ]; touch /root/scripts/storage/file_monitor/${file//\//_}
fi fi
done done

19
functions/logging.sh Executable file
View File

@ -0,0 +1,19 @@
#!/bin/bash
if [ -z "$logfile" ]; then
echo "logfile variable missing"
exit 1
fi
if [ -z "$log_identifier" ]; then
echo "log_identifier variable missing"
exit 1
fi
log() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] [$log_identifier]" >> $logfile
}
log_echo() {
log $@
echo -e "[$log_identifier] $@"
}

78
helpers/state_machine.sh Executable file
View File

@ -0,0 +1,78 @@
#!/bin/bash
################################################################################
# STATE_MACHINE.SH
# ----------------
# This script saves the last message for a certain key and compares the next
# message for the same key to not have any repeating notifications.
#
# Author: Robin Meier - robin@meier.si
################################################################################
set -o allexport
source /root/scripts/config/state_machine
set +o allexport
mkdir -p $STORAGE_PATH # Make sure STORAGE_PATH exists
# Get input from standard input or via first parameter
if [[ $# -eq 0 ]]; then
echo "[ERROR] Not enough arguments!"
exit 1
elif [[ $# -eq 1 ]]; then
MESSAGE=$(timeout 32 cat)
KEY=$1
elif [[ $# -eq 2 ]]; then
KEY=$1
MESSAGE=$2
else
echo "[ERROR] Too many arguments!"
exit 1
fi
# Check if KEY is empty
if [[ -z "${KEY}" ]]; then
echo "[ERROR] KEY argument is missing!"
exit 1
fi
KEY_FILE="${STORAGE_PATH}/${KEY}.txt"
if [[ -f $KEY_FILE && -z "${MESSAGE}" ]]; then
# Previous message present and empty message now
OLD_MESSAGE=$(cat $KEY_FILE)
echo "✅ Resolved"
echo "$OLD_MESSAGE"
rm $KEY_FILE
exit 0
elif [[ -f $KEY_FILE ]]; then
# Message and previous message present
OLD_MESSAGE=$(cat $KEY_FILE)
# Compare contents
if [[ "$OLD_MESSAGE" == "$MESSAGE" ]]; then
# Check last notification
if [ "$(( $(date +"%s") - $(stat -c "%Y" "$KEY_FILE") ))" -gt "$RENOTIFY_AGE_SEC" ]; then
touch $KEY_FILE
echo "‼Renotify"
else
exit 0
fi
else
echo "$MESSAGE" > $KEY_FILE
echo "⁉Changed"
fi
else
if [[ -z "${MESSAGE}" ]]; then
# No message present
exit 0
fi
# New message present, create KEY_FILE, continue to relaying
echo "$MESSAGE" > $KEY_FILE
echo "❗New"
fi
# Relay message if made it until here (Quotes are important here, so lines dont get .join(' ')-ed)
echo "$MESSAGE"
exit 0

View File

@ -1,24 +1,23 @@
#!/bin/bash #!/bin/bash
################################################################################ ################################################################################
# TELEGRAM_NOTIFICATION.SH # TG_NOTIFY.SH
# ------------------------ # ------------
# This script takes input via stdin or parameters, removes timestamps from each # This script takes input via stdin or parameters, replaces newlines with
# line and replaces newlines with telegram compatible ones and then sends the # telegram compatible ones and then sends the message to a chat
# message to a chat
# #
# Author: Robin Meier - robin@meier.si # Author: Robin Meier - robin@meier.si
################################################################################ ################################################################################
set -o allexport set -o allexport
source /root/scripts/.telegram_notification_env source /root/scripts/config/tg_notify
set +o allexport set +o allexport
BOT_API_URL=https://api.telegram.org/bot${BOT_TOKEN} BOT_API_URL=https://api.telegram.org/bot${BOT_TOKEN}
# Get input from standard input or via first parameter # Get input from standard input or via first parameter
if [[ $# -eq 0 ]]; then if [[ $# -eq 0 ]]; then
MESSAGE=$(timeout 30 cat) MESSAGE=$(timeout 32 cat)
elif [[ $# -eq 1 ]]; then elif [[ $# -eq 1 ]]; then
MESSAGE=$1 MESSAGE=$1
elif [[ $# -eq 2 ]]; then elif [[ $# -eq 2 ]]; then
@ -33,11 +32,6 @@ if [[ -z "${MESSAGE}" ]]; then
exit 0 exit 0
fi fi
# Strip timestamps from message
if [ "${MESSAGE:0:12}" == "$(echo '' | ts "[%Y-%m-%d")" ]; then
MESSAGE=$(echo -e "$MESSAGE" | cut -c 23-)
fi
# Replace newlines in message for telegram # Replace newlines in message for telegram
TG_MESSAGE=${MESSAGE//$'\n'/\%0A} TG_MESSAGE=${MESSAGE//$'\n'/\%0A}

View File

@ -1,18 +1,14 @@
#!/bin/bash #!/bin/bash
logfile=/root/logs/monitoring.log # Load configuration
log_identifier="[MON]"
log() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" >> $logfile
}
log_echo() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" | tee -a $logfile
}
set -o allexport set -o allexport
source /root/scripts/.monitoring_env source /root/scripts/config/monitoring
set +o allexport set +o allexport
# Import logging functionality
logfile=/root/scripts/log/monitoring.log
log_identifier="MON"
source /root/scripts/functions/logging.sh
problems=0 problems=0
@ -23,28 +19,24 @@ do
if [[ $(nc -w 2 ${ssh_host//:/ } <<< "\0" ) =~ "OpenSSH" ]] ; then if [[ $(nc -w 2 ${ssh_host//:/ } <<< "\0" ) =~ "OpenSSH" ]] ; then
log "[SSH] [OK] ${ssh_host} is reachable" log "[SSH] [OK] ${ssh_host} is reachable"
else else
# TODO: Rate limit fail messages, also add is back up message
log_echo "[SSH] [FAIL] ${ssh_host} not reachable" log_echo "[SSH] [FAIL] ${ssh_host} not reachable"
problems=1 problems=1
fi fi
done done
# TODO: HTTP Status Code 200 Monitoring
for http_host in $HTTP_MONITORING for http_host in $HTTP_MONITORING
do do
status_code=$(curl --write-out %{http_code} --silent --output /dev/null $http_host) status_code=$(curl --write-out %{http_code} --silent --output /dev/null $http_host)
if [[ "$status_code" -eq 200 ]] ; then if [[ "$status_code" -eq 200 ]] ; then
log "[WEB] [OK] ${http_host}" log "[WEB] [OK] ${http_host}"
else else
# TODO: Rate limit fail messages, also add is back up message
log_echo "[WEB] [FAIL] ${http_host} status code is ${status_code}" log_echo "[WEB] [FAIL] ${http_host} status code is ${status_code}"
problems=1 problems=1
fi fi
done done
if [[ "$problems" -eq "0" ]]; then if [[ "$problems" -eq "0" ]]; then
log "Monitoring Run Successful" log "Monitoring Run Successful"
else else
log_echo "Monitoring Run Failed" log "Monitoring Run Failed"
fi fi

View File

@ -5,7 +5,7 @@ Before=shutdown.target
[Service] [Service]
Type=oneshot Type=oneshot
ExecStart=/root/scripts/telegram_notification.sh "[SHUTDOWN] System going down" ExecStart=/root/scripts/helpers/tg_notify.sh "[SHUTDOWN] System going down"
TimeoutStartSec=0 TimeoutStartSec=0
[Install] [Install]

View File

@ -8,19 +8,15 @@
# Author: Robin Meier - robin@meier.si # Author: Robin Meier - robin@meier.si
################################################################################ ################################################################################
logfile=/root/logs/system_health_check.log # Load configuration
log_identifier="[SYS]"
log() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" >> $logfile
}
log_echo() {
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" | tee -a $logfile
}
set -o allexport set -o allexport
source /root/scripts/.system_health_check_env source /root/scripts/config/system_health_check
set +o allexport set +o allexport
# Import logging functionality
logfile=/root/scripts/log/system_health_check.log
log_identifier="SYS"
source /root/scripts/functions/logging.sh
problems=0 problems=0
@ -28,8 +24,9 @@ log "Starting System Health Check"
# RAM usage percentage # RAM usage percentage
ram=$(free | awk '/Mem/{printf("%.2f"), $3/$2*100}') ram=$(free | awk '/Mem/{printf("%.2f"), $3/$2*100}')
if [ $(echo "$ram > $RAM_LIMT" | bc -l) -eq 1 ]; then if [ $(echo "$ram > $RAM_LIMIT" | bc -l) -eq 1 ]; then
log_echo "[RAM] usage is ${ram}%! (Limit: $RAM_LIMIT)" log_echo "[RAM] usage is abobe limit of ${RAM_LIMIT}%!"
log "[RAM] usage is ${ram}%! (Limit: $RAM_LIMIT)"
problems=1 problems=1
else else
log "[RAM] usage is ${ram}%" log "[RAM] usage is ${ram}%"
@ -38,7 +35,8 @@ fi
# CPU usage percentage # CPU usage percentage
cpu=$(top -bn1 | grep "Cpu(s)" | awk '{print $2 + $4}') cpu=$(top -bn1 | grep "Cpu(s)" | awk '{print $2 + $4}')
if [ $(echo "$cpu > $CPU_LIMIT" | bc -l) -eq 1 ]; then if [ $(echo "$cpu > $CPU_LIMIT" | bc -l) -eq 1 ]; then
log_echo "[CPU] load is ${cpu}%! (Limit: $CPU_LIMIT)" log_echo "[CPU] load is above limit of ${CPU_LIMIT}%!"
log "[CPU] load is ${cpu}%! (Limit: $CPU_LIMIT)"
problems=1 problems=1
else else
log "[CPU] load is ${cpu}%" log "[CPU] load is ${cpu}%"
@ -50,7 +48,8 @@ fi
# Temperature # Temperature
avg_cpu_temp=$(sensors | awk '/^Core /{++r; gsub(/[^[:digit:]]+/, "", $3); s+=$3} END{print s/(10*r)}') avg_cpu_temp=$(sensors | awk '/^Core /{++r; gsub(/[^[:digit:]]+/, "", $3); s+=$3} END{print s/(10*r)}')
if [ $(echo "$avg_cpu_temp > $TEMP_LIMIT" | bc -l) -eq 1 ]; then if [ $(echo "$avg_cpu_temp > $TEMP_LIMIT" | bc -l) -eq 1 ]; then
log_echo "[TEMP] is ${avg_cpu_temp}°C! (Limit: $TEMP_LIMIT)" log_echo "[TEMP] is above limit of ${TEMP_LIMIT}°C!"
log "[TEMP] is ${avg_cpu_temp}°C! (Limit: $TEMP_LIMIT)"
problems=1 problems=1
else else
log "[TEMP] is ${avg_cpu_temp}°C" log "[TEMP] is ${avg_cpu_temp}°C"
@ -71,5 +70,5 @@ fi
if [ ${problems} -eq 0 ]; then if [ ${problems} -eq 0 ]; then
log "System Health Check Successful" log "System Health Check Successful"
else else
log_echo "System Health Check Found Problems" log "System Health Check Found Problems"
fi fi

View File

@ -10,21 +10,18 @@
# Author: Robin Meier - robin@meier.si # Author: Robin Meier - robin@meier.si
################################################################################ ################################################################################
logfile=/root/logs/zfs_health_check.log # Load configuration
log_identifier="[ZFS]" set -o allexport
log() { source /root/scripts/config/zfs_health_check
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" >> $logfile set +o allexport
}
log_echo() { # Import logging functionality
echo -e $@ | ts "[%Y-%m-%d %H:%M:%S] $log_identifier" | tee -a $logfile logfile=/root/scripts/log/zfs_health_check.log
} log_identifier="ZFS"
source /root/scripts/functions/logging.sh
problems=0 problems=0
set -o allexport
source /root/scripts/.zfs_health_check_env
set +o allexport
log "Starting ZFS Health Check" log "Starting ZFS Health Check"
# Pool Status # Pool Status
@ -123,5 +120,5 @@ done
if [ ${problems} -eq 0 ]; then if [ ${problems} -eq 0 ]; then
log "ZFS Health Check Successful" log "ZFS Health Check Successful"
else else
log_echo "ZFS Health Check Found Problems" log "ZFS Health Check Found Problems"
fi fi