Linux Watchdog Read Only Filesystem
Introduction
In this blog post, we will walk through the process of creating a watchdog script that monitors if a filesystem has gone read-only on a Linux system. If the filesystem is detected to be read-only, the script will reboot the system after a specified timeout period. This can help ensure that your system remains operational and stable.
Problem Statement
One common issue in Linux systems is when a filesystem goes read-only. This can happen due to various reasons such as disk errors, excessive write operations, or hardware failures. When a filesystem is read-only, the system cannot perform any write operations, which can lead to critical issues.
Solution
To address this problem, we will create a script that periodically checks the status of the filesystem and reboots the system if it is detected to be read-only. We will use a systemd service to ensure that the script runs at regular intervals.
Prerequisites
Before proceeding, ensure you have the following:
- A Linux system with root access.
- Basic knowledge of shell scripting and systemd.
Step-by-Step Guide
1. Create the Watchdog Script
First, create a new script file named fs_watchdog.sh.
1sudo nano /usr/local/bin/fs_watchdog.sh
1#!/bin/bash
2
3# Define the filesystem to monitor (e.g., root filesystem '/')
4FILESYSTEM="/"
5
6# Timeout in seconds
7TIMEOUT=600
8
9# Function to check if the filesystem is read-only
10is_readonly() {
11 mount | grep -q "on $FILESYSTEM type .* $ro\|read-only$"
12 return $?
13}
14
15# Check if the filesystem is read-only
16if is_readonly; then
17 echo "$(date '+%Y-%m-%d %H:%M:%S') - Filesystem $FILESYSTEM is read-only. Rebooting in $TIMEOUT seconds."
18 sleep $TIMEOUT
19 echo "$(date '+%Y-%m-%d %H:%M:%S') - Rebooting system now."
20 sudo reboot
21else
22 echo "$(date '+%Y-%m-%d %H:%M:%S') - Filesystem $FILESYSTEM is writable. No action required."
23fi
2. Make the Script Executable
Make the script executable by running:
1sudo chmod +x /usr/local/bin/fs_watchdog.sh
3. Create a Systemd Service
To run the script at regular intervals, we will create a systemd service.
Create a new file named fs-watchdog.service in /etc/systemd/system/.
1sudo nano /etc/systemd/system/fs-watchdog.service
1[Unit]
2Description=Filesystem Watchdog Service
3
4[Service]
5Type=simple
6ExecStart=/usr/local/bin/fs_watchdog.sh
7Restart=on-failure
8
9[Install]
10WantedBy=multi-user.target
4. Enable and Start the Service
Enable and start the service to run at boot.
1sudo systemctl daemon-reload
2sudo systemctl enable fs-watchdog.service
3sudo systemctl start fs-watchdog.service
5. Verify the Service
Check the status of the service to ensure it is running.
1sudo systemctl status fs-watchdog.service
Conclusion
In this blog post, we learned how to create a watchdog script that monitors if a filesystem has gone read-only and reboots the system after a specified timeout. By implementing this solution, you can help ensure that your Linux systems remain stable and operational even in the face of filesystem issues.
Feel free to modify the script to suit your specific needs, such as monitoring different filesystems or changing the timeout interval.
If you have any questions or feedback, please leave a comment below. Happy scripting!