A Linux file server uses software RAID 1 (/dev/md0) created from two 14 TB SAS disks, /dev/sdb and /dev/sdc. During a large backup job users report that saving files to /data fails. The kernel log shows lines like:
blk_update_request: I/O error, dev sdc, sector 28727488 op 0x0:(READ) md/raid1: md0: Disk failure on sdc, disabling device
mdadm --detail /dev/md0 now lists /dev/sdc as faulty, while /dev/sdb remains active. A SMART self-test on /dev/sdc reports a rapidly increasing Reallocated_Sector_Ct value. Which action should you take FIRST to eliminate the read/write errors and restore full redundancy?
Run fsck with automatic repair on the /data filesystem, then remount it.
Replace the faulty disk (/dev/sdc) with a new drive of equal or greater capacity and rebuild the RAID 1 array.
Use mdadm --zero-superblock on /dev/sdc and immediately re-add it to the array to clear the faulty flag.
Tune /proc/sys/vm/dirty_writeback_centisecs to reduce write bursts and lower I/O pressure.
The kernel and mdadm messages indicate that all read/write errors originate from the physical drive /dev/sdc. SMART reports an abnormal and rising Reallocated_Sector_Ct, a classic sign of imminent disk failure. Once a drive shows a predictive SMART error, the recommended remediation is to replace the drive and rebuild the array; trying to clear the fault flag or merely repairing the filesystem will not correct the underlying hardware problem. Adjusting write-back parameters might reduce I/O pressure but leaves the failing disk in service and risks further data loss.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What does RAID 1 mean, and why is it used here?
Open an interactive chat with Bash
What is Reallocated_Sector_Ct in SMART, and why is it important?
Open an interactive chat with Bash
How does mdadm rebuild a RAID 1 array after replacing a faulty disk?