A systems administrator is investigating a rack-mounted server that randomly crashes two or three times a week. The crashes do not correspond to any scheduled tasks or specific user activity. The server's OS event logs only show an 'unexpected shutdown' error immediately after rebooting. While physically inspecting the server after a recent crash, the administrator noticed the chassis was warmer than usual and a primary exhaust fan was not spinning. What is the MOST likely cause of the random crashes?
The correct answer is CPU overheating. The key symptoms pointing to this are the server chassis being warmer than usual and a primary exhaust fan not spinning. A failed fan leads to inadequate cooling, causing the CPU temperature to rise. Modern CPUs have built-in thermal protection that will shut down the system to prevent permanent damage, resulting in an 'unexpected shutdown' which aligns with the logs and the random nature of the crashes.
Intermittent memory module failure can cause random crashes, but it does not explain the thermal symptoms observed (warm chassis, stopped fan).
A corrupted boot sector would prevent the operating system from loading at startup, not cause crashes while the system is running.
A power supply unit (PSU) fault could cause random reboots, but the specific evidence of a failed fan and high heat points more directly to a thermal problem as the immediate cause of the shutdown.