Reliable Cron Across the Planet
This paper by two Google engineers is a non-technical explanation of issues found in large scale distributed systems.
Starting with cron on a single machine, it nicely shows up the differences as a job runner is scaled out to a distributed system…