In my past job, we were running about 80 crons (hourly, daily and weekly) across a cluster of 100 machines.We ended up using an external system like Jenkins to schedule and execute the script. The code for the script is a normal function/API call. On a given schedule, Jenkins either SSH's into a particular machine and executes the cron script or simply hits an API (obviously with authentication). If the cron fails, Jenkins sends out an e-mail with the failure details.
I prefer this approach over messing around with a crontab because:
- The notification system is generalized. I configure it once on Jenkins and it remains the same for any cron that we write in the future. I hate Postfix or Sendmail with vengeance. :D
- If you have a distributed cluster, running a cron requires maintaining a distributed cron system. While Dkron exists, I'm not familiar with it and we already use Jenkins as a build system. No harm doubling it as a cron scheduler.
- A ready web interface to look at the logs, history & statistics of past cron jobs. How long did it take to run? Is it taking more time today than a month back? Do we need to optimize our code?
- A ready web interface to manage the cron system. Don't need a cron? Simply disable the build job. Need it on a different schedule? Just change the scheduler on the web. Takes the management pain away from me.
Hope this helps you design a better cron system.