One of the most popular use cases for cron is to perform maintenance operations on a machine. This is manageable for a handful of machines. But as the number of systems increases, this becomes more of a chore. And questions arise such as:
- Are all systems configured?
- With what configurations?
- When will the operations be done?
- Are the operations timed/coordinated well?
- What to do when one or more machines need to be taken out of the maintenance loop?
We will present a setup to show how to manage this using hcron.
For each operation, we use the following structure:
- At least one template is created for each operation.
- The timegroup name encodes when the event should be launched.
- The template decodes the timegroup name from the
- The template extracts the hostname from the
1_0is decoded to "
when_hour=1" and "
- 0,12_0 is decoded to "
when_hour=0,12" and "
To limit the complexity and need to create an event file for each operation+host, templates are used, each of which decodes the
HCRON_EVENT_NAME to obtain the
As presented, the timegroup name is effectively:
<when_hour>_<when_minute>. Though clear for scheduling, the name does not communicate anything about the members. We can address by adding a tag. Two ways to do this are:
- Incorporate the tag into the timegroup name.
- Add an additional level to the event tree.
Enhancing the Timegroup Name
Incorporating the tag into the timegroup name is easiest because the current setup requires almost no change. Simply prefix the timegroup name with a tag.
For example, if the event tree is modified as:
the template event does not need to change because the
MN values are read from the end of the timegroup name:
so that anything preceding is effectively ignored.
Adding a Level to the Event Tree
Alternatively, another level could be added to the event tree.
This will require that:
- The symlinks be updated with an additional "
- The template be modified to extract the
TIMEGROUPfrom index -3 not -2.
It is also possible to order the reverse the timegroup and tag to get "
red/1_0" instead of "
1_0/red" at the cost of making the tag more important than the timegroup.
Both approaches achieve the goal, each with their own strengths. However, incorporating the tag into the timegroup name is simplest.
Using hcron and the setup described above, all the questions raised in the introduction are easily answered. There is only one, authoritative, place to look for the information. And with much of the configuration immediately visible in the structure and event names, documentation of the setup comes for free (this is enhanced with the information fields of v1.5 and the
hcron doc support). Logging helps to track what happened, when, and where. Updates are easy to do and unaffected by whether a machine is up or down, answering or not. The solution is scalable from 10 to 1000s of machines. The setup can be backed up, can be managed with a version control system. This approach makes administration of such maintenance type operations much easier than it ever was with local crontab files.