Cron, A practical guide to

Posted by lec** on Saturday, May 03 2008 @ 21:42:21 GMT        

Today, as requested, I have decided to say a little about cron, what it is and how it works, and give an overview and several methods for making use of it on a website.

What is cron? Cron is a time-based scheduling service, or in less complex terms, a way to run a certain program at some time you designate. Cron is useful in the development of web applications because it allows you to execute a cleanup script, for example, every thursday at 6 am, when your visitors are very few. By analysing the traffic to your website, you can find holes when there are very few visitors on average, and in those periods you could, using cron, make certain scripts execute that (for example) delete old comments that are no longer useful, create summaries of the most viewed content (this can take time to do, so you don't want to run it every time the page loads), save logs of the visitors that visited your website to a file, and then clear the current log from the database, and so forth. You could even create summary PDFs for yourself and your other admins at the end of the week using cron. Cron is very useful.

cron
Cron is a great tool for web developers

So how does it work? Cron is specific to the operating system. It's a service available on UNIX-based systems, which means it is available on most servers. Using cron task scheduling, you can really make the server run certain scripts at a certain time. A program running on the server (called a daemon) will check a file you register (as a user of the system) once every minute and read the contents to see whether you have registered any programs that should be run this minute. If so, those programs listed are executed, and the daemon sleeps for one minute (that sounds like something from D&D). If you are a user of the system, and have permission to use cron, just make your own cron file and register it for execution.

You do this by writing commands into a file (you could give it the .cron extension, for ease of recognition - e.g. pear.cron) and then using the crontab command on it from the UNIX terminal (~$ crontab pear.cron). Your cron file will now also be checked every minute.

Crontab file syntax is super-simple. It just specifies when you want your program to be run, and what the program is (it's path on the machine). Here's the general syntax:

 (minute) (hour) (day_of_month) (month) (day_of_week) (program_to_be_run)
 

Just replace the names in brackets with numbers, and the (program_to_be_run) with the path and name of the program whose execution you are scheduling with cron. You can have multiple cron tasks listed in one crontab file. Also, lines beginning with a # sign are considered comments in cron files. For example:

 # run /usr/bin/pear every 15 minutes while the server is up
 0,15,30,45 * * * * /usr/bin/pear
 
 # exactly the same thing as above, shorter notation
 */15 * * * * /usr/bin/pear
 
 # run my plum program at 10 minutes past 3 am every day
 10 3 * * * /usr/bin/plum
 
 # run some pineapples 10 minutes past midnight on the first day of every month
 10 * 1 * * /usr/bin/pineapple
 
 # pretty much the same thing as above, except on the first day of the year
 10 * * 1 * /usr/bin/passionfruit
 
 # finally, roll watermellons down a hill every Monday at 4:16pm (machine local time)
 16 16 * * 1 /usr/bin/watermellons
 

That's pretty simple. If you were to put that in a file called fruit.cron, and scheduled it for execution using the crontab program, you would get these fruit-related programs executed at their specified times. By specifying all the fields, you could in practice schedule a task to happen only once sometime in the future, though that doesn't make much sense.

If you make changes to the cron file, and want to update it, you first need to unregister it. ~$ crontab -r will remove your scheduled cron file, and your scripts won't be periodically executed anymore. It's now time to register the revised file again using the same command from before. Also, cron apparently sends emails to the user when a task is executed. If you want to supress this, just add >/dev/null 2>&1 to the end of the cron task line. For example:

 # suppress emails for avocados, that are executed every Tuesday night
 20 20 * * 2 /usr/bin/avocados > /dev/null 2>&1
 
/dev/null 2>&1 ]]>

Of course, that is the "true" cron. To make use of it, it is necessary to have access to a terminal of the server machine (through SSH, or by other means). Many users (especially those on free hosting, or because of technical differences) don't have access to the terminal, or otherwise can't make use of cron. In that case, a "cron emulation" method is used. It's quite simple: instead of letting the operating system execute the scripts that they want, they will execute them themselves. No, that doesn't mean getting up at 6am every Thrusday to run a script on a server somewhere in Siberia, it's as simple as running a "control" script every time a page is loaded that will determine whether any cron tasks are scheduled (scheduled in a database or local file instead of crontab files) and then running those scripts that need running. The big disadvantage is that this means that if a script is supposed to be executed at 3am GMT every Friday, it won't be executed at that time if there are no visitors opening pages - therefore, it could be executed on Sunday evening, if no one requests a cron-enabled page from the server until that time. However, if properly set up, the effect can be quite similar.

Shinyshell croncodile
The Shinyshell Croncodile who helps webmasters set up cron

An accepted method for cron emulation is to have a script that outputs a 1x1 pixel clear gif image. Then you can link that image into your website with an img tag, and before the script outputs the image, it will run a check to see whether there are any cron tasks that need to be run at this time. If there are, it will run them before outputing the image.

Why is this good? If there are no cron tasks to run, it will output a small, transparent image that will be embedded into your page, and no one will notice it. If there is a huge cron task to run, the user will not have to wait for the page contents (which would be the case if you had decided to run the cron code as a part of the web page generation code itself), but the image will not load immediately, which is not a problem (depending on the task, it could take many minutes to run a complex update on a database with thousands of records). If you include this small image in other places (e.g. another of your websites) you can ensure that your cron tasks can get run even if there is no one browsing the website in question!

How really to do this
I will now show you how to make a simple system for cron in PHP. Now, pay attention double-oh-seven.

<?php 
// CRON.PHP (the script you'll include as an image)

error_reporting(E_ALL ^ E_NOTICE);
ignore_user_abort(true);
@set_time_limit(0);

// various definitions to use in this script...
define('SAPI_NAME', strtolower(php_sapi_name()));
define('TIME', time());
define('DIR', (($cwd = @getcwd()) ? $cwd : '.'));

// change this path to reflect where you put the cron functions script
require_once(DIR . '/path/to/cron_functions.php');

// this is the clear GIF image that we will output after executing the cron tasks
$filedata = "R0lGODlhAQABAIAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw==";
$filesize = strlen(base64_decode($filedata));

header('Content-type: image/gif');

// the browser will stop loading the image if you specify a Content-Length
// regardless of how long the script continues to execute - so the image
// won't take long to load, unless this is running under IIS + CGI (epic fail)
if (!(strpos($_SERVER['SERVER_SOFTWARE'], 'Microsoft-IIS') !== false &&
strpos(SAPI_NAME, 'cgi') !== false)
)
{
header('Content-Length: ' . $filesize);
header('Connection: Close');
}

// print the image and finish the loading off
echo base64_decode($filedata);
flush();

// connect to MySQL - alternatively use your own methods here, if you're using
// some method other than a MySQL database (e.g. textual files)
mysql_connect('localhost', 'username', 'password');

$nextcron = mysql_fetch_assoc(mysql_query("
SELECT * FROM cron
WHERE nextrun <= " . TIME . " AND active = 1
ORDER BY nextrun LIMIT 1
"));

if ($nextcron)
{
$nextrun = cron_update($nextcron['cronid'], $nextcron);
if ($nextrun)
{
include_once(DIR . '/' . $nextcron['scriptname']);
}
}

?>
]]>

That was the script you will include into your web page using the image tag (more about this later). Just upload this functions script too, wherever you want and change the path to reflect it in the require_once() call in cron.php above. It's too big to cram into this article, so you can view the code here.

That functions script simply provides some functions that let cron calculate when to run what script. The whole system is set up so that one page load will run only one task, to balance the execution time as much as possible. You will notice that these scripts require a database to function. You'll need to either use a MySQL database, or recode the parts to do with database interaction for them to work with other database types (e.g. MySQLi, text files in some format...).

I suggest using MySQL. If you do, you'll need to create a new table called "cron". You can use this query:

CREATE TABLE cron ( 
cronid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
nextrun INT(10) UNSIGNED NOT NULL DEFAULT '0',
weekday SMALLINT(5) NOT NULL DEFAULT '0',
day SMALLINT(5) NOT NULL DEFAULT '0',
hour SMALLINT(5) NOT NULL DEFAULT '0',
minute VARCHAR(200) NOT NULL DEFAULT '',
scriptname CHAR(50) NOT NULL DEFAULT '',
active SMALLINT NOT NULL DEFAULT '1',
PRIMARY KEY (cronid),
KEY nextrun (nextrun)
)

Now you should have enough scripts to set cron up. Now how do you use it? Put the cron.php file into your site root directory, and the functions file wherever you want. Once you've configured the script as much as necessary, you'll want to put an image tag somewhere at the bottom of all your pages that looks like this:

<img src="cron.php?nocache=$current_timestamp" alt=" " style="width:1px;height:1px;" /> 
]]>

The $current_timestamp should be the current UNIX timestamp, so that browsers don't cache the image. Finally, you'll want to actually schedule cron jobs. You'll find it easier to do this if you make yourself an admin control panel for adding new tasks, because the minute field requires inputing serialized data if you want a task to run multiple times an hour. But you can still add new tasks manually to the database:

 +--------+---------+---------+-----+------+--------+---------------------------+--------+
 | cronid | nextrun | weekday | day | hour | minute | scriptname                | active |
 +--------+---------+---------+-----+------+--------+---------------------------+--------+
 |      1 |       0 |      -1 |  -1 |    3 |     10 | crontasks/postcleanup.php |      1 |
 +--------+---------+---------+-----+------+--------+---------------------------+--------+
 

The nextrun field should be left empty, as the script will calculate it so you don't have to. The above task will register the crontasks/postcleanup.php script to run every day at 3:10am. There you go. Experiment with it, and have fun!
SpaceMan

SpaceMan's avatar
Aug 28 2009 @ 19:45:49
On Windows, there's a command called "at", I used it to auto-shutdown my computer.
Then there are some Administrative Tools on Windows.

Also, the performance and crash problems are caused by one of the default cron jobs
Conventional Login

Don't have an account? You may want to create one.

OpenID Login
OpenID login and registration is usable, but not finished.
What is OpenID?
Search

(advanced search)
Site Stats
  Total members: 108
  Latest member: adamthephantump
  Members currently online: 0
  Most online: 5 - Aug 28, 2009 (21:49)
  Front page hits: 87998
Developer info
  Site version: 3.5 Alpha
  12 queries - 4 templates
Under the Spotlight
Collide Site
Collide make fabulously dreamy electronic-industrial music, they're one of my favourite bands! Give them a chance to take control of your life - myspace | youtube - "Euphoria".

Collide Site - Hits: 4597

5/5 (2) | Rate this site?