Issue
I have multiple crons set in Django. In each CronJob I have set ALLOW_PARALLEL_RUNS = False
. To run crons I have used linux crontab
like follows :
*/1 * * * * /home/social/centralsystem/venv/bin/python3.6 /home/social/centralsystem/manage.py runcrons
After some times of running (for example after 2 monthes) I see lots of same crons running that make a lot of load on the server. My question is that what causes this happen?
one example of my cron classes is :
class UserTaskingCronJob(CronJobBase):
ALLOW_PARALLEL_RUNS = False
RUN_EVERY_MINS = 5
schedule = Schedule(run_every_mins=RUN_EVERY_MINS)
code = 'user_tasking'
def do(self):
args = {
'telegram': {
'need_recrawl_threshold': 60 * 2,
'count': 100,
},
'newsAgency': {
'need_recrawl_threshold': 10,
'count': 100,
},
'twitter': {
'need_recrawl_threshold': 60 * 4,
'count': 500
},
}
for social_network in ['telegram', 'newsAgency', 'twitter']:
user_queuing(
SOCIAL_USERS_MODEL[social_network],
social_network,
args[social_network]['need_recrawl_threshold'],
args[social_network]['count'],
)
Solution
You have to be careful with django-cron, if you have lots of different tasks running for different periods of time. runcrons
takes all your cron classes sequentially and runs them sequentially. It also only logs a cron (successful or not) to the database when it's done. I think django-cron could be improved by saving the cron log at the start already (and checking if there is already a running task), but that would still not exclude overlaps if multiple jobs are run rather than one long one.
You are running runcrons
every minute, so in these cases you'll run into trouble:
- If during one of the runs, one of the tasks that needs to be run takes longer than 1 minute to run.
- If during one of the runs, the total duration of all tasks that need to be run takes longer than 1 minute to run.
In both cases, some tasks will not be logged in time to the database and while they are running, the next runcrons
command will start them again.
To avoid this, do the following:
- Identify tasks that take longer than 1 minute to run and run them with a different schedule that ensures they have finished before the next run.
- In the crontab, run separate
runcrons
commands with a list of cron classes each, making sure that the total run of a list lasts less than 1 minute, e.g.
*/1 * * * * ./bin/python3.6 manage.py runcrons "my_app.crons.FirstCron" "my_app.crons.SecondCron"
*/1 * * * * ./bin/python3.6 manage.py runcrons "my_app.crons.ThirdCron"
*/10 * * * * ./bin/python3.6 manage.py runcrons "my_app.crons.LongCron"
Answered By - dirkgroten Answer Checked By - Senaida (WPSolving Volunteer)