For something like this, I think setTimeout is not the most usable solution especially if you have to crawl multiple websites. If I needed to run something like this, what I would do would be to use cron to trigger a node script three times a day(every 8 hours), that script gets the list of sites to crawl from the database and lets a queue service like beanstalkd deal with crawling.