The battle of the reboot

Patching has become routine, but patches don’t take without a reboot. That’s a problem when business units insist on zero downtime.

I’ve talked a lot about patching, ever since my first column for Computerworld way back in 2008. Then, I had to struggle with the IT department to get them to do any patching. The backlog was immense — not only had the Windows operating systems running on our servers never been patched, but neither had their software applications. My vulnerability scanner found literally hundreds of thousands of patchable vulnerabilities.

It took a lot of work over a couple of years to work down that backlog and get everything current. And then we had to start on workstations. After all that, we finally reached a stable baseline with few vulnerabilities, but the next challenge was to start patching on a regular basis and keep all our computers updated month-to-month.

Fast forward to today, and everything is now on an even keel, as I mentioned in a recent column. Finally, patching has become a routine system administration practice no more difficult than log file management, end-user account administration and other relatively painless procedures. We even have special patching software that automates many of the system administration tasks required to deploy patches and software updates to all our computers.

There’s just one problem remaining: Windows computers must be rebooted to complete the patch installations. And because a reboot takes the computer out of service for a few minutes, it causes downtime. And when that system is dependent on other systems, or vice versa, rebooting can cause a chain reaction that cripples critical software services. So in fact, the simple act of rebooting a computer to complete the patch installations is the hardest part of the job.

This problem came to my attention in a recent vulnerability management meeting. I have these meetings every week with my company’s IT department, to go over the latest vulnerability scan results with them, plan next steps and make sure nothing gets missed. As we reviewed the scan results, it became apparent that several servers had fallen a couple of months behind on patches. This came as a surprise, because as I said, patching has become routine.

When I asked why those systems weren’t getting patched, one of the system administrators said the patches had in fact been installed, but the systems hadn’t been rebooted. From the system administrator’s point of view, his work was done. He applied the patches and figured the last step — the reboot — wasn’t such a big deal. From my point of view, though, the vulnerabilities still exist because the patches haven’t been installed.

But it wasn’t as simple as me asking him to reboot those computers. He hadn’t been given permission to do so. “The business won’t let me bring down the application right now,” he said. “They have a big deadline coming up, and they don’t want any downtime.”

“Are you kidding me?” was my response. “Surely they can tolerate a five-minute outage in the middle of the night when nobody is working.” But I found out later, when I called the business unit manager, that they were running overnight processes that would be corrupted by stopping the services.

So I tried to find a time when everybody could agree to do the reboots. Unfortunately, we haven’t come to that agreement yet. After the business unit’s deadline has passed, we should be able to resolve this. But my main concern is not with this particular situation; it’s with the general challenge of business units requiring 100% uptime on computers that need to be rebooted at least once a month. This is going to take some negotiation and planning.

For the moment, though, I’m going to have to live with some accumulated vulnerabilities. I could take a hard line and insist on rebooting the servers, but knowing that that would compromise the business unit’s work, I’ve decided to be flexible. We need to find a solution to the overall problem of regular system rebooting (and other system administration tasks) in a mutually agreed “maintenance window” where IT can take over all the computers for a while every month.

But secretly, I’m hoping for a power outage.

This week's journal is written by a real security manager, "J.F. Rice," whose name and employer have been disguised for obvious reasons. Contact him at jf.rice@engineer.com.

Join in

Click here for more security articles.

Join the CSO newsletter!

Error: Please check your email address.

More about Click

Show Comments

Featured Whitepapers

Editor's Recommendations

Solution Centres

Stories by By J.F. Rice

Latest Videos

  • 150x50

    CSO Webinar: The Human Factor - Your people are your biggest security weakness

    ​Speakers: David Lacey, Researcher and former CISO Royal Mail David Turner - Global Risk Management Expert Mark Guntrip - Group Manager, Email Protection, Proofpoint

    Play Video

  • 150x50

    CSO Webinar: Current ransomware defences are failing – but machine learning can drive a more proactive solution

    Speakers • Ty Miller, Director, Threat Intelligence • Mark Gregory, Leader, Network Engineering Research Group, RMIT • Jeff Lanza, Retired FBI Agent (USA) • Andy Solterbeck, VP Asia Pacific, Cylance • David Braue, CSO MC/Moderator What to expect: ​Hear from industry experts on the local and global ransomware threat landscape. Explore a new approach to dealing with ransomware using machine-learning techniques and by thinking about the problem in a fundamentally different way. Apply techniques for gathering insight into ransomware behaviour and find out what elements must go into a truly effective ransomware defence. Get a first-hand look at how ransomware actually works in practice, and how machine-learning techniques can pick up on its activities long before your employees do.

    Play Video

  • 150x50

    CSO Webinar: Get real about metadata to avoid a false sense of security

    Speakers: • Anthony Caruana – CSO MC and moderator • Ian Farquhar, Worldwide Virtual Security Team Lead, Gigamon • John Lindsay, Former CTO, iiNet • Skeeve Stevens, Futurist, Future Sumo • David Vaile - Vice chair of APF, Co-Convenor of the Cyberspace Law And Policy Community, UNSW Law Faculty This webinar covers: - A 101 on metadata - what it is and how to use it - Insight into a typical attack, what happens and what we would find when looking into the metadata - How to collect metadata, use this to detect attacks and get greater insight into how you can use this to protect your organisation - Learn how much raw data and metadata to retain and how long for - Get a reality check on how you're using your metadata and if this is enough to secure your organisation

    Play Video

  • 150x50

    CSO Webinar: How banking trojans work and how you can stop them

    CSO Webinar: How banking trojans work and how you can stop them Featuring: • John Baird, Director of Global Technology Production, Deutsche Bank • Samantha Macleod, GM Cyber Security, ME Bank • Sherrod DeGrippo, Director of Emerging Threats, Proofpoint (USA)

    Play Video

  • 150x50

    IDG Live Webinar:The right collaboration strategy will help your business take flight

    Speakers - Mike Harris, Engineering Services Manager, Jetstar - Christopher Johnson, IT Director APAC, 20th Century Fox - Brent Maxwell, Director of Information Systems, THE ICONIC - IDG MC/Moderator Anthony Caruana

    Play Video

More videos

Blog Posts

Market Place