+ Post New Thread
Page 1 of 3 123 LastLast
Results 1 to 15 of 39
How do you do....it? Thread, Managing Downtime In A School Environment in Technical; Hi all, This a bit of a random one but I though I'd throw it out there and see how ...
  1. #1
    Duke's Avatar
    Join Date
    May 2009
    Posts
    1,017
    Thank Post
    300
    Thanked 174 Times in 160 Posts
    Rep Power
    57

    Managing Downtime In A School Environment

    Hi all,

    This a bit of a random one but I though I'd throw it out there and see how EduGeekers handle it:

    What is your school's expectations for uptime, what maintenance windows are you given, and how do you manage planned and unplanned downtime, particularly on a school budget?

    For example, some of the issues we are facing:

    • In the past, pretty much all students would finish at 15:00 and after that we could do server maintenance that wouldn't affect staff. Most staff would be done by 16:30 or at least would understand downtime after then and so we could do maintenance at the end of the day. Now, lessons go on until 16:00 or 16:30 and controlled assessment goes until until 17:00 or 17:30 so we can't take anything down during the majority of school days.
    • We used to be able to get away with quick reboots of devices at breaktime or lunchtime but clubs and activities now run through these.
    • School holidays used to be free for us to do as much maintenance as we needed as long as we warned people, but now even a single day causes problems. We're in the Easter break currently and I spent yesterday from 7:30am to 6:30pm re-racking a server cabinet and despite the fact we warned people everything would be down (core switch was turned off) we still had a few complaints.
    • I need a few more hours to finish re-cabling the above mentioned cabinet properly, and I'm currently either going to have to work a night or a weekend, and getting time off in lieu for that will be difficult.


    The school basically expects close to 100% uptime, yet I currently have no budget, only half the funding I needed for virtualisation and no money for redundant storage or a proper backup solution.

    Is this unusual, or are most schools facing this problem? Our school heavily runs on email and SIMS so any downtime in the day is seen as a major issue. A few months ago the network was down for most of the day (bad switching issue took out most of the network and was a nightmare to trace) and I'm still being reminded by management how big a deal that was and how it must never happen again.

    Any thoughts?

    Chris

  2. #2
    bodminman's Avatar
    Join Date
    Apr 2007
    Location
    Sunny Suffolk
    Posts
    1,153
    Thank Post
    724
    Thanked 224 Times in 116 Posts
    Rep Power
    84
    We were directed by SLT.

    Basically 0% downtime between 8am and 4.30pm unless something has failed and needs fixing.

    Any updates etc have to be done out of these hours. During the holidays I can pull the strings so long as I give 24 hours notice.

  3. Thanks to bodminman from:

    Duke (12th April 2011)

  4. #3

    Join Date
    Jan 2009
    Location
    England
    Posts
    1,480
    Thank Post
    297
    Thanked 304 Times in 263 Posts
    Rep Power
    82
    Quote Originally Posted by Duke View Post
    Hi all,

    This a bit of a random one but I though I'd throw it out there and see how EduGeekers handle it:

    What is your school's expectations for uptime, what maintenance windows are you given, and how do you manage planned and unplanned downtime, particularly on a school budget?

    For example, some of the issues we are facing:

    • In the past, pretty much all students would finish at 15:00 and after that we could do server maintenance that wouldn't affect staff. Most staff would be done by 16:30 or at least would understand downtime after then and so we could do maintenance at the end of the day. Now, lessons go on until 16:00 or 16:30 and controlled assessment goes until until 17:00 or 17:30 so we can't take anything down during the majority of school days.
    • We used to be able to get away with quick reboots of devices at breaktime or lunchtime but clubs and activities now run through these.
    • School holidays used to be free for us to do as much maintenance as we needed as long as we warned people, but now even a single day causes problems. We're in the Easter break currently and I spent yesterday from 7:30am to 6:30pm re-racking a server cabinet and despite the fact we warned people everything would be down (core switch was turned off) we still had a few complaints.
    • I need a few more hours to finish re-cabling the above mentioned cabinet properly, and I'm currently either going to have to work a night or a weekend, and getting time off in lieu for that will be difficult.


    The school basically expects close to 100% uptime, yet I currently have no budget, only half the funding I needed for virtualisation and no money for redundant storage or a proper backup solution.

    Is this unusual, or are most schools facing this problem? Our school heavily runs on email and SIMS so any downtime in the day is seen as a major issue. A few months ago the network was down for most of the day (bad switching issue took out most of the network and was a nightmare to trace) and I'm still being reminded by management how big a deal that was and how it must never happen again.

    Any thoughts?

    Chris
    I do know of schools that claim 100% uptime, but they also have very very large IT budgets and the backing to ensure that they have spares on hand for most things. I think the only realistic thing to do is to be honest, set out the true cost of a 100% (or 99.999%) uptime environment and then the cost of environments that get close to that with a few less 0's at the end of the price. Make sure you show them why it's impossible to expect the network to be perfect with a less than pefect budget.

    I know that my last school IT had been heavily underfunded for many many years to the point where everything was being held together with gaffer tape. In the end I turned round to the head and the governors and showed them that they were expecting me to keep 1 - 2 million worth of IT equipment running without any money, when the majority of that equipment only had 3-5 year life spans and we were already past that with 80% of the kit. When asked what alternatives they had to investing money I was honest and told them the alternative was to remove IT suites and interactive whiteboards until we kit a level of IT penetration that suited the budget they had in mind. That shocked them a little, but the next day I had the green light for massive IT investment over five years.

    I think many schools are still treating IT as something that can be strung along on a shoestring and as you point out there is an ever greater reliance on IT systems to the point where it's just not fesable to underfund to the levels that are sometimes talked about.

    I hope my above ramblings helped a little

  5. Thanks to Soulfish from:

    Duke (12th April 2011)

  6. #4

    localzuk's Avatar
    Join Date
    Dec 2006
    Location
    Minehead
    Posts
    17,631
    Thank Post
    514
    Thanked 2,442 Times in 1,890 Posts
    Blog Entries
    24
    Rep Power
    831
    The way I manage it here is this - I give notice of downtime, and do it. Its not possible to run the system otherwise.

    I try not to do any downtime during term, as this would unfairly damage T&L, but out of term time I have to do it. I don't get paid to work evenings or nights, so I'm certainly not going to do it then.

    The options as far as I'm concerned are as follows:

    1. Do downtime as and when needed, warning people in advance, and work with existing budgets.
    2. Demand a larger budget to allow you to cluster everything, and remove the effects of downtime.
    3. Don't do any maintenance and watch the systems crash and burn, causing massive disruption and unwanted downtime.

    The way to present it to SMT is this - you wouldn't let a small piece of rust on a car stay without treatment. You'd fix it as soon as you could, taking the hour or so required to fix it. If you didn't, that bit of rust soon turns into a massive problem and you end up with your car off the road for days.

    Remember, whatever you do, you'll always get people complaining. Just remember Scotty's rule - estimate the time to do the task as being twice what you really think, then finish in half that time, and you'll forever be seen as a miracle worker.
    Last edited by localzuk; 12th April 2011 at 09:47 AM.

  7. Thanks to localzuk from:

    webman (12th April 2011)

  8. #5

    Join Date
    Jan 2009
    Location
    England
    Posts
    1,480
    Thank Post
    297
    Thanked 304 Times in 263 Posts
    Rep Power
    82
    Quote Originally Posted by localzuk View Post
    The way I manage it here is this - I give notice of downtime, and do it. Its not possible to run the system otherwise.

    I try not to do any downtime during term, as this would unfairly damage T&L, but out of term time I have to do it. I don't get paid to work evenings or nights, so I'm certainly not going to do it then.
    And to answer the original question properly - this is how I normally manage downtime. There are occasional bits that I may do out of hours during term, but I have quite a good relationship with my line manager so could reclaim TOIL during the holidays for this sort of thing.
    Quote Originally Posted by localzuk View Post
    The way to present it to SMT is this - you wouldn't let a small piece of rust on a car stay without treatment. You'd fix it as soon as you could, taking the hour or so required to fix it. If you didn't, that bit of rust soon turns into a massive problem and you end up with your car off the road for days.

    Remember, whatever you do, you'll always get people complaining. Just remember Scotties rule - estimate the time to do the task as being twice what you really think, then finish in half that time, and you'll forever be seen as a miracle worker.
    I'll have to remember tha analogy for future reference . And I follow a similar estimation rule - much better to deliver early and be thanked than deliver ontime/late and get complaints!

  9. #6

    Join Date
    Nov 2006
    Location
    Kendal
    Posts
    1,555
    Thank Post
    112
    Thanked 177 Times in 144 Posts
    Rep Power
    71
    We try not to do things (unless vital) between 8am and 4.30pm. If we plan to do soemthing later on I usually get someone to come in late and stay late but that's only for vital stuff. Other than that we have an agreement that the 1st Monday of the holidays the system is "at risk" - I confirm two weeks before the impending holiday if that means small areas down or a whole network outage while we re-arrange the server rack etc.

    People seem happy enough to work with that. I think as long as people are given plenty of notice it's fair enough to take things down occasionally.

    I do also end up remoting in at night to do some work too but the school are very flexible when it comes to time off so I can't complain.

  10. Thanks to jcollings from:

    Duke (12th April 2011)

  11. #7


    Join Date
    Sep 2008
    Posts
    1,752
    Thank Post
    320
    Thanked 258 Times in 211 Posts
    Rep Power
    119
    I've worked in places which expected similar things of me and to be honest the ones who complained were generally the ones who didn't listen that maintenance was scheduled and instead would come in and complain that they couldn't use the system. You will always get people who complain but you have to ignore most of these, but then mostly downtime was kept to during the holidays. If people were warned that there was going to be downtime they should listen.

    However as others have said, if you are being pressured into feeling that you need to provide 100% uptime of services you need to breakdown the true cost of this. Without sufficient budget these things often aren't realistic and highlighting what you can realisticly get with your budget should help the SMT realise they are asking too much.

    In addition, I wouldn't work over for nothing either, the more you do that the more they expect you to do that. But is this your SMT that are making these demands or all staff? If this is your SMT and you have neither the budget nor the manpower then they need to revise what they expect you to do. If they still want you to provide support 24/7 then they need to give you the resources to do this. If it is all staff who are complaining tell them to speak to your SMT.

  12. Thanks to penfold from:

    Duke (12th April 2011)

  13. #8

    TechMonkey's Avatar
    Join Date
    Dec 2005
    Location
    South East
    Posts
    3,286
    Thank Post
    225
    Thanked 405 Times in 302 Posts
    Rep Power
    162
    We don't do anything that could affect the system between 8 & 16:00. Thursdays are marked down as maintenance or patch day with notice. So we let everyone know via a traffic light system whether it will be SIMS or the whole network. Summer we usually publish a week where the system will be unreliable so staff don't come in with the expectation of working. This allows us to drop the servers or switches when we want, do a full reboot of stations, etc, with out having to check everyone is ok.

    I know the feeling though, more and more we are being expected to keep the system up & carry out anything in our own time, but with no overtime & TOIL is a joke. Same as you, holidays used to be our time with a few exceptions but now that is being eroded. Add to that we don't have that much time as we have to take holiday in holiday time, summer is restricted as that is when we do most and have to take xmas, it really is going to need a shift in expectations or school policy at some point.

  14. Thanks to TechMonkey from:

    Duke (12th April 2011)

  15. #9

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,621
    Thank Post
    1,239
    Thanked 777 Times in 674 Posts
    Rep Power
    235
    Quote Originally Posted by Duke View Post
    The school basically expects close to 100% uptime, yet I currently have no budget, only half the funding I needed for virtualisation and no money for redundant storage or a proper backup solution.
    I had much the same issue, wanting as near to 100% up-time as possible on a limited budget. This is where I found open source virtulisation and storage comes in to its own - it turns out you don't actually need to spend any money at all on anything except hardware, all the software to do what you want is available for free.

  16. Thanks to dhicks from:

    Duke (12th April 2011)

  17. #10
    Duke's Avatar
    Join Date
    May 2009
    Posts
    1,017
    Thank Post
    300
    Thanked 174 Times in 160 Posts
    Rep Power
    57
    Wow, thanks for all the quick responses everyone!

    Quote Originally Posted by bodminman
    Basically 0% downtime between 8am and 4.30pm unless something has failed and needs fixing.
    I can generally live with that and think no planned downtime in school hours (8:00-15:00) is perfectly reasonable. However, if I finish at 16:30 and can't do any maintenance between 15:00 and 16:30 then I'd have to either do a weekend (which I'm willing to do, but it inconveniences the Site Office having to take off alarms for me and I'd rather not make it a regular occurrence because it'll become expected of me) or work late (which I don't mind doing either, but again I can only work until 18:00 because of alarms and I don't want to make it a regular thing). If I don't do weekends or working late then it's 6 weeks between each holiday until I can do real maintenance.

    Quote Originally Posted by Soulfish
    I hope my above ramblings helped a little.
    Definitely, thank you. Management are generally pretty good at realising we need proper funding, but that last three years have been pretty tight and everyone is facing budget cuts this year and next year. I've just been promoted to head of department so I'll be working out an IT strategy that'll lay out what budget we need - unfortunately because we haven't had much budget the last few years a lot of desktops really do need replacing while I've also got major infrastructure upgrades to do.

    Quote Originally Posted by localzuk
    The way I manage it here is this - I give notice of downtime, and do it. Its not possible to run the system otherwise.

    1. Do downtime as and when needed, warning people in advance, and work with existing budgets.
    2. Demand a larger budget to allow you to cluster everything, and remove the effects of downtime.
    3. Don't do any maintenance and watch the systems crash and burn, causing massive disruption and unwanted downtime.
    I think management just have this idealistic view that it never breaks and never needs maintenance - I wish! (although I suppose that would put me out of a job) 1. = my preferred option, management have just made it very difficult to do. 2. = not happening this year or next year due to budgets. 3. = my conscience wouldn't let me.

    Quote Originally Posted by jcollings
    Other than that we have an agreement that the 1st Monday of the holidays the system is "at risk". [...] People seem happy enough to work with that. I think as long as people are given plenty of notice it's fair enough to take things down occasionally.
    I think I need to set up a day a month when maintenance is expected. Generally speaking people are fine with the downtime as long as it's properly planned, but yesterday's maintenance was calendared for a while ago, then cancelled because someone needed the network up, then it was moved again, cancelled for the same reason, and now I'm trying to fit it in when it should have been done a while ago.

    Quote Originally Posted by penfold
    However as others have said, if you are being pressured into feeling that you need to provide 100% uptime of services you need to breakdown the true cost of this. Without sufficient budget these things often aren't realistic and highlighting what you can realisticly get with your budget should help the SMT realise they are asking too much.
    Completely agree, the realistic costs to match expectations will go on my new plans...

    Quote Originally Posted by TechMonkey
    Thursdays are marked down as maintenance or patch day with notice. [...] I know the feeling though, more and more we are being expected to keep the system up & carry out anything in our own time, but with no overtime & TOIL is a joke.
    As I mentioned, I think I need to organise a maintenance day - e.g. server reboots will be happening between 15:30 - 16:30 on a certain day. Don't think management will like this though. Technically my job doesn't get me any overtime pay or TOIL (except for weekends) now I've hit a certain grade, however management are flexible on this. My bigger concern is that it would become expected of me and it would be assumed I'd be happy to work whatever hours are needed.

    Quote Originally Posted by dhicks
    I had much the same issue, wanting as near to 100% up-time as possible on a limited budget. This is where I found open source virtulisation and storage comes in to its own - it turns out you don't actually need to spend any money at all on anything except hardware, all the software to do what you want is available for free.
    I agree up to a point. I find if you want true redundancy and failover then VMware offers some very nice options, but at the end of the day the biggest costs for this type of thing have been hardware which is hard to avoid if you want reliable kit with a full warranty.

    Many thanks everyone,

    Chris

  18. #11
    JBoyd's Avatar
    Join Date
    Feb 2011
    Location
    Lyneham, Wiltshire
    Posts
    9
    Thank Post
    0
    Thanked 1 Time in 1 Post
    Rep Power
    0
    It's the old story. Schools expect to have the latest kit but cannot afford it and it doesn't help that the government keep pushing for new IT courses which need up to date resources; the students and staff then get upset when it doesn't work or is so slow it may as well not!! We only reboot in term time as a last resort and then never during lesson time but we have the expectation that holidays can be used for 'proactive' maintenance and that there will be disruption for those who cannot bear to be away or have no life outside school!!

    You could always be a but sneaky and have a 'practice' network crash and blame it on not having enough time for maintenance ;-)

    Jim

    Computers are stupid machines which can do clever things. Programmers are clever people who can do stupid things - this combination can cause mayhem!!

  19. Thanks to JBoyd from:

    Duke (12th April 2011)

  20. #12

    glennda's Avatar
    Join Date
    Jun 2009
    Location
    Sussex
    Posts
    7,799
    Thank Post
    272
    Thanked 1,134 Times in 1,030 Posts
    Rep Power
    349
    Quote Originally Posted by Duke View Post
    I agree up to a point. I find if you want true redundancy and failover then VMware offers some very nice options, but at the end of the day the biggest costs for this type of thing have been hardware which is hard to avoid if you want reliable kit with a full warranty.
    If you pay out for the kit and then look at the free software - Linux KVM Which is an open source virtulization platform which can do things such as live migration. The only thing lacking at the moment is the automatic failover but i'm sure it will appear soon.

    We currently run aprox 70% of the network on the platform and its alot quicker then esxi IMHO (plus a few articles on the net)

  21. Thanks to glennda from:

    Duke (12th April 2011)

  22. #13

    dhicks's Avatar
    Join Date
    Aug 2005
    Location
    Knightsbridge
    Posts
    5,621
    Thank Post
    1,239
    Thanked 777 Times in 674 Posts
    Rep Power
    235
    Quote Originally Posted by Duke View Post
    reliable kit with a full warranty.
    In my experience, a warranty is just there to stop you fixing stuff yourself. Modern computers aren't dangerious mechanical devices, they're just a bunch of standard components shoved in a case - build your own servers and you don't have to worry about invalidating a garuntee, you can just swap the broken component out and carry on. Saying that, modern off-the-shelf components are also, generally, very reliable.

  23. #14

    FN-GM's Avatar
    Join Date
    Jun 2007
    Location
    UK
    Posts
    15,823
    Thank Post
    875
    Thanked 1,675 Times in 1,458 Posts
    Blog Entries
    12
    Rep Power
    444
    Quote Originally Posted by dhicks View Post
    In my experience, a warranty is just there to stop you fixing stuff yourself. Modern computers aren't dangerious mechanical devices, they're just a bunch of standard components shoved in a case - build your own servers and you don't have to worry about invalidating a garuntee, you can just swap the broken component out and carry on. Saying that, modern off-the-shelf components are also, generally, very reliable.
    I disagree. For example on our dell servers we have a 4 hour warranty. If the motherboard dies in one they will be here within 4 hours to fix it, any time of the day.

    If i was to fix it. I would first have to:
    • 100% make sure that it is the board
    • Find a replacement - may take a few days
    • Replace the board myself - My take longer and may break something else.


    Particular parts like motherboards can be hard to find when the computer/server is getting older. With a warranty you know they will be available fairly quick.

    Meanwhile the days and hours are ticking and this will have a knock-on to lots children’s education.

    You also have to take into account the time it takes you to do these jobs. You have to remember your not free.

    The extra time that is spent resolving the issue could also be better used in improving the children’s education.
    Last edited by FN-GM; 12th April 2011 at 12:37 PM.

  24. Thanks to FN-GM from:

    Duke (12th April 2011)

  25. #15

    bossman's Avatar
    Join Date
    Nov 2005
    Location
    England
    Posts
    3,905
    Thank Post
    1,186
    Thanked 1,057 Times in 749 Posts
    Rep Power
    328
    @Duke:

    Downtime to any business is crucial and unless you have everything covered in total fail-over and redundancy then you are in fact going to incur downtime.

    How you manage that downtime is something that you and the SLT are going to have to sit down and work out.

    Then write up an SLA to which you and the SLT agree to and there you have it.

    You as the NM are required to be as flexible as feasibly possible and the school also has to be flexible in its thinking, give and take should result in a very manageable process which gives both you and the SLT total faith in the IT infrastructure and the management of it.

    Work life balance is something we should all be aware of and can lead to some serious health issues for people but if managed properly by SLT and yourself then it should be a progressive partnership.

    The whole school then benefits from this and you will be recognised for the efforts of you and your team.

SHARE:
+ Post New Thread
Page 1 of 3 123 LastLast

Similar Threads

  1. The Virtual Learning Environment (VLE) at Twynham School
    By UKDarkstar in forum Virtual Learning Platforms
    Replies: 3
    Last Post: 22nd February 2010, 04:14 PM
  2. Replies: 11
    Last Post: 19th February 2010, 11:10 AM
  3. managing whitelist for entire school?
    By Flakes in forum Internet Related/Filtering/Firewall
    Replies: 5
    Last Post: 30th November 2009, 09:21 AM
  4. Replies: 7
    Last Post: 12th October 2008, 08:33 PM

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •