They Lost Their Cars to the Floods — But in Doing So, Saved Crucial Computer Servers
As flood waters from Tropical Storm Irene swamped the Waterbury state office complex, seven employees from the Vermont Agency of Human Services rushed inside to rescue computer servers that are critical for processing welfare checks and keeping track of paroled prisoners living around the state.
Two AHS employees —network administrator Andrew Matt and deputy chief information officer Darin Prail — parked their cars behind the AHS building at around 6 p.m. that Sunday, August 28, and rushed in to save the equipment. When they came back outside, giant trees were floating by and the entire parking lot was under water. So were their cars.
"We didn't know how much time we had," Matt said, "and our job was to save the servers."
The employees' quick thinking is being credited with saving the state's largest agency from disaster. AHS oversees not only the Department of Corrections but runs programs that serve thousands of Vermont children, families, senior citizens and individuals with disabilities. Within days, AHS was up and running again — its servers installed at an alternate site. The AHS servers run nearly 300 distinct applications — from desktop software like Excel and Microsoft Word to more specialized software that tracks prison inmates and handles applications for food stamps and health care.
After last spring's record flooding, AHS developed a server rescue plan to save critical data systems in the event of future flooding. This was the first test of that plan, said Angela Rouelle, the agency's chief information officer.
"We realized that we didn't have a good enough plan in place to handle a real emergency, so as we watched the river we talked about what we would need and what we would do," said Rouelle.
As Irene was lashing Vermont, AHS staff drove by the complex routinely throughout that day to check on the rising water levels. Staff also kept in touch directly with the National Weather Service for up-to-the-minute information on how high the river might crest.
By 6 p.m., the waters were rising fast and the team launched a rapid response effort to shut down the agency servers, unplug them and move them to dry land.
In the days leading up to the storm, officials with the Vermont Department of Buildings and General Services and Vermont Emergency Management believed that the complex wouldn't face the kind of dramatic flooding that occurred. Rather, they anticipated the site would see flooding similar to what happened in May — when some of the fields behind the site were flooded. What occurred was a flash flood of epic proportions.
"I don't think anyone that I know of contemplated the severity of the event — especially at that campus." said BGS Commissioner Michael Obuchowski.
As a precaution, however, the state cleared out dozens of state-owned vehicles in the days before the flood, and several BGS custodians and security staff were at the building until nearly 11 p.m. the night before, powering down the entire complex and making sure everyone was evacuated, said Obuchowski.
For Rouelle, taking a more proactive action saved the state — and her team — months of work and the cost of replacing tens of thousands of dollars in equipment and applications.
Rouelle had rented a truck and had it waiting at the loading dock so staff could load computers onto the back of the truck and spirit them away to dry land. But the water rose much faster than anticipated and Rouelle realized that by the time the truck was loaded, they would be unable to drive it out of the complex.
So, the AHS employees began to drag the servers from the basement to the second floor. Days after the flood waters subsided, Vermont National Guard trucks hauled the servers to Montpelier where they were reconnected to the state system.
"Had we listened to what others were saying — that this wasn't going to be a big deal and we wouldn't flood — we would have been sunk," said Rouelle.
Matt, who lost his car, recounted: "My first instinct was to park as close to the data center as I could because I had tools in my car that we might need as we pulled the servers, so I parked right near the data center in alleyway. We started working and after Darin showed up and parked his car in front of mine, someone else said, 'You guys might want to move your cars,' but unfortunately at that point I didn't really have time to move my car."
Do they not have an off-site backup?!?!?! Any good IT person knows that's necessary for disaster recovery. You shouldn't need to rescue the hardware on-site to prevent loss of data. Did anyone ask them this? I'm hoping they were just trying to save the servers so the state wouldn't have to buy new equipment, and that the data was safe nonetheless.
Posted by: Drew | September 09, 2011 at 03:22 PM
If they didn't have an off-site backup of data, there's a much larger story here. Dig deeper Seven Days!
Posted by: Drew | September 09, 2011 at 03:24 PM
I am with Drew. Don't they have off-site backups? Time to look into some cloud solutions...
Posted by: Chris Lei | September 09, 2011 at 03:46 PM
I find it highly unlikely that they didn't have an offsite backup. Even with an offsite backup of all data, it would have still taken weeks to acquire new hardware, configure the new hardware, restore the backups, and test that everything is working correctly. Preserving all the hardware drastically shortens the amount of downtime—which was likely their goal. You could argue that they should have had a backup site (either cold, warm, or hot), but I doubt they had the budget to purchase two of everything, plus the cost of the second site. No one wants to pay for disaster preparedness but everyone is quick to complain in hindsight when measures weren't taken to be properly prepared. Personally, I applaud these folks for their quick thinking and selflessness. Based on their response, if there was any lack of preparedness it probably wasn't the fault of the people mentioned in this post.
Posted by: Bradley Holt | September 09, 2011 at 03:55 PM
I agree with you Bradley - saving the equipment makes it so much quicker (and cheaper) to get back up and running, and I certainly applaud the people who rushed in to save it. However the article says these "employees' quick thinking is being credited with saving the state's largest agency from disaster." This suggests that they don't have a good off-site backup because if they did, "disaster" would be a HUGE exaggeration. Shay should investigate further.
Posted by: Drew | September 09, 2011 at 04:08 PM
Thank you Matt and Dan! So sorry about your cars!
Posted by: Kris Benevento | September 09, 2011 at 06:44 PM
There was a team of about 10 talented folks including AHS and ANR staff that helped move equipment up the stairs.
Much of the equipment is very expensive including a fully populated HP SAN and dozens of server blades so it made sense to save it and avoid repurchase at taxpayer expense.
AHS does maintain offsite tape backups. As Bradley mentions it would take additional time and money to procure new equipment and restore everything to it.
Dedicated staff at DII worked around the clock for days to assist AHS with putting all the pieces back together again in the Montpelier data center.
Posted by: abc | September 09, 2011 at 09:18 PM
It is a shame to see the rest of the team left out of the article. The team from ANR even included the family of one of their IT folks.
Posted by: 123 | September 10, 2011 at 05:37 AM
Thanks for the explanation "ABC"...that equipment is surely expensive, so it's clear that this team did us all a huge favor. Thank you!
Posted by: Drew | September 12, 2011 at 05:47 PM