As some of you may know, I spent four years in the U.S. Marine Corps back during the Reagan administration. I started out as an 0311 Basic Rifleman (aka Grunt) for a couple of years, and then became an 1811 Armor Crewman, ending up as a Tank Commander of an M60A1 Rise/Passive main battle tank. I got out of the Marines as an E-5 Sergeant, and went to college at U.C. Irvine, and the rest is rather ancient history. You might be wondering what my military experience has to do with SQL Server and the importance of checking your main BIOS?
Well, here is the relationship (at least in my mind). It comes from attention to detail, which is a very important characteristic for a good DBA. One thing that the Marines liked to do periodically was have a formal personnel inspection of a everyone in a small unit, such as a Rifle platoon. Depending on who the inspecting officer was, this might cause anywhere from hours to weeks of preparation by the members of the unit, doing things like thoroughly cleaning your weapon, cleaning and properly marking all of the items of your uniform and web gear, clipping off loose threads on your uniform (known as “Irish pennants” or “Russian ropes”), and all manner of other little things to get completely “squared away” in the Marine Corps vernacular. After all of this extensive preparation, the formal inspection would finally occur, often taking several hours for a platoon of 42 people.
Everyone in the platoon would be standing at attention in formation, and the inspecting officer would go one by one, down the ranks of the formation doing his formal inspection. He would turn to face you, and you would have to do an Inspection Arms movement with your rifle for him. After you were done, he would slap the weapon out of your hands, and start looking at it in excruciating detail, while asking you questions like “What is the maximum effective range of this weapon?”, or “Who is your regimental commander?”. Woe to the Marine who did not know the answer to these questions! One thing that I saw a number of times during these types of inspections was when the inspecting officer found some obvious problem with the Marine that he was inspecting, something that was literally in plain view, but had been missed by everyone during all of the preparations and pre-inspections. For example, maybe one of the buttons on a pocket flap of your utilities was not buttoned, or maybe there was a long, loose thread on a seam of one of your chest pockets. If the inspecting officer found something obvious like this, they would sometimes literally start undressing the Marine, looking for other violations that were sure to be found under the surface! The inspecting officer’s zeal and attention to detail was always rewarded in a case like this, since if something so obvious was missed, there would be many other things that were wrong, hidden under the surface…
This is where I get back to SQL Server and the importance of doing the obvious things like checking the main BIOS version on your database servers. As I discussed recently, it is pretty easy and pretty important to periodically check the version of your main system BIOS for a machine, using tools like msinfo32.exe, CPU-Z, or management tools like Dell Open Systems Management Administrator (OMSA). The large system vendors like Dell, HP, IBM, etc. will release new versions of the main system BIOS to fix problems that are discovered with that model server. As a DBA, I think it is very important to keep tabs on this, even if someone else (such as a systems administrator) is actually responsible for maintaining your database servers. One reason is that if you ever have a hardware problem and you call your system vendor for support, they are going to want you to run a utility that will check the versions of your main system BIOS, any other firmware, and your hardware driver versions. If any of these are out of date, they will want you to update them. If there was a hardware problem that caused an outage, the fact that the hardware had old versions of the main BIOS or other firmware will also tend to focus some blame on whoever was supposed to maintain that hardware.
The other reason why this is important is that it shows that someone in your organization (hopefully you) is paying attention to the obvious details. Almost invariably, when I have looked at someone’s system after they have asked me for help, and I find that they are running a main system BIOS that is multiple versions out of date, that means that I will find numerous other problems with their system and database configuration. It is really a pretty reliable predictor of trouble!
There are several reasons why people don’t maintain their database hardware properly. First, they may not know any better, since they may not know that you actually have to maintain this type of thing. Second, they may be afraid of breaking something. What happens if you update your system BIOS, and then the server refuses to POST or boot afterwards? Third, they may just be lazy. After all the server seems to be running fine, why should I stick my neck out and have to flash the BIOS at 11PM on Friday night? Fourth, they may not have a good, tested HA solution in place for that database server, so doing something like flashing the BIOS will cause a relatively long outage because of a reboot. Perhaps they are unsure whether their applications will continue to work after failing over their database(s) to a mirror instance.
I would argue that a good DBA will find a way around all of these objections and fears, and put a regular maintenance program into place for your database servers. This forces you to think about your HA solution and to actually test it on a fairly regular basis. If you have an HA technology in place, such as failover clustering, database mirroring, or SQL Server 2012 AlwaysOn, you can do rolling upgrades to minimize your downtime during maintenance. With some testing and planning, you can combine BIOS updates, firmware updates, Windows Updates, and SQL Server Updates all in one maintenance window with minimal downtime. Planning this effort and then actually doing it on a regular basis, forces you and your organization to exercise your HA solution while you keep your systems up to date (which I think will reduce the number of problems you run in to in the future). This is far better than just avoiding the whole issue, and leaving your database servers running the original versions of everything on a permanent basis. Don’t be afraid!