A day in the life of Custom Systems’ Network Consultant, Chase Reitter.
On a bright and sunny September morning, I was on my way to my office when I received a phone call from one of my customers. For the sake of the story, we’ll call them Customer #1.
“SQL is down,” he said sadly. “No one can connect.”
“Alright,” I said as I made the next u-turn. “I’m only a few minutes from your office, and will be there as soon as I can.”
“Should I just reboot it?” he asked.
“Not yet,” I said. “I’d prefer to figure out what the problem is first, then reboot if we have to.”
Customer #1 depends heavily on a software application that in turn depends on SQL. No SQL = no work gets done. Also, rebooting SQL means having to reboot the application server as well, and I try to avoid that if I can. Reboots take a long time, and while it usually fixes the problem short-term, it solves nothing long-term.
When I arrived at Customer #1’s office, I logged into the SQL server, and quickly found that too many connections were left open from the night before. This just means that no one logged off of their computer last night. I informed Customer #1 of the issue, reset the connections, and suggested that they encourage users to log off of their computers every night. Problem solved.
A few minutes after leaving Customer #1, I received another phone call. This time from Customer #2:
“Email is down!” he said, in a distraught tone.
“What do you mean by ‘email is down’?” I inquired.
“No one here can receive email from our customers” he replied.
“Ok” I said. “I’m about 30 minutes from your office, and will be there as soon as I can.”
Knowing that inbound email is the issue, and not the entire server, tells me a lot about what the problem isn’t, and what it could be. Plus I have 30 minutes to think about it, so I can carefully lay out my plan of attack.
Upon my arrival, Customer #2 is very upset. “No email means we’re not getting orders from our customers, and that means I’m losing money!”
I completely understand his persistence.
“Ok, I’ll check on it right away and let you know what’s going on as soon I can.”
I log into their server. It is a Microsoft Hyper-V server that I built about two years ago, so I’m familiar with it. Their Exchange email server is one of four virtual servers running on their Hyper-V host.
“Wait a minute.” I say to myself out loud. Right away I notice FIVE virtual servers now running on their Hyper-V host.
I head down the hall to where Customer #2 is pacing a hole in the floor.
“Is it fixed yet?” He asked.
“Not yet,” I said. “But when did you add a new Virtual Server?”
“Yesterday” he replied. “It’s to test a new quoting software we are thinking of buying.”
“Ok” I said. “We need to shrink the Virtual hard drive on your test server. It’s taking up too much disk space.”
Innocently enough, he had setup his new test server to use the rest of the drive space assigned to the Virtual Servers. With Microsoft Exchange, best practice is to keep a minimum of at least 10 percent disk space free for database and log file growth. Once you have less than about 10 percent free, Exchange starts to choke, and email stops working.
Once we shrank the new virtual test server down to an acceptable size, and did some minor cleanup on the Exchange server, email was working again. Problem solved. I checked on Customer #2’s backups, and headed out the door.
Now, finally at my office, I set out to start the list of things I wanted to accomplish that day. Before I was could finish, the phone rings. It’s Customer #3:
“One of our users can’t login to their PC today.”
“Is there an error message?” I asked. Customer #3 is tech savvy. I knew she’d have it.
“Yes.” It says “The Trust relationship between this workstation and the primary domain has failed.”
“I guess you hurt its feelings” I quip.
But she’s not in the mood for my puns today. “So what do I do?”
“There are a number of things we can try,” I said. “Let’s start with the easiest:
Do you know the local administrator password for the PC?”
“Yes, of course” she says.
“Great! And you know how to run Windows System Restore.”
“Yup” She replies. “Done it a few times.”
Customer #3 is not only tech savvy, but also a Renaissance Fair goer. And on most days, has a great sense of humor.
Jestingly I say “Then I beseech ye to go forth to said workstation! Run thy System Restore to Friday last, and report thine findings back to me!”
“Um,” she pauses. “Ok…I’ll call you when it’s done”.
My jokes are lost on her today. But she calls back about 20 minutes later:
“System Restore fixed it! Thanks!”
After we hang up, I go back over the day’s events. I think about how often an apparently colossal problem is fixed by just taking a moment to look at the whole picture: Think about the symptoms, rule out what isn’t the problem, and start with the simplest solution. The simplest solution is almost always the correct one.
Chase Reitter
Network Consultant
Custom Systems Corporation
[email protected]
©Custom Systems Corporation 2013