Taking a methodical approach to finding problems in a computer system.
It happens to almost anyone who uses computer systems at some time or another. Either a task that we’ve carried out many times before refuses to work, or a new piece of software flat refuses to load.
This article goes through the troubleshooting process, working it’s way towards finding a solution. It’s a general, theoretical, approach rather than an answer sheet, so there’s nothing mentioned that’s too specific.
I’ve attempted to keep this as generalised as I can so that it covers as wide a spectrum of potential troubleshooters as possible. In the case where there’s a more professional slant to an aspect, that may not apply to those of us who are just attempting to sort out our home computer, the text will appear like this.
1. Identify The Problem
Before you can do anything constructive towards solving the problem, first we need to identify what the problem is. Ask questions about what was happening when the worst happened.
- Were we able to complete the task earlier? If not, perhaps the answer is simply that the computer system’s hardware is not sufficient for the task
- If the task was possible before, when did we notice the issue starting to develop? If we can work out what happened right before the problem appeared, it might be possible to identify the problem really quickly
- What types of changes have there been since the last time that particular task was completed? If nothing is immedietly forthcoming, consider whether the computer system has been changed in any way since the last time the failing task was completed. Is there any new software? Have there been any updates to the OS? Has any new hardware been added? Again, this might well lead straight to the source of the issue.
- Were there any error messages displayed? If we know what the error messages were, it might be possible to perform an internet search of the manufactures website or the internet in general for information
2. Establish a Theory
Quite simply, this is where we check everything that may seem too easy or obvious. Checking such things as that devices are actually plugged in and connected, that power switches are turned on and so-on. It’s important to make no assumptions about these obvious things as, quite often, problems are the result of the simplest things.
Having checked all of the above.
- If it’s appropriate to do so, attempt to re-create the issue, paying close attention to what takes place and what the results are. If we’re experiencing a fault for the first time while focusing our attentions on the task at hand, rather than what’s going on around that task, we may have missed something vital.
- If we’re assisting someone else, ask them to recreate the steps they took as exactly as possible. This way, it might be possible to identify an error in the way that they are using an application.
- At this point, having done all of the above, it’s important to have a theory about what might have occurred. If our own experience falls short here, it might be time to refer to online forums and support websites out there. Whatever the problem is, I’d hazard a guess that others have had the same experience at some time and many will have asked for help online.
3. Test the Theory
At this point we’re now going to test our theory to see if we’re right; check and test related components, inspect connections, check any hardware or software configurations; consult the forums and online support as we mentioned above.
If we manage to confirm our theory but the problem is not yet solved, it’s time to decide what the next steps will be. On the other hand, if we haven’t been able to confirm our theory about what has gone wrong, we either have to look again and see if there’s an alternative theory or possibly consider that something is beyond our ability to fix, without additional resources of some kind.
4. Establish a Plan
Now it’s time for us to establish a plan of action about what we’re going to do to solve the problem. We may need to conduct further research, establish some new or alternative ideas and determine priorities.
We might even end up with more than one plan depending on what the potential causes of our problem are, so we’ll need to prioritize and execute each of these one-by-one.
It’s important to ensure that system downtime is limited and that productivity doesn’t suffer. A half day shutdown of a network, for example, when one machine has had a malware infection probably isn’t necessary and in truth, will likely only cause us more trouble than it will solve.
Once we have resolved the issue, it’s important to ensure that the entire system is functioning as it should be and, if applicable, implement some preventative measures. Preventative measures will include such things as updating system software and firmware or installing antivirus software.
We need to ensure that our solution has actually worked. And that it hasn’t caused issues with other applications or devices connected to the system.
This part of the process may also include communicating or consulting with customers, colleagues or vendors to communicate the discovered issue, any solutions and any suggested preventative measures. It might also be a good time to ensure that the customer/client is satisfied with the results.
This part of the process very much depends on the nature of how we’re working with a computer system. Whatever the situation, it’s often important to document and share any knowledge gained from the work carried out.
For personal computer issues that we’ve sorted out at home, it may be worth a post in a forum or on a blog like this, if you have one. This is especially true if the cause was perhaps related to some form of malware or virus.
For computing professionals, this could take the form of our company’s documentation plans and for our own reference materials. Often, it’s a good idea to keep notes at each step of the process we’ve taken above while we’re carrying out the tasks.
This enables us to capture each valuable step of the troubleshooting routine, as well as the all important outcome for future use should a similar problem arise again.