·
Troubleshooting and Supporting the Network
·
Predicting the Impact of Modifying, Adding, or
Removing Network Services
·
Adding, Modifying, or Removing DHCP
·
Adding, Modifying, or Removing WINS
·
Adding, Modifying, or Removing DNS
·
Identify and Troubleshoot Errors with a
Particular Physical Topology
·
Star Topology
·
Ring Topology
·
Bus Network Errors
·
Mesh Network Errors
·
Infrastructure Troubleshooting
·
Troubleshooting Network Media
·
Troubleshooting Infrastructure Hardware
·
Troubleshooting a Wireless Infrastructure
·
Wireless Signal Quality
·
Wireless Channels
·
SSIDs
·
WEP Settings
·
Wireless AP Coverage
·
Troubleshooting Steps and Procedures
·
Identify the Symptoms and Potential Causes
·
Identifying the Affected Area
·
Establishing What Has Changed
·
Selecting the Most Probable Cause of the Problem
·
Implement an Action Plan and Solution Including
Potential Effects
·
Testing the Results
·
Identify the Results and Effects of the Solution
·
Documenting the Solution
Troubleshooting and Supporting the Network
Many duties and responsibilities fall under the umbrella of
network administration. Of all these, one of the most practiced is that of
troubleshooting. No matter how well a network is designed and how many
preventative maintenance schedules are in place, troubleshooting will always be
necessary. Because of this, network administrators have to develop those
troubleshooting skills.
This tutorial focuses on all areas of troubleshooting, including
troubleshooting best practices and some of the tools and utilities you'll use
to assist in the troubleshooting process. To start, we'll look at the impact of
modifying network services.
Predicting the Impact of
Modifying, Adding, or Removing Network Services
All network services require a certain amount of network resources
in order to function. The amount of resources required depends on the exact
service being used. Before implementing or removing any service on a network,
it is very important to understand the impact that these services can have on
the entire network. To provide some idea of the demands various services place
on the network, this section outlines some of the most common network services
and the impact their addition, modification, or removal might have on the
network and clients.
Adding, Modifying, or Removing DHCP
DHCP automatically assigns TCP/IP addressing to computers when
they join the network and automatically renews the addresses before they
expire. The advantage of using DHCP is the reduced number of addressing errors,
which makes network maintenance much easier.
One of the biggest benefits of using DHCP is that the
reconfiguration of IP addressing can be performed from a central location, with
little or no effect on the clients. In fact, you can reconfigure an entire IP
addressing system without the users noticing. As always, a cost is associated
with everything good, and with DHCP, the cost is increased network traffic.
You know what the function of DHCP is and the service it provides
to the network, but what impact does the DHCP service have on the network
itself? Some network services can consume huge amounts of network bandwidth,
but DHCP is not one of them. The traffic generated between the DHCP server and
the DHCP client is minimal during normal usage periods.
The bulk of the network traffic generated by DHCP occurs during
two phases of the DHCP communication process: when the lease of the IP address
is initially granted to the client system and when that lease is renewed. The
entire DHCP communication process takes less than a second, but if there are a
very large number of client systems, the communication process can slow down
the network.
For most network environments, the traffic generated by the DHCP
service is negligible. For environments in which DHCP traffic is a concern, you
can reduce this traffic by increasing the lease duration for the client
systems, thereby reducing communication between the DHCP client and the server.
If the DHCP service has to be removed, it can have a significant
impact on network users. All client systems require a valid IP address to get
onto the network. If DHCP is unavailable, each client system would need to be
configured with a static IP address. Because DHCP IP addressing is automatic
and does not assign duplicate IP addresses, as sometimes happens with manual
entries, DHCP is the preferred method of network IP assignment.
If DHCP is added to a network, all client systems will need to be
configured to use DHCP. In a Windows environment, this is as easy as selecting
a radio button to use DHCP. If client systems are not configured to use the
DHCP server, they will not be able to access the network.
Adding, Modifying, or Removing
WINS
WINS is used on Microsoft networks to facilitate communications
between computers by resolving NetBIOS names to IP addresses. Each time a
computer starts, it registers itself with a WINS server by contacting that
server over the network. If that system then needs to contact another device on
the network, it can contact the WINS server to get the NetBIOS name resolved to
an IP address.If you are thinking about not using WINS, you should know that
the alternative is for computers to identify themselves and resolve NetBIOS
names to IP addresses via broadcasts. Broadcasts are inefficient because all
data is transmitted to every device on the network segment. Broadcasts can be a
significant problem for large network segments. Also, if a network has more
than one segment, you cannot browse to remote segments because broadcasts are
not typically forwarded by routers, which will eliminate this method of
resolution.
Because WINS actually replaces the broadcast communication on a
network, it has a positive impact on network resources and bandwidth usage.
This does not mean that WINS does not generate any network traffic just that
the traffic is more organized and efficient. The amount of network traffic
generated by WINS clients to a WINS server is minimal and should not have a
negative impact in most network environments.
WINS server information can be entered manually into the TCP/IP
configuration on a system, or it can be supplied via DHCP. If the WINS server
addresses change and the client configuration is being performed manually, each
system needs to be reconfigured with the new WINS server addresses. If you are
using DHCP, you need to update only the DHCP scope with the new information.
Removing WINS from a network increases the amount of broadcast
traffic and can potentially limit browsing to a single segment unless another
method of resolution (such as the use of the statically maintained LMHOSTS file) is in place.
Adding, Modifying, or Removing
DNS
The function of DNS is to resolve hostnames to IP addresses.
Without such a service, network users would have to identify a remote system by
its IP address rather than by its easy-to-remember hostname.
Name resolution can be provided dynamically by a DNS server, or it
can be accomplished statically, using the HOSTS file on the client system. If you are using a DNS server, the IP
address of the DNS server is required. DNS server addresses can be entered
manually, or they can be supplied through a DHCP server.
Identify and Troubleshoot Errors
with a Particular Physical Topology
Each of the physical network topologies requires its own
troubleshooting strategies and methods. When troubleshooting a network, it is
important to know which topology is used as it can greatly impact the
procedures used to resolve any problems. This section lists each of the
respective physical network topologies and some common troubleshooting strategies.
Star Topology
The most common topology used today is the star topology. The star topology
uses a central connection point such as a hub in which all devices on the
network connect. Each device on the network uses its own length of cable, thus
allowing devices to be added or removed from the network without disruption to
current network users. When troubleshooting a physical star network, consider
the following:
·
The central device, hubs or switches, provides a single point of
failure. When troubleshooting a loss of connectivity for several users, it
might be a faulty hub. Try placing the cables in a known working hub to
confirm.
·
Hubs and switches provide light-emitting diodes (LEDs) that
provide information regarding the port status. For instance, by using the LEDs,
you can determine whether there is a jabbering network card, whether there is a
proper connection to the network device, and whether there are too many
collisions on the network.
·
Each device, printer, or computer connects to a central device using
its own length of cable. When troubleshooting a connectivity error in a star
network, it might be necessary to verify that the cable works. This can be done
by swapping the cable with a known working one or using a cable tester.
·
Ensure that the patch cables and cables have the correct
specifications.
Figure 1 shows how a single cable break would affect other client
systems on the network.
Figure 1 Identifying
cable breaks in a star network.
Ring Topology
Although not as commonly used as it once was, you might find
yourself troubleshooting a ring network. Most ring networks
are logical rings, meaning that each computer is logically connected to each
other. A physical ring topology is a rare find but a Fiber Distributed Data
Interface (FDDI) is often configured in a physical ring topology. A logical
ring topology uses a central connecting device as with a star network called a
multistation access unit (MSAU). When troubleshooting either a logical or
physical ring topology, consider the following:
- A physical ring
topology uses a single length of cable interconnecting all computers and
forming a loop. If there is a break in the cable, all systems on the
network will be unable to access the network.
- The MSAU on a
logical ring topology represents a single point of failure. If all devices
are unable to access the network, it might be that the MSAU is faulty.
- Verify that the
cabling and connectors have the correct specifications.
- All Network
Interface Cards (NICs) on the ring network must operate at the same speed.
- When connecting
MSAUs in a ring network, ensure that the ring in and ring out
configuration is properly set.
Figure 2 shows how a single cable break would affect other client
systems on a physical ring network.
Figure 2 Identifying cable breaks in a physical ring network.
Bus Network Errors
Troubleshooting a bus network can be a difficult and frustrating
task. The following list contains a few hotspots to be aware of when
troubleshooting a bus network:
- A bus topology
must be continuous. A break in the cable at any point will render the
entire segment unusable. If the location of the break in the cable is not
apparent, you can check each length of cable systematically from one end
to the other to identify the location of the break, or you can use a tool
such as a time domain reflectometer, which can be used to locate a break
in a cable.
- The cable used
on a bus network has two distinct physical endpoints. Each of these cable
ends requires a terminator. Terminators are used to absorb electronic signals
so that they are not reflected back on the media, compromising data
integrity. A failed or missing terminator will render the entire network
segment unusable.
- The addition,
removal, or failure of a device on the network might prevent the entire
network from functioning. Also, the coaxial cable used in a bus network
can be damaged very easily. Moving cables in order to add or remove
devices can cause cable problems. The T connectors used on bus networks do
allow devices to be added and removed without necessarily affecting the
network, but care must be taken when doing this.
- One end of the
bus network should be grounded. Intermittent problems or a high occurrence
of errors can indicate poor or insufficient grounding.
Figure 3 shows how a single cable break would affect other client
systems on a bus network.
Figure 3 Identifying cable breaks in a bus network.
Mesh Network Errors
A mesh topology offers high redundancy by providing several paths for
data to reach its destination. In a true mesh network, each device on the
network is connected to every other device, and if one cable fails, there is
another to provide an alternative data path. Although a mesh topology is
resilient to failure, the number of connections involved can make a mesh
network somewhat tricky to troubleshoot.
When troubleshooting a mesh network, consider the following
points:
·
A mesh topology interconnects all devices on the network, offering
the highest level of redundancy of all the topologies. In a pure mesh
environment, all devices are directly connected to all other devices. In a
hybrid mesh environment, some devices are connected only to certain others in
the topology.
·
Although a mesh topology can accommodate failed links, mechanisms
should still be in place so that failed links are detected and reported.
·
Design and implementation of a true mesh network can be complex
and often requires specialized hardware devices.
Infrastructure Troubleshooting
No doubt, you will find yourself troubleshooting wiring and
infrastructure problems less frequently than you'll troubleshoot client
connectivity problems and thankfully so. Wiring- and infrastructure-related
problems can be very difficult to trace, and sometimes a very costly solution
is needed to remedy the situation. When troubleshooting these problems, a
methodical approach is likely to pay off.
A network infrastructure refers to the physical components that
are used to create the network. This includes the media used, switches,
routers, bridges, patch panels, hubs and so on.
When troubleshooting the infrastructure it is important to know
where these devices are on the network and what they are designed to do. In
this section we explore two essential infrastructure components, media and
hardware components.
Troubleshooting Network Media
The physical connections used to create the networks are sometimes
at the root of a network connectivity error. Troubleshooting wiring involves
knowing what wiring your network uses and where it is being used. When
troubleshooting network media consider:
Media range (attenuation) All cables used in networking have
certain limitations, in terms of distance. It might be that the network
problems are a result of trying to use a cable in an environment or a way for
which it was not designed. For example, you might find that a network is
connecting two workstations that are 130 meters apart with Category 5 UTP
cabling. Category 5 UTP is specified for distances up to 100 meters, so
exceeding the maximum cable length can be a potential cause of the problem. The
first step in determining the allowable cable distance is to identify the type
of cable used. Determining the cable type is often as easy as reading the
cable. The cable should be stamped with its type whether it is, for example,
UTP Category 5, RG-58, or something else.
EMI and crosstalk interference Copper-based media is subject to
the effects of EMI and crosstalk interference. UTP cables are particularly
susceptible to EMI caused by devices such as power lines, electric motors,
fluorescent lighting and so on. Consider using plenum rated cable in
environments where cables are run through areas where EMI may occur. This
includes heating ducts, elevator shafts and through ceilings around lighting
fixtures. Crosstalk occurs when cables are run in close proximity and the
signals from one interfere with the signals on the other. This can be hard to
troubleshoot and isolate, so when designing a network ensure that crosstalk
preventative measures are taken.
Throughout limitations A problem with a particular media may be
simply that it cannot accommodate the throughout required by the network. This
would create network-wide bottlenecks. It may be necessary to update the
network media to correct the problem, for instance, upgrading the network
backbone to fiber optic media.
Media connectors Troubleshooting media requires verifying that the
connectors are correctly attached. In the case of UTP or coaxial, sometimes it
may be necessary to swap out a cable with a known working one to test. For
fiber, different types of connectors are used in fiber optic cabling. Before
implementing a fiber solution, ensure that the switches and routers used match
with the connectors used with the fiber optic cable.
Troubleshooting Infrastructure
Hardware
If you are looking for a challenge, troubleshooting hardware
infrastructure problems is for you. It is often not an easy task and usually
involves many processes, including base lining and performance monitoring. One
of the keys to identifying the failure of a hardware network device is to know
what devices are used on a particular network and what each device is designed
to do. Some of the common hardware components used in a network infrastructure
are shown in Table 1.
Table 1 Common
network hardware components, their function and troubleshooting strategies.
|
||
| Networking Device Signs |
Function |
Troubleshooting and Failure |
| Hubs | Hubs are used with a star network topology and UTP cable to connect multiple systems to a centralized physical device. | Because hubs connect multiple network devices, if many devices are unable to access the network, the hub may have failed. When a hub fails, all devices connected to it will be unavailable to access the network. Additionally, hubs use broadcasts and forward data to all the connected ports increasing network traffic. When network traffic is high and the network is operating slowly, it may be necessary to replace slow hubs. |
| Switches | Like hubs, switches are used with a star topology to create a central connectivity device. | The inability of several network devices to access the network may indicate a failed switch. If the switch fails, all devices connected to the switch will be unable to access the network. Switches forward data only to the intended recipient allowing them to better manage data than hubs. |
| Routers | Routers are used to separate broadcast domains and to connect different networks. | If a router fails, network clients will be unable to access remote networks connected by the router. For example, if clients access a remote office through a network router and the router fails, the remote office would be unavailable. Testing router connectivity can be done using utilities such as ping and tracert. |
| Bridges | Bridges are commonly used to connect network segments within the same network. Bridges manage the flow of traffect between these network segments. | A failed bridge would prevent the flow of traffic between network segments. If communication between network segments has failed, it may be due to a failed bridge. |
| Wireless Access Points | Wireless access points provide the bridge between the wired and wireless network. | If wireless clients are unable to access the wired network, the WAP may have failed. However, there are many configuration settings to verify first. |
Troubleshooting a Wireless Infrastructure
Wireless networks do not require physical cable to connect
computers; rather, they use wireless media. The benefits of such a
configuration are clear users have remote access to files and resources without
the need for physical connections. Wireless networking eliminates cable faults
and cable breaks. It does, however, introduce its own considerations such as
signal interference and security.
Wireless Signal Quality
Because wireless signals travel through the atmosphere, they are
subjected to environmental factors that can weaken data signals. Everything
from electrical devices, storms, RF interference, and obstacles such as trees
can weaken wireless data signals. Just how weakened the signal becomes depends
on many factors; however, all of these elements serve to decrease the power of
the wireless signal.
If you are troubleshooting a wireless connection that has a
particularly weak signal, there are a few infrastructure changes that can be
done to help increase the power of a signal.
·
Antenna Perhaps the first and most obvious thing to check is to
ensure that the antenna on the wireless access point is positioned for best
reception; this will often take a little trial and error to get the placement
right. Today's wireless access cards commonly ship with diagnostic software
that displays signal strength.
·
Device Placement One of the factors that can degrade wireless
signals is RF interference. Because of this, it is important to try and keep
wireless devices away from appliances that output RF noise. This includes
devices such as microwaves, certain cordless devices using the same frequency,
and electrical devices.
·
Network Location Although there might be limited choice, as much
as possible, it is important to try to reduce the number of obstructions that
the signal must pass through. Every obstacle strips a little more power from
the signal. The type of material a signal must pass through also can have a
significant impact on the signal integrity.
·
Boost Signal If all else fails, it is possible to purchase devices
such as wireless repeaters that can amplify the wireless signal. The device
takes the signal and amplifies it so that the signal has greater strength. This
will also increase the distance that the client system can be placed from the
WAP.
In order to successfully manage the wireless signals, you will
need to know the wireless standard that you are using. The standards that are
used today specify range distances, RF ranges, and speeds. It might be that the
wireless standard is not capable of doing what you need. Table 2 highlights the
characteristics of common wireless standards.
Table 2 Comparing
Wireless Standards
|
||||
| Standard |
Speed |
Range |
Frequency |
Concerns |
| 802.11a | Up to 54Mbps | 2575 feet | 5GHz | Not compatible with 802.11g or 802.11b |
| 802.11b | Up to 11Mbps | Up to 150 feet | 2.4GHz | Might conflict with other devices using the 2.4GHz range |
| 802.11g | Up to 54Mbps | Up to 150 feet | 2.4GHz | Might conflict with other devices using the 2.4GHz range |
| Bluetooth | 720Kbps | 33 feet | 2.4GHz | Might conflict with other devices using the 2.4GHz range |
As you can see in Table 2, the speeds are listed with the "Up
to" disclaimer. This is because each standard will decrease the data rate
if there is interference. 802.11b wireless link offers speeds up to 11Mbps, but
it will automatically back down from 11Mbps to 5.5, 2, and 1Mbps when the radio
signal is weak or when interference is detected. 802.11g auto sensing rates are
1, 2, 5.5, 6, 9, 12, 18, 24, 36, 48, and 54 Mbps. Finally, 802.11a provides
rates up to 54Mbps, but will automatically back down to rates 48, 36, 24, 18,
12, 9, and 6Mbps.
Wireless Channels
RF channels are important parts of wireless communications. A
channel is the frequency band used for the wireless communication. Each
standard specifies the channels that can be used. The 802.11a standards
specifies radio frequencies ranging between 5.15 and 5.875GHz. In contrast,
802.11b and 802.11g standards operate between the 2.4 to 2.497GHz range. As far
as channels are concerned, 802.11a has a wider frequency band, allowing more
channels and therefore more data throughput. As a result of the wider band,
802.11a supports up to eight non overlapping channels. 802.11b/g standards use
the smaller band and support only up to three non overlapping channels.
It is recommended that the non overlapping channels be used for
communication. In the United States, 802.11b/g uses 11 channels for data
communication as mentioned three of these, channels 1, 6, and 11, are non overlapping
channels. Most manufacturers set their default channel to one of the non overlapping
channels to avoid transmission conflicts. With wireless devices, you have the
option of selecting which channel your WLAN operates on in order to avoid
interference from other wireless devices that operate in the 2.4GHz frequency
range.
When troubleshooting a wireless network, be aware that overlapping
channels can disrupt the wireless communications. For example, in many
environments, APs are inadvertently placed closely together. Perhaps two access
points in separate offices are located next door to each other or between
floors. Signal disruption will result if there is channel overlap between the
access points. The solution here is to try and move the access point to avoid
the problem with the overlap or change channels to one of the other non overlapping
channels. For example, switch from channel 6 to channel 11.
As far as troubleshooting is concerned, you would typically only
change the channel of a wireless device if there is a channel overlap with
another device. If a channel must be changed, it must be changed to another non
overlapping channel.
SSIDs
The Service Set Identifier (SSID) is a configurable client
identification that allows clients to communicate to a particular base station.
In application, only clients that are configured with the same SSID can
communicate with base stations having the same SSID. SSID provides a simple
password arrangement between base stations and clients.
As far as troubleshooting is concerned, if a client is not able to
access a base station, ensure that both are using the same SSID. Incompatible
SSIDs are sometimes found when clients move computers, such as laptops, between
different wireless networks. They obtain an SSID from one network and then if
the system is not rebooted, the old SSID won't allow communication to a
different base station.
WEP Settings
The Wired Equivalent Privacy (WEP) is a security protocol for
wireless networks that encrypts transmitted data. WEP is easy to configure with
only three possible security options Off (no security), 64-bit (basic
security), and 128-bit (stronger security). WEP is not difficult to crack, and
using it reduces performance slightly.
If your network operates with WEP turned off, your system is very
open for someone to access your data. Depending on the sensitivity of your
data, you can choose between the 64-bit and 128-bit encryption. Although the
128-bit WEP encryption provides greater security, it does so at a performance
cost. 64-bit offers less impact on system performance and less security.
As far as troubleshooting is concerned, in order for wireless
communication to take place, wireless devices must all use the same WEP
setting. Most devices are set to Off by default; if changed, all clients must
use the same settings.
Wireless AP Coverage
Like any other network media, APs have a limited transmission
distance. This limitation is an important consideration when deciding where an
AP should be placed on the network. When troubleshooting a wireless network,
pay close attention to the distance that client systems are away from the AP.
When faced with a problem in which client systems cannot
consistently access the AP, you could try moving the AP to better cover the
area, but then you might disrupt access for users in other areas. So what can
be done to troubleshoot AP coverage?
Depending on the network environment, the quick solution might be
to throw money at the solution and purchase another access point, cabling, and
other hardware, and expand the transmission area through increased hardware.
However, there are a few things to try before installing another wireless
access point. The following list starts with the least expensive solution to
the most expensive.
·
Increase transmission power Some access points have a setting to
adjust the transmission power output. By default, most of these settings will
be set to the maximum output; however, it is worth verifying just in case. As a
side note, the transmission power can be decreased if trying to reduce the
dispersion of radio waves beyond the immediate network. Increasing the power
would provide clients stronger data signals and greater transmission distances.
·
Relocate the AP When wireless client systems suffer from
connectivity problems, the solution might be as simple as relocating the WAP to
another location. It might be that it is relocated across the room, a few feet,
or across the hall. Finding the right location will likely take a little trial
and error.
·
Adjust or replace antennas If the access point distance is not
sufficient for some network clients, it might be necessary to replace the
default antenna used with both the AP and the client with higher end antennas.
Upgrading an antenna can make a big difference in terms of transmission
range.Unfortunately,not all WAPs have replaceable antennas.
·
Signal amplification RF amplifiers add significant distance to
wireless signals. An RF amplifier increases the strength and readability of the
data transmission. The amplifier provides improvement of both the received and
transmitted signals, resulting in an increase in wireless network performance.
·
Use a repeater Before installing a new AP, you might want to first
think about a wireless repeater. When set to the same channel as the AP, the
repeater will take the transmission and repeat it. So, the WAP transmission
gets to the repeater, and then the repeater duplicates the signal and passes it
forward. It is an effective strategy to increase wireless transmission
distances.
Troubleshooting Steps and
Procedures
Regardless of the problem, effective network troubleshooting
follows some specific troubleshooting steps. These steps provide a framework in
which to perform the troubleshooting process and, when followed, can reduce the
time it takes to isolate and fix a problem. The following sections discuss the
common troubleshooting steps and procedures.
1.
|
Identify the symptoms and
potential causes.
|
2.
|
Identify the affected area.
|
3.
|
Establish what has changed.
|
4.
|
Select the most probable
cause.
|
5.
|
Implement an action plan and
solution including potential effects.
|
6.
|
Test the result.
|
7.
|
Identify the results and
effects of the solution.
|
8.
|
Document the solution and process.
|
Identify the Symptoms and
Potential Causes
The first step in the troubleshooting process is to establish
exactly what the symptoms of the problem are. This stage of the troubleshooting
process is all about information gathering. To get this information, we need a
knowledge of the operating system used, good communication skills, and a little
patience. It is very important to get as much information as possible about the
problem. You can glean information from three key sources: the computer (in the
form of logs and error messages), the computer user experiencing the problem,
and your own observation.
Once you have identified the symptoms, you can begin to formulate
some of the potential causes of those symptoms.
Identifying the Affected Area
Some computer problems are isolated to a single user in a single
location; others affect several thousand users spanning multiple locations.
Establishing the affected area is an important part of the troubleshooting
process, and it will often dictate the strategies you use in resolving the
problem.
Problems that affect many users are often connectivity issues that
disable access for many users. Such problems can often be isolated to wiring
closets, network devices, and server rooms. The troubleshooting process for
problems that are isolated to a single user will often begin and end at that
user's workstation. The trail might indeed lead you to the wiring closet or
server, but that is not likely where the troubleshooting process would begin.
Understanding who is affected by a problem can provide you with the first clues
about where the problem exists.
Establishing What Has Changed
Whether there is a problem with a workstation's access to a
database or an entire network, keep in mind that they were working at some
point. Although many claim that the "computer just stopped working,"
it is unlikely. Far more likely is that there have been changes to the system
or the network that caused the problem.
Look for newly installed applications, applied patches or updates,
new hardware, a physical move of the computer, or a new username and password.
Establishing any recent changes to a system will often lead you in the right
direction to isolate and troubleshoot a problem.
Selecting the Most Probable Cause
of the Problem
There can be many different causes for a single problem on a
network, but with appropriate information gathering, it is possible to
eliminate many of them. When looking for a probable cause, it is often best to
look at the easiest solution first and then work from there. Even in the most
complex of network designs, the easiest solution is often the right one. For
instance, if a single user cannot log on to a network, it is best to confirm
network settings before replacing the NIC. Remember, though, that at this
point, you are only trying to determine the most probable cause, and your first
guess might, in fact, be incorrect. It might take a few tries to determine the
correct cause of the problem.
Implement an Action Plan and
Solution Including Potential Effects
After identifying a cause, but before implementing a solution, you
should develop a plan for the solution. This is particularly a concern for server
systems in which taking the server offline is a difficult and undesirable
prospect. After identifying the cause of a problem on the server, it is
absolutely necessary to plan for the solution. The plan must include details
around when the server or network should be taken offline and for how long,
what support services are in place, and who will be involved in correcting the
problem.
Planning is a very important part of the whole troubleshooting
process and can involve formal or informal written procedures. Those who do not
have experience troubleshooting servers might be wondering about all the
formality, but this attention to detail ensures the least amount of network or
server downtime and the maximum data availability.
With the plan in place, you should be ready to implement a
solution that is, apply the patch, replace the hardware, plug in a cable, or
implement some other solution. In an ideal world, your first solution would fix
the problem, although unfortunately this is not always the case. If your first
solution does not fix the problem, you will need to retrace your steps and
start again.
It is important that you attempt only one solution at a time.
Trying several solutions at once can make it very unclear which one actually
corrected the problem.
Testing the Results
After the corrective change has been made to the server, network,
or workstation, it is necessary to test the results never assume. This is when
you find out if you were right and the remedy you applied actually worked.
Don't forget that first impressions can be deceiving, and a fix that seems to
work on first inspection might not actually have corrected the problem.
The testing process is not always as easy as it sounds. If you are
testing a connectivity problem, it is not difficult to ascertain whether your
solution was successful. However, changes made to an application or to
databases you are unfamiliar with are much more difficult to test. It might be
necessary to have people who are familiar with the database or application run
the tests with you in attendance.
Identify the Results and Effects
of the Solution
Sometimes, you will apply a fix that corrects one problem but
creates another problem. Many such circumstances are hard to predict but not
always. For instance, you might add a new network application, but the
application requires more bandwidth than your current network infrastructure
can support. The result would be that overall network performance would be
compromised.
Everything done to a network can have a ripple effect and
negatively affect another area of the network. Actions such as adding clients,
replacing hubs, and adding applications can all have unforeseen results. It is
very difficult to always know how the changes you make to a network are going
to affect the network's functioning. The safest thing to do is assume that the
changes you make are going to affect the network in some way and realize that
you just have to figure out how. This is when you might need to think outside
the box and try to predict possible outcomes.
Documenting the Solution
Although it is often neglected in the troubleshooting process,
documentation is as important as any of the other troubleshooting procedures.
Documenting a solution involves keeping a record of all the steps taken during the
fix not necessarily just the solution.
For the documentation to be of use to other network administrators
in the future, it must include several key pieces of information. When
documenting a procedure, you should include the following information:
·
Date When was the solution implemented? It is important to know
the date because if problems occur after your changes, knowing the date of your
fix makes it easier to determine whether your changes caused the problems.
·
Why Although it is obvious when a problem is being fixed why it is
being done, a few weeks later, it might become less clear why that solution was
needed. Documenting why the fix was made is important because if the same
problem appears on another system, you can use this information to reduce time
finding the solution.
·
What The successful fix should be detailed, along with information
about any changes to the configuration of the system or network that were made
to achieve the fix. Additional information should include version numbers for
software patches or firmware, as appropriate.
·
Results Many administrators choose to include information on both
successes and failures. The documentation of failures might prevent you from
going down the same road twice, and the documentation of successful solutions
can reduce the time it takes to get a system or network up and running.
·
Who It might be that information is left out of the documentation
or someone simply wants to ask a few questions about a solution. In both cases,
if the name of the person who made a fix is in the documentation, he or she can
easily be tracked down. Of course, this is more of a concern in environments in
which there are a number of IT staff or if system repairs are performed by
contractors instead of actual company employees.




No comments:
Post a Comment