Chapter 9. When Things Start to Go Down: Troubleshooting

Your network now has enough built-in redundancy to survive the most common causes of network outages: failing connections, failing hardware, and failing ISPs. It’s rare for two routers, connections, or ISPs to fail at the same time (but don’t think it never happens), so the most common source of problems in a multihomed network is the BGP configuration itself. This chapter tells you how to diagnose and correct BGP problems quickly. Read this chapter carefully, imagine the impact on your network of each of the problems mentioned, and take action to minimize the damage that would result. But first a few paragraphs about handling the stress of troubleshooting network problems when an outage occurs.

Keeping a Clear Head

Remember the time when you weren’t multihomed? When the network went down, this created a fair amount of stress. In a multihomed situation, the network survives most of the problems that are fatal for a single-homed network, so outages are far less frequent. As a result, the amount of stress you experience when there is such an outage—and you know you’re the one who has to fix the problem—reaches unprecedented heights. Clammy hands and a sinking feeling in your stomach may not make you feel better, but your body’s stress response can actually be an advantage. It’s also perfectly natural. Even experienced performers get stage fright, so don’t worry about “staying cool.” If you do a good job, nobody will complain. ...

Get BGP now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.