Archive for the ‘ Work ’ Category

BGP: Routed or Routing Protocol? Or…NONE LIKE IT HOT!

I think we need to cover something very basic before we get into this discussion:

BGP sucks.

Let’s face it.  BGP is probably the dumbest routing protocol currently deployed.  Even RIPv2 is smarter than BGP.  The other protocols out there (IS-IS [my staff manager fondly calls this Protocol Zero, by the way], OSPF, RIP, EIGRP [this isn’t a real protocol]) do everything auto-magically.  Seriously, it’s like magic sometimes.  Yeah, you have to come up with fancy things like stub areas, NSSA, and all of that jazz from time to time, but really, these protocols largely do everything on their own.  You just tell them the type of area you want them to be in, or that you want the router to be a stub router, or whatever, and they know exactly what you mean.

BGP, on the other hand?  Yeah, good luck.  First, you have to explicitly tell it what devices to peer with.  Then you have to specify the local-address (update-source if you’re a Cisco weirdo).  Then type internal or type external.  And being even remotely intelligent about what advertisements it sends where?  Not happening.

GTFO!

So now that you’re probably thinking I’m uneducated and just don’t understand BGP, let me ease those concerns.

BGP’s inability to function even remotely well on its own is also its greatest strength.  It gives the administrator absolute and total control over precisely how traffic enters and exits his administrative domain.  I’m not going to get into ‘how’ or ‘why.’  Everything I’ve written so far has just been a prelude to what has come before it.  And that is…

IS BGP A ROUTED PROTOCOL OR A ROUTING PROTOCOL?

As we’ve discussed, BGP is not capable of making intelligent decisions on its own.  It needs direction!  You can’t (realistically) use BGP as your IGP.  While technically possibly, it would be a very, very bad idea.  OSPF and IS-IS (zero protocol!) have a full topology of the entire IGP (at least the backbone does).  On top of that, they have areas that subdivide the network to conserve hardware resources and reduce the size of routing tables.  BGP really can’t do that.  For BGP to even be used as an IGP, you would have to create iBGP peerings based on interfaces and loopbacks, redistribute connected networks, and tune timers because BGP convergence is terrible.  Even then you would run into problems.  An iBGP-learned route does not get advertised to other iBGP peers, right? So you would have to be extra careful to make sure you are peering to everything in known existence in your network (that’s a little overkill, I know).

This all makes BGP incredibly pathetic as an IGP.  Why?  Because it’s stupid.  Very unintelligent.

What does this have to do with anything?

iBGP relies on the underlying IGP to form its iBGP peerings–if they’re based on loopback IPs (which they always should be).  This is because BGP, all by itself, without excessive and hideous configuration, can’t route traffic.  Check out the NEXT_HOP attribute for more information on that.

To make things short and sweet, the next-hop is not changed by default and requires recursive lookups–which are going to depend on the IGP.  And if it’s an externally learned route, well, you’re screwed (without setting next-hop self).

So what happens when packets travel between autonomous systems?  BGP contains an AS_PATH attribute that details the autonomous system path.  But that isn’t really relevant except as a primitive ‘shortest path’ indicator and loop-prevention mechanism.  There’s another attribute, the NEXT_HOP attribute, that does the ‘real work.’  And it doesn’t really do much work.  The router performs recursive lookups and relies on the IGP to get the traffic to whatever it knows the NEXT_HOP is.  And then the process repeats.  The first router in the next autonomous system does the same thing.

So BGP messages are actually routed.  They use the well-known TCP port 179.  This means IP connectivity must be established before messages can be sent and received–ultimately resulting in the establishment of a peering.  This is why, in the absence of an IGP, you would need to manually configure a full mesh of BGP peerings across every link in your network.  BGP has to have that IP connectivity.  These messages get routed across the IGP network to their destination.  TCP sessions are established by making use of the IGP.  In this manner, BGP is actually an application of TCP.

BGP’s Intent

With the above being said, however, what is the intent of BGP?  Why does it exist?  To establish inter-AS communication.  And how does it accomplish this?  By providing update messages containing routes.  Yes, the information contained in these messages is primitive at best, but it does contain the reachability information necessary to traverse multiple autonomous systems.  And that information is a prefix and a NEXT_HOP.  Oh, and that NEXT_HOP thing?  It’s going to rely on an IGP for recursive lookups to that NEXT_HOP in a lot of scenarios.  Because strictly speaking, BGP doesn’t need the AS_PATH to convey this information.  That information does not persist in a given IP packet from point A to point Z (unless that packet happens to be a BGP packet).  The AS_PATH attribute is there to give us finer control over inter-domain routing.  The AS_PATH attribute exists to prevent loops.  It exists to provide some measure of ‘shortest path‘ indication.

Wrapping Up

BGP is an amazing protocol. Extremely flexible, but incredibly stupid. It relies on everything else in the network to be right in order to function properly–more than most things. You can leave out a neighbor configuration for your IGP, but if you leave out a router out of the iBGP configuration, it could be detrimental. It is both a routed and a routing protocol.

I hope this has been informative. This was a concept that I truly did not grasp when I started my new job a few months ago. It has only been since working with BGP on a daily basis and read a few books on the matter (check my bookshelf) that I began to grasp this extremely important–and simple–concept.

As a final note, please find me and smack me across the head if I am wrong on any of the above points. I am still learning (always will be), but I feel pretty confident about the information presented here.

And the ‘Or…NONE LIKE IT HOT!’ thing was just because I was watching Futurama. It has no relevance to this article.

For the Love of Networking or How I Learned to Stop Worrying and Love the Bomb

People usually tell you to do what you love. What they may not tell you is that you probably shouldn’t do something unless you love it.

There are obviously exceptions to this. If you need the work and can’t get anything else, you have to do what you have to do. However, with IT, the rule of “do what you love” seems particularly harsh.

I realize more and more that, with IT in general, if you don’t love what you do, you won’t get very far. You’ll probably work at a Tier I help desk for the rest of your life. While someone has to do it (and while it can be an art itself), I think most people aspire for more. Unfortunately, if you don’t love it, you won’t get any further.

As I study for my JNCIS, I have realized more and more that if I didn’t really want this, there’s no way I could pass it honestly. Sure, I could use a brain dump (read here for why not to) and pass, but that wouldn’t get me very far. I would either bomb every interview or get lucky, get hired, and then get fired within 30 days as my employer realizes I cheated on the test.

This stuff isn’t extremely simple. It’s not overly difficult, but you’re going to hate it if you don’t crave it. And if you hate it, how far do you realistically expect to get?

If you love it, don’t worry. It will all come with perseverance and dedication. Just study, ask questions, and delve deeper and deeper.

BGP, OSPF, Spans, and Bounces – Working for a Service Provider

Disclaimers

First, I would like to say that in general, I like my job. Second, it should be mentioned that I am in a group that is often looked down upon. Third, I am good at my job.

It should also be mentioned that anything contained in this post is my own personal opinion and does not in any way reflect the views of the company I work for, my department, my bosses, my peers, or any other organizational unit or entity within my company.

Now that the disclaimer is out of the way…

Background

Let’s get something straight. I consider my job to be vital to the continued operations and profitability of the company I work for. I do not resolve issues on my own, but instead escalate them to the support group responsible for a particular piece of equipment. This is usually DNOC IP or DNOC ATM, although sometimes it can include our server group or applications support.

I watch Netcool. For those of you not familiar, it basically collects SNMP traps from all of the network-facing equipment in our company. By network-facing, I mean network backbone and service equipment. Customer equipment and building equipment – such as a Cisco 2960 switch – is not included. Netcool tells me when something bad happens. This could be as simple as a single IMA T1 going down or as bad as an entire market losing connectivity to the rest of the network.

I’ll say it again: I don’t resolve issues. However, that being said, I can tell you why an iBGP peer dropped and then re-established. And I can do it pretty quickly. I can generally tell you why anything that I receive an alarm for happens. I’ve been doing this for less than a year now, but I feel very confident in my abilities to narrow something down. This is where the problem comes in.

The Problem

If a routing protocol bounces for a few seconds, it is generally held for 24 hours. If it stays clear, the ticket is closed.

This is simple. Why must I send the ticket to another group for something so simple? I think I’m capable of periodically checking on an interface to see if it has remained stable. If we monitored these ourselves and escalated if necessary, we could save time for the support groups. And saving the time of people that get paid more than we do means saving the company money.

When we receive traps, I can easily find a root or common cause. Not always, but often.

I hold a NOC position and title. Why, then, does it feel like I am not a member of the NOC?

The Solution

We need to raise the standard for our group. We need to be more consistent as a group and provide better information as a group. We need to increase our knowledge as a group. Everyone needs to be on the same page.

If we can accomplish those things, perception of the group should become a little more positive. Over time, it should get to the point of trust. At that time, we should be allowed to be integrated more fully into the NOC and better fulfill our entry-level position expectations.

Summary

Working for a service provider is not all that it’s cracked up to be. If I did not have such an excellent boss, I would have jumped ship longer ago in search of greener pastures. No place is perfect, but not troubleshooting is a nightmare. Do not apply for a service provider position expecting sunshine and rainbows. You may find that it is either more than you bargained for or nowhere near as challenging as what you expected. I’m not sure if there is a happy balance anywhere in there. Unless, maybe, you want to be customer-facing.