Clearing Up Some Misinformation RE: eBGP Multihop and TTL

Clearing Up Some Misinformation RE: eBGP Multihop and TTL

Myth: You have to set ttl to 2 because it is decremented on the way to the loopback.

**This blog is a formatting cleanup and update to a previous blog I posted in 2013 on NetworkWorld.

 

Years and years ago I was trying to learn more about BGP and I was reading some book with a chapter on the topic.  Back then I pretty much believed that if it made it into a book it must be true and my knowledge had to be in error.  🙂  So to say I was confused back then would be an understatement.

Why? Well ya see… they basically said that the reason one must set the TTL to 2 for eBGP peers that are directly connected, but peering with their loopbacks, was cause “the TTL gets decremented on the way to the loopback”

When I try to help someone deprogram this brain washing, I find pictures help.  So for those who’d like to get deprogrammed and learn the truth… Let’s go play in the lab!!!

bgp_ttl_0-100274774-orig

In the picture above we have 3 Routers in 3 different BGP ASes.  We all probably know that if we peer R1 and R2 together configured, to use our directly connected subnet (10.1.2.0), the eBGP (which has a default TTL of 1) will come up with no playing or tweaking of the TTL.  But let’s try to peer with the loopbacks.

Through the use of some quick static routes, they have full connectivity to each other’s loopback addresses

  • R1 is in BGP AS #1
  • R2 is in BGP AS #2
  • R3 is in BGP AS #3

Actually… you know what?  🙂  Instead of peering R1:R2 loopback to loopback… let’s actually try to eBGP peer R1’s loopback to R3’s loopback.

R1 and R3 eBGP Peer with a TTL of 2

What if I told you that I can eBGP peer between R1 and R3 with a TTL of 2?

bgp_ttl_1-100274775-orig

Don’t take my word for it.  Let’s check R1 and see if it actually has an established BGP session and let’s look at those configs.

router_1_configs-100274776-orig

So as we can see, R1 and R3 can indeed eBGP peer loopback to loopback with a ttl of 2 and with R2 in the middle!

Taking a Step Back

Let’s take one giant step backwards and look at this not from a BGP perspective.  Let’s just say we had 3 routers and a PC connected to Router 1 as in the diagram below.

bgp_ttl_2-100274777-orig

Let’s have the PC ping all 3 of the Loopbacks while setting TTL.  Wanna take bets?  I’m betting

  • TTL of 1 is sufficient to reach R1’s loopback address
  • TTL of 2 is sufficient to reach R2’s loopback address
  • TTL of 3 is sufficient to reach R3’s loopback address

Any takers?

As you can see below… all those pings were successful.

bgp_ttl_ping

So Then Why?

Since R1 and R2 only need a TTL of 1 to get between their respective loopbacks,  why do we “need” to set eBGP multihop to 2 for R1 and R2 for eBGP to work?

The truth is we actually don’t “need” to.

Off to Read Documentation

Let’s do a quick google search for neighbor ebgp-multihop.

bgp_extra-100274779-orig

neigh_multihop_0-100274780-orig

So this command says that it will help connect eBGP peers “residing on networks that are not directly connected.”  So, is R1’s loopback directly connected to R2’s loopback?  Nope!

By default, the above says, that without this command (ebgp-multihop) that for eBGP “only directly connected neighbors are allowed.”  If the default behavior is that only directly connected neighbors are “allowed,” this would mean that some type of check happens that realizes that R1’s loopback and R2’s loopback are not directly connected to each other and the attempted eBGP connection must, then, by default fail.

So if it isn’t TTL that “fails,” then what?

Well it basically seems to indicate in the above documentation that the default behavior is to check to see if the neighbors are directly connected.  For our eBGP peering between R1 and R3 back in beginning we knew two things

  1. A TTL of 2 would be needed to ping R3 from R1
  2. A TTL of 2 would be needed to eBGP peer between R1 and R3

So riddle me this.  If I need a TTL of 2 for successful eBGP peering between R1 and R3 whether via loopback or physical then can it even be in the realm of possibility that they are directly connected?  No.  If I literally need to change the TTL from the eBGP default of 1 to a TTL of 2 for the two IP addresses to even reach each other, then they must not be directly connected.

Therefore — when I configure ebgp-multihop to 2 —  the underlying code must disable the code that does the checking to see if they are directly connected.  Right?  Of course right!

We don’t “need” a TTL of 2 to eBGP peer between R1’s loopback and R2’s loopback, we just need to disable the directly connected little test that it does by default when the TTL is set to 1.

What if I could just do that?  Leave the TTL to 2 but disable that code that, by default, checks if the eBGP peers are directly connected?

“neighbor disable-connected-check”

This command has really been around for quite some time now.

bgp_disable_docs-100274781-orig

Disable-connected-Check Into Action

And now we peer R1’s loopback with R2’s loopback with “disable-connected-check” on both routers and – VOILA!

bgp_disable-100274782-orig

 

 

 

 

 

 

Hope you had fun playing in the lab!  🙂

Filed Under: BGP
  • Hmm. I knew about the TTL issue already – most likely from reading your original post – but the documentation from Cisco does not help much here does it?

    >A BGP routing process will verify the connection of single-hop eBGP peering session (TTL=254)
    >to determine if the eBGP peer is directly connected to the same network segment by default.

    Why the reference to a TTL of 254 in the context of a single-hop eBGP peering session? That’s nonsensical; surely it should read TTL=1?

    >This command is required only when the neighbor ebgp-multihop command is configured
    >with a TTL value of 1.

    What this suggests is that when you configure ebgp-multihop 1 (which is a valid value per the documentation), IOS still performs the “connected check”, and thus to make it work you will need the neighbor disable-connected-check command in addition. This is backed up by the example configurations in the documentation. Your own example though does NOT have ebgp-multihop 1, yet the disable-connected-check command is apparently sufficient to bring up the BGP peer relationship.

    The documentation also does not explain why ebgp-multihop 1 would not automatically disable the connected check since it’s not much use without it. Wouldn’t that have been simpler than adding an entire new additional command that seems to overlap in purpose with the original?

    It’s no wonder nobody can make sense of the logic around eBGP multihop TTLs! And finally just some really ambiguous wording:

    >The neighbor update-source command must be configured to allow the BGP routing
    >process to use the loopback interface for the peering session.

    I think they’re trying to say that on at least one side you’ll need to define an update-source as a loopback; however, the statement could be interpreted to mean that you “must” configure update-source loopbackX on both sides in order to use the disable-connected-check command, which is only half true. *shudders* This is why technical documentation is so darned hard to write and edit.

    Looking forward to the inevitable corrections and explanations where I have misread; bring it on!

    Thanks for a great article.

    • Denise “Fish” Fishburne

      LOL. I just linked to those technical documents… I didn’t write them. 🙂

  • Pingback: FishNet: BGP()