Clearing Up Some Misinformation RE: eBGP Multihop and TTL

Myth: You have to set ttl to 2 because it is decremented on the way to the loopback.

Years and years ago I was trying to learn more about BGP and I was reading some book with a chapter on the topic.  Back then I pretty much believed that if it made it into a book it must be true and my knowledge had to be in error.  🙂  So to say I was confused, when I read what I did, would be an understatement.

Why?  Well cause what they said was that the reason one must set the TTL to 2 for eBGP peers that are peering with their loopbacks but also directly connected…. was cause “the TTL gets decremented on the way to the loopback

This is not true.  Let’s go play in the lab and I’ll explain.  I find pictures and seeing the flow helps it all sink in.

bgp_ttl_0-100274774-orig

In the picture above we have 3 Routers in 3 different BGP Autonomous Systems.

R1 and R2 BGP Peering via Subnet 10.1.2.0/24

As you may already know, if we peer R1 and R2 together using the directly connected subnet (10.1.2.0) that connects them together…. the eBGP (which has a default TTL of 1) will come up with no playing or tweaking of the TTL.

For grins and giggles… let me show you something else.  Let’s BGP Peer between R1 and R3 and have them use their Loopbacks for the BGP session.

R1 and R3 BGP Peering via Loopbacks

Before we begin… yes… the routing connectivity between R1’s loopback and R3’s loopback is already set up via statics.  So let’s move to the next step.

The default TTL for an eBGP session is a TTL of 1.  A TTL of 1 isn’t going to make it between R1 and R3.  So what TTL to use then?

What if I told you that I can eBGP peer between R1 and R3 with a TTL of 2?

bgp_ttl_1-100274775-orig

Don’t take my word for it.  Let’s check R1 and see if it actually has an established BGP session and let’s look at those configs.

router_1_configs-100274776-orig

So as we can see, R1 and R3 can indeed eBGP peer loopback to loopback with a TTL of 2 and with R2 in the middle!

So if the TTL actually were getting additionally decremented going to the loopback… then how exactly can R1 and R3 peer with each other AND also peer THRU another router?

Taking a Step Back

Let’s take one giant step backwards and look at this not from a BGP perspective.  Let’s just say we had 3 routers and a PC connected to Router 1 as in the diagram below.

bgp_ttl_2-100274777-orig

Let’s have the PC ping all 3 of the Loopbacks while setting TTL.  Wanna take bets?  I’m betting

  • TTL of 1 is sufficient to reach R1’s loopback address
  • TTL of 2 is sufficient to reach R2’s loopback address
  • TTL of 3 is sufficient to reach R3’s loopback address

Any takers?

As you can see below… all those pings were successful.

bgp_ttl_ping

Getting more confusing eh?

Since R1 and R2 only need a TTL of 1 to get between their respective loopbacks,  why do we “needto set eBGP multihop to 2 for R1 and R2 for eBGP to work?

The truth is we actually don’t “need” to.

Off to Read Documentation

Let’s do a quick google search for neighbor ebgp-multihop.

bgp_extra-100274779-orig

neigh_multihop_0-100274780-orig

So this command says that it will help connect eBGP peers “residing on networks that are not directly connected.”

Question: Is R1’s loopback directly connected to R2’s loopback?

Answer: No.

By Default, Only Directed Connected Neighbors are Allowed

The documentation says that, by default, without this command (ebgp-multihop) that for eBGPonly directly connected neighbors are allowed.”

If the default behavior is that only directly connected neighbors are “allowed,” this would mean that some type of check happens that realizes that R1’s loopback and R2’s loopback are not directly connected to each other and the attempted eBGP connection must, then, by default fail.

So if it isn’t TTL that “fails,” then what?

Well it basically seems to indicate, in the above documentation, that the default behavior is to check to see if the neighbors are directly connected.  For our eBGP peering between R1 and R3 back in beginning we knew two things

  1. A TTL of 2 would be needed to ping R3 from R1
  2. A TTL of 2 would be needed to eBGP peer between R1 and R3

If I NEED to Set TTL of 2 for R1 and R3 to BGP Ping or Even Ping, Can They Be Directly Connected?

So riddle me this.

Question: If I need a TTL of 2 for successful eBGP peering between R1 and R3 whether via loopback or physical then can it even be in the realm of possibility that they are directly connected?

Answer: No.

If I literally need to change the TTL from the eBGP default of 1 to a TTL of 2 for the two IP addresses to even reach each other, then they must not be directly connected.

Underlying Code MUST be Doing Something Additional When we Set TTL of 2

Therefore — when I configure ebgp-multihop to 2 —  the underlying code must disable the code that does the checking to see if they are directly connected.  Right?  Of course right!

We don’t “need” a TTL of 2 to eBGP peer between R1’s loopback and R2’s loopback, we just need to disable the directly connected little test that it does by default when the TTL is set to 1.

What if I could just do that?  Leave the TTL to 1 but disable that code that, by default, checks if the eBGP peers are directly connected?

“neighbor disable-connected-check”

This command has really been around for quite some time now.

bgp_disable_docs-100274781-orig

Disable Connected Check Into Action

And now we peer R1’s loopback with R2’s loopback with “disable-connected-check” on both routers and – VOILA!

bgp_disable-100274782-orig

Hope you had fun playing in the lab!  🙂



Categories: BGP, Routing

Tags: , , , , , , , ,

7 replies

  1. Hmm. I knew about the TTL issue already – most likely from reading your original post – but the documentation from Cisco does not help much here does it?

    >A BGP routing process will verify the connection of single-hop eBGP peering session (TTL=254)
    >to determine if the eBGP peer is directly connected to the same network segment by default.

    Why the reference to a TTL of 254 in the context of a single-hop eBGP peering session? That’s nonsensical; surely it should read TTL=1?

    >This command is required only when the neighbor ebgp-multihop command is configured
    >with a TTL value of 1.

    What this suggests is that when you configure ebgp-multihop 1 (which is a valid value per the documentation), IOS still performs the “connected check”, and thus to make it work you will need the neighbor disable-connected-check command in addition. This is backed up by the example configurations in the documentation. Your own example though does NOT have ebgp-multihop 1, yet the disable-connected-check command is apparently sufficient to bring up the BGP peer relationship.

    The documentation also does not explain why ebgp-multihop 1 would not automatically disable the connected check since it’s not much use without it. Wouldn’t that have been simpler than adding an entire new additional command that seems to overlap in purpose with the original?

    It’s no wonder nobody can make sense of the logic around eBGP multihop TTLs! And finally just some really ambiguous wording:

    >The neighbor update-source command must be configured to allow the BGP routing
    >process to use the loopback interface for the peering session.

    I think they’re trying to say that on at least one side you’ll need to define an update-source as a loopback; however, the statement could be interpreted to mean that you “must” configure update-source loopbackX on both sides in order to use the disable-connected-check command, which is only half true. *shudders* This is why technical documentation is so darned hard to write and edit.

    Looking forward to the inevitable corrections and explanations where I have misread; bring it on!

    Thanks for a great article.

  2. Thanks for the great article!
    However, I have a question in the following statement.

    “What if I could just do that? Leave the TTL to 2 but disable that code that, by default, checks if the eBGP peers are directly connected?

    In this statement, wouldn’t the following be a tighter fit ‘Leave the TTL to 1’?

    I mean, based on the article – can’t we peer R1 and R2 using loopbacks with a TTL of 1, given that we disable the connected network check? Thanks.

Trackbacks

  1. FishNet: BGP
  2. BGP Graceful Restart on the Cisco FTD: Part 1 - Configuring

Leave a Reply