Today I came across an interesting case, where two out of three tunnels on one spoke (out of 50 spokes in total) were down. The third (working) tunnel was connected to the same two hubs but via mpls.
- tunnel configuration. It was identical in all other 40+ spokes
- crypto isakmp and ipsec policies. Again, identical.
- state of crypto: both ISAKMP and IPSEC were up, encrypting and decrypting tunnels
So now it was up to GRE and NHRP to do their job of establishing DMVPN.
When I checked the state of dmvpn on the hub, it showed UP, however, the spokes were showing their state as NHRP. What does it mean? After doing NHRP debug I knew that the spoke wasn’t getting the resolution reply for the two bad tunnels.
Finally, I saw that there were 2 NAT entries for GRE traffic. It was strange, because the traffic was sourced by the same host on the LAN and the destinations were (in both cases) the dmvpn hub.
I looked at the nat translation statement: it was a nat overload to the outside (public) interface.
As GRE doesn’t cooperate well with PAT, I decided to clear the entries, especially that the the source host was Incomplete in the ARP table.
After clearing the nats from the ip nat translation table, the tunnels went up.
What does it teach us? We should make a deny statement for GRE traffic in our NAT ACLs and only use TCP and UDP in nat statements rather than IP.
I’m wondering if the same issue will crash my DMVPN lab or not.