It’s been an amazing week. With some invaluable help from DiGi engineers, I investigated and tested the a working VRRP backup pair of Cisco and DiGi routers to route to the DC over VPN (GRE over IPsec). I managed to set up a VPN hub (Cisco) with a branch spoke (Cisco), and then I set up a DiGi router, connected it to a branch switch and set up a VRRP pair between the branch Cisco router and the branch DiGi router. Finally, I set up a backup VPN tunnel to the VPN hub in the DC. The only workaround is that because DiGi doesn’t do EIGRP, i had to set up an OSPF neighborship over the tunnel to DiGi, and then on the hub redistribute the OSPF routes from the spokes into the main EIGRP process.
What does it mean:
- you can save quite a lot of money because a second Cisco router would cost much more than a DiGi does.
- DiGi does LTE much better than Cisco (it has two SIM slots for redundancy!)
- VRRP allows you to have 24×7 operations in case main Cisco fails because of the redundant tunnel on DiGi
- you can run Python scripts on DiGi to execute configurations conditionally
I still need to do more learning because DiGi CLI commands are a bit… well… non-intuitive to say the least, but at least I can get it up and running. In the next update I will include a working configuration on DiGi and Cisco.
If you have a 3g or 4g interface and you set up a GRE tunnel out this 4g interface, make sure to lower your ip mtu below 1376 (your router will tell you the correct value if you set it too high). Failure to do so will cause the tunnel to be in DOWN state, even though ipsec negotiation is fine. I spent 30 minutes today looking at the config until I made the right decision to delete the tunnel interface and configure it line by line. When I entered ”ip mtu 1400”, ios gave me a warning that the mtu should be no higher than 1376.
The reason is that there some protocols in 4g network core that need to be additionally included in your mtu calculation. Normally we are able to send 1460 bytes ( + 20 bytes IP, 20 bytes TCP). However, when you have ipsec and gre, a safe value is 1400. Then come various 4g tunnel encapsulations, so 1376 seems to be just fine.
Here’s one example why you should isolate your test infrastructure from your office network.
Person A responsible for the server infrastructure enables link aggregation on uplink server ports without telling the network admin. This causes the switch to wonder on which port mac address 0000.1111.2222 actually is because server sends traffic on all 4 ports.
Oct 13 15:00:58: %SW_MATM-4-MACFLAP_NOTIF: Host 0000.1111.2222 in vlan 11 is flapping between port Gi0/25 and port Gi0/34
After a few hours, the switch gets tired and refuses incoming aaa calls because of how some lazy programmer built this function.
AAA unable to create UID for incoming calls due to insufficient processor memory.
That’s it, we’re cut off from our switch because we didn’t enable any security mechanism (alarms, memory preservation features etc.).
This example is due to a bug. Some smart programmer decided that macflaps should be within the same function as AAA, therefore memory taken by macflaps goes into the Auth Manager bucket. Therefore, the more mac flaps, the less memory is left for AAA. Brilliant.
Once in a while I’m asked to join a job interview as a network engineer. Unfortunately, I usually want to run out of the room as soon as the candidate starts answering my questions. Here are a few Do’s and Don’ts for network engineer wannabes.
- know at least something. Most of the questions for low level openings are quite standard. You will be asked about spanning tree, about basic differences between eigrp and ospf, why rip is no longer used. You might be asked to calculate a subnet for a given host. How long can it take to prepare? A week? A month? And even if it’s a bit more, what is a month compared to the lifelong career in networks (not to mention a good salary)?
- tell the interviewer why you’d enjoy the job you’re applying for. At least appear honest.
- tell the interviewer about your plans related to the job you’re applying for. It could be a course that you’ve been wanting to take, it could be a lab that you’re setting up at the moment
- say that knowledge is in books and it’s enough to know the table of contents (true story!!!) and besides you can look it up on the internet
- say that you picked IT because what you’d really like to do in life is not well-paid or too difficult (another true story)
- start talking about your private life when you’re asked about your worst defeat. The question is more about your job experience… It’s a bit awkward when you start to talk about how your sister has been mean to you and you can’t deal with that
- say Cisco certificates are a stupid thing to have
- say you have practical experience with networks because you once called your telecom and they came and fixed your router
- say that ”i’ve been asked the same question when I applied for a job at XXX” if you just failed to answer this particular question. It just goes to show that you’re not improving as a network engineer.
And most of all, please don’t share your idiosyncratic philosophies such as: ”when i go to a job interview, i prefer not to do any research because I think it brings bad luck”.
Why not enroll on a course at www.humanity.pl ? We will prepare you for job interviews, Cisco certificates, and any network-related tasks that you will be told to do at work!
Today I learnt that you can easily make a network loop on a Dell server with an embedded switch, causing a network outage for the whole office.
The embedded switch has a few physical external ports and a lot of logical internal backplane ports. If you use port mirroring on the external port but you send the mirrored frame also on the external port, the copied (mirrored) frames will multiply like crazy, because the frame that wants to go out does go out but is first copied. Its copy also wants to go out is is first copied etc. etc. One frame quickly becomes a billion frames that go out to your LAN.
Another example of how you can cause a loop by using SPAN is by ”faking” RSPAN:
As you might remember, last month before the demo I had a weird problem where my Gembird power strip application was malfunctioning. In short, the power strip connects to the Internet, and on my cell phone I configured an application where I can remotely switch on/off sockets on the power strip. The problem was that for some reason I could only connect to the power strip when i was on the same WLAN. Otherwise, the application said that the device was offline about 30 minutes after I would leave home. I thought that it was an application/faulty unit problem until today. I have 4 of those power strips so i swapped the faulty one and configured another gembird. I logged out from my wifi and used normal Internet access on my phone. Sure enough, the app died after an hour. But I noticed a weird thing: the app started working again as soon as i connected to the same wifi that the power strip was on! So I figured it wasn’t a problem with the connection between the power strip and the cloud server, but rather between my phone and the cloud server! It appears that the connection is to the TCP port 5000 of the Gembird cloud server, which i guess is blocked by the ISP. However, if I use wifi at home to use the gembird app, the status is OK.
I wouldn’t actually be able to figure this out if it weren’t for the problem I had at work recently, where Meraki VPN site to site (their ”punching UDP holes” vpn where nobody has a public IP address) would only work on the landline internet, but it would fail if operating in 4g mode.
Internet access offered by mobile operators is not full (they only proxy some ports or simply block certain ports) or broken due to the way they do NAT. And I figure that IPv6 will solve all such problems.
- Get a cheap ICND1 Cisco course (chris bryant’s courses at udemy.com rock and they can be dirty cheap if there’s a good deal on). The current url is https://www.udemy.com/ccna-on-demand-video-boot-camp/
- Get a book with Cisco labs, e.g. https://www.amazon.com/101-Labs-Cisco-CCNA-Exam/dp/0955781523
- Get 4 cheap routers (2801 or 1841) and Rj45 cables at ebay. A good price is $50 a piece
- Get 3 cheap switches (3550 or 3560) at ebay. A good price is $40 for 3550 or $60-70 for 3560
- Get a Cisco console cable and DB9>USB adapter at ebay
- Start watching the ICND1 course. Try to practise everything you see there on your equipment.
- After you finish watching the course, do the labs from the book.
- Take the 100-105 Cisco exam (go to pearson vue cisco website and book the exam at your nearest exam center) – this step is optional but Cisco certificates can guarantee at least a job interview.
- Congrats! Now get an ICND2 book and repeat the process.
This will set you back around $700 ($150 for the vids and books, $400 for the gear and $150 for the exam), or even less if you get a good bargain at ebay/allegro.
Alternatively, book an ICND1 course at www.humanity.pl. This ensures that:
- you get 80 hours of learning, this includes lectures and workshops. Classes are delivered by an experienced network engineer
- you get 60+ hours of access to our equipment when you go home after the classes
- it’s less expensive than buying the equipment and books
- you know exactly what to do at each step
- if you come across a difficult problem during or after the course (we provide a free 3month post-course email troubleshooting service), you can ask the trainer
- you are thoroughly prepared for the exam
What do I mean by ”weird problems”? Some websites don’t load fully, you can access network shares but cannot actually open the files, your teamviewer sessions are suddenly disconnected etc. Mind you, lowering MTU on endhosts is not really a good solution because then you need to do it on all your endhosts. Typically, you will lower the MTU on your router, but what if you cannot access your router or the router doesn’t have the option to change the MTU size?
You typically have this problem when on a PPPoE connection, if MTU on the router is set to 1500 instead of 1492. However, it seems that UPC customers have this problem and lowering the MTU on the machines helps, too, because it makes more space for any headers and administrative overhead that is necessary to make protocols work.
Now why does lowering the MTU size help? Because endhosts can fragment a large segment of data better than intermediary routers. Some routers will just do a bad job of fragmenting/assembling the packet. But if the endhost sends packets that are small enough not to be fragmented anywhere on the path from host A to host B, nobody needs to do any fragmentation at all.
Unbelievably, there is a bug on Nexus 7000 that can prevent you from applying any ACLs to the interface. The result is that the following command gives TCAM allocation failures.
ip access-group MYACL in
The solution is to use the following command:
hardware access-list resource feature bank-mapping
I was quite proud of the fact that I found the solution in 20 minutes.
I’ve found a number of times that if you have a crypto map that is applied to an interface, changes made to the transform-set that is applied to this interface are not applied instantaneously. So e.g. you have this crypto map:
crypto map MYMAP ipsec-isakmp 10
set peer 184.108.40.206
set security-association lifetime kilobytes 4608000
set security-association lifetime seconds 3600
set transform-set MYSET
set pfs group5
match address MY_INTERESTING_TRAFFIC
Now if you change MYSET, your router may still send out the old MYSET. Solution no1:
- be patient, wait 15 minutes
- clear the crypto map, apply it again
- shut and unshut the interface
So if you’re troubleshooting a broken S2S VPN, make one change, wait 15 minutes, check, if it’s still not working only THEN make another change.
Today I spent 80 minutes in a troubleshooting session with an engineer from the remote end trying one thing after another and nothing worked. We ended the session and set up another one for the next day. I reverted the config to the original settings + made 1 change that should be ok (but it still wasn’t ok! ), went shopping, came back, and it just worked!