Category Archives: Uncategorized

vMotioned VMs dropping off the network

When a server is vMotioned to another blade chassis the server can connect to other devices within the EPG but not outside the EPG.

This was occurring for LINUX and Windows servers.

The quick and easy fix is to bounce the network interface on the LINUX servers.  On Windows servers this did not always fix the problem.

What is really happening is that the endpoint location is not being updated in the COOP table on the spines correctly.  And get this it’s a known bug with no fix at the moment.  https://bst.cloudapps.cisco.com/bugsearch/bug/CSCva72341/?reffering_site=dumpcr

So how do you fix it inside the fabric?

On your boarder leaves run the following command on both of them as close to the same time as possible.

leaf1# bash

leaf1# clear system internal epm endpoint key vrf YOURVRFHER:VRFNAME ip IPADDRESS

To verify that the VPC leaf is actually passing the traffic correctly use the following steps:

Rrun the following ELAM on the two leaves that the device is connected to see if ARP packets are coming in and see if the “status” triggered. You would have to do it on both leafs at same time because it’s in vpc.

1. vsh_lc

2. debug platform internal ns elam asic 0

3. trigger reset

4. trigger init ingress in-select 3 out-select 0

5. set outer l2 dst_mac ffff.ffff.ffff src_mac YOUR DEVICE MAC ADDRESS HERE

6. start

7. status < — to see if it triggered or stays as Armed //Armed means no traffic has meet what was defined in step 5

8. report | egrep “ce_|ar_”

EPG learnng disabled

If you are getting the 1197 errors in your fabric then the ACI fabric has disabled learning on 1 or more EPGs.

In my case it was caused by MAC flapping from VMware. With the DVS health check enable (which it is by default) The DVS spams the fabric on each VLAN but with the same MAC address. This causes the fabric to disable learning to protect itself.

The VMware KB on it is:
https://kb.vmware.com/selfservice/search.do?cmd=displayKC&docType=kc&docTypeID=DT_KB_1_1&externalId=2034795

In my case the trace had the following characteristics:

All of the non-broadcast protocol 0x8922 packets from Src: Vmware_d0:c3:a0 to Dst: Vmware_d0:07:f8 came in on encapsulating vlan-1510
The VMware broadcast 0x8922 packets were sent in untagged from Src: Vmware_d0:c3:a0 to Dst: Vmware_d0:07:f8
Then there were random vmware mac addresses trying to reach 2 specific vmware mac addresses (00:50:56:d0:44:40 and 8 bits later 00:50:56:d0:44:48) (00:50:56:d0:83:f0 and 8 bits later 00:50:56:d0:83:f8) using protocol 0x8922
10.7.20.11 and 10.7.20.12 were multicasting to 224.0.0.222 [unassigned multicast] and were playing Distributed Interactive Simulation (DIS) which is an IEEE standard for conducting real-time platform-level wargaming across multiple host computers and is used worldwide, especially by military organizations but also by other agencies such as those involved in space exploration and medicine.

A Bunch of 0x8922 packets being broadcast from Source: Vmware_d0:27:e8 across vlan 49, 55, 57, 59, 61, 62, 98, 107, 131, 132, 133, 138. This would cause mac flapping across the vlans.

The same source mac address broadcast without vlan tags.
There were a lot of vms responding to the source mac address in 1 using vlan 450, 451, 1402, 1209, 1212, 1213, 1223, 1230, 1402, 1424

I picked one to see if it looped. eth.addr == 00:50:56:d0:c3:a0 showed it was across the vlans. It looks like you used a specific source ip address instead of letting the switch use its node id as the last octet of the address.
The ERSPAN source can be either a specific IP or subnet prefix. If a specific source IP is configured, all leaf switches in the vPC will use the same IP address as the source IP address in the ERSPAN packet headers.
If a subnet prefix is configured, leaf switches will try to use their own node ID if possible as the last octet in the address. This allows you to differentiate between which leaf switch sent the packet to the destination ip address.

Long and the short of it is disable VMware health checks in Vcenter for the DVS that is causing the problems.

Update: 24-May-16
VMware released a document about this specific issue after we pointed it out to them.

When you have VC tunnel mode connecting into Cisco ACI, there are some scenarios you need to pay attention in order to have the right connectivity.

We conducted some testing in DCA-Lab and this is some information to help you with understanding the nature of the issue.

https://hongjunma.wordpress.com/2016/05/19/cisco-aci-integration-with-virtual-connect-tunnel-mode/

This problem nature is very similar to this VC advisory when working with layer 2 load balancer/bridging device.

http://h20564.www2.hpe.com/hpsc/doc/public/display?docId=emr_na-c02684783&sp4ts.oid=3794423

Also same applies to vCloud Director Network Isolation (vCDNI) which is MAC-in-MAC encapsulation.

 

http://www.wooditwork.com/2013/03/21/vcloud-director-network-isolation-vcdni-doesnt-work-with-hp-virtual-connect-in-tunnel-mode/

ACI hell part 1

When connecting access ports with static paths within an EPG that has trunking what a pain.

So basically if you have a static path binding using 802.1p then try and put an access port with 802.1p Access Untagged things may not work.

The reason is that the 802.1p Access Untagged setting it sets the vlan to 0 in the header, but it still has a vlan tag in there.  Some access devices don’t accept it because they are not  expecting a tag period.  This is especially meaningful with appliances.

If you set your mode to 802.1p Access Untagged and use the same Encapsulation VLAN tag as trunked ports, it will not work.  ACI will give you an error saying that you can’t have tagged and untagged in the same EPG.  Yet you can if you change the encapsulation VLAN ID to a different number it will work.

Remember that a VLAN in ACI is just bogus because ACI uses VXLAN, but endpoint devices care about that VLAN number.  Below is an example of 1 EPG with multiple endpoints in the same bridge domain with different VLAN encapsulations.

ACI8021P

 

How to see what traffic is hitting you CPU on a 6500

If you are having high CPU issues on your routers there is a way to see exactly what is causing it.

My issue was causing EIGRP to drop neighbors then come back online.

The problem is catching it fast enough to get the needed output especially when the spikes happen for only 1 second or 2.  I don’t know about you but i can’t type that fast.

To solve this issue we’ll use our friendly EEM script.

event manager session cli username “XXX” ——-This line may only be used if you have AAA configured and this “algo” must be a username that you already have in AAA

event manager applet HIGH_CPU
event snmp oid 1.3.6.1.4.1.9.9.109.1.1.1.1.3.1 get-type exact entry-op ge entry-val 85 exit-op lt exit-val 75 poll-interval 7
action 1.01 syslog msg “——HIGH CPU DETECTED—-, CPU:$_snmp_oid_val%”
action 1.02 cli command “enable”
action 1.03 cli command “term length 0”
action 1.04 cli command “debug netdr cap rx”
action 1.05 cli command “show netdr cap | append disk0:HIGH_CPU.txt”
action 1.06 cli command “show proc cpu sort | append disk0:HIGH_CPU.txt”
action 1.07 cli command “Show users | append disk0:HIGH_CPU.txt”
action 1.08 cli command “Show proc cpu history | append disk0:HIGH_CPU.txt”
action 1.09 cli command “show logging | append disk0:HIGH_CPU.txt”
action 1.10 cli command “show spanning-tree detail | append disk0:HIGH_CPU.txt”
action 1.11 cli command “show ip traffic | append disk0:HIGH_CPU.txt”
action 1.12 cli command “show clock | append disk0:HIGH_CPU.txt”
action 1.13 cli command “undebug all”
action 1.14 cli command “term length 24”
action 1.15 cli command “exit”

Depending on your platform you may need to change disk0: to flash: or something else.

It will trigger when there is 85% CPU or greater and write a file to the destination.

With this output you can put it into a beta cisco tool https://cway.cisco.com/tools/netdr which will decode it for you.

Here is what one of mine looked like:

netdr

Cisco Anyconnect and Smart Tunnels

Cool feature that is available for SSL/WebVPN users.  When a process is started (Windows) or an application in a certain directory path is launched (MAC) you can have “smart tunnels” established.

This works really easily with the Windows platform and is very easy to configure.

Edit your Clientless SSL VPN Access Group policy

ASA1

Select the Portal option on the left menu.

Go to the smart tunnel section and select your Tunnel Application.  In mine I had named mine RDPclientless

ASA2

Click add

ASA3

I added the windows on and it worked perfectly.  I also have tried many versions of the MAC configuration but I have not had any success.

ASA4

One thing to note when ever you make changes to these profiles the Auto start check box becomes un-checked.

ASA5

Cisco Anyconnect and Windows 8.1

What is wrong with Cisco?  They put out an installer file that requires a registry edit to work correctly!!!

Below is what I did to get the Anyconnect client to work on Windows 8.1.

VPN1

 

VPN2

UAC accept box

2nd UAC accept box

VPN3

VPN4

Find this registry key:

Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\vpnva

Deleted everything to the left of Cisco.

VPN5

Reload

VPN6

Added software to Cisco ASA’s that was missing and I was successful in getting connected with the Full client on my Windows 8.1 computer.