As part of NSX preparation for logical switching and routing, it is necessary to define at least one Transport Zone (from here on – “TZ”).
It is obvious from the UI that TZ configuration includes default VXLAN Control Plane mode and a list of ESXi clusters; but what does it actually do?
Let’s find out.
TZ and Logical Switches’ Control Plane mode
This one’s easy: VXLAN Control Plane Mode selected when a TZ is created specifies the default Control Plane Mode for Logical Switches created in that TZ. In other words, when you create an LS via UI or API, TZ CP mode will be selected for your new LS by default. However, you can override it per individual LS, and of course you can change it later.
It is perfectly fine to have Logical Switches in different CP modes within the same TZ; however it may make recovery from some system failures, such as total loss of Controller Cluster (for eg, due to datastore failure or a human error), more time-consuming.
Needless to say, Hybrid and Multicast CP modes require that you configure a pool of multicast IP addresses for NSX to draw from.
TZ, Logical Switches, and DVS
This one’s not very obvious, but makes sense if you think about it. Logical Switches are not explicitly “created” on ESXi hosts. When an LS is created, a TZ is specified for it to live in. NSX Manager looks up that TZ’s config to see which clusters are included in it, and builds a list of DVS prepared for VXLAN that correspond to that TZ. Then, NSX Manager creates a “special” dvPortgroup on this or these DVS, and tells Controller Cluster about that new LS. That’s it as far as NSX Manager is concerned. DVS component will take care of informing hosts of the new dvPg.
When a VM is connected to such special dvPg (or DLR is configured with a VXLAN LIF), ESXi host would request further information from the Controller Cluster, which will allow VXLAN kernel module on that host to do its job.
This has an interesting side effect: if you didn’t add all clusters of a given DVS to the TZ, those clusters you haven’t added will still have access to that Logical Switch. Let’s have a look at the following diagram:
We have three clusters here (Comp A, Comp B, and Mgmt / Edge), and two DVS (Compute_DVS and Mgmt_Edge_DVS). If we were to create a TZ that included say clusters Comp B and Mgmt / Edge, but didn’t include Comp A, any LS created against that TZ will still be “available” to VMs running on cluster Comp A.
Why? Because LS is really a DVS Portgroup, and the cluster Comp A is a member of the DVS that’s part of the TZ.
What will happen if we connect a VM running on cluster Comp A to a dvPg that corresponds to such an LS? VXLAN will work just fine – VMs on that LS will be able to communicate across all three clusters; however the trouble will arrive when we connect such an LS to a DLR.
TZ and DLR
Unlike LS, DLR instances are created by NSX Manager on each ESXi host explicitly. This procedure has no relationship or dependency on DVS, and it does follow the TZ scope strictly.
This means that in out hypothetical case, if we were to create a DLR and connect to it that LS we’ve created earlier, DLR instance would get created on hosts in clusters Comp B and Mgmt / Edge, but not on hosts in cluster Comp A:
This would cause an “interesting” situation, where Comp A VMs will not be able to reach their DLR’s LIF, because DLR, along with its LIFs, simply wasn’t created on these hosts.
So in the diagram above, VMs web1, web2 and LB can talk to each other just fine; but VMs app1 and db1 will not be able to talk to anything. Additionally, VM web1 won’t be able to talk to anything other than what is connected to the same VXLAN as itself (5001). This will surely appear confusing.
When creating your Transport Zones, make sure to include all clusters in each DVS, ie, align TZ to the DVS boundary. Also, forewarned is forearmed. 😉