Do I need a hardware VTEP for my NSX for vSphere?

VMware NSX for vSphere has been shipping beta support for hardware VTEPs since version 6.2.0, with General Availability (GA) coming in the next few months. With this in mind, I thought it would be useful to provide an overview of HW VTEP’s use cases and considerations.

Where does HW VTEP fit with NSX?

Let’s recap why customers invest in NSX:

  • It lets you apply security functions directly at VM’s vNIC, providing tighter security and easier compliance. This is referred to as “micro-segmentation”.
  • Virtualised L2/3, FW, LB, NAT, and VPN services provided by NSX can be consumed with no need for lengthy network change process, since no changes to physical network are needed. This enhances IT’s agility.
  • Those virtualised services can span multiple vCenter servers and potentially physical sites. This enables efficient resource pooling.

NSX L2 Bridging function

NSX bridging and HW VTEP integration relates to the point #2 (Agility). It gives application owners the ability to bridge logical L2 networks (“Logical Switches”, or “LS”) with VLANs.

There are four use cases where bridging of logical switches with VLANs may be required:

  • VMs on LS need direct access to physical appliances, such as Storage arrays, FW, LB, or Routers. To simplify routing configuration on VMs with multiple vNICs, physical appliances can be placed on the same IP subnet as a dedicated vNIC on VMs. Also, NSX Distributed Firewall should be used to prevent VMs from talking to each other via this segment.
Use case 1: Access to physical appliances

Use case 1: Access to physical appliances

  • Migration scenarios – P2V and V2V from VLAN-backed infrastructure, where changing IP addresses of workloads may be problematic; or
  • Multi-tier applications that incorporate physical servers, e.g., DB. Note that the bridged LS is also connected to the DLR for routing to other application tiers.
Use cases 2 & 3: Migration and physical tier

Use cases 2 & 3: Migration and physical tier

  • Multicast routing for VMs connected to Logical Switches. Multicast receivers are typically physical endpoints, and don’t need direct access to logical networks.
Use case 4: Multicast Routing

Use case 4: Multicast Routing

Bridging in NSX Software

NSX-v ships with native bridging capability in software as part of its Distributed Router function.

Software bridging is enabled in Distributed Router instance configuration, where an LS and a VLAN-backed dvPortgroup are linked to form a bridge.

Software bridging is performed in kernel on an ESXi host. Only one ESXi server can be actively bridging between an individual LS and a VLAN-backed dvPortgroup, to avoid forwarding loops. Service protection is provided by the DLR HA function. This function can provide recovery in case of a ESXi host failure typically in under 10 seconds.

Since bridging happens in an ESXi host, the maximum throughput is limited to how much network capacity that host has. If more capacity is required, multiple DLR instances can be deployed for other LS : VLAN pairs, with several different ESXi hosts performing active bridging function. However for a given LS : VLAN pair capacity can’t exceed that of one host.

So, why would I need a hardware VTEP?

There are cases where bridging capacity or failure recovery time provided by the NSX software bridging function is not sufficient. Hardware VTEP solutions can be used to address either or both of these.

Brocade HW VTEP Solution

Brocade HW VTEP solution provides application owners and virtualised infrastructure admins with a solution that can provide large bridging capacity and a fast failure recovery of under 1 second.

To use Brocade HW VTEPs with NSX, between one and four 6740 or 6940 switches are connected into a VCS fabric, and configured to act as a single Hardware Bridge entity, called “Overlay Gateway” in Brocade NOS CLI. The Overlay Gateway runs a distributed OVSDB server that can be programmed by NSX for vSphere to manage VLAN to VXLAN bridging configuration. To complete the “illusion”, Brocade’s Overlay Gateway also presents all member physical switches as a one, with a single VTEP IP.

While Overlay Gateway needs all member physical switches to be part of the same VCS fabric, it is not necessary for all switches in such fabric to participate in Overlay Gateway. In other words, Overlay Gateway can be configured on a sub-set of switches in a given VCS fabric.

Once the Overlay Gateway is registered with NSX as a new “Hardware Bridge”, operator can begin attaching “Hardware Ports” to Logical Switches.

There is a small but important point that needs to be made. Brocade HW VTEP connects an NSX Logical Switch to an L2 VLAN inside VCS fabric, rather than mapping it to an actual physical port with an optional VLAN ID. Any physical port on any switch in VCS fabric configured as a member of that VLAN will be able to communicate with VMs connected to that Logical Switch. There’s also no need for such physical port to be on the actual switch that’s performing VLAN to VXLAN bridging. It is for example possible to set up the Overlay Gateway on the spine switches, while physical servers and appliances connect to the leaf ToRs connected to the same VCS.

This implementation allows for flexible connectivity options of physical servers and appliances. It also supports connection of physical devices using Brocade VLAG (think MC-LAG / MLAG / vPC), which provides rapid forwarding plane recovery in case of a physical link or switch failure or maintenance.

Depending on individual Vendor implementation, when a Hardware Port is attached to a Logical Switch, the VLAN membership configuration of that port on a physical switch may be changed without network administrator’s action. In Brocade HW VTEP implementation, the “Hardware Port” is virtual inside VCS fabric. Configuration of physical ports is not updated automatically, which provides more control and serves as an additional safety measure against unintended changes to physical network.

Hardware VTEP considerations

There is one important factor to consider when using HW VTEP solutions, whether they are from Brocade or any other Vendor.

In a typical NSX application topology, logical switches with VMs are connected to a DLR that provides an IP gateway function for these VMs.

With software bridging, a Logical Switch connected to a DLR can also be bridged to a VLAN-backed dvPortgoup. In such case, both VMs and physical endpoints can use the DLR as their IP gateway.

However current implementation of NSX HW VTEP bridging does not support such configuration. If a Logical Switch is connected to a DLR, it will not be possible to attach a “Hardware Port” to it.

This needs to be taken into consideration when planning the scenarios for deployment of HW VTEP solutions.

For example, we mentioned a scenario that includes a physical server in a multi-tier application. With a HW VTEP this scenario will need a rethink because the DB Logical Switch cannot be both connected to DLR and bridged to a DB VLAN.

Looking back at the use cases depicted above, only #2 and #3 had Logical Switch connected to a DLR. Therefore, we’ll need a workaround only for those two.

In these scenarios, traffic levels are usually not expected to be extremely high, such as the case with VMs accessing file shares or object storage provided by dedicated storage systems, shown in use case #1. This means it should be OK to provide connectivity for the bridged segments by tapping the North-South path.

One way is to connect the bridged LS to the Edge Services Gateway (ESG) upstream of the DLR:

Work-around 1: via the ESG

Work-around 1: via the ESG

Another way would be to use a physical router connected to the bridged VLAN, where traffic would flow out to the DC network, and down South through ESG and DLR:

Work-around 2: via a physical router

Work-around 2: via a physical router

Summary

So, what we’ve talked about:

  • NSX provides clear benefits to application owners and virtualisation admins around security, agility, and better resource utilisation.
  • VXLAN to VLAN bridging is a part of the “Agility” benefit, enabling on-demand connectivity between virtualised and non-virtualised resources.
  • In cases where performance, scalability, or redundancy of NSX software bridging is insufficient, HW VTEPs can be used.
  • Reference topologies in NSX make extensive use of Distributed Routing, but using HW VTEP to bridge LS with VLAN is not compatible with DLR, and need special attention during design phase.

Teaser

In a later post or posts, I’m planning to provide a deep dive into how NSX-v HW VTEP integration and Brocade HW VTEP implementation works, including some deployment considerations and technical reasons behind them.

To make sure I keep material relevant and useful, I’d like to ask you to tell me in comments which aspects you’d like to learn about, and I’ll do my best. 🙂

Thanks for reading!

Next post in the series is here.

About Dmitri Kalintsev

Some dude with a blog and opinions ;) View all posts by Dmitri Kalintsev

8 responses to “Do I need a hardware VTEP for my NSX for vSphere?

  • Petar Smilajkov

    Thanks for this. It’s exactly the question I asked Simon H-W. today and he pointed me to your blog post 🙂

    I would like to leverage VDX’s VTEP capabilities to allow customer VM’s (which are behind NSX Edge) to “directly” connect to backup servers (and shared NFS mounts) without having to go through attaching real dvPortGroups to them. Des that make sense?

    Looking forward to your future posts on this matter. If you could go into some detail on how to configure Brocade VDX 6740 to do this, I’d be much obliged. Their own guides are a bit out-dated and concentrate more on NSX-Hypervisor (4.2 mind you) rather than NSX-V (6.2+).

    Thanks!

    Petar

    • Dmitri Kalintsev

      Hi Petar,

      Thanks for the comment.

      What you’re asking for makes perfect sense.

      This use case is perfect for VXLAN to VLAN bridging for several reasons:

      1) Since your VMs are on a VXLAN, you don’t need to provision the Backup/NFS VLAN to all ESXi hosts;

      2) If you use a dedicated Logical Switch for Backup/NFS access, you don’t need to connect it to DLR for routing, which makes it compatible with both NSX software bridging and HW VTEP bridging;

      3) You can start with NSX software bridging, and then migrate to HW VTEP when you need more capacity or need faster failure recovery;

      4) Since VMs and Backup/NFS servers are on the same subnet, you won’t need to manage routing on your VMs, which lowers opeartional overhead and risk of errors;

      5) You’re not compromising on security, since you can use NSX DFW to ensure VMs can only talk to Backup/NFS servers, but not to each other.

      And yes, I will cover how to configure things in a future post.

  • Jordi Castro

    Hello Dmitri

    Thank you for your post, it’s a great explanation.
    You know if this HW VTEP Works with a LS configured in Unicast Mode?

    Thanks and regards,

    Jordi

  • Planning deployment of a Hardware VTEP with NSX for vSphere | Telecom Occasionally

    […] a couple of previous blog posts, we’ve looked at the use cases for HW VTEPs. Now, let’s start digging a bit […]

  • Validating HW VTEP deployment for integration with NSX-v | Telecom Occasionally

    […] post is next in series on using HW VTEPs with NSX-v. You can find earlier posts here: 1, 2, and […]

  • canuck69

    In your use case#1. I could not get that working. I created a vm in a logical network called App. It’s default gateway is the DLR. I added a logical network called physical (not connected to the DLR) to the vm (I did not give a default gateway). I created a L2 bridge on the DLR that connects the logical network called physical to a VLAN backed distributed port group for my physical server.
    So, my vm can successfully get to other vm’s on the same logical network called App, and it can route to other vxlan segments via the DLR. This is as expected. It cannot get to the physical server though. It should though, as the vm and physical server are on the same L2 (via the bridge).
    I can get scenerio #2, 3 working just fine. It’s just the first scenerio that has 2 interfaces on a vm that I can’t get to work for the L2 traffic to physical.

    • Dmitri Kalintsev

      Hello,

      From your description, things should work. I’m assuming that the IP address of your VM’s interface connected to the LS “Physical” is in the same subnet as the IP address of the physical server, and that this subnet is different from the one your VM has on the LS “App”. Is this the case?

      If so, it’s probably time to have a look at bridging tables on your ESG, and then potentially do some packet captures to figure out where things are breaking.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: