In this article

When we consider deploying Application Centric Infrastructure (ACI) or Data Center Network Manager (DCNM) in the data center, we begin by evaluating the existing architecture to understand which technology could improve deployments, reliability and overall network functionality.

Traditional network architecture

Many data centers today use traditional three-tier network architecture. These classic designs are composed of three main layers. 

  • The CORE layer supplies high throughput layer-3 services.
  • The DISTRIBUTION layer commonly supplies multi-chassis EtherChannel (MEC) layer-2 services and first-hop redundancy protocols (FHRP) layer-3 services.  The Distribution layer can also provide layer-4 through layer-7 services when configured with load balancers and firewalls.
  • The ACCESS layer supplies only layer-2 services and can also utilize MEC.

This classic design works, as it has for many years, but it can create some challenges. 

  • Inconsistent device configuration. For example, adding a new layer-2 VLAN or layer-3 SVI could require configuring up to four switches depending on MEC type. Typically, these configurations are completed from the CLI and there is a possibility a typo could be introduced. Additionally, configuration changes to firewalls and/or load balancer may be required.
  • Lack of built-in telemetry of devices.  Most of these devices support some form of SNMP or Netflow sampling.  Typically, an external collector will need to be deployed to fully analyze the data and take action.  Granular, full-flow data is usually unable to be recorded at wire speed.
  • Difficult horizontal growth. With traditional architectures, the distribution layer can run out of physical ports with continued need to add access switches for increased server connectivity. Adding another distribution layer can accommodate continued growth but the core separates the new distribution layer at layer-3. On one hand this is beneficial because it reduces the failure domain. On the other hand, with multiple domains, there is a lack of layer-2 adjacency which requires other technologies to adjoin layer-2 across layer-3 boundaries when this functionality is needed. It is important to note, if using network services at this layer such as firewalls or load balancers, additional devices will need to be purchased for the new distribution layer. All L3 routed East-West traffic in different distribution paired switches has to traverse the core.
  • Longer timelines for network changes and growth. Because of the previously outlined challenges, many organizations that have deployed three-tier networks proceed with caution when changes are needed.  This cautiousness is for good reason as these networks are highly complex and change often involves navigating physical, logical and sometimes even political boundaries.

ACI

Some aspects of the ACI architecture may look like a collapsed version of the traditional three-tier network design, however, it is actually a spine-leaf architecture based on the "Clos design" invented in 1938 by Edson Erwin and later formalized by Charles Clos in 1952.  Clos networks behave in a similar manner to crossbar switching designs utilized in modern modular switching platforms.  Apart from adapting the Clos architecture, ACI was one of the first reliable homogeneous software-defined network virtualization platforms and was released to the general public back on November 6, 2013.

In ACI, from the hardware perspective, there are three mandatory components.

  • The SPINES behave like fabric modules in traditional chassis-based switches, such as a Nexus 7k.  The spine provides an aggregation point for leaf switches in an ACI fabric; typically, each spine has only one connection to each leaf.  Additional bandwidth is achieved on the switch fabric by adding spines. Larger ACI networks typically utilize six spines per pod, with up to 24 per fabric.
  • The LEAVES behave like line cards in traditional chassis-based switches.  The leaves provide connectivity to endpoint devices as well as connectivity back to the spines on dedicated uplink ports.  Adding leaf switches is an option to increase capacity, dependent on the available ports on the spines.
  • The APICs are Application Policy Infrastructure Controllers and they behave like supervisory modules in traditional chassis-based switches.  Frequently grouped in three, these servers handle automated configurations, prescriptive configurations, telemetry of devices, management, and troubleshooting.

ACI uses both an underlay network and an overlay network.  The underlay network directly connects all of the hardware pieces, such as leaves and spines, to provide reachability for the overlay network.  The overlay network virtualizes network objects to permit any virtual network to exist anywhere in the data center fabric.  This design behaves like a single logical switch and removes the physical constraints of the three-tier architecture to provide the flexibility to place any physical or virtual workload anywhere in the data center fabric.  ACI's virtual layout also enables devices like firewalls or load balancers to be placed almost anywhere in the data center fabric through Layer 3 service insertion configuration so they can be utilized by any endpoint.

ACI is an entirely different way of thinking about networking. It is genuinely software-defined with an entirely new set of networking constructs and objects. For example, the VLANs, widely used in traditional networks, are replaced by objects like endpoint groups (EPGs) and bridge domains (BDs) in ACI to achieve similar functionality.

Another important concept of ACI is the default behavior of data plane flow outside of bit boundaries, in that it is closer to that of a firewall because all traffic is implicitly blocked until explicitly allowed ("white-listed"). A well designed ACI network provides a level of security by default that is not available in a three-tier network. 

ACI has no single point of failure. All fabric control is managed via the APICs, however, losing a single or all APICs will not stop the data plane of the network as APICs are control plane only. Endpoints, when multi-connected, will not be affected if a spine or leaf is lost preventing down time and instability. 

ACI provides true multitenancy, allowing network managers to segment business units or grant customers complete control over their tenant. When using multitenancy, deleting a tenant also removes all objects under the tenant, simplifying clean up.

For those managing an ACI network, reporting of failures via APIC quickly provides visibility into errors.

There can be some challenges around the adoption of ACI for network professionals. Those with traditional network backgrounds may find the foundational changes of constructs and objects within ACI challenge the way they "know" networks to behave. From and educational perspective, the teaching community may be able to assist in this area by focusing initially on overall ACI concepts rather than doing a technical deep dive on the interworkings of VXLAN, COOP, MP-BGP and IS-IS, as these technologies are abstracted via ACI from the network professional.

I believe ACI's most significant advantage is programmability.  Some have said the "A" in ACI should stand for "Automated" instead of "Application." The ACI community has created massive amounts of documentation that outline how to issue procedural, and API calls in the programing language of your choice. With this, it is possible to script the automation of network builds and management tasks so they can be executed in minutes rather than hours, days or weeks. 

DCNM

Alternatively, Cisco has released a similar fabric automation offering, DCNM (11.3 and higher). DCNM has been gaining traction recently because of the ease of setting up an EVPN (Ethernet Virtual Private Network) fabric and because it uses familiar NX-OS switches and command sets. 

A DCNM basic EVPN fabric consists of three primary components, outwardly similar to ACI but vastly different under the covers.

  • The SPINES behave more like physical aggregation points for all of the leaves.
  • The LEAVES provide endpoint aggregation and link to the spine.
  • The DCNM controller(s) provide an automation point for the fabric.

The basic design is very similar to most EVPN designs and ACI, for that matter.  In my opinion, when starting with a green-field environment, DCNM is even easier to install than ACI because using specific ports for uplink and downlink connections isn't required. In fact, particular switches aren't required for spines or leaves, as most switches that support DCNM can be either. There is also support for a wide range of Nexus 9000 switches using specific versions of NX-OS.

The DCNM EVPN fabric has an underlay network that uses OSPF, an overlay network comprised of VXLAN EVPN and uses MP-BGP as the control plane. This design allows for virtualization of layer-2 and layer-3 throughout the data center fabric.

DCNM behaves more like traditional networks with regards to security.  Every endpoint-to-endpoint communication within a VRF is allowed implicitly unless expressly denied, typically with a device such as a firewall.

DCNM does not have the same single logical switch strict homogeny of ACI, rather, it is a conglomeration of devices joined together by the controller.  This laxness lends itself to expanded design freedom, allowing a wider array of configurations. Customizations should be carefully balanced with the intended design of the fabric to minimize increased support costs.

DCNM has no true multitenancy solution out of the box.

DCNM quickly delivers excellent telemetry and reports on errors, helping the operations team identify problems and possible solutions.

DCNM automation capabilities are similar to ACI with a full RestFul API setup, allowing data center engineers to use the program of their choice to effect configuration changes rapidly.

Network professionals will find DCNM bridges the skills gap between the familiar concepts of NX-OS and the newer technology of EVPN. This allows for the creation of an EVPN fabric in a matter of minutes with even minimal training. 

Conclusion

When evaluating ACI and DCNM as part of a data center upgrade, there are several factors to consider as they each have advantages but are better suited to certain scenarios.

ACI is the best option for medium to massive environments but can also be effective in smaller environments when a small remote data center is needed and there is already a working ACI practice in place. Since ACI is designed from the ground up to easily scale and supports multitenancy, it is the logical choice for a larger or segmented organization. The default behavior of ACI is security, appealing to all organizations but is particularly beneficial in networks with grander presence. The automation capabilities are where the real benefit of ACI can be realized as large scale change can be easily accomplished with minimal effort. 

With ACI, the longer learning curve may delay adoption and higher costs can exceed budgets for smaller installations. In these environments, DCMN is a better solution.

In a small to medium environment, DCNM is straightforward to set up and easier to manage with in-house IT staff. DCNM can also be a solution for a larger data center environment if multitenancy is not required and at least two network staff members have a deep understanding of EVPN, BGP, MP-BGP, VRFs, OSPF and multicast.  DCNM in these larger and more complex environments is typically used as a configuration control device with telemetry for troubleshooting.

Technologies