3 VCF POCs in the ATC, What We Learned...
We have had many customers in WWT's Advanced Technology Center utilize our Proof of Concept (POC) testing to evaluate different Software-Defined Data Center (SDDC) solutions that would fit their production environment needs. In this ATC Insight, we expose information from three different POCs that we executed on with our customers. All three of the POC's mentioned in this ATC Insight demonstrate the flexibility of VCF (VMware Cloud Foundation) and show how it offers a variety of ways for organizations to run the infrastructure on-premise and in the cloud.
In This Insight
Quick Detail on VMware Cloud Foundation
Looking to take the next big step with your VMware Infrastructure? You've come to the right place. This ATC Insight covers three different Proof of Concepts (POCs) out of WWT's Advanced Technology Center (ATC), detailing some successes as well the potential pitfalls of VCF (VMware Cloud Foundation).
What is VCF? For those of you who are unfamiliar with VCF, here is a little background on the product…
You can think of VCF as an operating system for the datacenter. It is providing all of the necessary compute, storage, and IO (network) to the underlying hardware. Different than a regular OS, VCF is actually pooling resources from multiple sources together to create a cohesive environment to host the enterprise's applications. Additionally, VCF provides a framework for deploying infrastructure in a manner consistent with cloud providers and robust lifecycle management. These capabilities reallocate your infrastructure teams' valuable time to services that add value to the business, instead of base-level support tasks.
Under the covers, VCF is a combination of vSphere ESXi, vSAN, NSX-T, and SDDC Manager, at a minimum. You can add the vRealize components to the package as an option. While vSAN is required for the management domain (always) it is optional for the workload domains. For further reading on VCF, take a look at this Primer Article.
Purpose of this ATC Insight
The purpose of this ATC Insight is to show the flexibility of VCF in both private cloud and hybrid cloud use cases. Here, we will discuss the 3 different on-premises infrastructures: HPE Synergy, Dell VxRail, and Dell vSAN Ready nodes. Each one of these solutions operated a little differently from a setup aspect, but the VCF setup process is close to the same. There are some additional setup steps that are done on VxRail. (It is important to note that VCF on VxRail is not a customer-deployable solution; plus, some infrastructure interaction with VCF on VxRail is limited.)
It is also possible to deploy VCF where Fibre-Channel, NFS, or iSCSI storage (via VMware vSphere Virtual Volumes or vVols) is being used for the principal storage for VCF workload domains. While we don't have any of those use cases in our current POCs, we expect they will be more common moving forward.
The Proof of Concepts
POC Scenario One: VCF on Synergy
The Ask/POC Description:
In this specific POC, one of our government agency customers wanted to evaluate the integration of an existing HPE Synergy environment with VCF. Cloud foundation provides a composability feature that can connect to both HPE Synergy and Dell MX composable environments. Once the Redfish translation layer is configured you have the ability to compose and decompose servers from within the SDDC manager using the Synergy infrastructure's available resources. Speed of deployment and efficiency is increased by creating a single pane of glass within VCF. Monitoring of the OneView environment is also shown in this POC by VCF via the vRealize Operations.
HLD - (High-Level Design)
The initial VCF management cluster was deployed via Cloud Builder onto four Synergy 480 Gen10s. Once the vCenter environment was stood up, a small Linux VM was then deployed into the environment and the OneView redfish toolkit installed. With very minimal configuration the OneView Connector for VCF server was then registered within the SDDC Manager. This is found under SDDC Manager --> Administration --> Composable Infrastructure. Once connected to the OneView appliance the available resources populated in the SDDC. Compute, storage, and network resources available for use along with all the server profile templates were now visible. SDDC manager can now spin up and spin down ESXi servers without the need to log into OneView. While servers spin up in VCF you can actually see the same host profile being created within OneView. Kinda cool!
Now that the servers get built out in the SDDC manager, it's best to keep an eye on the hardware both in use and available with VCF. The solution is vRealize Operations! VCF can utilize the full vRealize Suite but for our purpose, we just needed Lifecycle Manager and vRealize Operations. After vROPs was built out, some nifty customized dashboards provided by HPE were uploaded which give great views of the OneView resources by color-coded heatmaps. I found it unfortunate there were no pre-built reports available but with a little work, these can be built to order.
The outcome of the POC was an 8 node environment running VCF 4.1 that could spin up and down additional nodes within SDDC manager using the composable infrastructure. Not only could all the available OneView resources be used but the hardware could be monitored from VCF via vRealize Operations. The setup and configuration of the composable infrastructure was relatively simple. The GUI was also easy to use. While the profile config abilities are limited and not nearly as agile as building the solution within OneView, I look forward to the future versions.
POC Scenario Two: VCF on VxRail
The Ask/POC Description:
In this particular POC, one of our customers in the timeshare and vacation destination business wanted to see how VCF and HCX could help condense some of their legacy data centers into a hybrid-cloud datacenter. As you can see in the diagram below, we built a legacy datacenter consisting of just a couple of hosts and some EMC fibre-channel storage. In between the legacy datacenter and the new datacenter, we injected latency using an IXIA device (65ms and Wan link simulation to 3GB) to simulate what it would look like when moving a VM across the wire from data centers at different locations.
For this POC we also leveraged our direct connection to VMC on AWS to simulate what that would look like for the customer when moving workloads using HCX to the cloud and back. This was a huge win for the customer because we took the requirements of latency and throughput and were able to give them a more real-world look and feel of how HCX would perform in their own environment.
In addition to the HCX discovery, we showed the customer how to effectively setup VCF on VxRail, explained the role of SDDC Manager, and demonstrated how to maintain a healthy VCF environment. Attached to this ATC Insight is a video that will show you the setup of a VxRail node during the RASR process, the setup of VxRail, as well as adding a VxRail cluster into VCF.
HLD - (High-Level Design)
At the end of this POC, the customer was not only able to see how to set up a VCF on VxRail, HCX, VMC on AWS but more importantly, they were able to get a good understanding of how they can start incorporating these new VMware technologies into their new data center design. The relationship that we built between the customer and VMWare teams was paramount to helping us drive innovation for this customer. The customer was then able to officially begin designing their new data center of the future.
POC Scenario Three: VCF on vSAN Ready Nodes
The Ask/POC Description:
In the final POC, one of the global financial customers we work with wanted to test their apps and services on VCF 4.0 before deploying the solution into production. This included the full suite of VMware applications, including but not limited to vRealize Operations, Automation, Log Insight, and Network Insight. The lab utilized NSX-T for network virtualization with overlay and underlay segments (per NSX-T's requirements), along with using BGP as the routing protocol and NSX-T's Edge Node services to communicate to the northbound physical network infrastructure. The domain consisted of two racks: the first rack contains the switching and compute hardware for the management cluster and the second rack contains the switching and computer for multiple workload clusters. Multiple workload clusters proved to be a valuable asset. They were used to test performance, features, and resiliency services, simultaneously, by assigning unique workload clusters to each test group that was interacting with the lab.
HLD - (High-Level Design)
This POC is ongoing and has produced great results for the customer. Knowing their services perform well on the VCF 4 platform will provide their IT departments with peace of mind when moving forward with the deployment of the production domains within their data centers.
What We've Learned
Through extensive testing and trial, we've learned some important lessons about what to do and what not to do when considering VCF. Here are some of those insights:
- Some hardware is worse than others.
- Pay special attention to hardware compatibility around vCenter Lifecycle Manager and vSAN. Lacking support for vLCM means that WLD lifecycle management can be very limited.
- Ensure that the hardware can enumerate the VMNICs as 0 and 1. Some hardware enumerates the first usable NIC as 1 and this will cause issues with WLD deployment.
- The Dell c6420 servers suffer from both of these pitfalls and should be avoided for customer deployments of VCF.
- Make sure your use case is valid.
- VCF is a lot more than Lifecycle Management. If a customer doesn't want/need NSX and is only buying VCF for the Lifecycle Management aspects; a hardware solution like VxRail or Synergy may meet the customer's needs more directly.
- If your cloud builder fails, there's a good chance that you have something wrong in your deployment parameters sheet. If it fails late in the process, you may need to reinstall your ESXi hosts and start over.
- Double-check your deployment parameters sheet.
- Then, double-check that again.
- Better yet, check it again.
- So many things can be wrong in the DPS. Make sure you don't copy/paste, check for extra spaces, look for incorrect subnets, bad passwords, etc.
- Don't use AVNs unless necessary.
- This is contrary to VMW's advice. But we have found that AVNs can have some unpredictable issues in deployment (such as being unable to share certificates across an L3 boundary unless the BGP peer has jumbo frames enabled--the rest of the internet doesn't work that way!).
- Today, AVNs are a placeholder for something to come later. It is an unnecessary complication…for now.
- It is possible to get VCF into a state where the entire environment needs to be rebuilt.
- Slow down and move cautiously -- even after deployment. Some (a LOT) changes cannot be reversed and VMW Support has limited ability to help you recover.
- If you make some of the bigger mistakes it may be faster to reinstall the entire SDDC than to work through the process with VMW Global Support and Product Engineering.
- There are no guardrails in vCenter for VCF. It is possible to make unsupported changes in vCenter with no warning that it will break VCF.
- vRealize Lifecycle Manager is very difficult to unwind from VCF if you push it via SDDC Manager (via an AVN) and have difficulties (such as with initial deployment of the VIDM).
- There can be internal inconsistencies with VCF components and CloudBuilder/SDDC Manager.
- A specific example is that its possible to pick a password for NSX-T that works for the bring-up process and is validated by CloudBuilder/SDDC Manager. However, that same password can be rejected by NSX-T later on and cause cascading failures around WLD deployment and other tasks within SDDC Manager.
- Pick long, secure passwords that you are sure will pass all requirements in each individual component.
- VCF on VxRail
- Currently, this is only deployed by professional services or partners (we expect this to be relaxed in the upcoming months to include customer deployments).
- Currently, there is no mixing of workload domains that are not VxRail. Meaning, all of your domains have to be VxRail from management to workload.
- All VxRail domains have to be set up like a traditional VxRail system, then imported into SDDC manager. For the first domain (management), you set up the VxRail system with an internal vCenter, then transition that vCenter to think it is external, then you deploy the SDDC Cloudbuilder VM to setup NSX-T and the rest of the VCF Suite.
- Unlike traditional setups with vSAN ready nodes, the setup of VCF on VxRail is a much more streamlined process, meaning there's a little less manual work. As long as the nodes are set to default it is pretty easy to stand up the management domain as VxRail. There are a few caveats, but overall it's an easy process.
- HCX (Hybrid Cloud Extension)
- HCX not only allows you to migrate from datacenter to cloud; it also allows you to migrate from datacenter to datacenter.
Call To Action
Having read this, our hope is to get you thinking about how VCF can help solve your infrastructure problems. Yes, there are some caveats with VCF setup, and yes, there can be some challenges along the way, but isn't that the case with most infrastructures you have to set up? As long as you know what challenges lie ahead, you will be able to prepare yourself, and in some cases, fix the problem before it has a chance to happen.
If you would like to work with WWT privately by attending a VCF workshop or have us conduct a POC in our award-winning lab ecosystem environment, our Lab Services/GET teams are ready to help get you on your way with VCF!
Spirent Network Emulator - The Spirent Network Emulator provides industry-leading flexibility in building and modeling these complex real-life systems enabling you to simulate networks and emulate the real-world conditions under which applications and platforms need to perform.
VxRail VCF Summary Video Highlights:
- 1:07 - 11:19 VxRail Rasr process
- 11:19 - 22:00 VxRail setup process
- 22:00 - end VxRail workload addition to VCF Cluster