Category: NSX

  • VCF 9.1 Makes VKS Harder to Ignore

    VKS VKS VCF 9.1

    VKS on VCF 9.1 What Actually Changed & Why It Matters

    A Comic Book Story in Seven Chapters

    Issue #01 · May 2026 · The VCF 9.1 Saga

    ⚡ Cast of Characters ⚡

    Captain VKS
    vSphere Kubernetes Service 3.6
    The hero. Born from vSphere, forged in CNCF conformance. Now powered up with VCF 9.1 abilities.
    The Architect
    Platform Engineer
    Our protagonist. Runs multi-domain VCF estates. Needs Kubernetes at enterprise scale without the circus.
    Cluster Creep
    The Villain of Sprawl
    Feeds on operational toil, slow provisioning, and fragmented toolchains. Grows stronger with every manual step.
    FINOPS
    The Oracle
    VCF Operations 9.1
    Sees all. Knows cost. Tracks every namespace. Speaks in metrics and FinOps.
    🥊 The Challengers 🥊
    $
    The Cloud Twins
    The Hyperscaler Duo
    They move fast and always whisper: “Just move to our cloud.” They charge per hour and never let go.
    The Red Baron
    The Opinionated Platform
    Arrives in full armor. Brings his own runtime, registry, mesh, and opinions about everything. Enterprise prices included.
    The Wrangler
    The Multi-Cluster Cowboy
    Rides across any ranch — any cloud, any edge, any distro. Freedom is his creed. But who’s managing the cattle?
    $ kubeadm
    Bare Knuckle
    The DIY Brawler
    No platform. No hand-holding. Bare metal, kubeadm, and grit. Cheap up front. Costs you in blood and 3 AM pages.
    Chapter 01The 37-Minute Nightmare
    37 MIN !! ?!
    The data center. 6:42 AM. The Architect stares at a provisioning timer that refuses to move. Cluster Creep watches from the shadows, feeding on frustration.
    The Architect 37 minutes to spin up a dev cluster. Thirty. Seven. Minutes. The hyperscaler team next door gets theirs in ten. The CTO is asking questions.
    Cluster Creep Yesss… and that’s just the deployment. Wait until you see the upgrade windows. I’ve got 45 minutes of downtime planned for each cluster. You have 200 clusters. Do the math. 😈
    That’s 150 hours of maintenance windows per upgrade cycle… across the fleet…
    VCF 9.1 DROPS MAY 5, 2026
    May 5, 2026. Broadcom releases VCF 9.1. And everything changes.
    Captain VKS Miss me? I brought Fast Deploy. Let me show you the new numbers.
    MetricVCF 9.0VCF 9.1
    Cluster Deploy Time37 min11 min (↓69%)
    Cluster Upgrade Time45 min15 min (↓67%)
    Max Clusters / Supervisor~100500
    Node Pool PlacementManualDRS Intelligent
    Chapter 02The Challengers Step Forward
    V S
    Word of VCF 9.1 spreads. Four challengers emerge from the fog, each claiming the throne of enterprise Kubernetes. The Architect has heard their pitches before.
    The Cloud Twins Adorable upgrade, Captain. But we’ve been doing sub-10-minute clusters for years. Managed control plane. Global regions. Auto-scaling node groups. Why fight gravity? Just come to the cloud.
    Captain VKS Sure — and your managed control plane costs how much per cluster per month? Multiply that by 500 clusters. Now add the egress fees. Now add the data sovereignty audit your CISO just mandated. I run on hardware you already own.
    The Red Baron How charming. You finally got CNI choice? I ship with my own SDN, my own service mesh, my own registry, my own CI/CD pipelines, and a full developer portal. I am the platform. You’re still assembling one.
    Captain VKS You are the platform. That’s the problem. Your opinions become my constraints. Your lifecycle becomes my upgrade treadmill. Your per-core subscription becomes my CFO’s nightmare. I give choice. You give mandates.
    The Wrangler Y’all are so cute with your single-vendor stacks. I run on any infrastructure. True multi-cluster freedom. No lock-in. Ever.
    Captain VKS Freedom is great until your team is maintaining six different infrastructure backends. I give you 500 clusters on one Supervisor with one operational model. You give them options and a prayer.
    Bare Knuckle I don’t need a platform. kubeadm, a Makefile, and raw skill. Zero licensing. Zero overhead. Pure Kubernetes.
    Captain VKS I respect the craft. But who patches your nodes at 2 AM? Who handles etcd backups? Who runs certificate rotation? Your “zero cost” platform costs three full-time engineers.
    The Architect I’ve evaluated all of you. Here’s my problem: I already run VCF. My VMs, NSX networking, vSAN storage, and security policies are all here. I need Kubernetes that joins my platform — not one that replaces it or ignores it.
    The best Kubernetes platform is the one that doesn’t make me build a second operations team…
    Chapter 03Fast Deploy — 11 Minutes or Bust
    Captain VKS explains what changed under the hood. Fast Deploy isn’t a marketing stunt — it’s an architectural rework of the provisioning pipeline.
    Captain VKS Here’s what actually happened. We parallelized the node bootstrapping sequence, pre-staged container images into a local content library, and eliminated redundant API round-trips during cluster init. 11 minutes, from API call to workload-ready.
    The Architect What about upgrades? That’s where we bleed. Every cluster upgrade is a maintenance window, and my team juggles 200+ clusters.
    Captain VKS 45 minutes down to 15. Pre-staged images, parallel node drain-and-replace, and Multiple Clusters per Zone means you keep workloads running on Zone A while upgrading Zone B.
    ⚡ IMPACT METER ⚡
    Provisioning Speed Gain
    69%
    Upgrade Speed Gain
    67%
    Scale Ceiling Increase
    5× (500 clusters)
    The Cloud Twins 11 minutes… fine, that’s competitive. But can you match our global availability zones?
    Captain VKS I don’t need 60 regions. My Architect’s data stays in his sovereign data center, on his hardware, under his compliance umbrella. Your 60 regions are 60 places his CISO has to audit.
    Chapter 04DRS Strikes Back — Intelligent Node Pool Placement
    GPU HOST AI ML NVMe HOST DB CACHE COMPUTE HOST WEB API DRS SCHEDULER
    VCF 9.1 introduces Intelligent Node Pool Placement. This isn’t basic affinity rules — it’s DRS-level scheduling applied to Kubernetes node pools.
    Captain VKS GPU pods → GPU hosts. NVMe workloads → NVMe nodes. DRS algorithm decides placement — not your YAML-wrestling platform team.
    The Red Baron I have Topology Manager, NUMA-aware scheduling, and a full operator ecosystem. Infrastructure-aware placement is table stakes for me.
    Captain VKS You schedule within the cluster. I schedule the cluster itself. DRS sees the whole estate. Your scheduler sees one namespace.
    The Oracle With VKS Cost Showback in VCF Operations 9.1, I can tell you exactly what each namespace, each cluster, each team is costing you. FinOps FOCUS-compliant.
    The Oracle I also expose an API for your RAG pipelines and MCP frameworks — your AIOps engine can query cost data directly.
    Per-NS
    Cost Attribution
    FOCUS
    FinOps Compliant
    Real-Time
    Pricing Estimates
    Show + Charge
    Back Capability
    Chapter 05Container-as-a-Service & The CNI Revolution
    CNI-A CNI-B CNI-C VKS
    VCF 9.1 introduces a simplified Container Service — deploy containers without deep Kubernetes expertise. Meanwhile, VKS 3.6 opens up CNI choice for the first time.
    Captain VKS First: Container-as-a-Service. Your app teams get a self-service surface. Click, deploy, done. No Supervisor clusters or ClusterClass YAML.
    Captain VKS Second: CNI freedom. VKS 3.6 deprecated ClusterBootstrap. Pick your CNI through the Addon Framework using AddonConfig CRDs. Antrea default, but the door is open.
    The Wrangler Oh, you’re just now letting people choose their CNI? Welcome to 2022, Captain.
    Captain VKS You let them choose. I let them choose with validated blueprints, lifecycle support, and a single vendor to call at 3 AM. Choice without support is just risk with extra steps.
    The Architect And the Ingress story? The popular open-source Ingress controller is being retired…
    Captain VKS Avi Load Balancer — natively integrated. Centralized control plane, distributed data plane, full observability. Plus vDefend gives you zero-trust lateral security for every pod.
    Chapter 06The Arena — Where Platforms Are Measured
    🛡️ ☁️ 🎩 🤠 🥊
    The Architect pulls up the scoreboard. No hype. No marketing. Just the dimensions that matter when you’re running Kubernetes in a regulated enterprise with 500+ VMs already on VCF.
    ⚔️ HEAD TO HEAD ⚔️
    Dimension 🛡️ Captain VKS ☁️ Cloud Twins 🎩 Red Baron 🤠 Wrangler 🥊 Bare Knuckle
    Data SovereigntyYour DCTheir DCYour DCDependsYour DC
    VM + K8s Unified OpsNativeSeparateSeparateSeparateSeparate
    Infra-Aware SchedulingDRS-LevelNode GroupsTopology MgrManualDIY
    Cluster Scale Ceiling500 / SupervisorUnlimited*Per InfraPer InfraPer Team
    Integrated FinOpsFOCUS NativeCost Explorer3rd Party3rd PartySpreadsheet
    Network SecurityvDefend + NSXVPC / SGBuilt-in SDNBYOBYO
    Licensing ModelPer-Core VCFPer-Cluster/HrPer-Core SubOpen SourceFree
    Day 2 ToilLowLowMediumMediumHigh
    AI / GPU ConformanceCNCF AI CertGPU PoolsOperatorsBYOBYO
    The Cloud TwinsWe still win on global reach and elastic scale.
    The Red BaronAnd I still own the developer experience story. Integrated CI/CD, GitOps, developer portal — out of the box.
    Captain VKS Fair. I’m not claiming I win everywhere. But for organizations already running VCF — I’m the only Kubernetes that doesn’t create a second operational island. VMs and containers. One platform. One team. One pane.
    The Architect That’s the point everyone misses. I don’t need the “best” Kubernetes in a vacuum. I need the best Kubernetes for my stack. And my stack is VCF.
    Chapter 07The Numbers Don’t Lie
    💥 THE FINAL SHOWDOWN 💥
    Broadcom surveyed 44 VCF 9 customers in March 2026. Here’s what they found — and why the challengers are looking over their shoulders.
    51%
    Less Infra Mgmt Time
    46%
    Less Monitoring Time
    47%
    Less Capacity Needed
    39%
    Faster MTTR/MTTI
    Cluster Creep No… NO! My sprawl… my complexity… my beautiful 37-minute deploy times… NOOOOO!
    ⚡ DEFEATED ⚡
    The challengers watch from the sidelines. They’re not defeated — but they know the game just changed.
    The Cloud TwinsWe’ll be back. Hybrid is where we’re heading too. See you at the edge…
    The Red BaronImpressive numbers. But developer experience is the next battlefield. Don’t get comfortable.
    The WranglerNot every ranch runs on one brand of fence. I’ll see you at the multi-cloud rodeo.
    Bare KnuckleSome of us still prefer the raw fight. But… 11 minutes is hard to argue with.
    The Architect VCF 9.1 gives me 11-minute deploys, 15-minute upgrades, 500 clusters per Supervisor, intelligent DRS-based node placement, native FinOps cost tracking, self-service CaaS, open CNI choice, native Avi ingress, and zero-trust pod security. All on the same VCF stack I’m already running.
    Captain VKS And I’m CNCF Kubernetes AI Conformant. The challengers are strong — I respect each of them. But none of them can do what I do: run Kubernetes as a native citizen of your existing VMware estate.
    VCF 9.1 doesn’t just iterate on VKS — it redefines the operational ceiling. Fast Deploy eliminates the provisioning tax. DRS-based placement removes manual scheduling toil. FinOps cost showback closes the last visibility gap. And with 500 clusters per Supervisor, VKS is the platform-scale Kubernetes runtime that VCF architects have been waiting for.

    The challengers each bring real strengths — managed simplicity, opinionated platforms, multi-cloud freedom, zero-cost entry. This isn’t a story where the hero has no flaws. But for the Architect running a VCF estate with VMs, containers, and AI workloads under one roof — the calculus is clear.

    The question is no longer “can VKS compete?” — it’s “what’s your excuse for not running it?”
    📚 Sources & References
  • Planning a VMware Cloud Foundation 9.0 Upgrade? Start Here…

    vmtechie.blog · Infrastructure Tools

    I Built a VCF Upgrade
    Path Planner
    — Here’s Why

    Tool: VCF Upgrade Path Planner Covers: 8 upgrade paths Target: VCF 9.0 / 9.0.2

    If you’ve ever had to plan a VMware Cloud Foundation upgrade from scratch, you know how scattered the information can be — KB articles here, TechDocs pages there, blog posts from different release cycles, and no single place that ties it all together into a clear, ordered sequence.

    That frustration is exactly what drove me to build the VCF Upgrade Path Planner. As someone who works with VCF environments day-to-day and runs vmtechie.blog to share practical infrastructure knowledge with the community, I wanted to create something that gives engineers a solid starting point before they walk into a maintenance window — a tool that reflects real-world upgrade sequencing, not just the high-level marketing overview.

    Example — vSphere 7.0 → VCF 9.0 upgrade journey

    This planner covers eight upgrade paths — spanning vSphere 7.0, 7.0 U2/U3, 8.0, and 8.0 U2/U3 converge routes to VCF 9.0, the VCF 5.0 and 5.1/5.2 in-place upgrade paths, the 9.0.0/9.0.1 to 9.0.2 maintenance path, and a current-state check for VCF 9.0.2 — all linked directly to official Broadcom Knowledge Base articles, TechDocs pages, and VMware blog posts so you can verify every recommendation against authoritative source material.

    All 8 Upgrade Paths Covered
    §

    Why I Built This

    If you’ve ever had to plan a VMware Cloud Foundation upgrade from scratch, you know how scattered the information can be. KB articles here, TechDocs pages there, blog posts from different release cycles, and no single place that ties it all together into a clear, ordered sequence. That frustration is exactly what drove me to build the VCF Upgrade Path Planner. As someone who works with VCF environments day-to-day and runs vmtechie.blog to share practical infrastructure knowledge with the community, I wanted to create something that gives engineers a solid starting point before they walk into a maintenance window — a tool that reflects real-world upgrade sequencing, not just the high-level marketing overview.

    This planner covers eight upgrade paths spanning vSphere 7.0, 7.0 U2/U3, 8.0, and 8.0 U2/U3 converge routes to VCF 9.0, the VCF 5.0 and 5.1/5.2 in-place upgrade paths, the 9.0.0/9.0.1 to 9.0.2 maintenance path, and a current-state check for VCF 9.0.2 — all linked directly to official Broadcom Knowledge Base articles, TechDocs pages, and VMware blog posts so you can verify everything against authoritative source material. A significant amount of research, testing, iteration, and community review has gone into getting the sequencing, version gates, and critical warnings right. That said, VCF is a complex and fast-moving platform, and I’m one person — so if you spot a step that’s missing, a version gate that’s wrong, or guidance that doesn’t match your experience in the field, please reach out and let me know. Every piece of feedback makes this tool better for everyone in the community.

    🔗

    Everything is sourced

    Every step links directly to the relevant Broadcom KB, TechDocs page, or VMware blog post so you can verify each recommendation against authoritative source material before acting on it.

    ⚠️

    Critical gates are flagged

    Version gates, one-way doors, and ordering requirements — like the Aria Operations 8.18 gate, the NSX Edge OVF certificate expiry fix in 9.0.2, and the mandatory vLCM Baseline-to-Image transition — are surfaced prominently, not buried in footnotes.

    §

    How We Calculate Time, Risk & Effort

    The complexity numbers shown in each upgrade path — estimated duration, risk score, and effort score — are not pulled from a vendor SLA document. They are practical estimates built from field experience with VCF environments of varying sizes and community input from engineers who have executed these upgrades in production. Here is how each metric is derived.

    Duration
    4–8w
    weeks estimated
    Risk Score
    50
    out of 100
    Effort Score
    65
    out of 100

    Duration

    Estimated based on the number of sequential phases in the path, the number of components that require ordered upgrades (SDDC Manager → NSX → vCenter → ESXi is always serial, never parallel), and the realistic time each component upgrade takes in a mid-sized environment. Converge paths from vSphere carry additional time for pre-converge remediation, vLCM Baseline-to-Image transitions, and the VCF Installer workflow itself. Paths starting from VCF 5.0 carry extra time for the mandatory VCF 5.2 intermediate hop. These are conservative estimates — your actual duration will vary based on node count, hardware speed, precheck findings, change management windows, and whether you are running a lab or a production fleet.

    💡

    What is RDU (Reduced Downtime Upgrade)?

    Starting with VCF 9.0, vCenter upgrades exclusively use Reduced Downtime Upgrade (RDU). Instead of upgrading in-place and taking the existing vCenter offline for the full duration, RDU deploys a brand-new temporary vCenter appliance alongside the existing one, migrates all configuration and inventory data across while the environment stays running, then decommissions the old appliance. The result is a much shorter management plane outage — typically just a few minutes for the final cutover rather than the extended downtime of a traditional in-place upgrade. In VCF 9.0.1+, the Installer automatically assigns a 169.254.x.x link-local IP address for the temporary appliance, so you no longer need to pre-stage a static IP on your management network in most environments. RDU is only required for major version jumps (e.g. 8.x → 9.x) — within-9.x maintenance updates use a regular in-place upgrade with no temporary appliance needed.

    Risk Score

    A relative measure from 0 to 100 that reflects how many irreversible transitions the path contains, how many components must be upgraded in strict sequence, and how much room there is to safely roll back if something goes wrong. A vSphere 7.0 converge path scores higher risk not because converge is inherently dangerous, but because it involves more one-way doors — once the VCF Installer runs and creates the management domain, you cannot unconverge back to standalone vSphere. Maintenance paths like 9.0.0 to 9.0.2 score low risk because they involve fewer components, shorter windows, and well-understood rollback via snapshot.

    Effort Score

    Reflects the total planning and execution workload — number of discrete steps, number of decisions that require engineer judgment rather than automation, number of separate maintenance windows required, and the degree of documentation and preparation needed before you can safely begin. A vSphere 7.0 to VCF 9.0 path scores high effort not because any single step is especially hard, but because the cumulative preparation — HCL checks, Baseline-to-Image transitions, ELM removal, VCF Installer staging, Aria Suite pre-work, workload domain imports — adds up to a substantial project even before the first upgrade window opens.

    ⏱️
    Duration Factors
    • Sequential component count
    • Intermediate hops required
    • Pre-converge remediation
    • Workload domain count
    • Aria Suite pre-work
    🎯
    Risk Factors
    • One-way door transitions
    • Rollback constraints
    • NSX version direction rules
    • vCenter RDU complexity
    • ELM removal requirements
    🏗️
    Effort Factors
    • Total discrete steps
    • Judgment calls required
    • Separate change windows
    • Documentation prep
    • Depot configuration work
    Upgrade Path Duration Risk Effort Risk Bar

    All three scores scale relative to each other across the eight paths, so they are most useful as a comparison tool — if you are deciding between targeting VCF 9.0.0 or 9.0.1, or choosing whether to converge from vSphere 8.0 U3 versus waiting to patch to U3 first, the scores give you a quick read on the relative complexity trade-off. They are starting points for your own planning conversation, not guarantees — always validate your specific environment against official Broadcom documentation and run the SDDC Manager upgrade prechecks before committing to a maintenance window.

    §

    A Community Tool

    VCF is a complex and fast-moving platform, and I’m one person. A significant amount of hardwork has gone into building and refining this planner — cross-referencing every step against official Broadcom documentation, KB articles, and VMware engineering blog posts, running it through multiple review cycles, and iterating on the content based on community feedback. But if you spot a step that’s missing, a version gate that’s wrong, or guidance that doesn’t match your experience in the field, please reach out and let me know. Drop a comment below or contact me directly — every piece of feedback makes this tool better for everyone in the community.

    Spotted something missing or incorrect?

    Drop a comment below or reach out directly. Your field experience makes this tool better for the whole community.

    Leave Feedback ↓
    🚀

    Try the VCF Upgrade Path Planner

    Open the tool directly on vmtechie.blog and generate your tailored upgrade plan in seconds.

    Open the Planner →

  • How the VCF 9 Fleet Sizer Actually Works

    How the VCF 9 Fleet Sizer Actually Works

    A complete walkthrough of every calculation behind the tool — from raw NVMe capacity to ESA protection factors, NVMe memory tiering, and VCF licence entitlement. No black boxes.


    Table of Contents

    1. What the tool sizes
    2. Host specification inputs
    3. Management VM stack
    4. Compute sizing formula
    5. vSAN ESA storage pipeline
    6. Protection policies & PF table
    7. Final host count & limiter
    8. NVMe memory tiering
    9. External storage mode
    10. VCF licence entitlement
    11. Principal storage options (KB 416270)
    12. Assumptions & caveats

    1. What the tool sizes

    The VCF 9 Fleet Sizer calculates the minimum number of ESXi hosts required across a VMware Cloud Foundation deployment — one Management Domain and any number of VI Workload Domains. For each domain it independently determines whether CPU, memory, or storage is the binding constraint, and returns the host count driven by the most demanding dimension.

    The sizer is built specifically for VCF 9 with vSAN ESA — the Express Storage Architecture that requires NVMe-only drives and operates as a single storage tier without a separate cache/capacity split. It also models external storage mode (Fibre Channel, NFS) where hosts are sized on compute and memory only, and a disaggregated NVMe memory tiering model unique to VCF 9.

    ⚠️ Planning aid only — not an official Broadcom tool. All outputs are estimates based on the inputs you provide. Validate every design against official Broadcom documentation, the VMware HCL, and field engineering guidance before procurement or deployment. Real-world DRR and vSAN overheads vary significantly by workload.


    2. Host specification inputs

    Every domain (management and each WLD) has an independent host specification. The tool does not assume all hosts are identical across domains — a management cluster might run 2×16c hosts while a production WLD uses 2×32c AI-optimised nodes.

    InputDefaultUsed inNotes
    CPU Qty2Core count, licensingSockets per host
    Cores per CPU16Core count, licensingPhysical cores — no hyperthreading multiplier applied
    RAM (GB)1,024Memory sizingTotal usable host RAM
    NVMe Qty6Storage sizingNVMe drives per host (vSAN ESA only)
    NVMe Size (TB)7.68Storage sizingTB decimal — converted to GB via ×1,000
    CPU OversubscriptionUsable vCPUvCPU:pCPU ratio — applies before reserve
    RAM OversubscriptionUsable RAM1× = no oversubscription. Rarely exceed 1× for RAM
    Compute Reserve %30%Usable vCPU & RAMHeadroom withheld from placement (HA, overhead)

    Raw capacity per host formulas:

    Host Cores = CPU Qty × Cores per CPU
    Raw GB per Host = NVMe Qty × NVMe Size (TB) × 1,000

    ⚠️ No hyperthreading multiplier. The sizer deliberately does not multiply physical cores by 2 for hyperthreading. Logical thread counts are workload-specific and highly variable. Instead, the CPU oversubscription ratio gives you explicit control. A 2× ratio on a 32-core host models the same headroom as a 64-thread count at 1× — but you’re aware you’re making that choice.


    3. Management VM stack

    The Management Domain hosts a fixed stack of VCF infrastructure VMs. These are not user workloads — they are the control plane. Their combined vCPU, RAM, and disk demand is the entire sizing input for the management cluster. The tool carries an accurate per-component VM stack based on current VCF 9 T-shirt sizes from Broadcom documentation.

    ComponentSizesvCPU rangeRAM rangeDisk range
    vCenter Server (Mgmt)S / M / L / XL4 – 2421 – 58 GB694 – 2,283 GB
    NSX ManagerM / L / XL6 – 2424 – 96 GB300 – 400 GB
    NSX EdgeS / M / L / XL2 – 164 – 64 GB200 GB
    NSX Global ManagerS / M / L / XL4 – 2416 – 96 GB300 – 400 GB
    Avi Load BalancerS / M / L8 – 2424 – 48 GB128 – 512 GB
    vCenter Server (WLD)S / M / L / XL4 – 2421 – 58 GB694 – 2,283 GB
    VCF Operations (SDDC Mgr)S / M / L / XL4 – 2416 – 128 GB274 GB
    VCF Operations CollectorS / M2 – 48 – 32 GB144 GB
    VCF Operations for LogsS / M / L12 – 4824 – 96 GB1,590 GB
    VCF Operations for NetworksL / XL / XXL12 – 4824 – 96 GB1,590 GB
    VCF Net. CollectorM / L / XL / XXL4 – 1612 – 48 GB200 – 300 GB
    Identity ManagerEmbedded / HA0 – 320 – 64 GB0 – 400 GB

    Management sizing is deterministic: configure your component sizes, and the tool sums the total vCPU, RAM, and disk demand — no workload VM estimates needed.


    4. Compute sizing formula

    For Workload Domains, tenant demand is specified as VM count × per-VM averages for vCPU, RAM, and disk. Infrastructure VMs (NSX Edges, VKS Supervisor nodes) can optionally be included in the WLD demand totals. All demands are then sized against the host specification to determine the compute host floor.

    WLD demand totals:

    Demand vCPU = (VMs × vCPU/VM) + Infra vCPU
    Demand RAM = (VMs × RAM/VM) + Infra RAM
    Demand Disk = (VMs × Disk/VM) + Infra Disk

    Usable capacity per host:

    Usable vCPU/host = Host Cores × CPU Oversub × (1 − Reserve%)
    Usable RAM/host = Host RAM × RAM Oversub × (1 − Reserve%)

    Compute host floors (evaluated independently):

    CPU Hosts = ⌈ Demand vCPU / Usable vCPU per host ⌉
    RAM Hosts = ⌈ Demand RAM / Usable RAM per host ⌉

    Example: 200 VMs × 4 vCPU = 800 vCPU demand. Host: 2×16c = 32 physical cores × 2× oversub × 0.70 reserve factor = 44.8 usable vCPU/host. CPU Hosts = ⌈ 800 / 44.8 ⌉ = 18 hosts.


    5. vSAN ESA storage pipeline

    vSAN ESA storage sizing is a sequential pipeline of capacity transformations. Each stage adds overhead for a specific reason. Starting from raw VM disk demand, the pipeline applies data reduction, swap space, protection overhead, free space reserve, and growth buffer — in that order — to arrive at the total raw capacity required and therefore the storage host floor.

    Pipeline stages:

    Step 1 — VM Capacity GB = Demand Disk GB ÷ DRR
    (DRR = Dedup Ratio × Compression Ratio)
    Step 2 — Swap GB = Demand RAM GB × VM Swap%
    (100% for mgmt, configurable for WLD)
    Step 3 — Interim GB = VM Capacity GB + Swap GB
    Step 4 — Protected GB = Interim GB × Protection Factor (PF)
    Step 5 — With Free GB = Protected GB × (1 + vSAN Free%)
    Step 6 — Total Required = With Free GB × (1 + Growth%)

    Storage host floor:

    Effective Hosts = Total Hosts − Failures to Tolerate
    Per-Host Requirement = Total Required GB ÷ Effective Hosts
    Storage Hosts = ⌈ Total Required GB / Raw GB per Host ⌉ + Failures

    Data Reduction Ratio (DRR)

    The tool splits DRR into two separate inputs: Dedup Ratio and Compression Ratio. DRR = Dedup × Compression. Both default to 1.0 (no reduction) because real-world ratios depend entirely on data entropy — databases compress poorly, VDI golden images deduplicate extremely well. Using optimistic DRR values leads to undersized storage clusters.

    ⚠️ DRR above 2.0 is optimistic. Unless you have measured DRR from an equivalent workload in your environment, keep both ratios at 1.0. A DRR of 2.0 halves your storage host count. If the real-world ratio comes in at 1.2, you’ll need significantly more hosts than planned.

    TiB conversion

    The tool uses binary TiB throughout. NVMe drives are marketed in TB decimal (1 TB = 1,000 GB). Conversion: 1 TB = 1,000 GB = 0.9095 TiB. A 6× 7.68 TB host = approximately 41.9 TiB raw per host after conversion.


    6. Protection policies & PF table

    The Protection Factor (PF) is the storage overhead multiplier applied to usable data to account for redundancy. It is determined by your chosen RAID type, FTT (Failures to Tolerate), and for RAID-5, the stripe width. The tool enforces the minimum host count per policy.

    PolicyPFMin HostsFTTNotes
    RAID-5 2+1 FTT=11.50x31Default — best balance of protection and efficiency
    RAID-5 4+1 FTT=11.25x61Lower overhead but needs 6+ hosts
    RAID-6 4+2 FTT=21.5x62Two simultaneous drive failures tolerated
    Mirror FTT=12.x31Simple mirror — highest rebuild performance
    Mirror FTT=23.×52Three copies of every object
    Mirror FTT=34.×73Maximum redundancy — very high storage cost

    7. Final host count & limiter

    The final host count is the maximum across four independent floors: CPU hosts, RAM hosts, storage hosts, and the policy minimum. The tool identifies which floor is binding and labels it the Limiter.

    Final Hosts = max( CPU Hosts, RAM Hosts, Storage Hosts, Policy Min )
    LimiterMeaningCommon cause
    ComputeCPU is the binding constraintHigh vCPU density, low oversub ratio
    MemoryRAM is the binding constraintMemory-intensive workloads, RAM oversub at 1×
    StoragevSAN ESA capacity drives the countLarge disk demand, high PF, low DRR, insufficient NVMe
    PolicyProtection policy min host countSmall cluster — compute fine but policy enforces minimum N hosts

    When storage is the limiter, your NVMe capacity per host is insufficient to hold the protected dataset within the compute-determined host count. Solutions: increase NVMe drive count or size, relax the vSAN free% reserve, or accept a higher host count.


    8. NVMe memory tiering (VCF 9)

    VCF 9 introduces NVMe-backed memory tiering, where fast NVMe drives act as a memory extension. A partition of each NVMe drive is set aside as a memory tier — not storage — allowing effective RAM per host to exceed physical DRAM installed. This can reduce the host count when memory is the sizing constraint.

    Tiering formulas:

    Partition GB = min( Drive GB, DRAM × NVMe Ratio, 512 GB cap )
    NVMe Ratio Used = Partition GB ÷ Host DRAM GB
    Effective Host RAM = Host DRAM × (1 + NVMe Ratio Used)
    Tiered Demand R = ( Eligible Demand ÷ (1 + NVMe Ratio Used) )
    + Ineligible Demand

    Key inputs: Eligibility % (what fraction of workload is not latency-sensitive), NVMe-to-DRAM ratio (GB of NVMe tier per GB of DRAM), and tier drive size (separate from vSAN data drives). The effective RAM and reduced demand figure feed back into the RAM host floor calculation.

    ⚠️ Tiering caveats. NVMe tiering suits read-heavy workloads with temporal locality. It is not appropriate for latency-sensitive databases, real-time analytics, or anything where memory bandwidth consistency matters. The eligibility % input requires honest assessment of your workload mix.


    9. External storage mode

    Both the Management Domain and each WLD can be toggled to External Array mode — modelling Fibre Channel or NFS as principal storage. In this mode, the vSAN ESA storage pipeline is bypassed entirely. Host count is determined by compute only, and the user supplies an estimated array capacity for documentation.

    Final Hosts (ext) = max( CPU Hosts, RAM Hosts, Policy Min )
    — Storage floor is removed

    The Limiter can only be Compute, Memory, or Policy. No ESA capacity, PF, or per-host storage figures are calculated for external domains.

    Entitlement impact

    Every VCF core licence includes 1 TiB of vSAN raw storage entitlement. When a domain runs external storage, those cores are still licensed at the same cost but the bundled vSAN storage is unused.

    Forfeited TiB = Licensed Cores × 1 TiB/core

    For a 10-host domain with 2×32c hosts, that’s 640 TiB of vSAN entitlement forfeited — storage the customer is paying for but not using. The tool surfaces this inline, in the Fleet License Summary, and in the export report so the commercial impact is visible before procurement conversations begin.


    10. VCF licence entitlement calculation

    VCF 9 is licensed per core. The tool calculates total core count across the fleet and derives the vSAN storage entitlement bundled with those licences.

    Mgmt Cores = Mgmt Hosts × Host Cores
    WLD Cores = Σ( WLD Hosts × Host Cores )
    Entitlement (TiB) = ( Mgmt Cores + WLD Cores ) × 1 TiB/core
    Fleet vSAN Raw TiB = Σ( Hosts × NVMe Qty × NVMe TB × 0.9095 )
    Add-on Required = max( 0, Fleet Raw TiB − Entitlement TiB )

    If raw capacity exceeds entitlement, the difference is flagged as Add-on TiB Required — additional vSAN capacity licensing needed beyond what’s included in core licences. External storage domains exclude their array capacity from the fleet raw total.


    11. Principal storage options in VCF 9 (KB 416270)

    VCF 9 supports a broader set of principal storage options than previous versions. Some are available via standard greenfield workflows; others require the Converge workflow. This distinction matters — it affects automation, LCM, and Day 2 operations.

    Storage ModelMgmt DefaultMgmt AdditionalVI WLDMethod
    vSAN ESAPrincipalPrincipalPrincipal🟢 Greenfield
    vSAN OSAPrincipalPrincipalPrincipal🟢 Greenfield
    Storage Cluster (disagg. vSAN)PrincipalPrincipal🟢 Greenfield
    Compute-Only ClusterPrincipalPrincipal🟢 Greenfield
    Fibre Channel (FC)PrincipalPrincipal + SuppPrincipal + Supp🟢 Greenfield
    NFS v3PrincipalPrincipal + SuppPrincipal + Supp🟢 Greenfield
    iSCSIPrincipal*Principal*Principal*🔄 Converge
    NFS v4.1Principal*Principal*Principal*🔄 Converge
    FCoEPrincipal*Principal*Principal*🔄 Converge
    NVMe/FC · NVMe/TCP · NVMe/RDMAPrincipal*Principal*Principal*🔄 Converge

    * Via Converge workflow: deploy ESXi 9 → configure target datastore → deploy vCenter 9 → import into VCF 9 using Converge (management) or Import vCenter (WLD).

    ⚠️ Day 2 operations constraint: For non-LCM Day 2 operations (host commissioning, adding/removing hosts or clusters), perform the operation in vCenter first, then run Sync Inventory in VCF Operations. If this step is skipped, lifecycle management in VCF Operations will be blocked for those hosts and clusters.

    Source: Broadcom KB Article 416270


    12. Assumptions & caveats

    AssumptionDetail
    Single cluster per domainEach WLD is modelled as one cluster. Multi-cluster WLDs are not supported.
    Homogeneous hostsAll hosts within a domain use the same spec. Mixed-node clusters are not modelled.
    vSAN ESA onlyThe storage pipeline models ESA only. vSAN OSA has different overhead characteristics.
    Growth is a flat bufferGrowth % is applied once, not compounded year-over-year. Add headroom manually for multi-year plans.
    VM Swap fixed at 100% for mgmtThe management domain’s swap requirement is not user-configurable.
    No stretched cluster modellingStretched clusters double host count and require witness nodes — not currently modelled.
    Flat DRR across all dataA single DRR applies to the entire disk demand. Mixed workloads with varying compressibility are not modelled per-VM.
    No explicit vSAN CPU/RAM overheadvSAN ESA consumes a small amount of host CPU and memory. Include this in your Compute Reserve % input.

    🚫 Not an official Broadcom tool. This sizer is an independent planning aid built by vmtechie.blog. It is not endorsed by or affiliated with Broadcom. All figures are estimates. Validate every design against official Broadcom TechDocs, VMware HCL, and field engineering guidance before procurement or deployment.

  • VCF 9 Fleet Planning Sizer

    VCF 9 Fleet Planning Sizer

    After several VCF design sessions—navigating management domains, ESA policies, and the new core-based licensing—one thing became clear: we have plenty of docs, but we need more interactive clarity. I built the VCF 9 Fleet Planning Sizer (ESA Only) to help architects model environments quickly.

    🔷 VCF 9 Fleet Planning Sizer (ESA Only)

    👉 Try it here: https://sizer.vmtechie.blog/

    This is an independent planning calculator designed to help architects model:

    • Infrastructure VM footprint (Supervisor, Edge, etc.)
    • Management Domain sizing
    • Multiple Workload Domains
    • ESA storage behavior
    • DRR (Dedup × Compression realism)
    • Failure domain modeling (0 / N+1 / N+2)
    • Core-based licensing visibility
    • vSAN entitlement vs raw consumption

    Why I Built This Tool

    Designing VCF 9 isn’t just about adding up VMs. It’s about navigating the “Triple Constraint”: Compute, ESA Storage, and Licensing. In real architecture discussions, we constantly ask:

    • What is actually limiting this cluster?
    • CPU, Memory, or Storage?
    • How many hosts do we really need?
    • What does FTT=2 + RAID-6 really do to capacity?
    • Are we oversizing?
    • Are we license constrained?
    • What happens if I add Supervisor HA?
    • What does N-2 failure tolerance mean in practice?

    Spreadsheets can answer parts of this, but they don’t show the dynamic interaction between policy, compute, and ESA, This tool tries to do that.

    Management Domain Sizing

    The calculator starts with:

    🔹 Hardware Profile

    • CPUs per host
    • Cores per CPU
    • RAM per host
    • NVMe quantity & size
    • Minimum host count

    🔹 Policy Inputs

    • CPU oversubscription
    • Memory oversubscription
    • Host reserve %
    • FTT & RAID policy
    • vSAN free space %
    • Dedup & compression
    • VM Swap Used %
    • Failure modeling

    How It Calculates Management Hosts

    1. Compute usable vCPU per host
    2. Compute usable RAM per host
    3. Apply reserve factor
    4. Compare demand from full Management VM stack
    5. Determine limiter (Compute / Memory / Storage)
    6. Calculate ESA protected storage requirement
    7. Apply failure domain logic
    8. Final host count = max(CPU, RAM, Storage, Minimum Hosts)

    You immediately see:

    • Demand vs Capacity
    • Protection Factor
    • ESA storage breakdown
    • Core licenses required
    • Raw TiB consumed

    Full Management VM Stack Modeling

    The tool includes:

    • SDDC Manager
    • vCenter
    • NSX Manager
    • NSX Edge
    • AVI
    • VCF Operations
    • Log Insight
    • Network Insight
    • Identity
    • Custom VMs

    Each with T-shirt sizing.

    ESA Storage Model

    ESA math is often misunderstood,The calculator models:

    VM Capacity = (VM disks + infra disks) / DRRSwap = Provisioned RAM × Swap %Interim Total = VM Capacity + SwapProtected = Interim × Protection Factor+ Free Space Reserve+ Growth %Storage Hosts = ceil(total / per-host raw capacity + failures)

    Protection Factor examples:

    PolicyFTTProtection Factor
    RAID-112.0
    RAID-123.0
    RAID-511.5
    RAID-521.75
    RAID-621.5

    Workload Domains (Where It Gets Interesting)

    You can add multiple WLDs.

    Each WLD has:

    🔹 Tenant Demand

    • VM count
    • vCPU per VM
    • RAM per VM
    • Disk per VM
    • Growth %

    🔹 Policy + Planning

    • CPU/Mem oversub
    • FTT + RAID
    • Reserve %
    • Free space %
    • Dedup × Compression
    • VM Swap Used %
    • Failure Domain (0 / N+1 / N+2)

    Limiter Visualization + Health Model

    Each WLD shows:

    • Compute limiter
    • Memory limiter
    • Storage limiter
    • Utilization %
    • Health badge:
      • 🟢 Healthy
      • 🟡 Tight
      • 🔵 Oversized

    This gives immediate architectural intuition.

    Licensing Visibility (Core-Based)

    The calculator also models:

    • Management core licenses
    • Workload core licenses
    • Total fleet cores
    • Entitlement (1 TiB per core)
    • Required add-on capacity

    What Makes This Different?

    This tool is:

    ✔ ESA-focused
    ✔ Policy-aware
    ✔ Failure-domain realistic
    ✔ Multi-domain capable
    ✔ Licensing visible
    ✔ Architecture-driven

    It’s not just math. It reflects real design conversations.

    ⚠️ Important Disclaimer

    This calculator is:

    • Independent
    • Not an official Broadcom / VMware tool
    • Not endorsed by my employer
    • Intended as a planning aid only

    Always validate against:

    • Official documentation
    • HCL
    • Field engineering guidance

    🧑‍💻 Who Is This For?

    • VCF Architects
    • Cloud Platform Leads
    • Infrastructure Engineers
    • Pre-sales Architects
    • Capacity planners
    • Anyone doing ESA-based VCF 9 designs

    🚀 Try It

    👉 Live here:

    https://sizer.vmtechie.blog

    If you test it, I’d love feedback

    Final Thoughts

    Architecture clarity reduces risk.This tool is my contribution to making VCF 9 planning:

    More transparent.
    More realistic.
    More engineer-friendly.

  • VCF 9 – Updating the Supervisor Service

    VCF 9 – Updating the Supervisor Service

    Supervisor and VKS clusters are built using a common Kubernetes distribution core, but their Kubernetes versions are delivered differently. Starting with VCF 9, Supervisor Kubernetes releases are delivered independently of vCenter. You can update the Supervisor version by deploying a release from the Supervisor Content Library. In this blog post, we will walk through the Supervisor update process step by step. Let’s get started!

    Create and Configure a Subscribed Content Library for Supervisor Images

    For vSphere Supervisor, VMware publishes Supervisor images through a content delivery network (CDN). To enable or upgrade vSphere Supervisor, you can create a Subscribed Content Library that synchronizes with the Supervisor release images.

    You can configure the content library in either Immediate or On-Demand synchronization mode. Note that immediate synchronization from the public CDN may require more time and consume additional disk space.

    • Log in to vCenter as a vSphere administrator.
    • From the Home menu, select Content Libraries
    • Click Create
    • Provide a name for the library (for example, supervisor update library) and click Next.
    • On the Configure Content Library page, select Subscribed Content Library.
    • In the Download content section, select the synchronization mode of the content library and click Next
    • When prompted, accept the SSL certificate thumbprint.The thumbprint will remain stored on your system until the subscribed content library is removed from the inventory
    • Apply Security Policy click Next
    • On the Add storage page, select a datastore as a storage location for the content library contents and click Next.
    • Review the details and click Finish

    Assign the content library to the vSphere Supervisor platform

    • on vCenter go to Home menu, select Supervisor Management
    • Select Content Distribution.
    • On the Supervisor Images Library card, click Assign
    • Select the Content Library that created above and click Assign
    • The new content library begins synchronizing, which may take some time. After synchronization is complete, the new Supervisor Kubernetes versions included in the images will appear under the Updates tab

    Apply Updates

    • Select the Available Version you want to update to. For example: v1.30.10+vmware.1-fips-vsc9.0.0.0100. ⚠️ Updates must be applied incrementally. You cannot skip versions (e.g., upgrading directly from 1.28 to 1.30). The correct sequence is 1.28 → 1.29 → 1.30.
    • Select a Supervisor to update and click Apply Updates

    The system runs a series of pre-checks to verify the compatibility of the different components against the Supervisor Kubernetes version to which you want to update.

    Learn which are the pre-checks that are run before updating the supervisor and how to troubleshoot in case of errors resulting from failed pre-checks, HERE

    When the pre-checks are completed successfully, you can update the Supervisor.

    Upgrading the VMware vSphere Supervisor service is a critical step in maintaining a secure, stable, and feature-rich VMware Cloud Foundation environment. By following best practices—planning incremental updates, leveraging subscribed content libraries, and validating compatibility at every stage—administrators can ensure minimal downtime while keeping workloads and Kubernetes clusters up to date. Regular Supervisor upgrades not only enhance platform capabilities but also strengthen the foundation for running modern applications, containers, and cloud-native services efficiently and reliably.

  • VCF Automation – Tenant Management

    VCF Automation – Tenant Management

    In today’s multi-tenant cloud environments, VMware Cloud Foundation Automation (VCFA) offers a robust layered architecture that seamlessly bridges enterprise-grade infrastructure management with developer-ready self-service capabilities.

    By clearly separating responsibilities—from VMware Cloud Service Providers who manage the physical and virtual infrastructure, to organization administrators who allocate resources, and finally to developers who consume them—VCFA enables efficient resource governance, operational consistency, and scalability. This structured approach not only supports multi-tenancy and workload isolation but also accelerates innovation by empowering end users to deploy applications and services quickly within well-defined boundaries.

    Why Tenant Management Matters?

    Tenant management is more than just dividing resources—it’s about ensuring cost efficiency, security, scalability, and compliance in a shared infrastructure. In VCFA, these capabilities allow VMware Cloud Service Providers to maximize utilization without compromising performance or governance for individual tenants.

    Key concepts to understand from both the Provider and Tenant perspectives:

    Projects

    Projects control user access to namespaces and user ownership of provisioned resources. All organizations are created with a default project. The default project is empty and does not have any namespaces or users.

    Example: A VMware Cloud Service Provider might assign a dedicated project to each customer department for clearer billing and isolation.

    Regions

    The Regions page lists all the regions where the organization has a quota in. Organizations can have a quota in one or many regions. Your provider administrator assigns the regional quota to your organization. Quota in a region can come from one or many vSphere Zones within that region.

    Example: A global enterprise hosted by a VMware Cloud Service Provider might have quotas in Asia and Europe to ensure low-latency access for local teams.

    Namespace Class

    Namespace classes are templates for namespace provisioning. These templates can be used to standardize namespace attributes, like utilization limits, reservations, VM classes, storage classes, and content libraries. organizations comes preconfigured with three default namespace classes (small, medium, and large), which are meant to serve as example templates. The only different attributes among these built-in templates are the CPU and Memory limits. Administrators can use these templates as-is or can modify them to suit their needs.

    Namespace

    Projects are the central construct for organizing and allocating infrastructure resources to tenants or teams. As the organization administrator, you manage and distribute infrastructure by assigning namespaces to projects. When configuring a project, you must add at least one namespace so that users within the project can begin provisioning workloads such as virtual machines, VMware Kubernetes Service (VKS) clusters, or other supported resources. Namespaces act as scoped resource pools, defining limits for CPU, memory, and storage to ensure fair allocation and performance consistency. Each namespace is tied to a Virtual Private Cloud (VPC) and a namespace class, which in turn is associated with at least one zone to determine placement and availability. This structure not only enforces resource governance but also enables automation workflows to deploy consistently within predefined boundaries. All organizations are created with a default project, which is initially empty and contains no namespaces or users, providing a baseline starting point for configuration.

    Example: A tenant of a VMware Cloud Service Provider might create separate namespaces for development and production to avoid accidental resource conflicts.

    Virtual Private Clouds (VPCs)

    A Virtual Private Cloud (VPC) in VMware Cloud Foundation Automation (VCFA) offers an isolated networking environment that can be associated with one or more namespaces. Organizations can create multiple VPCs and assign each to specific namespaces based on workload or isolation requirements.

    Each VPC is an independent network and supports three types of IP address spaces, each offering different levels of reachability:

    • Private CIDRs: These addresses are internal to the VPC and are not routable outside without NAT. They are managed by the VPC administrator and do not need to be globally unique, allowing reuse across multiple VPCs.
    • TGW Private IP Blocks: These IP blocks are scoped at the organization level and are advertised through the Transit Gateway (TGW) within the organization. Organization admins define these blocks, and project admins can allocate subnets from them for their VPCs. This enables direct communication between VPCs in the same organization using the TGW Private IP space.
    • External IP Blocks: Managed by the provider admin, these IPs enable outbound access through Source NAT. Organization admins can assign subnets from provider-defined external blocks, giving workloads external connectivity while still using internal addressing.

    You can choose to deploy a separate VPC per namespace for stricter isolation, or share a VPC across namespaces where network separation is not required.

    Transit Gateways

    Each organization has a transit gateway which provides connectivity to the provider gateway within the organization. One or more VPCs are connected to the transit gateway, and that connection is defined by a VPC connectivity profile. Each VPC has connected workloads and a private subnet. SNAT rules translate addresses from this private subnet to a public address in the IP spaces block. This infrastructure enables the organization and its workloads to connect to external networks.

    You can view what transit gateways are available to your organization on the Manage & Govern > Networking > Transit Gateways page.

    IP Management

    Provider can use IP Spaces to manage their IP address allocation needs. IP Spaces provide a structured approach to allocating public IP addresses to different organizations, enabling connectivity to external networks.

    An IP space consists of a set of CIDR blocks that are reserved, these CIDRs must be dedicated to  and used by organization administrators as they configure services. An IP space can only be IPv4.

    Organization administrators can create and manage the private IP blocks within their organization. there tenant can view external IP address blocks assigned to this organization by a provider. You can also create and view private TGW IP address blocks for the entire organization to use. Finally, you can view private VPC IP address blocks that are applicable to specific VPCs.

    In essence, VMware Cloud Foundation Automation’s tenant management capabilities provide a structured, role-based framework for organizing projects, namespaces, VPCs, transit gateways, and IP resources. By aligning provider and tenant responsibilities, VMware Cloud Service Providers ensure secure isolation, consistent governance, and streamlined automation—empowering organizations to scale efficiently while maintaining full control over infrastructure and networking resources.

  • Navigating the Shift: From VMware Cloud Director to VCF Automation in VMware Cloud Foundation 9

    Navigating the Shift: From VMware Cloud Director to VCF Automation in VMware Cloud Foundation 9

    VMware Cloud Foundation 9 (VCF 9) has officially launched, introducing a next-generation Cloud Management Platform — VCF Automation (VCFA). This new platform supersedes both Aria Automation and VMware Cloud Director (VCD). This blog is specifically aimed at those familiar with VCD and looking to understand how VCFA compares — what remains familiar, what’s changed, and how to navigate the shift.

    It’s important to note that VCFA is not a simple rebranding of existing tools. It is a new solution built with purpose, though it incorporates core components from its predecessors. The provider-facing layer, known as Tenant Manager, is built on the VCD codebase, so the UI and APIs will feel familiar to seasoned VCD administrators. On the other hand, the tenant experience draws heavily from Aria Automation, introducing a modernized interface and capabilities that will appear significantly different — especially for users coming from a traditional VCD background.

    Why VCFA?

    Modern enterprises and service providers are navigating increasingly complex environments — hybrid, multi-cloud, containerized, and AI-driven workloads are the new normal. VMware has responded with VCFA: a cloud automation solution tightly integrated with VCF 9 that provides:

    • Unified multi-tenant management
    • Seamless integration across compute, storage, and networking
    • Robust self-service capabilities for both providers and tenants
    • Compliance-ready, policy-driven automation

    This is not just an incremental upgrade. VCFA is a next-generation platform, built to be extensible, resilient, and future-proof.

    How VCFA Differs from VCD and Aria Automation

    Let’s break it down into provider and tenant perspectives:

    Provider Experience – Tenant Manager

    The provider-facing component of VCFA is called Tenant Manager.

    • It leverages the codebase from VCD, meaning administrators familiar with VCD will find the UI and APIs instantly recognizable.
    • Tasks such as creating tenants, managing quotas, assigning resources, and configuring networks follow a some what similar structure to VCD.
    • However, Tenant Manager is fully integrated with VCF’s architecture, eliminating dependency on external orchestration layers.

    In essence, Tenant Manager modernizes VCD’s core capabilities while maintaining continuity for service providers.

    Tenant Experience – VCFA UI and APIs

    For tenants, the VCFA experience is heavily influenced by Aria Automation but redesigned for simplicity and control:

    • New self-service portal tailored for tenant-level resource provisioning
    • Integrated access to IaaS, network services, Kubernetes (via VKS), and more
    • Native support for day 2 operations, approvals, cost visibility, and policy governance
    • UI/UX reflects a cloud-native mindset, empowering developers and app teams

    If you’re a tenant used to the VCD interface, the VCFA UI may initially seem unfamiliar — but it brings greater power, flexibility, and visibility.

    Provider Management

    The VCF Automation Provider Management Portal is a dedicated interface for Provider Administrators and to access it, type https://vcfa.example.com/provider and to log in for the first time, you must use default administrator/admin account with local user and password which you set up during the installation.

    You can use the Quick Start wizard in VCF Automation to quickly create an organization with predefined settings, streamlining the initial setup process. This is a convenient alternative to manually configuring each component and is especially useful for setting up a test or evaluation environment to explore the platform’s capabilities.

    NOTE – VCF Automation 9.0, only active-standby mode is supported for NSX Tier-0 Gateways. In active-standby mode, an elected active member processes the traffic. If the active member fails, a new member becomes active.

    Alternatively, you can use the manual wizard in VCF Automation to set up each component individually—Region, Organization, IP Space, Provider Gateway, and Tenant Networking—giving you full control and customization over your environment. In this blog post, I’ll walk you through that step-by-step process to help you understand how to configure a tenant from the ground up.

    Region

    In VCFA, a region represents a logical grouping of compute, storage and networking resources, typically associated with one or more vCenter Server instances and a shared NSX instance.

    NSX Local Manager – provides software define networking for the region, select the NSX Manager instance that integrates with the vCenter instances you want to use for the region

    Note: A single NSX Manager instance must be integrated with all vCenter instances within a region.

    Supervisor(s) – Inside a Region we have one or more Supervisors and provides compute infrastructure for the region, list shows all available Supervisors for NSX Manager instance that you choose in above step.

    Storage Class(es) – shows all storage classes across the selected Supervisors.

    Organisations

    In VMware Cloud Foundation Automation (VCFA), Organizations are foundational constructs used to separate and manage tenants and providers in a multi-tenant private cloud environment. These organizations define the boundaries for resource allocation, identity management, policies, and service consumption.

    VCFA introduces two main types of organizations:

    Provider Consumption Organization

    A PCO ( Provider consumption organization ) is created which the provider can use to share blueprint catalog, workflows with other tenant organizations , this must be enabled by going to Administration > Feature flags and enable PCO Organization feature flag

    Tenant Organization

    Each tenant/customer is onboarded into VCFA as a separate organization, Tenants get:

    • Isolated access to their own VMs, networks, storage, Kubernetes clusters, etc.
    • Self-service portal and/or API access
    • Resource limits defined by the provider
    • Option to integrate with their own identity providers (IdP) (e.g., SAML, LDAP)
    • Custom catalogs or services if published by the provider

    When onboarding a new customer in VCFA:

    • You (the provider) create a Tenant Organization.
    • Allocate region, supervisor and zones (resources – e.g., 10 GHz, 10 GB RAM).
    • Assign VM classes and storage classes
    • Configure access control (create local users)
    • Let the customer use VCFA UI or API to deploy/manage their workloads.

    VCFA Organizations are essential to enabling multi-tenancy, isolation, and governance in VCFA.They help service providers manage multiple customers securely and efficiently. Each org has its own identity, resource limits, users, services, and policies.

    IP Space

    IP spaces offer a structured approach for providers to allocate IP addresses to different organizations, enabling connectivity to external networks. You can use quotas to control usage. For internal organization communications, organizations can self-manage their own IP address blocks.

    Go to Networking > IP spaces to create a new IP Space and set quotas. IP Blocks are created in NSX. IP Blocks represent IPs used in this local datacenter, south of the Provider Gateway. IPs within this scope are used for configuring services and networks.

    External Reachability represents the IPs used outside the datacenter, north of the Provider Gateway.

    Provider Gateway

    A Provider Gateway in VCFA is the logical network boundary between the provider-managed infrastructure and external environments. It serves as the entry/exit point for all traffic coming in and going out of tenant environments.

    A provider gateway leverages VCF Networking T0s or T0 VRFs, and associates them with IP addresses from IP spaces that can be advertised from those gateways. A provider gateway can be assigned to one or more organizations.

    To add a provider gateway, first you must create an Active Standby tier-0 gateway in the NSX Manager associated with the region to back it. You can create the tier-0 gateway in the NSX Manager UI or by using the NSX Policy API.

    If you want to add a tier-0 gateway that is backed by a VRF gateway in NSX, you must also create a VRF gateway that is linked to the tier-0 gateway.

    • Enter a name and, optionally, a description for the new provider gateway.
    • From the drop-down menu, select the region of the tier-0 gateway, and click Next.
    • Select a tier-0 gateway from the list, and click Next.
    • Select one or more IP spaces to associate with the provider gateway, and click Next.
    • Review the network settings and click Create.

    Region Network Settings (Tenant Networking)

    When you configure networking for a Region in VCFA, you’re defining how tenant workloads in that region will connect—both internally and externally. This includes:

    Click on “START” will take to Organization page, there select Organization for which you want to configure Networking and click on CONFIGURE

    • Select the Region – choose the appropriate region where this organization’s resources will be provisioned, then click Next.
    • Choose a Provider Gateway – select a provider gateway to connect the organization’s virtual network to external networks (e.g., internet or upstream services), then click Next.
    • Assign an Edge Cluster – Pick the Edge cluster where the VPC services for this organization will operate. (You may choose the same cluster associated with the Tier-0 provider gateway, or a different Edge cluster depending on your resource planning)
    • Review and Confirm – Review all configured network settings. Once validated, click Create to complete the network setup for the organization.Select a region, and click Next

    This blog post provides a comprehensive, step-by-step walkthrough of how to manually onboard a tenant in VMware Cloud Foundation Automation (VCFA) by configuring key components such as Regions, Organizations, IP Spaces, Provider Gateways, and Tenant Networking, offering cloud providers and administrators deeper control and customization compared to the Quick Start option—ultimately enabling a flexible, scalable, and secure multi-tenant private cloud environment built on VCF 9.

  • From Virtualization to Cloud Service Delivery with VMware Cloud Foundation & VCSPs

    From Virtualization to Cloud Service Delivery with VMware Cloud Foundation & VCSPs

    The IT landscape is undergoing a massive transformation. Traditional virtualization, which once revolutionized data centers, is now evolving into full-fledged cloud service delivery. Organizations are no longer just managing VMs; they are delivering scalable, secure, and AI-ready cloud platforms.

    The Shift from Virtualization to Cloud Services

    Virtualization has been the backbone of IT infrastructure for over a decade, enabling efficiency, consolidation, and improved resource utilization. However, as digital transformation accelerates, enterprises require more than just virtual machines. They need scalable, automated, and AI-powered cloud platforms that can meet the growing demands of modern workloads.

    This shift is being powered by VMware Cloud Foundation (VCF)—the cornerstone of modern cloud infrastructure. With VCF, enterprises and Cloud Service Providers (CSPs) can move beyond virtualization to build multi-cloud, hybrid, and sovereign cloud environments with automation, security, and AI-driven capabilities at their core.

    Key Benefits of VMware Cloud Foundation

    Unified Platform: Compute, storage, networking, and management are integrated into a single solution.
    Hybrid & Multi-Cloud Operations: Seamlessly run workloads across private, public, and hybrid cloud environments.
    Built-in Security & Compliance: Ensure data sovereignty and regulatory compliance with sovereign cloud initiatives.
    AI-Ready Infrastructure: GPU acceleration and private AI capabilities empower AI/ML workloads.
    Accelerated Cloud Service Delivery: Enable Cloud Providers & VMware Cloud Service Providers (VCSPs) to deliver next-gen cloud offerings.

    The Significance of VMware Cloud Providers (VCSPs)

    VMware Cloud Providers (VCSPs) play a pivotal role in enabling organizations to seamlessly transition from virtualization to cloud services. They extend the capabilities of VMware Cloud Foundation by offering:

    🔹 Managed Cloud Services: Helping enterprises offload infrastructure management with fully managed VMware-based cloud environments.
    🔹 Sovereign and In-Country Cloud Solutions: Ensuring compliance with regional data sovereignty laws while delivering cloud scalability.
    🔹 Multi-Tenant Cloud Platforms: Empowering service providers to offer flexible, cost-effective cloud solutions with secure tenant isolation.
    🔹 AI and GPU-Powered Cloud Services: Providing enterprises with AI-ready infrastructure to support next-gen workloads.
    🔹 Disaster Recovery & Business Continuity: Offering reliable DRaaS (Disaster Recovery as a Service) to ensure business resilience.

    Future of Cloud with VMware Cloud Foudation

    As enterprises and service providers embrace cloud-first and AI-driven strategies, VCF is enabling them to deliver next-generation cloud services with agility, resilience, and efficiency. This evolution is not just about technology; it’s about unlocking new business opportunities, enhancing innovation, and driving digital transformation at scale.

    With cloud-native applications, AI/ML workloads, and security-first cloud strategies becoming the new normal, the role of VMware Cloud Foundation is more critical than ever.

    VMware Cloud Foundation is transforming the way cloud services are delivered, from the traditional virtualization model to highly flexible, customer-tailored cloud services. With the support of VCSPs, businesses are empowered to adopt cutting-edge cloud solutions faster and more efficiently than ever before.

  • Enhancing Firewall Flexibility in VMware Cloud Director 10.6.1

    With VMware Cloud Director 10.6.1, service providers gain greater flexibility and control over firewall configurations, ensuring compliance with licensing entitlements while delivering scalable, high-value security services. This update aligns with VMware Cloud Foundation (VCF) networking licensing, enabling providers to selectively offer the VMware Advanced Networking & Security (ANS) Add-On to customers based on their needs and cost agreements.

    Impact of VMware NSX Licensing Changes

    Recent changes to VMware’s NSX licensing model have significantly altered how firewall features are provisioned. Under the new structure:

    • Stateless Firewall is included in the VMware Cloud Foundation (VCF)
    • Stateful Firewall now requires an additional, separate license documented Here

    This change impacts how service providers manage network security within VMware Cloud Director environments. To address these shifts, Cloud Director 10.6.1 introduces new controls that give providers flexibility in defining which firewall type—stateless or stateful—is available to their tenants. This ensures security policies align with business needs while optimizing costs associated with VMware licensing.

    VMware Cloud Director with NSX supports both stateful and stateless firewalls, each serving different security needs:

    What is a Stateless Firewall?

    A stateless firewall inspects traffic on a per-packet basis without maintaining the state of active connections. Unlike stateful firewalls, which track the context of traffic flow, stateless firewalls apply predefined rules to each packet independently.

    💡 Key Benefits:
    ✔ Faster packet processing for high-performance workloads.
    ✔ Ideal for perimeter protection and edge security use cases.
    ✔ Lower resource consumption compared to stateful firewalls.

    Stateful vs. Stateless Firewalls in Cloud Director

    FeatureStateful FirewallStateless Firewall
    Connection Tracking✅ Maintains connection state❌ No connection awareness
    Security Context✅ Applies rules based on traffic flow❌ Evaluates each packet independently
    PerformanceHigher resource usageLightweight, optimized for speed

    Configuring in Cloud Director

    This feature is designed to help cloud service providers who wish to control which tenants can access Stateless/Stateful Firewall services. The goal is to enforce better governance over the consumption of advanced network services, such as Stateful Firewall and Distributed Firewall.

    The license selection is made at the Edge Cluster level in VCD. The service provider determines which type of firewall can be applied to a specific Edge Cluster. Consequently, all Provider/Organization and vApp Edge Gateways utilizing that cluster will have firewall rules configured as either stateful or stateless, depending on the selection.

    This will have corresponding changes in NSX, while The firewall rule configuration remains the same in vCD. below is the VMware Cloud Director (VCD) view of the Org VDC Edge Gateway firewall configuration deployed on an Edge Cluster designated with the stateless firewall option inside NSX Manager.

    NOTE : Changing an Edge Cluster from Stateful to Stateless or vice versa will not impact existing deployed Gateways.

    Gateway Firewall Enforcement Control in VCD

    One key use case is when a service provider or tenant is using an appliance-based third-party firewall instead of the NSX-integrated firewall in Cloud Director. In such cases, they may not require NSX-based firewall enforcement and prefer to manage security through their own solution. This feature allows them to disable the NSX firewall, ensuring flexibility in security architecture without unnecessary conflicts.

    Now with this release both service providers and tenants can disable or enable the firewall at the Provider or Org Gateway level without removing existing firewall rules. A new “Active” switch has been introduced in the Firewall UI (top right corner), allowing users to toggle firewall enforcement as needed while preserving the configured rules.

    Conclusion

    The new firewall flexibility in Cloud Director 10.6.1 ensures that service providers can:

    Optimize licensing costs by choosing stateless or stateful firewall options.
    Align security offerings with customer needs.
    Enhance governance and compliance around advanced network security services.
    Seamlessly integrate third-party firewall solutions into their cloud environments.

    By leveraging these new capabilities, Cloud Director providers can deliver scalable, efficient, and cost-effective security solutions while adapting to the evolving VMware NSX licensing model.

    Cloud Director 10.6.1 Release Notes Published Here

  • Why Customers Should Choose VMware Cloud Service Providers When Transitioning from Public to Private Cloud

    As businesses’ cloud strategies evolve, many are reconsidering their reliance on public cloud environments and exploring the benefits of private cloud solutions. Public clouds like AWS, Azure, and Google Cloud offer flexibility and scalability, but they also come with challenges such as unpredictable costs, security concerns, and limited control. This is where VMware Cloud Service Providers (CSPs), powered by VMware Cloud Foundation (VCF), present a compelling alternative for businesses looking to transition from public to private cloud. Here’s why customers should choose a VMware CSP when making this move:

    1. Predictable Costs and Better Financial Control

    Public Cloud Challenge:
    The pay-as-you-go model of public clouds is attractive at first but often leads to unpredictable and escalating costs. Usage spikes, data transfer fees, and networking costs can cause budget overruns, making it difficult for businesses to manage long-term financial planning.

    VMware CSP Advantage:
    With VMware Cloud Foundation hosted by a VMware CSP, costs become more predictable and fixed. Unlike public clouds, where charges can fluctuate based on consumption, VMware CSPs offer stable pricing tailored to the customer’s dedicated infrastructure needs. This leads to greater financial control and ensures that businesses can plan their budgets with confidence, avoiding unexpected bills and cost surges.


    2. Enhanced Security and Compliance

    Public Cloud Challenge:
    While public cloud providers maintain infrastructure security, customers are responsible for securing their data. This shared responsibility model introduces potential security gaps, especially in multi-tenant environments where data is more exposed. For industries with strict regulatory requirements, such as healthcare and finance, managing compliance in a public cloud can be challenging.

    VMware CSP Advantage:
    VMware Cloud Service Providers offer private, dedicated infrastructure, giving businesses full control over their security protocols. VMware Cloud Foundation includes built-in features like NSX micro-segmentation, end-to-end encryption, and automated compliance controls to ensure robust security. This infrastructure meets the stringent security needs of industries like government and financial services, making it easier for organizations to comply with regulations such as GDPR, HIPAA, and PCI-DSS.

    By choosing a VMware CSP, businesses can deploy their own security policies and governance measures, ensuring full compliance without the risks associated with public cloud environments.


    3. Consistent Performance and Infrastructure Customization

    Public Cloud Challenge:
    Public clouds are designed to serve a broad range of customers, leading to performance variability. The shared, multi-tenant nature of public clouds can cause resource contention, which negatively impacts performance for businesses with mission-critical workloads. Additionally, public cloud platforms offer limited options for customizing infrastructure to optimize specific workloads.

    VMware CSP Advantage:
    With a VMware CSP, businesses gain access to dedicated infrastructure that provides consistent, reliable performance. VMware Cloud Foundation allows companies to customize their private cloud environments, tuning resources to meet the exact demands of high-performance workloads like AI/ML, enterprise applications, or data-intensive tasks. This ensures optimal performance and avoids the unpredictable resource contention seen in public cloud environments.


    4. Full Control Over Data and Infrastructure

    Public Cloud Challenge:
    In a public cloud setup, businesses often lose a degree of control over their data and infrastructure, as public cloud providers manage the underlying systems. This can lead to vendor lock-in, where organizations are restricted to the cloud provider’s tools and architecture, making it difficult to adapt or migrate workloads.

    VMware CSP Advantage:
    VMware Cloud Foundation offers businesses full control over their infrastructure, ensuring flexibility and freedom. With a VMware CSP, organizations are not bound by the limitations of a public cloud vendor’s ecosystem. Instead, they can manage and operate their private cloud environment according to their own policies and tools, retaining ownership of their data and ensuring it is managed and stored in compliance with their internal standards.

    Moreover, VMware CSPs provide a vendor-neutral platform, reducing the risk of cloud lock-in and enabling smoother transitions to other cloud models if needed.


    5. Simplified Compliance and Data Residency

    Public Cloud Challenge:
    Many businesses must comply with strict regulations around data residency and sovereignty, requiring data to be stored and processed within specific regions. While public clouds offer region-based services, maintaining compliance can be complex due to the global nature of their infrastructure and multi-tenant environments.

    VMware CSP Advantage:
    VMware Cloud Service Providers offer private cloud environments where data residency is easily enforced, ensuring that sensitive information remains within required geographic boundaries. Organizations can select specific data center locations that comply with local laws and regulatory requirements, providing greater control over data governance. This is crucial for industries like finance, healthcare, and government, where compliance and data sovereignty are paramount.


    6. Hybrid and Multi-Cloud Flexibility

    Public Cloud Challenge:
    Public clouds are optimized for running workloads within their own ecosystem, making hybrid or multi-cloud strategies more complex. This often results in vendor lock-in, where businesses are limited to the services and infrastructure of a single cloud provider.

    VMware CSP Advantage:
    VMware Cloud Foundation is designed for hybrid and multi-cloud environments, offering businesses the flexibility to run workloads across private clouds, on-premises infrastructure, and public clouds (via VMware Cloud on AWS, Azure VMware Solution, or Google Cloud VMware Engine). This allows businesses to choose the best environment for each workload while maintaining a consistent management experience across clouds. VMware CSPs provide the best of both worlds, enabling seamless hybrid cloud operations without sacrificing control or flexibility.


    7. Long-Term Cost Efficiency and Lower Total Cost of Ownership (TCO)

    Public Cloud Challenge:
    Public clouds are ideal for elastic workloads but can become costly for steady-state or predictable workloads. Over time, businesses may find that public cloud environments become less efficient, with resources underutilized or costs outpacing usage.

    VMware CSP Advantage:
    Private clouds hosted by VMware CSPs offer a more cost-efficient solution for businesses with predictable workloads. By shifting to a fixed-cost private cloud model, organizations avoid the long-term costs of over-provisioning in the public cloud. VMware Cloud Foundation optimizes resource utilization, ensuring infrastructure is used efficiently, leading to a lower total cost of ownership (TCO) over time.

    For enterprises looking to stabilize their operational costs while maintaining cloud-level flexibility, VMware CSPs provide a long-term financial advantage compared to public cloud platforms.


    Conclusion

    Transitioning from public cloud to a private cloud environment hosted by a VMware Cloud Service Provider offers businesses a powerful combination of predictable costs, enhanced security, control, and customized infrastructure. VMware CSPs allow organizations to regain control over their data and operations, ensure compliance with stringent regulatory requirements, and optimize performance for mission-critical applications.

    For enterprises seeking a strategic balance between cloud agility and operational control, VMware Cloud Service Providers are the ideal partners to support a seamless and effective move from public cloud to private cloud.