VCAP-CID Study Notes: Objective 3.3

This is Objective 3.3 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Identify allocation model characteristics
  • Explain the impact of providing a dedicated cluster to a Provider VDC.
    • Its better to use a dedicated cluster because then you don’t have to carve up the same DRS cluster into smaller resource pools, each with different allocation models which have different ways of guaranteeing resource to their workloads.
    • This is explained in detail in this post from Frank Denneman:

Skills and Abilities

  • Determine the impact of a chosen allocation model on cluster scalability.
    • When using Reservation pools you can only scale to 32 ESXi hosts per cluster since they can only use the resources available in a single provider vDC.
    • Here there are two options, either the organization vDC is elastic or it’s not.
      • Elastic organization vDC allows you to scale to multiple clusters within the same provider vDC. This allows these to scale to multiple clusters.
      • Non elastic organization vDCs only allow Allocation Pool workloads to consume resource from single cluster.
    • These are elastic as they only work on per VM based reservation
    • Reservation Pool
    • Allocation Pool
    • Pay-as-you-Go
  • Give application performance requirements, determine an appropriate CPU and memory over-commitment configuration.
    • CPU overcommitment ratios are mostly based on vCPU:pCPU ratios. 6:1 vCPU to pCPU is considered a maximum in most cases, but this is just a recommendation from VMware. CPU vary greatly in performance so make sure to have that in mind in calculating these ratios.
    • The number of vCPU’s that can run on available physical cores determines the amount of VM’s an environment can run. This ratio is determined in the design phase as a technical requirement with design decision similar to this one: “The risk associated with an increase in CPU overcommitment is that mainly degraded overall performance that can result in higher than acceptable vCPU ready times.”
    • This ratio can be used as a part of a performance service agreement to make sure certain tier-1 workloads will not have potential CPU contention.
    • CPU Overcommitment
    • Memory Overcommitment
      • Memory overcommitment is based on configuring VM’s running on a host with more memory than it has to offer. This is commonly used in VMware vSphere environment as most workload do not use all their memory at the same time. vSphere ESXi has various features to help with overcommiting memory
      • TPS – Transparent page sharing
      • Memory ballooning
      • Memory Compression
      • Swapping to disk
    • Many of the allocation models have reservation of memory build in, but unlike reservations of CPU, when memory is reserved it is not used by other VM’s.
      • Reservation Pool creates a static memory pool where the workload fight for the resources, but certain workloads can be have individual reservations (and limits and shares)
      • Allocation Pool create a resource pool with % of reserved memory for running VM’s. These VM’s then have that pool of reserved resource to fight over as well for resources configured for the pool.
      • PAYG create a per VM reservation (and resource pool as well) and can use the rest of the resource available in the cluster.
  • Given service level requirements, determine appropriate failover capacity.
    • Failover capacity is most likely based on disaster recovery scenarios.
    • SRM is not vCloud aware so storage replication is the only way to “protect” consumer workloads.
    • The steps required could be automated using Storage system API in addition to using automation features in vSphere (PowerCLI, Orchestrator)
    • Operationally, recovery point objectives support must be determined for consumer workloads and included in any consumer Service Level Agreements (SLAs). Along with the distance between the protected and recovery sites, this helps determine the type of storage replication to use for consumer workloads: synchronous or asynchronous.
    • For more information about vCloud management cluster disaster recovery, see http://www.vmware.com/files/pdf/techpaper/vcloud-director-infrastructure-resiliency.pdf.
    • Appendix C in the vCAT Architecting a VMware vCloud document is something that is worth reading.
    • To determine the appropriate failover capacity you will need to have determined the SLA for DR service, and to which service offering it will map to. A Gold provider vDCD cluster might have that as a feature.
    • Then you have the required capacity you will need to have on the recovery site as that will based on the resources used by the organizations using the Provider vDC Performance Tier that has DR features.
  • Given a Provider VDC design, determine the appropriate vCloud compute requirements.
    • A prodiver vDC design will include allocation models, with requirements for availability SLA’s and other related requirements (DR, DRS, HA)
    • This information is used to determine the appropriate Host logical design, with number of CPU and cores, memory, HBA, NICs, local storage and additional information if needed (boot of USB etc.)
    • A great example of the process is on pages 32-34 in the Private VMware vCloud Implementation Example document
  • Given workload requirements, determine appropriate vCloud compute requirements for a vApp
    • Compute requirements for a vApp involve deciding how many vCPUs will be used, how much memory is allocated and which allocation model would fit the application in question. A Tier-1 application should run on a Reservation Pool, Tier-2-3 on Allocation Pool and Dev/Test/Transient workloads on Pay-as-you-Go.
  • Given capacity, workload and failover requirements, create a private vCloud compute design.
    • Capacity requirements are 700 VM’s and 500 vApps.
    • Workloads include Development, Pre-production, Demonstration, Training and Tier-2-3 IT Infrastructure applications and Tier-1 Database applications
    • Failover requirements are to support failover of essential Tier-2-3 Workloads and all Tier-1 workloads.
    • Create two tiers of Compute clusters, Gold and Silver.
      • Gold will include  8 host with N+2 HA configuration at 26% resource reservation. DRS configured at Fully automated at Moderate Migration Threshold.
      • Silver will include 8 host with N+1 HA configuration at 13% resource reservation. DRS configured at Fully automated at Moderate Migration Threshold.
    • Storage will include 3 separte tiers, Tier-1, Tier-2 and Tier-3.
      • Tier-1 is based on 10K SAS disks and SSD disks and Easy-tiering solution to move hot blocks to SSD automatically. Workloads are replicated to a second site for Disaster recovery.
      • Tier-2 is based on 10K SAS disks.
      • Tier-3 is based on 7.2 NL-SAS disks and 10K disks, to move hot blocks to 10K disks automatically.
    • Allocations model used are
      • Run Tier1-2-3 workloads
      • Run the rest of the workloads with various amount of % reservations between workloads.
      • Gold Compute cluster: Reservation Pool
      • Silver Compute clsuter: Allocation Pool
    • Let’s use an example:
    • The Private vCloud compute design includes both Management and Resource clusters, but most Management cluster are similar, both for Private or Public Clouds so we will just design the Resource cluster (since the workloads will reside there)
    • This is just an example how you would create a logical compute layout. I added both storage and consumer constructs but I think you really need to have all the required information to get the idea.
    • This information was gathered from the Private VMware vCloud Implementation Example document. Even though based on vCloud 1.5 it’s a great document to use as a reference for future designs.
  • Given capacity, workload and failover requirements, create a public vCloud compute design.
    • The same applies to Public vCloud compute designs, but instead of Reservation or Allocation Pool you use Pay-as-you-Go (if the requirements are to charge each VM separately etc.)
    • Capacity requirements in Public vCloud instances are really guess-work. You design around predicted capacity based on number of VM’s, their setup and size of their storage. As seen in this picture from the Public VMware vCloud Implementation Example document.

VCAP CID 3-3-1

    • Most public vCloud recommended use-cases are Test/Dev and other transient workloads. But this is really based on the requirement of the hosting company. Can be a mix of all the allocation models.
    • Failover requirements of Public vClouds are most likely a part of a more expensive offering for higher tier workloads, but you can of course failover any kind of workloads, even the “no-so-important” ones. All workloads are someones production workloads 🙂
    • For creating a design for a public vCloud I recommend reading the Public VMware vCloud Implementation Example document as a great reference.

VCAP-CID Study Notes: Objective 3.2

This is Objective 3.2 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.
Knowledge

  • Identify the storage placement algorithm for vApp creation.
    • As scary as this sounds this is documented very well in the vCAT Architecting a VMware vCloud document on pages 46-48.
    • This is the link to the online version.
  • Explain considerations related to vApp VM virtual disk placement.
    • The same pages.
  • Explain the impact of datastore threshold settings on shadow VM creation.
    • If a vApp is a linked clone it check if the datastore holding the Shadow VM is reaching its yellow or red limit, if so then a new Shadows VM is created on a new Datastore.
  • Explain the impact of multiple vCenter Servers on shadow VM creation.
    • I explained how Fast Provisioning works in objective 2.5
  • Explain the impact of VMFS and NFS on cluster size.
    • In vSphere 5.1 (which this blueprint is based on) the maximum size of a Datastore is 64TB. NFS shares can be as large as the manufactures of the NAS box allows.
    • Resource clusters can have 32 ESXi hosts. Each host can have 256 Volumes, but only 64 hosts can access the same volume.
    • So at 64 ESXi hosts you are limited to showing them the same 256 Volumes.
    • Each Datastore Cluster can only hold 32 Datastores, so that’s 8 full Datastore clusters.
    • For NFS you can mount 256 mountpoints per host.

Skills and Abilities

  • Given service level requirements, create a tiered vCloud storage design.
    • Most likely the SLA’s around tiered storage ar based on performance guarantees.
    • If the use cases and the requirements are that you need to have faster storage for some high IO VM’s and then slower storage for low IO VM’s a tiered storage would help in those instances.
    • Most likely a 3 tier storage solution is used. Eg. Gold, Silver and Bronze.
      • Gold would perhaps use 15K disks and SSD disks for caching, or 10K disks and SSD disk for automatic caching. RAID levels depend on the storage array manufacturer and write load.
      • Silver could be a pool of 10K disks. RAID levels depend on the storage array manufacturer and write load.
      • Bronze could be a pool of 7.2 NL-SAS disk and be used mainly for archiving purposes, idle workloads and perhaps test/dev (just remember that test/dev environments are production environments for developers :))
  • Given the I/O demands of various cloud workloads, determine an appropriate vCloud storage configuration.
    • This is also based on storage tiers, or how they are configured.
    • RAID levels, in legacy storage arrays, play a role in calculating IO performance for disk arrays.
      • R1 or R10: Mirror og Mirrored Spanned drives. Write penalty is 2, so for each write issued from the servers, two need to be performed on the array. That’s because it’s a mirror.
      • R5: Uses Parity to allow failure of one drive. Write penalty is 4. That’s because with each change in the on the disk, a read on both data and parity is performed, and then write the data and the parity.
      • R6: Uses two sets of parity. Write Penalty is 6. Three reads, three writes.
      • RAID-DP: Two sets of parity like raid 6. But the write penalty is only two because of how WAFL writes data to disks.
    • How to calculateIOPS
      • First thing first, a list ofIOPS per disk speed
        • 15K Disks = 175-210 IOPS
        • 10K Disks = 125-150 IOPS
        • 7.2K Disks = 75-100 IOPS
        • 5.4K Disks = 50 IOPS
      • Also it depends if it’s a SATA or SAS drive, and if it’s an enterprise grade disk or not (higher MTBF).
      • And then Duncan Epping will explain IOPS calculations for us in this post: http://www.yellow-bricks.com/2009/12/23/iops/
  • Given service level, workload and performance requirements, create a private vCloud storage design.
    • Private Cloud most likely have workloads that are known, or even planned in advance. So there is more information about the storage requirements for the workloads.
    • This information can be used with other information (gathered in a Current State Analysis) to create a storage design that would fulfill all requirements.
    • Data that would be nice to have for a private cloud are:
      • IO profile of the workload
        • Can be used to calculate IO requirements for the environment.
      • Hard disk size and number, and growth
        • To calculate initial capacity and projected growth
      • RTO if a LUN is lost
        • How long it will take to restore all the VM’s on the datastore, and that depends on the backup solution used
    • As an example you can then create storage tiers that have different sizes to differentiate on RTO, but the same RAID setup.
  • Given service level, workload and performance requirements, create a public vCloud storage design.

 

VCAP-CID Study Notes: Objective 3.1

This is Objective 3.1 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Identify network isolation technologies available for a vCloud design.
    • For internal networks
      • VXLAN (dynamically created vSphere port groups)
        • Virtual eXtensible LAN (VXLAN) network pools use a Layer 2 over Layer 3 MAC in UDP encapsulation to provide scalable, standards-based traffic isolation across Layer 3 boundaries (requires distributed switch).
      • vCloud Network Isolation-backed (dynamically created vSphere port groups)
        • vCloud Director Network Isolation-backed (VCD-NI) network pools are backed by vCloud isolated networks. A vCloud isolated network is an overlay network uniquely identified by a fence ID that is implemented through encapsulation techniques that span hosts and provides traffic isolation from other networks (requires distributed switch).
      • VLAN-backed (dynamically created vSphere port groups)
        • VLAN-backed network pools are backed by a range of preprovisioned VLAN IDs. For this arrangement, all specified VLANs are trunked into the vCloud environment (requires distributed switch).
      • vSphere port group-backed (manually created vSphere port groups)
        • vSphere port group-backed network pools are backed by preprovisioned port groups, distributed port groups, or third-party distributed switch port groups.

VCAP CID 3-1-1

 

Skills and Abilities

  • Based on a given logical design, determine appropriate network isolation technologies for a physical vCloud design
    • You will need to base that on the feature of each isolation technology.
    • Based on the requirements (and constraints) you will get a logical design for the network, bot internal and exteranl connections. This needs to be translated in choosing an isolation method. Also you could end up with all of them if the use-cases require so.
  • Based on a given logical design, determine network service communication requirements (DNS, LDAP, IPv6 and NTP) for a physical vCloud design
  • Analyze communication requirements for a given application.
    • This is either based on internal and new vCloud application (multi-tier), or single applications that need external access from a routed network.
      • Multi-tier applications are workload consisting of multiple seperate virtual machines, each with a role within the application. A Web service, with a web front end, and application server to work out request from the web service, and database server to keep all that data processed by the application server.
        • These server each have communication requirements that so the application works and is secure.
      • Single virtual machines running in a vCloud can also have various communication requirements, like inbound http access, ldap access for authentication, file level access to file server etc. This list is really anything you can think of as all application need to communicate to other applications and some point.
    • This can also apply to workloads that will get migrated into a vCloud instance. In this example a dependancy list of applications and their servers is something that is needed. Unfortunately most organizations don’t have that software but VMware has several options for customers if they need to map out their current workload dependencies.
      • VMware Application Dependency Planner
        • A tool for VMware Partners to use to map dependencies for both virtual and physical systems
          • Its agentsless and uses port mirror features available in vSphere vSwitches to create a dependecy map.
      • vCenter Operations Manager Infrastructure Navigator
        • A tool to map out dependencies of virtual environments. A Part of vCops Advanced packages.
  • Given an application security profile, determine the required vShield edge services (static routing, IPSEC VPN, IP masquerading, NAT, DHCP, etc.).
    • The vShield Edge services are and are only for routed networks, or DHCP for interna organization networks.
      • Static Routing
        • You can configure an edge gateway to provide static routing services. After you enable static routing on an edge gateway, you can add static routes to allow traffic between vApp networks routed to organization vDC networks backed by the edge gateway.
      • IPSEC VPN
        • You can create VPN tunnels between organization vDC networks on the same organization, between organization vDC networks on different organizations, and between an organization vDC network and an external network
      • IP masquerading
        • You can configure certain vApp networks to provide IP masquerade services. Enable IP masquerading on a vApp network to hide the internal IP addresses of virtual machines from the organization vDC network.
        • When you enable IP masquerade, vCloud Director translates a virtual machine’s private, internal IP address to a public IP address for outbound traffic.
      • NAT (SNAT and DNAT)
        • A source NAT rule translates the source IP address of outgoing packets on an organization vDC that are being sent to another organization vDC network or an external network.
        • A destination NAT rule translates the IP address and port of packets received by an organization vDC network coming from another organization vDC network or an external network.
      • DHCP
      • Load Balancer
        • Edge gateways provide load balancing for TCP, HTTP, and HTTPS traffic. You map an external, or public, IP address to a set of internal servers for load balancing. The load balancer accepts TCP, HTTP, or HTTPS requests on the external IP address and decides which internal server to use. Port 809 is the default listening port for TCP, port 80 is the default port for HTTP, and port 443 is the default port for HTTPS.
      •  Firewall
        • Firewall rules are enforced in the order in which they appear in the firewall list. You can change the order of the rules in the list. When you add a new firewall rule to an edge gateway, it appears at the bottom of the firewall rule list. To enforce the new rule before an existing rule, reorder the rules.
    • Application security profiles can be very different and its best to know what these service can do to be able to determine which ones you should configure
  • Given security requirements, determine firewall configuration.
    • This is involves configuring the Edge Firewall to fulfill the security requirements of a application.
    • Just have in mind that the firewall rules are enforced in the order in which they appear in the firewall list, so make sure the order is correct 🙂 (not allow all first , then deny some)
  • Given compliance, application and security requirements, create a vApp network design.
    • Instead of having a great time drawing all available configurations of vApp designs I’ll tell you to read from page 56 to 59 in the vCAT  Architecting a VMware vCloud document where most of the network design are explained (and with pictures).
  • Given compliance, application and security requirements, create a private vCloud network design.
    • This really can’t be explained any better than on pages 65-66 in the vCAT  Architecting a VMware vCloud document.
    • The thing is you really need to know how you can use the diffrent networking features of vCloud (direct, routed, vApp networks) to be able to create a vCloud network design.
    • Also the Private VMware vCloud Implementation Example is a great reference document for vCloud designs.
    • And as for most Physical designs they work as an expansion on logical designs with information on physical layout and attributes.
  • Given compliance, application and security requirements, create a public vCloud network design.
    • Same goes for this one, pages 64-65 in the vCAT  Architecting a VMware vCloud document.
    • Like in the previous bullet the Public VMware vCloud Implementation Example is a great reference.

 

VCAP-CID Study Notes: Objective 2.5

This is Objective 2.5 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Compare and contrast vCloud allocation models.
    • I think was covered in Objective 2.4.
  • Identify storage constraints for an Organization Virtual Datacenter (VDC)
    • When configuring storage for a Organization Virtual Datacenter you can set a Storage Limit in GB. This is the storage that is used by VM’s and Catalog items in the organization vDC.
    • When using Fast Provisioning there are constraints regarding the usage of Shadow VM’s and their Linked Clones.
      • When a VM is created with Fast Provisioning a Shadow VM is created. The VM is then created as a linked clone from that Shadow VM.
      • This table (from the vCAT Architecting a VMware vCloud document) shows how placement of Linked Clones behaves:

VCAP CID  2-5-1

 

 

Skills and Abilities

  • Determine applicable resource pool, CPU and memory reservations/limits for a vCloud logical design.
    • The allocation pool types do most of that configuration automatically and should not be changed using the vSphere client.
    • But you could create sub resource pools for each type of an allocation pool.
  • Determine the impact of allocation model performance to a vCloud logical design.
    • This is based on the allocation model used:
      • Reservation Pool
      • Allocation Pool
        • Since Allocation Pool is based on resource pool reservation the same concepts apply here as to the Reservation Pools. The only difference is that the users can’t change the limit, reservations and shares on a VM level.
        • Here you don’t have to worry that much as the VM’s will use the CPU/Memory capacity configured, with % of it reserved when VM’s are powered on.
      • Pay-as-you-Go
        •  A default resource pool with no configuration is created, but the VM’s have limits based on the vCPU speed configured and reservation based on the % setting. So a VM with 2 vCPU would have double the limit of the vCPU speed.
        • CPU reservations at a VM level can result in high CPU ready times. Esxtop would show %RDY, and %MLMTD show the percentage of time the VM is ready to run but isn’t because it violates the CPU limit setting. %MLMTD should be 0%.
        • Memory reservation at a VM level is covered in this blog post from Frank Denneman: http://frankdenneman.nl/2009/12/08/impact-of-memory-reservation/
        • Here the performane penalty is mostly based on the vCPU speed configured at the creation of the Organization vDC.
          • To small and you’ll have a lot of slow VM’s. You could fix this by adding more vCPU, by that increasing the limit, but the you might end up with VM’s with to many vCPU (create a vCPU scheduling war)
          • If the vCPU speed is set to the MHz number of the actual hosts used means all the VM’s will basically work like they don’t have a limit. But that means you will need to give the Organization a huge quota to work with to be able to turn on all the VM’s on, or leave it at unlimited.
  • Determine the impact to a given billing policy based on a selected allocation model.
    • There is no need for me to write about this as this subject has been covered in this excellent post from Eiad Al-Aqqad.
  • Given service level requirements, determine appropriate allocation model(s).
    • First we need to point out that service levels can be different based on what they should cover:
      • Availability – Are the system running? Based on uptime of the systems in question.
      • Backups – RTO & RPO
      • Serviceability – Initial response – Intital resolution time
      • Performance SLA – Need a certain amount of performance – 15 K disk , SSD disk etc.
      • Compliance – Logging,  ensure infrastructure is compliant to standards (PCI-DSS, etc)
      • Operations – Time when users can be added (if manual)
      • Billing – How long billing information is kept, depends on the local law.
    • Each allocation model has its caveats regarding service levels:
      • Reservation Pool
        • DRS Affinity rules can not be set by users in the default vCloud UI but will need to be “spliced” in as a part of a custom UI perhaps (Objective 2.4 has a link to a great example http://vniklas.djungeln.se/2012/06/21/vm-affinity-when-using-vcloud-director-and-vapps/)
        • If the service level are regarding the amount of resources available, Reservation Pool has all their resources reserved.
        • Availability SLA is the same for each cluster of ESXi hosts and has no impact on different allocation models.
      • Allocation Pool
        • If you are using Elastic mode your VM’s might be running two seperate DRS clusters, so you will need to keep that in mind.
        • If the service level are regarding the amount of resources available, Allocation Pool has a part of their resources reserved.
      • Pay-as-you-Go
        • If the service level are regarding the amount of resources available, PAYG has a part of their resources reserved, but it most likely to have performance problems regarding CPUs.
  • Given customer requirements, determine an appropriate storage provisioning model (thin/fast).
    • Thin Provisioning is just that, VM disks use less space on the datastores. So if space efficiency is a requirement than great.
    • Fast Provisioning like I mentioned before uses the Linked Clone technology to create clones that read from a single Shadow VM. Each VM reads that master disk, but writes to their own delta disk. Great for creating multiple VM’s at once as they don’t need to clone the whole disk of the templates.
  • Given a desired customer performance level, determine a resource allocation configuration.
    • Resource allocation configuration are allocation pools and how they are configured.
      • Reservation Pool
        • Customer gets resources reserved and can control how those resources are divided between workloads in that pool.
      • Allocation Pool
        • Customer gets a part of the resource reserved with a chance to burst to a certain amount.
      • Pay-as-you-Go
        • Customer gets a VM based reservation and limited vCPU power. Great for Test/Dev or dynamic workloads (meaning created and then deleted after a short period of time)
    • Performance levels can also mean using different performance Tiers (and perhaps with different level of service)
      • You can create different Provider vDC’s with different CPU speeds, HA configurations, and perhaps adding a SSD caching solution to create different Tiers.
      • Also you can offer storage Tiers, that could be based on different kind of spinning disks, eg. 10K, 15K and 7.2K. All based on the storage array and protocol used.
    • If we create an example
      • 3 Tiers of Provider vDC’s
        • Gold: Reservation Pool – E7 Intel processors – High speed Memory – HA at N+2 – SSD caching enabled.
        • Silver: Allocation Pool- E5 Intel processors – High speed memory – HA at N+1
        • Bronze: PAYG – E5 Intel Processors – High speed memory – HA turned off
      • Storage
        • Gold: 15K HDD + SSD caching in the storage array
        • Silver: 10K HDD
        • Bronze: 7.2K HDD

VCAP-CID Study Notes: Objective 2.4

This is Objective 2.4 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic. Please note that this post is one of the larger ones in this series.

Knowledge

  • Identify constraints of vSphere cluster sizing
    • Each vSphere cluster can only have 32 hosts. But the clusters can be Elastic so Organizations can use multiple clusters.
    • This is from ESXi 5.1 Maximums:

VCAP CID 2-4-1

  • Identify constraints of Provider/Organization Virtual Datacenter relationships
    • You can only create 32 Resource Pools (Organization vDCs) for the same Organization. They can be a part of more than one Resource Pool for a single Provider vDC.
    • Elastic mode allows organizations to use multiple clusters (which is a resource pool)
    • This piece of advice is always good to know, but it might be to strict if you have a good estimate of the growth of the environment or use elastic Provider vDCs:
      • As the number of hosts backing a provider virtual data center approaches the halfway mark of cluster limits, implement controls to preserve headroom and avoid reaching the cluster limits. For example, restrict the creation of additional tenants for this virtual data center and add hosts to accommodate increased resource demand for the existing tenants.
  • Identify capabilities of allocation models
  • Explain vSphere/vCloud storage features, functionality, and constraints.
    • vSphere storage features include (among others):
      • Storage IO control
        • Supported in vCloud environment, as this is a feature of the vSphere environment and doesn’t have a constraint on a vCloud design.
        • I will not be explaining these vSphere features in length as I assume people know the work and might impact a design.
      • Storage DRS (Storage vMotion)
        • Supported in vCloud Director 5.1
      • Storage Clusters
        • Supported in vCloud Director 5.1. You can add a Storage Cluster in the vCloud director administrative page.
        • I recommend setting the same Storage Policy (Profile in 5.1) for each Storage Cluster
        • Each Cluster can only contain 32 Datastores, but a Storage Policy (Profile in 5.1)  can include multiple datastores from multiple Storage Clusters.
          • So VM’s for the same Organization could reside in two different Datastore Clusters.
      • vSphere Flash Read Cache (vSphere 5.5)
      • vSphere Profile Driven Storage
      • All theses features are supported for vCloud environments, all but vFRC in 5.1, but that is supported in vCloud Director 5.5.
      • Please note that only 64 ESXi hosts can access the same Datastore at any given time so if you have a large environment you might run into constraints on that fact.
      • If not familiar with storage features for vSphere please make sure to catch up on that subject.
    • vCloud storage features include:
      • The only real storage features used in vCloud Director are
        • Thin-provisioning
        • Fast-provisioning
        • Snapshots
          • A vSphere feature, but it’s capped in vCloud Director to only one Snapshot each VM.
          • Other capabilities include:
            • One snapshot per virtual machine is permitted.
            • NIC settings are marked read-only after a snapshot is taken.
            • Editing of NIC settings is disabled through the API after a snapshot is taken.
            • To take a snapshot, the user must have the vAPP_Clone user right.
            • Snapshot storage allocation is added to Chargeback.
            • vCloud Director performs a storage quota check.
            • REST API support is provided to perform snapshots.
            • Virtual machine memory can be included in the snapshot.
            • Full clone is forced on copy or move of the virtual machine, resulting in the deletion of the snapshot (shadow VMDK).
  • Explain the relationship between allocation models, vCloud workloads and resource groups.
    • First I want to recommend everyone read this document:
    • You should know VMware changed how CPU allocation works in Allocation Pools between 5.1 RTM (810718) and 5.1.2 (1068441).
      • In 810718 (5.1 RTM)
        • When configuring an organization neither Limit or Reservation of the RP created was set at creation.
        • 20 GHz Capacity and 50% guarantee. 1 vCPU is 2 GHz.  1 VM  is powered on and the RP Limit is set to 2GHz and the reservation are set to 1 GHz. So you only have 10 vCPU’s available in your environment before you hit that cap of 20GHz and the VM’s can’t be turned on.
        • And when using a lower number for the vCPU, lets say 400 MHz you get VM’s that are limited in available CPU at first as the RP is incremented in 400MHz. First VM has 400MHz, 2 VMs has 800MHz, 3VM has 1200 MHz etc.
      • In 868405 (5.1.1)
        • When configuring an organization the Limit of the RP created was set at creation, so lets say at 20GHz Capacity and 50% guarantee the resource pool would have a 20 GHz limit. No reservations was set on the RP, that was done when a VM was powered on.
        • 20 GHz Capacity and 50% guarantee. 1 vCPU is 2 GHz.  1 VM  is powered on and the RP Reservation are set to 1 GHz. You still only have 10 vCPU’s available.
        • But now if you used lower vCPU numbers you could create VM’s like you wanted (). 400Mhz per vCPU would allow you to create  50 vCPU’s for 20 GHz Capacity RP.
        • The first VM’s created now have the whole CPU Capacity to use so they are not so constrained.
        • But this means if the Allocation pool was Elastic (spanning multiple Clusters) each RP in each cluster would have the limit set to the initial Capacity, allowing organizations to use more resource than was initially configured.
        • Massimo Re Ferre’ has a great post on what changed between 5.1 an 5.1.1 : http://it20.info/2012/10/vcloud-director-5-1-1-changes-in-resource-entitlements/
      • In 1068441 (5.1.2)
    • Allocation Pool (like is works in 5.1.3)
      • What kind of resource pool does this pool create?
        • A Sub-resource pool under the Provider vDC resource pool. The pool is configured with the CPU Capacity configured as a limit, leaving CPU reservation unchanged. The Memory limit and reservation are also unchanged (with Expandable and Unlimited Selected as well)
      • What happens when a virtual machine is turned on?
        • When a VM is turned on the Sub-resource pools memory limit is left unchanged with Expandable Reservation and Unlimited selected, and its reservation is increased by the VM’s configured memory size times the percentage guarantee for that organization vDC.
          • Please note even thought the limit is not set on the resource pool vCloud Director will not power on VM’s that will break the Memory Capacity configuration for pool.
        • The CPU reservation is increased by the number of vCPU configured for the virtual machine times the vCPU specified at the organization vDC level times the percentage guarantee factor for CPU set at the organization vDC level. The virtual machine is reconfigured to set its memory and CPU reservation to zero and placed.
      • Does this allocation model have any special features?
        • Elasticity: Can span multiple Provider Resource pools.
    • Pay-as-you-Go (EDITed)
      • What kind of resource pool does this pool create?
        • A Sub-resource pools is created with zero rate and unlimited limit.
      • What happens when a virtual machine is turned on?
        • When a VM is turned on the VM’s memory limit is increased by the VM’s configured memory size, and its reservation is increased by the VM’s configured memory size times the percentage guarantee for that organization vDC. The Resource pool reservation is also increased to the same amount of reservation+the VM overhead.
        • The CPU limit on the VM is increased by the number of vCPU the virtual machine is configured with times the vCPU frequency specified at the organization vDC level, and the CPU reservation is increased by the number of vCPU configured for the virtual machine times the vCPU specified at the organization vDC level times the percentage guarantee factor for CPU set at the organization vDC level. The Resource pool reservation is also increased to the same amount of reservation.
      • Does this allocation model have any special features?
        • No resources are reserved ahead of time so they might fail to power on if there aren’t enough resources.
    • Reservation Pool
      • What kind of resource pool does this pool create?
        • A Sub-resource pools is created with the limit and reservation configured set at the organization vDC level.
      • What happens when a virtual machine is turned on?
        • Rate and limit are not modified. The organization can change these settings on a per VM level with this allocation model.
      • Does this allocation model have any special features?
        • Can not be elastic between multiple Provider resource pools.
        • Will fail to create if the resources in the Provider resource pools are insufficient.
        • Users can set shares, limits and rates on Virtual machines.

 

Skills and Abilities

  • Given a set of vCloud workloads, determine appropriate DRS/HA configuration for resource clusters.
    • These are the available HA configurations:
      • Host failures tolerated
      • % of Cluster resources reserved
      • Specify Failover Host
    • This really depends on Allocation pool type that will be used with the cluster, and lets say a whole cluster is using the same allocation type(for simplicitys sake)
      • Reserved Pool
        • The Resource pools will have reservations and limits. The VM reservations and shares will be  controlled by the users of that organization so I recommend using % based HA mode. That will make HA take the reservation of each VM into account when calculating the Current Failover Capacity.
        • If you would use Host failures at its default setting it would use the VM with the most reserved CPU and Memory as a slot size. You would need know how much resources the users are going to reserve to be able to manually set those slot sizes with advanced setting, so it’s not a very flexible option.
      • Allocation Pool
        • The resource pools will have reservation and limits based on the configuration of the Organization. The VM’s will not be configured with a limit so the slot size will be the default of 32MHz and 0 MB + overhead in memory.
        • Allocation pool VM’s can also be very different in size and again you will need to have good, no a great idea on the sizes of the VM’s that will be running in there to be able to use advanced setting for Host failure tolerates.
        • A % based HA is probably the better choice of the two.
      • Pay-as-you-Go
        • In PAYG the VM’s have a limit set to the configured size of the vCPU. But the memory limit depends on the % guaranteed setting at the creation of the organization vDC. So you will have a very predictable CPU limit (or slot size) but the memory slot size will depend on the size of the largest VM in the cluster.
        • So If you have large VM’s with perhaps 32 GB memory and any percentage guaranteed, lets say 20% (the default) the slot size will be 1 GHz and 6,4 GB. That will not be a good slot size as most VM’s will be much smaller.
        • One way of making Host failure Tolerated acceptable is to use no % guaranteed, so every VM must fight for the Memory in the resource pool.
        • But best to use % of Cluster Resources since it is the de-facto standard in most HA clusters because it delivers the most flexibility of the three options.
          • Takes all available resources in a cluster and adds them upp. Then it subtracts the % configured in HA settings. Then HA will add all reserved resource are in use (Powered-on VM’s only), but will use a 32 MHz default setting for each VM. Memory is reservation+overhead. This value is than used to calculate the failover capacity.
          • It’s best to let the experts explain this as they made a book on it so I recommend reading the vSphere 5.1 Clustering Deepdive book
          • Or read their blog on HA. This blog post from Duncan Epping explains the % based HA very well: http://www.yellow-bricks.com/vmware-high-availability-deepdiv/
    • To be able to use Anti/affinity rules you will need to use PowerCLI or other automation features to make sure certain vApp are deployed to different/same host.
    • DRS configuration
      • First I must mention you HAVE to enable DRS on a vCloud Provider resource pools as that is the only way to be able to create resource pools.
      • And it’s best practice to use different Provider vDC’s for each version of Allocation Pools
        • Pay-as-you-Go Cluster, Allocation Pool cluster and Reservation Cluster
        • But let’s be realistic, that will not be that case at all for many installations. Not everybody has a budget to create 3 different clusters. You can create Sub-resource pools for each version but it complicates DRS scheduling immensely as FrankDenneman explains in this blog post:
      • DRS moves VM’s around ESXi hosts in a cluster based on resources used in the cluster. That was a very simplified description of DRS as it uses a lot of different metric and algorithms to calculate if and when a VM should be moved.
      • I’m not going to explain in detail how DRS works as you can read up on that in various books and documentation.
      • As you might expect using different resource pools can affect DRS calculations. You might have a Reservation Allocation Pool using half of the resources and Allocation or PAYG for the rest.
      • In case of configuration of DRS settings its best in most cases to set the mode to automatic and just let DRS do its thing.
      • Read this to get a good idea on how DRS works, and of course if you really want to know more you should pick up Duncan’s and Frank’s book.
  • Given a set of vCloud workloads, determine appropriate host logical design.
    • So this means creating a logical design for the hosts that are a part of the Resource cluster.
    • Here you have the configuration of the ESXi hosts for the Resource clusters, and it depends the amount of workloads projected and usage of resources and availability requirements.
      • Chapter 4.4 in the vCAT Architecting a VMware vCloud document describes this process.
    • This Table from a VMware Partner SET documents is a good overview what you would like to have as a part of a host logical design

VCAP CID 2-4-2

  • Given a set of vCloud workloads, determine appropriate vCloud logical network design.
    • This section will require details on networking and networking security configuration for the ESXi hosts supporting the resource clusters.
      • This include details on MTU sizes, when using VCD-NI or VXLAN.
      • Increasing the number of ports on a vDS from 128 to 4096 to allow vCD to dynamically create port groups.
      • An overview of vSS and vDS usage on the ESXi hosts for each cluster (if different)
      • vSS configuration: # of ports, # network adapters, Security settings, Port Groups with VLANs and security settings.
      • vDS configuration: # of ports, # network adapters, Security settings, Port Groups with VLANs and security settings and Bindings
    • Chapter 4.4 in the vCAT Architecting a VMware vCloud document helps with sizing the environment to be able to create the logical design.
  • Given a set of vCloud workloads, determine appropriate vCloud logical storage design.
    • Chapter 4.4 in the vCAT Architecting a VMware vCloud document helps with sizing the environment to be able to create the logical design.
    • Here you need to state how much storage is needed by the projected workloads, and if different Tier of storage will be offered. There Tiers need to be explained
    • If available, create a list of VM’s with their configured disk sizes, memory sizes, safety margins and average number and sizes of snapshots. I know this is almost impossible for public clouds, but its better to be able to make a decision on something tangible rather than say “oh cause I always used 2TB datastores”.
      • Let say the storage vendor that was used says it only wants 36 VM’s per datastore (I couldn’t imagine why).
      • You have created an estimate of the number and their disk sizes of the VM’s that will be deployed/migrated into the environment.
      • From that you could size the datastores from those numbers.
    • If you will be using Datastore Clusters, they should be quite proficient in moving workloads around so you only need to make sure you have enough resources in the cluster. The 32 Datastore per Cluster restriction, and the Shadow VM for Fast provisioning might affect that design though. And plan ahead with projected growth as well.
  • Given a set of vCloud workloads, determine appropriate vCloud logical design.
    • vCloud logical design is an overview of the whole environment. vCAT Architecting a VMware vCloud document has a great picture that shows just that:

VCAp CID 2-4-3

VCAP-CID Study Notes: Objective 2.3

This is Objective 2.3 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Identify management components required for a vCloud design
    • From the vCAT Architecting a VMware vCloud document:
      • Core management cluster components include the following:
      • vCenter Server or VMware vCenter Server Appliance™
      • vCenter Server database
      • vCloud Director cells
        • (NFS storage for Multicell environment as well)
      • vCloud Director database
      • vCloud Networking and Security Manager (one per resource group vCenter Server)
      • vCenter Chargeback Manager
      • vCenter Chargeback database
      • VMware vCenter Update Manager™
      • vCenter Orchestrator
      • VMware vCloud Networking and Security Edge gateway appliances deployed by vCloud Director through vCloud Networking and Security Manager as needed, residing in the resource groups, not in the management cluster.
    • And more from the same document:
      • The following management cluster components are optional:
      • VMware vCenter Server Heartbeat™
      • vCloud Automation Center
      • vCloud Connector
      • VMware vFabric RabbitMQ™
      • vFabric Application Director
      • VMware vFabric Hyperic® HQ
      • VMware vSphere Management Assistant
      • vCenter Operations Manager
      • vCenter Configuration Manager
      • vCenter Infrastructure Navigator
      • vCenter Site Recovery Manager
      • Databases for optional components
      • Optional components are not required by the service definition but are highly recommended to increase the operational efficiency of the solution.
        The management cluster can also include virtual machines or have access to servers that provide infrastructure services such as directory (LDAP), timekeeping (NTP), networking (DNS, DHCP), logging (syslog), and security.
  • Identify management component availability requirements.
    • Most of the vCloud components have High Availability features built in.

VCAP CID 2-3-1

      • vCenter Server or VMware vCenter Server Appliance™
        • vCenter Heartbeat – its EOA but will be replaced with availability features in future vCenter releases.
      • vCenter Server database
        • SQL cluster (mirroring or AlwaysOn etc.) or Oracle RAC.
      • vCloud Director cells
        • You can cluster vCloud Director Cells but you will need a NFS Storage as common place for vApp uploads
          NFS storage
        • Either a VM running NFS or even a hardware NAS.
      • vCloud Director database
        • SQL cluster (mirroring or AlwaysOn etc.) or Oracle RAC.
      • vCloud Networking and Security Manager (one per resource group vCenter Server)
        • One of the few components that can only be covered by vSphere HA.
      • vCenter Chargeback Manager
        • See the picture
      • vCenter Chargeback database
        • See the picture.
      • VMware vCenter Update Manager™
        • There is no need for availability here. Used when patching ESXi hosts.
      • vCenter Orchestrator
        • It has a cluster mode, this KB says it all:2079967
      • VMware vCloud Networking and Security Edge gateway appliances deployed by vCloud Director through vCloud Networking and Security Manager as needed, residing in the resource groups, not in the management cluster.

Skills and Abilities

  • Design a management cluster given defined availability constraints.
    • I think I covered the constraint pretty well in the last bullet.
    • Please make sure to include all vSphere related features, HA and FT.
    • Other than those these “best practices” are from the vCAT Architecting a VMware vCloud document
      • NFS storage fro vCloud Director Cell should at least be 250GB.
      • Use 3 ESXi hosts for a managment cluster
      • Use HA with percentage based reservation. N+2 is also an option for even more high availability.
      • Network component and path redundancy.
      • Configure redundancy at the host (connector), switch, and storage array levels.
  • Size a management cluster based on required management components for a given vCloud design.
    • There is a nice table in the vCAT Architecting a VMware vCloud document:

VCAP CID 2-3-2

  • Design a management cluster that meets the needs of a given resource configuration.
    • Resource configuration is based on the size of the Resource cluster, and a number of VM’s (or at least I think they are referring to that).
    • So vCloud maximums is something you need to be aware off:
      Knowledge Base Article 2036392.
    • But there is a slight chance they want to you take the sizing of the management components and use that to design a management cluster (how many hosts, CPU, Memory, network, storage). Either way best to know about both 🙂

VCAP-CID Study Notes: Objective 2.2

This is Objective 2.2 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Identify policy/quotas/limits available to a vCloud design.
    • The organizational policy/quota/limits are the following:
      • Allocation models
        • Organization vDC needs to have a single allocation model which will in some instances limit the resources available to the organization. This will explained later in this Objective.
          • Pay as you go:
            • CPU Quota GHz – CPU guarantee % – vCPU speed
            • Memory Quota – Memory guarantee %
          • Allocation Pool:
            • CPU Allocation GHz – CPU guarantee %
            • Memory Allocation GB – Memory guarantee %
          • Reservation Pool:
            • CPU Allocation GHz
            • Memory Allocation GB
      • Storage Space
        • An organization vDC requires storage space for vApps and vApp templates. You can allocate storage from the space available on provider vDC datastores.
        • This storage quota either has a limit of certain GB/TB or unlimited.
        • Thin-provisioning and Fast-provisioning are also available options when selecting storage for organizations
      • Network Pools
        • An Organization need network pools to create vApp networks or isolated internal networks.
        • When creating a Organization vDC you can limit how many such networks can be created.
      • Rate limits
        • You can configure inbound and outbound rate limits on Edge gateways for organizations.
        • Those are in gigabit per second
      • Max number of VM’s
        • When choosing allocation models you can limit how many VM’s can create in that Organization vDC.
      • Catalog sharing/publishing
        • You can choose whether or not the organization can publish catalogs.
        • Also you can choose if can share catalogs to other organizations
      • Leases
        • You can specify for how long a VM can run
        • Also how long VM’s are available before they are “cleaned up”
        • What should be done when the are “cleaned up”
          • Move to Expires Items or Delete permanently
        • vApp template lease – how long they are kept before they are “cleaned up”
          • Move to Expires Items or Delete permanently
      • Default Quotas
        • The organization administrator can change how many VMs can be kept and how many can be powered on at any given time
      • Limits
        • There are 3 DOS (Denial of Service) limits, and are best described by vCloud director:
          • These limits provide a defense against Denial of Service attacks. Resource intensive operations, such as copy, move, Add to My Cloud, Add to Catalog, and so on, can be contained at a maximum number. Simultaneous connections to a VM through the VMRC console can also be limited, although this does not limit user-created connections though protocols such as VNC or RDP.
        • Number of resource intensive operations per user.
        • Number of resource intensive operations per organization
        • Number of simultaneous connections per VM.
  • Explain inter- and intra- organization vApp deployment and migration constraints.
    • I might be wrong but I think the mean Elastic vDCs.
      • Elastic vDCs are made of several resource pools on different Resource pools associated with its provider vDC.
        • 1 Cluster of ESXi host is one resource pool, so you can have two clusters available to an organization through the same provider vDC
      • So when the virtual machine is powered on a placement engine decides where it should run and can be moved to another Resource pool (which is most likely a vSphere HA cluster)
    • Intra organization migration
      • They might as well be talking about the fact you can move a vApp to another vDC.
        • Turn the vApp off – Click My Cloud – Select vApps – Select Move to and select the vDC you want to move the vApp to.
        • You can also copy it the same way.
    • Inter-organization Deployment
      • Only way to deploy a vApp in another Organization is to publish it as a Public vApp Template. Then deploy it from there or move it to an Organization catalog and deploy it.
    • Inter-organization migration
      • The same as deployment.
    • Intra organization deployment
      • This is just an ordinary deployment of a vApp and you can control who has access to deploy it.

Skills and Abilities

  • Given resource requirements and constraints, determine policy/quotas/limits for a logical design.
    • Since we are talking about organizations, I presume they mean organizational policy/quota/limits.
    • As these are resource based this applies to allocation models and best described in the vCloud Director Administration Guide (page 54-57) and vCAT Architechting a vCloud (page 42-44, 5.4)
    • All the above mentioned bullets in Identify policy/quotas/limits available to a vCloud design can be used to control the creation of an organization and its vDC.
  • Determine how service dependencies impact components in a logical design.
    • The Datacenter Operational Excellence Through Automated Application Discovery & Dependency Mapping document in the Objective Tools is something everyone should read, but the highlights are:
      • Automated application discovery and dependency mapping can deliver tremendous value when applied to projects related to datacenter consolidation, capacity planning, business continuity planning, and application  migration. All of these projects have one common requirement: a deep, thorough understanding of how servers, applications, and infrastructure work together and relate to each other.
      • With automated application discovery and dependency mapping, you get that capability—whether its establishing a baseline asset inventory before a datacenter consolidation, ensuring a new application rollout doesn’t impact legacy architectures in unexpected ways, or understanding of how the servers, applications, and infrastructure work together so you can deliver application services right away when recovering from a disaster.  You know what applications are involved, how they are delivered, and what business services and users will be impacted by a proposed change or move.
      • Furthermore, by leveraging automation for application discovery and dependency mapping, you save tremendous amounts of money that would have been spent dedicating personnel hours—even worse, likely using highly skilled IT personnel—collecting and analyzing this data. And in the end, you’d still have risk with manual application discovery and dependency mapping because manual processes done by people are extremely prone to error.
      • In addition, by having automated application discovery and dependency mapping capabilities, these kinds of activities start to move away from being one-off projects, and become properly positioned to move toward being ongoing operational processes. Moving away from reactive, project-based approaches to proactive, process-based approaches for these kinds of activities, IT operations moves that much closer to achieving operational excellence—the data provided by automated application discovery and dependency mapping provides the key linchpin necessary to continuously improve these types of activities, rather than simply react when the need arises.
      • Automated application discovery and dependency mapping supports IT operations before and after a consolidation and virtualization effort by automatically discovering all of the critical dependencies that exist among your applications and your information infrastructure.
      • Before virtualization, automated application discovery and dependency mapping augments your virtualization plan by determining the key dependencies that exist between applications and their physical hosts. For example, automated application discovery and dependency mapping could identify a critical dependency between an application and database that could reduce overall network traffic. To maintain proper levels of performance after virtualization, IT operations could then ensure the application and database remain on the same server for greater efficiency.
      • By using automated application discovery and dependency mapping to identify critical dependencies between applications and servers before consolidation and virtualization occurs, you have a smoother transition to virtualization while ensuring the delivery and performance of IT services.
      • After consolidation and virtualization, changes occur rapidly in the new environment. Automated discovery and dependency-mapping information also helps IT operations by ensuring ongoing performance and improvement. This dependency information can help you identify potential application-related performance bottlenecks and reduce unnecessary network traffic. Automated application discovery and dependency mapping helps you deliver on the promise of virtualization by allowing you to optimize the performance of critical IT applications and services.
  • Given business requirements, determine the appropriate organizational structure for a logical design.
    • Business requirements are what needs to be done to provide value to the business. I had some examples in Objective 2.1 and I’ll use those as an example:
      • System must provide self-service capability
        • That is a integral part of vCloud Director and vCAC (vRealize now) so that wont impact organizational structure in most design. Unless you have a requirement to block self-service for certain organizations (a static organization with no ability to create more VM’s).
      • System must provide 99.9% availability
        • This is more of a infrastructure matter, hardware, networking, high-availability of management services, HA etc. The resources should be available to the organizations according to SLA’s and that is the only thing that should matter regarding availability (they should not need to worry about at least).
      • System must provide optimal scalability and elasticity
        • vCloud Director and vSphere is highly scalable and elastic by design, but you need to design for it though (see what I did there? 🙂 vSphere clusters can scale to 32 ESXi hosts, and even so you can add multiple clusters for one Provider vDC. And elasticity is baked in as well, as you can move your workloads between clouds, private or public fairly easily with vCloud Connector.
      • System must provide multitenancy
        • This is also an integral part of vCloud director as it supports multitenancy.
      • System must provide metering capabilities for cost reporting
        • VMware vCenter Chargeback Manager is the service that helps with that, please note that Chargeback is EOA for non-service provider customers June 10, 2014.
      • System must support vApp use cases defined
        • This depends on the use-cases that were defined. Please check out Objective 1.1 for some Use-cases.
      • System must leverage shared infrastructure and resources pooling
        • That’s a part of vSphere.
      • System must support a catalog of templates that end users can create
        • Yes also a part of vCloud Director.
      • System must provide differentiated offerings based on cost.

VCAP-CID Study Notes: Objective 2.1

This is Objective 2.1 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Identify what can be included in a published catalog.
    • A vCloud Catalog can include several things, most of them vApps of various sizes and ISO files.
      • vApp with one VM.
      • vApp with several VM’s with various level of connection between them.
      • Media files: ISO files for the installtion of new VM’s OS inside a vApp. Floppy disks are also supported. Please note that users will not see the media files unless a Organization admin moves them to their Organization Catalog and shares them with the users.
    • A Published Catalog or a Global Catalog is a catalog where a vCloud admin has made a certain catalog public to all organization in the cloud.
  • Identify what can be included in a private catalog.
    • Same things you can in a published catalog.
    • But you will need to have access to the catalog to be able to create VM’s from it.
  • Identify permission controls for catalogs.
    • This picture is snapped from the vCloud Director Users Guide and says it all:
    • VCAP CID 2-1-1
  • Explain the functionality of a catalog.
    • The best way to explain what a catalog is is to quote the VCAT Consuming a VMware vCloud documentation:
    • Organizations can offer the following types of service catalogs to their users or customers:
      • A vCloud service catalog – Includes predefined vApps, virtual machines, and images (operating systems and applications) that users can deploy within an organization
      • An operational service catalog – Includes operational features such as development of a vCloud service catalog, backup and recovery services, archival services, managed services, and migration services.
    • And then a quote from the vCloud Director User’s Guide:
      • A catalog is a container for vApp templates and media files in an organization. Organization administrators and catalog authors can create catalogs in an organization. Catalog contents can be shared with other users in the organization and can also be published to all organizations in the vCloud Director installation.
      • There are two types of catalogs in vCloud Director; organization catalogs and public catalogs. Organization catalogs include vApp templates and media files that you can share with other users in the organization. If a system administrator enables catalog publishing for your organization, you can publish an organization catalog to create a public catalog. Organization administrators from any organization in the vCloud Director installation can view the vApp templates and media files in a public catalog and copy those files to a catalog in their organization for use by their members.
      • There are two ways to add vApp templates to a catalog. You can upload an OVF package directly to a catalogor save a vApp as a vApp template.
      • Members of an organization can access vApp templates and media files that they own or that are shared to them. Organization administrators and system administrators can share a catalog with everyone in an organization, or with specific users and groups in an organization.

Skills and Abilities

  • Based on application requirements, determine appropriate vApp configuration.
    • Lets start with a short introduction to what a vApp is:
      • A vCloud vApp differs from a vSphere vApp in the way it is instantiated and consumed in the vCloud. A vApp is a container for a distributed software solution and is the standard unit of deployment in vCloud Director. It allows power-on and -off operations to be defined and ordered, consists of one or more virtual machines, and can be imported or exported as an OVF package. A vCloud vApp can have additional vCloud-specific constructs such as networks and security definitions.
    • A vApp can contain a single VM or multiple VM’s that can have various interconnections between them.
    • If using single VM vApps the usual VM design considerations apply
      • Single vCPU unless needed
      • Newest VMware tools
      • Don’t use Reservations and limits unless the design requires it.
      • Use VMXNET3 where supported
      • Secure VM’s like you would physical machines
      • Use naming conventions.
    • Lets say you have a Database, it needs more then 32 vCPU, you should upgrade the virtual hardware version to at least 9.
  • Determine appropriate storage configuration for a given vApp.
    • This most likely is related to where a VM in a vApp should reside.
    • Let’s say a vCloud installation has 3 tiers of storage, Gold, Silver and Bronze. You can select which VM should use each tier. So a database VM should likely go to a Gold tier, while a Development machine would use the Bronze tier.
  • Given customer requirements, determine appropriate catalog design.
    • A Catalog can include vApps, and media files. Access to those catalogs can be controlled.
  • Determine the impact of given security requirements, on a catalog structure.
    • The only sequrity feature with catalog structure is permissions. You can make the changes you need so only a subset of people/deparment can access certain catalogs.

VCAP-CID Study Notes: Objective 1.3

This is Objective 1.3 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Skills and Abilities

  • Determine how storage and network topologies affect capacity requirements for a vCloud conceptual design
    • I’m not sure if they want you to really deep dive into the actual topologies or they want you to know the limitation of choosing a FiberChannel solution versus choosing a iSCSI/NAS solution. But lets say they really want us to go over topologies:
      • You have different storage topologies (borrowed from a whitepaper from Brocade):
        • Flat SAN
          • Simple redundant pathing, each server is connected to at least two switches, and those switches are connected to all storage devices
        • Mesh SAN
          • Each server connected to a at least two switches that are interconnected together. Any-to-any connections between devices.
        • Core Edge SAN
          • Here you have a core switch, a distribution layer and a edge layer. Similar to many LAN setups.
      • As for network topologies this could have an effect if using VSAN or NFS/iSCSI, and how networks will be interconnected.
    • The different topologies can affect the design and must be taken into account when creating it.
      • Example:
        • Customer has Flat Fiber Channel SAN and that will affect how you many storage arrays you can connect (on the number of ports available and bandwidth of the SAN switches)
        • Customer had Core Edge SAN using NFS with Edge switches connecting to Local Storage and Core Switches directly connecting to main storage as well (figure 4 in the whitepaper). This will affect they storage layout of the design, where tier 1 and tier 2 storage will be located etc.
  • Describe VMware vCloud Director and VMware vSphere functionality and limitations related to capacity.
    • This involves knowing the Configuration Maximums. Most of these maximums are so high that you will never reach them unless you have a very large environment.
    • Here you can read the Configuration Maximum for vSphere 5.1
      VCAP CID 1-3-2
    • This also includes knowing all the features of vCloud and vSphere as they will limit what you can do regarding to capacity. Just to name one, in vSphere 5.1 you can only put 32 datastores in a Datastore cluster and that might limit the size of a Tiered storage pool (and the datastore size is in turn limited to RTO/RPO availability and storage protocol limitations)
      • Please note that vSphere 5.5 allows 64 Datastores for each Datastore Cluster.
  • Given current and future customer capacity requirements, determine impact to the conceptual design.
    • Current and future capacity requirements include CPU, memory, storage, network bandwidth/port capacity. If you can and you have the numbers (eg. APC cooling/PSUs + SW) go for cooling and power as well.
    • The requirements are based on documentation from the customer or a Capacity Planner project for the existing environment.
    • The future capacity is the projected growth of the environment in all these areas, most likely based on numbers from the customer from previous years.
    • For this to impact the conceptual design you might not be able to replicate the amount of data you want to because of bandwidth issues, you will hit vSphere Configuration maximum, to few network ports, not enough power etc.
  • Given a customer datacenter topology, determine impact to the conceptual design.
    • If the customer has multiple datacenters that topology will impact the conceptual design if they will all be used. Bandwidth, connections, Network topology, cooling, power, storage, using stretched clusters are some of the thing you need to consider.
      • Internally to one datacenter this is based on rack space, and cabinets, cooling layout and power grid setups.
  • Given cloud capacity needs, constraints, and future growth potential, create an appropriate high-level topology.
    • Lets use an example:
      • Capacity need is 3-tiered storage layout with 20TB each. Tier 1 is replicated to a recovery site.
      • Constraints include a 50 Mbps link to a recovery site for replication.
      • Future Growth need is 20% each year (4TB each year)
        • As you might suspect you will need to know the bandwidth of the Tier1 workloads to be able to mitigate the risk of the link between the sites will be enough. If the bandwidth is enough, but will not be enough after 2 years, you could either increase link speed, or do it when a certain percentage of the link is used.

VCAP CID 1-3-1

VCAP-CID Study Notes: Objective 1.2

This is Objective 1.2 in the VCAP-CID blueprint Guide 2.8. The rest of the sections/objectives can be found here.

Bold items that have higher importance and copied text is in italic.

Knowledge

  • Identify discovery questions for a conceptual design (number of users, number of VMs, capacity, etc.)
    • First we need to know what a conceptual design is.
      • A conceptual design is a high level overview of a design. It includes how the design will eventually look like or a final look of the design. It should show the concepts the design will cover. In a vCloud environment it might include different Tiers of Resource clusters, a different Management cluster, information on the various clusters (replication, storage stacks, networking, security).
      • Conceptual vCloud
    • When we have an idea what a conceptual design should include, we need to ask the right groups within the organization to fill in the blanks, and these groups include various roles:
      • C-level IT people – who are more aligned with the business side of IT
      • Server Administrators – those who manage the hardware resources, CPU and Memory.
      • Storage Administrators – those who manage storage, both hardware and creating logical storage for utilization (the manual way)
      • Backup Administrators
      • Desktop Administrators
      • Network Administrators
      • Security Administrators
      • Virtualization Administrator
      • Application Power users and/or administrators – great in use-case creation
      • Help desk – these know more than most Level 3 Support guys on what is the real issue with some environment.
      • End users – in use-case creation, and to find out the paint points of the current operational model.
    • The questions can both cover the current environment and the future environment and of course regarding the projected usage of the environment:
      • How many users will be accessing each service? (used for scalability and capacity considerations)
      • How many VM’s are you currently using? (used for current capacity considerations)
      • What is the projected growth rate of VM’s? (used for future capacity considerations)
      • Are there any security requirements? (Compliance, isolation (data, network etc), mobility)
      • What are the current layout of performance tiers in the environment? (used for current performance considerations)
      • Are there plans to offer different performance tiers based workloads? (used for future performance considerations and SLA requirements)
      • Are there requirements for DR/BC? If so would they vary between performance tiers?
  • Identify the effect of product architecture, capabilities, and constraints on a conceptual design.
    • This is a very vague point, but I guess they mean how the different products in the vCloud stack, with their abilities and constraints, and how they affect the creation of the conceptual design.
    • I guess you would use your knowledge of vCloud environment architecture to create the conceptual design using the business requirements you have been given.
    • The product, VMware vCloud, will have constraints on how the conceptual design will be. Even if the business requirements are somewhat different they can’t go beyond what the actual product can do.
    • It this process of gathering the requirements from the business to translate them into a working conceptual design that is part of the process.

Skills and Abilities

  • Relate business and technical requirements to a conceptual design.
    • Business requirements are all about value, or how the design should provide value to the business. To name a few that could be used is:
      • Self-service capability
      • 99.9 % availability
      • Scalability
      • Multitenancy
      • Metering Capabilities
    • Technical requirements are just how you will use the technology in question to fulfill those business requirements
    • VCAP CID Obj 1.2 - 2
    • The technical requirements also include stuff that are not easily tracked to a certain business requirement and is more of a logical layout of the design:
      • Storage requirement: Different Tiers of storage must be available to the customer (T1,T2,T3)
      • Storage requirement: NFS datastore for the vCloud cell
      • Security requirement: AD must be used to authenticate users to the vCloud environment
  • Gather customer inventory data.
    • This can be done in multiple ways and that really depends on the project itself.
    • When the plan is to import existing workload into vCloud you will need some capacity information. You can use Capacity Planner, which is a tool VMware Partners get access to.
    • If you need to see the financial benefit of moving to the vCloud Suite you ask VMware Partners to use VMware Infrastructure Planner tool. It can be located here : https://vip.vmware.com/
    • Also if the customer has inventory document ready and capacity and performance information as well that can also be used.
  • Determine customer business goals.
    • Like I stated before a business requirement can be described as what needs to be achieved for the system to provide value. Here are some examples ofvCloud specific business requirements:
      • System must provide self-service capability
      • System must provide 99.9% availability
      • System must provide optimal scalability and elasticity
      • System must provide multitenancy
      • System must provide metering capabilities for cost reporting
      • System must support vApp use cases defined
      • System must leverage shared infrastructure and resources pooling
      • System must support a catalog of templates that end users can create
      • System must provide differentiated offerings based on cost.
      • The last 4 are taken from the Cloud Infrastructure Use Case document. They are somewhatvSphere related, but are they are a great example.
        • Business agility and flexibility should be increased; the cost of doing business should be decreased.
        • Minimal workload deployment time.
        • The environment should be scalable to enable future expansion (minimum one year, 20 percent estimated).
        • Resources should be guaranteed to groups of workloads as part of internal SLAs
      • The business requirements are not something that is handed to the architect as a list of things they solution must fulfill, but a list of requirements that are agreed upon during the design phase interviews. The architect must help the customer define the business requirements based on the capabilities and constraints of the product itself or you could end up with conflicting or even out-of-scope requirements.
  • Identify requirements, constraints, risks, and assumptions.
    • The majority of the next lines are straight from the vCloud Service documents available to VMware Partners.
    • Requirements are documented statements that depicts the requisite attributes, characteristics or qualities of the system
      • See business and technical requirements above for examples.
    • Constraints are requirements that restrict the amount of freedom in developing the design.
      • Existing hardware must be used
      • Distance between Datacenters
      • Cost
      • Network bandwidth is 1Gbps
      • Total storage available is 10TB
      • Etc…
    • Risks are potential issues that may negatively impact the reliability of the design
      • Lack of redundancy of specific hardware component
      • Support staff has not had any training
      • If hardware is not installed and configured by expected date, the project timeline will be affected
    • Assumptions are “educated guesses” that are made during the design process regarding the expected usage and implementation of a system. You should try to mitigate the assumptions by explaining them or even changing them to a requirement by asking the customer about the assumption.
      • Redundant hardware components are used
      • Training is provided for staff
      • Sufficient bandwidth is available for the projected number of VMs
      • All licences are ready before the implementation phase
    • Lets use Training as an example assumption. You as an architect assume the customer will provide training. You could leave it at that or ask the customer and verify that training will performed and change it to a requirement (Management requirement). An unresolved assumption (or at least unexplained one) will always have risks associated with it, like in this case if the customer wouldn’t train their staff there is a risk the project could fail.
  • Given customer requirements and product capabilities, determine the impact to a conceptual design.
    • I think this have already been covered in this post. The requirements of the customers must be related to the capabilities of the product and that will impact the conceptual design.