vCenter & Distributed vSwitch on two ESXi hosts with a single NIC

I was doing some lab work the other day with two IBM Flex nodes that only had a single 10Gb NIC.

The vCenter for the environment was located on the afformentioned ESXi hosts and my plan was to use the Distributed vSwitch, rather than the Simple vSwitch.

If you ever tried moving a ESXi host to a Distributed vSwitch which hosts the vCenter, it easy when you have more than one NIC. Just move one of the NIC’s to the Distributed vSwitch,  and then change the network configuration for the vCenter.

But when you are trying to move a ESXi host with a single NIC (whitebox, demo equipment, etc) things get a little bit more complicated.

When you attempt to move the vCenter and the ESXi host to a new Distributed Portgroup, the vCenter loses its connection and the process is rolled back. But you are still stuck with the NIC on the Simple vSwitch. Status quo…

The best way to make this work is to:

  1. Move the ESXi host that doesn’t run the vCenter VM hto the Distributed vSwitch. Create VM traffic portgroups.
  2. Clone the vCenter VM and place it on the ESXi host that doesn’t run the vCenter VM.
  3. Connect the newly cloned vCenter VM to a Distributed Portgroup on the ESXi host (that was connected to the DVS previously)
  4. Turn off the original vCenter.
  5. Turn on the cloned vCenter and configure the network settings (accept the error about a previous network using the IP if using Microsoft Server)
  6. Move the existing host to the Distributed switch.

Now you have a working vCenter on hosts with single NICs with  a Distributed vSwitch.

Custom alarms for events in vCenter 5.x

Some customer have been asking if I know why some machines are failing at consolidating the snapshot in the end of the backup job. It seems as the job finishes, but the snapshot deletion fails, some times leaving behind a large snapshot, or even some “ghost” snapshots.  Sometimes the event isn’t noticed until days later, or even worse. when the datastore fills up.

When this happens, an event is logged for the virtual machine, stating that the VM’s disks consolidation fails:

Virtual machine {vm.name} disks consolidation failed on {host.name} in cluster {computeResource.name} in {datacenter.name}.

This is a perfect case for a custom alarm so the administrator can be informed when the consolidation failed.

  1. First you need a way to create custom alarms in vCenter. My main source of information is this handy document from the VMware communities (author hmundt): More fun with vSphere Alarms
  2. Second you need a list of event for the vSphere API. Veeam has been so kind to publish a list of events from the API for vSphere 5.0 which they make available for users for their great product Veeam One (and if anyone from Veeam reads this, an updated list for vSphere 5.1 will be much appreciated).
  3. Next you create a new alarm on the vCenter level, choose Virtual Machine, Event and for the Event trigger you just paste the vSphere API event text. In this case its:

com.vmware.vc.VmDiskFailedToConsolidateEvent

Next time a consolidation job fails an Alarm will light up that VM and bother all the people you added on the email notification list.

Of course this list can be used to watch for EVERY event know in the vSphere API and is very handy when you need to watch for a specific event in one of those troubleshooting sessions.

Upgrading a vCenter SQL Express database

The other day I got my hands on a full vCenter SQL 2005 SP2 Express database. The vCenter database filled up the 4GB allowed for SQL 2005 Express DBs.

So as the shop I was in had no SQL’s to work with, it was decided to upgrade to SQL 2008 R2 SP2 Express, which has a 10GB limit per database.

The environment was running on vSphere 5.0, and I had upgraded it recently from 4.1 to 5.0. There’s a quite an increase of tables between 4.1 and 5.0, so this will happen to most environments sooner or later.

Note this procedure will only work if you will still be using the same vCenter server as in the beginning. Not to be used for whole vCenter relocations.

So the way to do this is quite easy, and you don’t need to be a SQL admin. 🙂

You will need to break this procedure into 3 parts: 1) Preparation 2) Upgrade 3) Test

1) Preparation

  • ODBC connections: Make sure to check what the ODBC connection is configured to Integrated Windows or SQL.
  • Services: Make sure to check what user is used to run the Virtual Center Server service. Most likely System or a domain/local admin.
  • Name of the Database: I recommend not to change the name of the database. Most likely the name will end  at SQL*\SQLEXP_VIM.
  • Get the installation files for SQL 2008 R2 Express and also for SQL Server Management Studio Express.
  • Open up the SQL instance using SQL Management Studio, and note who the DBOwner is for each database that will be moved. If a SQL user note that down as well.

2) Upgrade

    1. Stop all vCenter related services
      • vSphere Web Client
      • VMware VirtualCenter Server Delayed
      • VMware VirtualCenter Management Webservices Delayed
      • VMware vSphere Update Manager Service.
      • VMware vSphere Profile-Driven Storage
      • vCenter Inventory Service
      • VMwareVCMSDS
    2. Put all stopped services to disabled.
      • This is done as you will need to restart the server after a SQL upgrade and you will not want the services to start when you do.
    3. Open up the old SQL 2005 Express database using the SQL Management Studio.
    4. Backup each database (e.g. if you got vCenter and Update Manager databases).
      • Right click the database, go to Tasks and select Backup. Backup to a known location.
    5. Go the the DATA folder for the SQL instance, for 32 bit  its in c:/Program Files/Microsoft SQL Server//…, and for 64 bits in c:/Program Files (x86)/….
      • There you will find all the database and log files for the vCenter server.
      • Names are most likely VIM_VCDB.ldf for logs, and VIM_VCDB.mdf for the database itself.
    6. Detach the database. Make sure you stopped the vCenter services.
      • Right click the database, go to Tasks and select Detach.
      • Move the database and log file to another location.
    7. Though you can upgrade 2005 Express to 2008 Express, I find it much “cleaner” to just uninstall 2005 and install a new SQL 2008 R2 Express instance
      • Remove the SQL 2005 Express instance. (you will need to turn off the SQL service)
    8. Restart
    9. Install a new SQL 2008 R2 Express instance.
      •  When installing a new database make sure you write down the sa account password and/or give a domain/computer account sysadmin privileges to the instance.
      • Make sure you name the instance as SQLEXP_VIM. Otherwise you will need to change a registry setting for the VirtualCenter service to start (pointing it to the new name).
    10. Just to make sure, restart again.
    11. Move the database and log file to the new folder for the 2008 R Express instance.
    12. Login to the instance using SQL Studio Manager.
    13. Right click databases and select Tasks->Attach. Attach the database. You don’t need to attach another log file when the pop-up appears, theres only 1 log file already associated with the database.
    14. Go to properties of the vCenter database and make sure the DBO (database owner) is the same one as on the 2005 instance.
      • You might need to add the user in the Login section of the instance.
    15. Create a new file using notepad, save it as connections.udl (must end in udl). Go to properties and to Connection. There you can try out the SQL connection. This is a handy tool to use with SQL connections test. This will be used in the next sections.
    16. Go to SQL Server Configuration Manager (should available in the Start menu).
      • Select SQL server network configuration and enable both Named pipes and TCP/IP.
      • Go to Properties on TCP/IP. Select IP Addresses and go to the bottom where you see a section called IPAll. Put in 1433 in TCP port. Push OK.
    17. Go to both ODBC managers (32bit and 64 bit: C:\Windows\SysWOW64 for 32bit and C:\Windows\system32 for 64bit, yes they have conflicting names…).
      • Make sure you have a connection to the database. 32 bit is for Update Manager.
      • The user that connects to the database, needs to be a user that has access to Database through the SQL Studio Manager. Best practice is a domain system account, that is a DBO on the vCenter database, and is the one that starts the vCenter service as well.
    18. Open SQL Studio Manager and open up the vCenter Database
    19. Put all the services to their former startup selection.
    20. Restart the server, or go through restarting the services. I find it easier just to restart it.

3) Test

    1. After restarting make sure the vCenter server service starts and all your performance data is showing.

Notes (stuff you should know about vCenter SQL Express databases):

  • Rollup jobs (the jobs that move performance data between week->month->year) are not running as a separate job, so you should not need to fix those. They are being run by the VirtualCenter service and are a part of the database (located in vCenter DB > Programmability > Stored Procedures). This is only the case for SQL Express instances.
  • I always recommend putting vCenter Databases on real SQL servers. But I’ve seen small environments of at least 100 machines run for years on an Express database (NOT SUPPORTED).
  • Most misconfigurations on SQL Express DB’s are user related. Double check the user that runs the VirtualCenter service, and who is the DBO, and ODBC connections.

KBs used in this blog post:

vMotion failing at 9%: Error 0xbad010d

A short post, even though its been ages since my last post. But I promise I will fix that next week. 🙂 This one is more of a personal reminder cause  the VMware KB had nothing on this, only this which didn’t work at all.

I was trying to vMotion a VM today (actually Redeploy VMs in vCloud but still)  and I got this error:

A general system error occured: Failed to start migration pre-copy Error 0xbad010d. The Esx host failed connect over the VMotion network.

All host were using vDS and different VLANs for Managment and vMotion vKernels.

First thing to check is vmkping vMotion IPs on the hosts, everything checked out fine.
Then I glanced over the event log on the host and I saw that the vMotion vKernel was trying to contact the mangement IP on another host, not its vMotion vKernel (which were not on the same VLAN, as they should).

Just before I had to remove the receiving host from the vDS by unassigning all pNics from its vDS, restore vSS and re-add the host to the cluster.
So the easy way to fix that is to deselect vMotion on the management vKernel on the receiving host.

Took me at least good 20 min to finally check for that small line in the event log 🙂 I hope this helps someone in the future so they can use the 20 min to get more coffee and bacon.