Archive

Posts Tagged ‘vCenter’

vCenter disk performance – SCSI LUN ID display

September 26th, 2010

Something has been bugging me for some time now about vCenter disk performance statistics.  Basically vCenter shows each SCSI LUN with a unique ID as per the following screenshot. When viewed through the disk performance view it’s impossible to tell what is what unless of course you know the NAA ID off by heart!?

image 
I was working on a project this weekend putting a Tier 1 SQL server onto our vSphere 4.0 infrastructure, therefore insight into disk performance statistics was key.  So I decided I needed to sort this out and set about identifying each datastore and amending the SCSI LUN ID name, here is how I did it.

Identify the LUN

First of all navigate to the datastore view from the home screen within vCenter

image 

Click on the datastore you want to identify and then select the configuration tab

NAA_Post2 

Click on the datastore properties and then select manage paths

Note down the LUN ID in this case 2 and also note down the capacity

image

Change the SCSI LUN ID

Now navigate to the home screen and select Hosts and Cluster

image

Select a host, change to the configuration tab and then select the connecting HBA

image

At the bottom identify the LUN using ID and capacity noted earlier and rename the start of ID. I chose to leave the unique identifier in their in case it is needed in the future.

image

Now when you look at the vCenter disk performance charts you will see the updated SCSI LUN ID making it much more meaningful and useable.

NAA_Post6

Raw Device Mappings

If you have Raw Device Mappings (RDM) attached to your virtual machine then these to are capable of showing up in the vCenter disk performance stats.  It’s the same process to change the name of the SCSI LUN ID however it’s slightly different when identifying them.  To do so carry out the following.

Edit the settings of the VM, select the RDM file, select Manage Paths and then note down the LUN ID for the RDM.  Use this to identify the LUN under the Storage Adapter configuration and change it accordingly.

image

Following making these changes I can now utilise the vCenter disk counters to compliment ESXTOP and my SAN monitoring tools.  Now I have a full end to end view of exactly what is happening on the storage front, invaluable when virtualising Tier 1 applications like SQL 2008.

There are a plethora of metrics you can look at within vCenter,  if you would like to understand what they all mean mean then check out the following VMware documentation.

VMware vCenter SDK reference guide – Disk Counters

Storage, vCenter, VMware , , , ,

VMware Snapshot Alerting and Reporting

June 21st, 2010

I spotted an issue in my vSphere infrastructure this weekend just past. I noticed that one of the main development boxes was showing the dreaded question, redo log out of space, retry or abort?

hbacommom_outofspace

As it turned out VMware Data Recovery Manager had taken a snapshot as part of it’s back up routine and had failed when trying to remove it.  This coupled with a scheduled SQL maintenance plan caused the delta files for the snapshot to grow to over 250GB in little over 12 hours.

I eventually overcame the issue by adding an extent to the out of space VMFS datastore, this gave me an extra 160GB with which to play the logs back in.  I then used the very handy SnapVMX utility to tell me how much space was required to replay the delta files.  Luckily for me it only required 20 GB as sometimes it can require as much as the size of the original disk. After the snapshot was merged I did a bit of Storage vMotion and reworked the datastore to get rid of the extent (I’m not a fan of using them)

This particular incident was unfortunately unavoidable, it happened at a weekend, was due to VMDR’s failure to remove a snapshot it had created and unfortunately clashed with a disk intensive operation. It did get me thinking though, although I am careful with snapshots and there usage who else in the organisation is not? how do we mitigate this potential risk?

Snapshots are a handy feature, I generally only use them for short periods of time, usually to provide a rollback when patching or changing configurations. Misuse or mismanagement of snapshots can quite quickly lead to problems, something that a recent blog article from VMware Support deals with quite effectively. Entitled ESX Anatomy 101 it’s a must read for anyone trying to gain a good basic understanding of how VMware snapshot work.

I myself have taken a two pronged attack to preventing snapshots causing problems. The first approach is to schedule a very basic PowerShell script that I found on the blog site of Axiom Dynamics.  This simple little script queries your vCenter server for all current snapshots and then sends an email detailing them.  A simple but effective means of keeping an eye on snapshots across the virtual infrastructure.

The second more proactive approach is to use a vCenter alarm at the data centre level to alert when a VM is running from a snapshot. This alarm simply involves emailing a warning when any snapshot is larger than 2GB. This handy video taken from VMware Knowledgebase article 1018029 describes in detail how to set this up, the KB itself also provides step by step instructions.


Alternatives
 

There are a number of alternatives available for reporting on snapshots.

Alan Renouf’s Snapshot Reminder – A Powershell script that integrates with AD to send the creator of a snapshot a little reminder when the snapshot is over 2 weeks old.

Alan Renouf’s vCheck Daily Report – Another Powershell script that reports on a large number of areas within the virtual infrastructure.  One of those areas includes snapshots

RVTools – A very handy .Net application by Rob de Veij  that can be used to query your virtual infrastructure for just about everything. You will notice in the screenshot below the vSnapshot tab which should help you identify those rogue snapshots.

Snapshots

In summary,  everyone who works with snapshots should have an understanding of their usage and limitations.  Obviously you can’t always rely on people to do things right, we are only human after all. As a safeguard ensure you have some level of reporting and alerting in place to help you prevent those annoying and time consuming out of space issues occurring.

VI Toolkit / Powershell, VMware , , ,

Virtual Storage Integrator 3.0.0 for vSphere and EMC Storage

May 11th, 2010

I have been trying desperately this week to keep up to date with the latest announcements coming out of EMC World 2010.  Problem is they appear to be making them and blogging about them faster than I can read and assimilate them.

One blog post that did catch my attention was a post by EMC’s Chad Sakac. Chad constantly amazes me, he generates a massive amount of super high quality technical content for the EMC and VMware community. His blog post was entitled “EMC’s next generation vCenter Plugins” and details the latest and greatest offerings from EMC’s Free vCenter plugins.

The Virtual Storage Integrator (VSI) V3.0.0 is a renaming of the existing EMC Storage Viewer 2.1 plugin that has been available for a while.  Why the rename? Well EMC are introducing tighter integration by enabling storage provisioning from within vCenter, it’s now surpassed being just a storage viewer.  The storage provisioning integration works with the CLARiiON across all protocols (FC, ISCSI, FCOE) and it also works with NFS on the Celerra. It also adds greater degrees of simplicity and reduces risk by automating all the tasks involved in provisioning and presenting storage to your vSphere cluster.

Chad explains it in much more detail and much better than I ever could in the following video.

I personally feel that the benefits of EMC’s ownership and tight working relationship with vSphere are beginning to shine through.  Such tight levels of integration are now being delivered and future development doesn’t look likely to slow down either. The quote from Chad below show’s how aggressively his team are working to constantly bring new features to the table and best of all, there completely free!

EMC Virtual Storage Integrator (or VSI) is the main EMC vCenter plug-in.  Over time, more and more functions that currently require installing one or more additional plugins will fold into the EMC Virtual Storage Integrator.  We’re beefing up the team behind it, and they have a very aggressive roadmap (wait till you see what’s next!!!)

Click the link below to find out more about what vCenter plugins are available, what they’re designed to do and where you can download them from in EMC Powerlink.

Plugin_Image

Storage, VMware, vSphere , , , ,

Storage I/O control – SIOC – VMware DRS for Storage

May 10th, 2010

Following VMworld in 2009 a number of articles were written about a tech preview session on IO DRS – Providing performance Isolation to VMs in Shared Storage Environments. I personally thought that this particular technology was a long way off, potentially something we would see in ESX 4.5. However I recently read a couple of articles that indicate it might not be as far away as first thought.

I initially came across an article by VMware’s Scott Drummond in my RSS feeds.  For those that don’t follow Scott, he has his own blog called the Pivot Point which I have found to be a invaluable source of VMware performance related content. The next clue was an article entitled ESX 4.1 feature leak article, I’m sure you can probably guess what the very first feature listed was? It was indeed Storage I/O Control.

Most people will be aware of VMware DRS and it’s usage in measuring and reacting to CPU and Memory contention. In essence SIOC is the same feature but for I/O, utilising I/O latency as the measure and device queue management as the contention control. In the same way as the current DRS feature for memory and CPU, I/O resource allocation will be controlled through the use of share values assigned to the VM.

VM_Disk_Shares

I hadn’t realised this until now but you can already control share values for VM disk I/O within the setting of a virtual machine (shown above).  The main problem with this is that it is server centric as you can see from the statement below from the VI3 documentation.

Shares is a value that represents the relative metric for controlling disk bandwidth to all virtual machines. The values Low, Normal, High, and Custom are compared to the sum of all shares of all virtual machines on the server and the service console.

Two main problems exist with this current server centric approach.

A) In a cluster, 5 hosts could be accessing VM’s on a single VMFS volume, there may be no contention at host level but lots of contention at VMFS level. This contention would not be controlled by the VM assigned share values.

B) There isn’t a single pane of glass view of how disk shares have been allocated across a host, it appears to only be manageable on a per VM basis.  This makes things a little trickier to manage.

Storage I/O Control (SOIC) deals with the server centric issue by introducing I/O latency monitoring at a VMFS volume level. SOIC reacts when a VMFS volume’s latency crosses a pre-defined level, at this point access to the host queue is throttled based on share value assigned to the VM.  This prevents a single VM getting an unfair share of queue resources at volume level as shown in the before and after diagrams Scott posted in his article.

   queues_before_sioc              queues_after_sioc

The solution to the single pane of glass issue is pure speculation on my part. I’d personally be hoping that VMware add a disk tab within the resource allocation views you find on clusters and resource groups.  This would allow you to easily set I/O shares for tiered resource groups, i.e. Production, Test, Development. It would also allow you to further control I/O within the resource groups at a virtual machine level.

Obviously none of the above is a silver bullet! You still need to have a storage system with a fit for purpose design at the backend to service your workloads. It’s also worth remembering that shares introduce another level of complexity into your environment.  If share values are not assigned properly you could of course end up with performance problems caused by the very thing meant to prevent them.

Storage I/O Control (SOIC) looks like a powerful tool for VMware administrators.  I know in my own instance, I have a cluster that is a mix of production and testing workloads.  I have them ring fenced with resource groups for memory and CPU but always have this nagging doubt about HBA queue contention.  This is one of the reasons I wanted to get EMC PowerPath/VE implemented, i.e. use both HBA’s and all available paths to increase the total bandwidth.  Implementing SOIC when it arrives will give me a peace of mind that production workloads will always win out when I/O contention occurs.  I look forward to the possible debut of SOIC in ESX 4.1 when it’s released.

**UPDATE**

Duncan Epping over at Yellow Bricks has located a demo video of SOIC in action.  Although a very basic demonstration,  it gives you an idea of the additional control SOIC will bring.

Gestalt-IT, New Products, VMware , , , ,

vSphere vMotion Processor Compatibility and EVC Clusters

January 25th, 2010

In today’s economic climate it’s currently the done thing to sweat existing assets for as long as you possibly can.  At the moment I am working on a vSphere deployment and we are recycling some of our existing ESX 3.5 U4 hosts as part of the project.  So over the weekend I was testing out vMotion between a new host with the Intel Xeon X7460 processor and an old host with the Xeon 7350 processor.  I was getting the following error message displayed which pointed to a feature mismatch relating to the SSE4.1 instruction set.  Thankfully the error pointed me to VMware KB article 1993

EVC_1

Within this KB article it immediately refers you to using Enhanced vMotion Compatibility (EVC) to overcome CPU compatibility issues.  I had never used EVC in anger and wanted to read up on it a bit more before making any further changes.  A quick read of page 190 on the vSphere basic configuration guide gives a very good brief overview for those new to EVC. 

So I was referred to VMware KB article 1003212 which is the main reference for EVC processor support.  Quite quickly I was able to see that EVC was supported for the Intel Xeon 7350 and 7460 using the Intel® Xeon® Core™2 EVC baseline.  In essence as far as vMotion is concerned all processors in the cluster would be equal to an Intel® Xeon® Core™2 (Merom) processor and it’s feature set.  This basically masks the SSE4.1 instruction set on the Intel Xeon 7460 that was causing me the problem.

So I set about enabling my current cluster for EVC,  however when I went to apply the appropriate baseline I was getting the following error displayed. The error related to the host that was currently running 3 Windows 2008 R2 x64 servers.  These servers were obviously using using the advanced features of the Intel Xeon 7460 and as such that host could not be activated for EVC.

EVC_2

The vSphere basic configuration guide (Page 190) makes the following recommendation for rectifying this issue, the example matched my situation exactly.

All virtual machines in the cluster that are running on hosts with a feature set greater than the EVC mode you intend to enable must be powered off or migrated out of the cluster before EVC is enabled. (For example, consider a cluster containing an Intel Xeon Core 2 host and an Intel Xeon 45nm Core 2 host, on which you intend to enable the Intel Xeon Core 2 baseline. The virtual machines on the Intel Xeon Core 2 host can remain powered on, but the virtual machines on the Intel Xeon 45nm Core 2 host must be powered off or migrated out of the cluster.)

Now here is the catch 22, my new vCenter server is virtual and sits on the ESX host giving me the EVC error message.  I had to power it off to configure EVC but I can’t configure the EVC setting on the cluster without vCenter,  how was I going to get round this?  Luckily VMware have another KB Article dealing with exactly this situation.  The aptly titled “Enabling EVC on a cluster when vCenter is running in a virtual machinewas exactly what I was looking for. Although it involved creating a whole new HA / DRS cluster complete with new resource groups, etc it was a lot cheaper than buying a large number of expensive Intel processors. It worked perfectly, rectifying my issue and allowing me to use all servers as intended.

Moral of the story…..… Check out VMware KB article 1003212 for processor compatibility before buying servers and always configure your EVC settings on the cluster before adding any hosts to the cluster.  If it’s to late and you have VMs created already,  well just follow the steps above and you should be fine.

vCenter, VMware, vSphere , , , , , , ,

Virtual Distributed Switch and vCenter Server failure

December 20th, 2009

I’m currently working with my colleagues on an upgrade of our VI 3.5 infrastructure to vSphere Enterprise Plus.  We have recently been mulling over some of the design elements we will have to consider and one of the ones that came up was virtual Distributed Switches (vDS).  We like the look of it,  it saves us having to configure multiple hosts with standard vSwitches and it also has some nice benefits such as enhanced network vMotion support, inbound and outbound traffic shaping and Private VLANs.

vDSOne of the questions that struck me was,  what happens if your vCenter server fails? what happens to your networking configuration? Surely your vCenter server couldn’t be a single point of failure for your virtual networking, could it?

Well I did a bit of digging about, chatted to a few people on twitter and the answer is no it would not result in a loss of virtual networking.  In vSphere vDS the switch is split into two distinct elements,  the control plane and the data plane. Previously both elements were host based and configured as such through connection to the host, either directly using the VI client or through vCenter. In vSphere because the control plane and data plane have been separated,  the control plane is now managed using vCenter only and the data plane remains host based.  Hence when your vCenter server fails the data plane is still active as it’s host based where as the control plane is unavailable as it’s vCenter based.

One thing I was not aware of was where all this vDS information is stored . Mike Laverick over at RTFM informed me that the central config for a vDS is stored on shared VMFS within a folder called the .dvsData folder. I’ve since learnt that this location is chosen automatically by vCenter and you can use the net-dvs command to determine that location. It will generally be on shared storage that all ESX hosts participating in the vDS have access to.  As a back up to this .dvsData folder a local database copy is located in /etc/vmware/dvsData.db which I imagine only comes into play if your vCenter server goes down or if your ESX host loses connectivity to the shared VMFS with the .dvsData folder.  You can read more about this over at RTFM

Interesting links if your considering VMware Distributed Switches

VMware’s demo video of vDS in action, for those who want to learn more about vDS

Mike Laverick’s great reasoning on whether you should use vDS or not

Eric Sloof’s vDS caveats and best practices article

VMware’s vSphere Networking white paper explaining new vDS features

VMware’s vSphere Networking white paper on vDS network migrations

Jason Boche’s very interesting article on a specific vCenter + vDS issue

ESX, vCenter ,

Virtu-Al’s PowerShell VMware Daily Report

July 16th, 2009

For those of you that will have heard of Alan Renouf you will undoubtedly know of his talents in the dark art of VMware CLI / Powershell.  For those of you who don’t know him I suggest you check out his web site  to sample some of the many great articles and scripts he’s already produced.

His latest powershell creation has recieved a lot of attention in the last couple of days and with good reason. The Daily Report is a configurable script where you can set thresholds and variables such as snapshot age, datastore space free thresholds or number of days to look at for vCenter warnings and errors.  The script when run goes off and examines your Virtual  Infrastructure based on these variables and then proceeds to email you a nice html report on the following items.

·         VMs created in the  x number of days and who created them.

·         VMs deleted in the  x number of days and who deleted them.

·         Datastores which have less than x% of free space remaining.

·         VMs that have CD-Rom or Floppy drives connected.

·         VMs with no VMware Tools installed.

·         Snapshots that are older than x number of days.

·         Current state of vCenter Services.

·         vCenter events that have been logged in x number of days.

·         Windows events  on the vCenter server that relate to VMware.

·         Hosts in maintenance mode or a disconnected state.

Get yourself over  to Alan’s site and download a copy of the script and give it a try,  I did today and the results were enough for me to go ahead and implement this as a scheduled task.  If you’d like to see more features in Alan’s Daily Report script then give him some feedback,  there are a few good suggestions on the blog post already and I’m sure the next version isn’t far away.  Great work Alan, keep it up!

ESX, vCenter, VI Toolkit / Powershell, VMware , ,

vCenter tasks time-out or ESX host disconnects

February 5th, 2009

Duncan Epping over at Yellow bricks has posted a most interesting article which I read tonight when reviewing my RSS feeds.  It instantly struck a cord with me because recently we have been having issues with an ESX host at a site remote from our recently upgraded vCenter 2.5 U3 server.

I just received an email from a fellow consultant about a customer which had vCenter tasks time-out every once in a while. At times also ESX hosts got disconnected for no apparent reason at all. He discovered the following article by Richard Blythe aka VMware Wolf: ESX disconnects randomly or when doing VI client tasks from VC, task randomly timeout after a long idle time. Richard created a list of issues/errors that might be related to this issue:

  • ESX disconnects randomly from VirtualCenter
  • ESX disconnects when performing VI Client tasks from VirtualCenter.
  • Tasks randomly timeout after a long idle time
  • “An error occurred communicating to the remote host” pops up.

The article refers to an issue with vCenter Update 3 in combination with firewalls using state-ful inspection. The problem occurs because of SOAP timeouts, and this behavior did not exist in VC 2.0.x or 2.5 GA, as they used a different mechanism to communicate with ESX. The official KB article hasn’t been released yet but a temporary workaround has been published by Richard. If you run into any of the before mentioned issues head over to Richard’s website and try out the workaround until the fix or official KB article is released.

When conducting operations with this particular host using VI client attached to vCenter server,  we get “An error occurred communicating to the remote host” pop up more often than not.  I have been looking through the logs in vCenter for this host and it appears as well as manual tasks, our overnight Platespin protection replication jobs are also getting this message when executing.  This might explain some of the issues we’ve been having with some of our newer replication jobs not completing.

I’ve had a quick look at VmwareWolf’s workaround and have asked Richard if you need to create a dummy vm on each host or just the hosts that experience the problem as its not completely clear.  If i get a response I’ll let you all know what it is, meantime we look forward to an official KB hot fix release from VMware

* UPDATE – 09/02/09*
Richard at VMwareWolf came back to me and informed me that the dummy vm only needs to be setup on the affected host.  I set ithe workaround up yesterday and it appears to have resolved our issue for a job that was consistently reporting the error.  A permanent fix is still outstanding from VMware.

vCenter, VMware

VMware logs – locations and what’s in them

December 19th, 2008

I was looking into an issue following an upgrade to vCenter Server 2.5 last weekend.  So I set about searching through the file system for the log files on the server with very little luck to be honest.

I then found two excellent posts from Rick Blythe a.k.a the vmwarewolf,  the posts detail the locations of the logs and what each one means.  This is an excellent post and one that I’m going to keep handy for all those strange little issues where insight into the logs might give a clue to the problem.

Virtual Center Logs
http://www.vmwarewolf.com/which-virtual-center-log-file/

ESX Server Logs
http://www.vmwarewolf.com/which-esx-log-file/

ESX, ESXi, vCenter, VMware , ,

Virtual Center Upgrade – 2.0.2 to 2.5 Update 3

December 14th, 2008

Well this weekend I had the job of updating our Virtual Center deployment from 2.0.2 to 2.5 update 3. The primary reason for this was to prepare for the introduction of ESX 3.5 hosts into the Virtual infrastructure.

Due to the importance of our Virtual infrastructure I decided to do as much reading and preperation as possible to ensure it all went smoothly. It’s easier to convince our change management team to let us make changes if we get them right first time,  so this one was important to facilitate an easier path for future change.

So what did I read to ensure I’d covered everything,  here are a few links to get you started

Mike Laverick’s Upgrade Experience PDF
http://www.rtfm-ed.co.uk/?p=482

vCenter Server 2.5 Update 3 – Release Notes
http://www.vmware.com/support/vi3/doc/vi3_vc25u3_rel_notes.html

ESX 3.5 and vCenter Server 2.5 Upgrade Guide
http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_upgrade_guide.pdf

The above PDF is a brilliant guide to the process you should follow, including rollback.  You should read this thoroughly so you understand all the pre-requisites and can avoid those silly problems that could cause your upgrade to fail.

Of course vCenter Server 2.5 introduces Update Manager. Although we can’t use it as we have ESX 3.0.1 and 3.0.2 hosts ( supports 3.0.3, 3.5 and 3.5i only ) I decided to install it anyway so it’s there for the future.

Here are some of the links I used to plan out my update manager deployment.

Update Manager Administration Guide
http://www.vmware.com/pdf/vi3_vum_10_admin_guide.pdf

Update Manager Performance and Best Practice Guide
http://www.vmware.com/pdf/vum_1.0_performance.pdf

Update Manager Size Estimator
http://www.vmware.com/support/vi3/doc/vi3_vum_10_sizing_estimator.xls

One of the main mistakes that people tend to make is to not give the SQL accounts the correct permissions on the new update manager database and the MSDB database.  Make sure you cover this one or you upgrade will fail.

It all went quite smoothly, I initially had a couple of issues which appeared to be related to me attempting to do a custom install. I wanted to ensure I could go through all settings and customise as required, the install however failed with various MSI error messages. I started the install again and didn’t choose the custom setup this time. This however created an issue whereby the Update Manager database appeared to install as SQL Server 2005 Express.  I wanted to put it on the same SQL 2000 server as our Virtual Center database but I never got the option as far as I can remember.  I have today uninstalled Update Manager and re-installed it using the “use an existing database” option and the SQL 2000 database.  It worked fine the second time around.

No immediate problems following the upgrade,  I had read some horror stories about issues with the Virtual Centre Agent on the host not updating.  Luckily for me it wasn’t an issue. Good luck with your upgrade

vCenter, VMware , ,