Archive

Archive for the ‘Gestalt-IT’ Category

Gestalt IT Tech Field Day Seattle – NEC HYDRAstor

July 16th, 2010

Following my return from my first Tech Field Day I have been reading through my notes and reflecting on the vendors I saw when I was in Seattle.  Of the vendors I saw the one that surprised me most was NEC, everyone has heard of them but not everyone actually knows what they do or what products they make.  As we found out during our visit, NEC have a broad technology portfolio and have quite an interesting offering in the storage space.

Here are some basic facts about NEC that you may / may not know

- Founded in 1899
- Fortune 200 company with over 143,000 staff
- Revenues of $43 Billion in 2009
- $3 Billion spent in R&D each year across 12 R&D global labs
- 48,000 patents worldwide.
- Have been in storage since 1950 

So with that little history lesson over, the main focus of our visit was NEC’s HYDRAstor. This is their modular grid storage offering for customers with backup and archive storage in mind. It’s marketed as “Grid storage for the next 100 years” which may sounds a little far fetched, but data growth and data retention periods are ever increasing.   From what I saw and heard the HYDRAstor could very well live up to this bold claim.

There was a lot of content delivered on the day and the session went on for 4 hours, so I’ve tried to wrap up some of the key features below. I have expanded on the key elements of the HYDRAstor that really caught my attention as I think they are worth exploring in more detail.

Key Features

- 2 tier architecture based entirely on best of breed Intel Xeon 5500 based servers

- 2 tier architecture consists of front end accelerator nodes and back end storage nodes

- Shipped as a turnkey solution, though entry level can be bought for self racking.

- Supports a Maximum of 165 Nodes, 55 accelerator nodes and 110 storage nodes

- All interconnects based on 1GB Ethernet Networking (NEC Network switches included)

- Supports old and new node modules in the same Grid for easy node upgrade and retirement.

- Supports volume presentation with NFS and CIFS (SMB Version 1)

- Non-disruptive auto reallocation of data across any additional grid capacity - DynamicStor

- higher levels of resilience than RAID with a reduced capacity overhead (See DRD below)

- WAN optimised grid to grid replication minimises network bandwidth requirements – RepliGrid

- WORM Support for secure retention / compliance governance - HYDRAlock

- Efficient drive rebuilds, only rebuild the actual data not the whole drive.

- Global inline de-duplication across the entire grid – DataRedux™

- Tight backup vendor integration – strips out backup metadata to improve de-dupe ratios

- Mini HYDRAstor appliance available for remote offices or offsite DR replication.

Data Protection - Distributed Resilient Data™ (DRD)  

The resilience provided by HYDRAstor really caught my eye, primarily because it was so different from anything I had ever seen before.  Distributed Resilient Data (DRD) uses something known as erasure coding to provide extremely high levels of resilience. Now you may think that this would come with a considerable storage and performance overhead, but you’d be wrong.

The HYDRAstor provides 6 levels of protection (1 – 6) all with differing levels of protection and capacity overhead. With the default level 3 selected NEC’s implementation of erasure coding splits the data chunks into 12 parts, 9 data and 3 parity. The use of erasure coding means that it only ever needs 9 parts to make up a complete data chunk. So if that data chunk is spread over 12 disks in a single storage node, it can withstand 3 disk failures. if those 12 chunks are spread over 12 storage nodes then you can withstand 3 complete node failures.

This default level 3 protection requires a 25% capacity overhead, much like RAID 5.  However by providing for 3 disk failures it provides 300% more protection than RAID5 and 150% more protection than RAID 6.  If you want to go to the highest level of protection (level 6) then there is a 50% capacity overhead as with RAID 1, however you can withstand the failure of 6 disks or 6 nodes.

The following video describes Distributed Resilient Data™ (DRD) at the default level 3

 

High Performing

The demonstration NEC gave us was based on their lab setup of 20 accelerator nodes and 40 storage nodes.  This was a 4 rack setup, which as you can see from the photo below is not a small setup. What it is though, is a very high performing storage solution.

image

NEC demonstrated a data copy that utilised a full 10GB per second throughput, which worked out at about 540MB throughput per front end accelerator node.  The screenshot from the management GUI below shows the  total throughput achieved.

The maximum HYDRAstor configuration consists of 11 racks and is capable of 25GB per second or 90TB per hour. This works out at roughly 2 PB’s in a 24 hour period, that is an astounding amount of data throughput.  Surely a level of throughput to deal with even the most demanding backup or archiving use case.
 

image

There were a few negative aspects that I picked up on during our visit, thankfully all ones I feel can be addressed by NEC over time.

User Interface

I felt the user interface was a little dated (see screenshot above), it served it’s basic purpose but wasn’t going to win any awards. It was a stark contrast when compared with the very nice and easy to use GUIs we saw from Nimble storage and Compellent.  That said if the HYDRAstor is only being used as a backup and archive storage and not primary storage, does it actually need to have the worlds best GUI, possibly not.

Solution Size

The HYDRAstor came across as a large solution, though I’m not sure why. When I think about it any storage solution that provides 10GB/sec throughput and 480TB of raw storage is likely to take up 4 racks, in some instances probably a lot more.  Maybe it was the sheer number of network interconnects, perhaps some consolidation with 10GB Ethernet could assist in making the solution appear smaller.  NEC could also look at shrinking down the servers sizes, probably only possible with the accelerator node servers as the storage nodes need 12 x 1TB disk so not a lot of scope for size reduction there.

Marketing

A general consensus among delegates was why have NEC marketing not been pushing this harder,  why had so many of us in the room not heard about it? I suppose that was one of the reasons we were there, to hear about it, discuss it and ultimately blog about it as I’m doing now. There are some specific target markets that NEC maybe need to look at for this product, possibly looking at world wide data retention regulations as a means of identifying potential markets and clients.  More noise needs to be made by NEC about there efficient de-dupe integration with enterprise backup products such as CommVault Simpana, Symantec NetBackup, TSM and EMC Networker.  More comments such as the one below wouldn’t hurt.

with the application aware de-duplication for CommVault we’ve optimized storage efficiency with a four times improvement in space reduction.
Pete Chiccino, Chief Information Officer, Bancorp Bank

EMEA availability

NEC told us that this product is not being actively pushed in the EMEA region.  Currently the product is only available for purchase in North America and Japan.  One of the points I made to NEC was that the HYDRAstor appeared to me to be a product that would have a lot of applications in the European market place, possibly more so in the UK.  I made specific reference to FSA regulation changes where Financial companies are now required to keep all electronic communications for up to 7 years.  NEC’s HYDRAstor with it’s high tolerance for failure, global de-duplication across all nodes and grid like extensibility is perfect for storing this kind key critical complaince data.  That is a very specific example, another is insurance companies who have longer retention requirements and museums digitising historical documents / books which have a “keep forever” retention requirement.

NEC contacted me via twitter after the event to say that although not on sale in EMEA if a company has a presence in the US they will be able to explore purchasing the HYDRAstor through NEC America.

Summary

I had no idea what to expect when we arrived at NEC’s offices, sure I knew who they were but I had no idea what they were doing in the storage space. Gideon Senderov at NEC certainly saw to it that we had all the information needed to form an opinion, his knowledge of his product was simply outstanding.

NEC HYDRAstor is a product that is quite unique. It’s easy to scale up and scale out, has high levels of redundancy without the normal capacity penalty and of course exceptional levels of performance. It strikes me as a product that any IT professionals responsible for backup, archiving and long term data retention would be very, very interested in

Note : Tech Field Day is a sponsored event. I receive no direct compensation and take personal leave to attend, however all event expenses are paid by the sponsors via Gestalt IT Media LLC. The views and content expressed here are my own and is in no way influenced by the sponsors of this event.

Events, Gestalt-IT, Storage , ,

Gestalt IT Seattle Tech Field Day – Day 2 Summary

July 16th, 2010

It’s now been a couple of days since the second day of the Gestalt IT Tech Field Day, I’m actually taking the opportunity to write this on the plane on the way back from Seattle. So once again I thought I would do a summary post until I get the chance to write up a detailed post on each vendor.

 image image

Compellent were one of the main sponsors for the Seattle Tech Field Day and were responsible for us getting access to the Microsoft Campus. So a big thank you to Compellent for their support of Tech Field Day.

Compellent are a company I have had dealings with before, I looked at buying one of their storage devices back in 2008 and was very impressed by the product they had on offer at the time.  This was a great chance for me to revisit Compellent two years on and see how things had changed.

Compellent in general still appears to be much the same product that I liked so much back in 2008.  Their pooled storage model, software controlled RAID write down, space efficient snapshots and WAN optimised thin replication are all superb  features. There main differentiator back in 2008 was their ability to do automated storage tiering (Data Progression™), something that others in the industry are starting to catch up to (EMC FAST). Compellent’s Data Progression technology is one that many customers actively use with good results, I was slightly disappointed though to learn that their data movement engine only executes once every 24 hours and cannot be made more frequent.  I’m not sure how that compares to EMC FAST but is something I’ll include in a more expansive post.

A feature I had heard of but didn’t quite understand previously was Compellent’s Live Volume.  It’s another unique feature for Compellent and one of my fellow delegates even described it as “EMC vPlex that you could actually afford”. Compellent implement the Live Volume feature at software level as opposed to a hardware based implementation like EMC vPlex. Compellent are able to present the same volume, with the same identity in two different locations, they do this using the underlying WAN optimised asynchronous replication. One point of note was that this is not an active / active DR like setup,  this is a setup for use in a controlled maintenance scenario, such as SAN fabric maintenance or a DC Power down test.

Compellent also took the opportunity to share some roadmap information. Highlights included the release of the 64 bit, Series 40 Controller base on the Intel Nehalem, encrypted USB device for seeding replication, a move to smaller 2.5” drives and 256 bit full disk encryption among others.

image 
Although we were situated on Microsoft’s Campus for a large part of Tech Field day we were never presented to by Microsoft, which was a shame.  We did however get the chance to visit the Microsoft store which is for employees only.  It gave us all a chance to buy some discounted Microsoft Software and souvenirs of our visit to Redmond which we all took advantage of.

photo

Tech Field Day delegates Kevin Houston, Stephen Foskett and Jason Boche using their iPhones and iPads in the heart of the Microsoft campus. Note Jason Boche using an iPad and wearing his VMware VCDX shirt, brilliant!

image

Our afternoon session was spent a short bus ride away from Microsoft at NEC America’s Seattle office.  We were here to hear about NEC’s storage offering (I had no idea they even did storage) and more specifically the NEC HYDRAstor range. We had a very in depth session on this fascinating product with Gideon Senderov, Director of Product Management for the HYDRAstor range.

NEC have taken an innovative approach with this product, one I was not expecting. They utilise full blown NEC servers to provide a two tier architecture made up of front end accelerator nodes and back end storage nodes.  On top of this they don’t use the traditional RAID model, instead using something known as erasure coding to provide improved data protection. I will deep-dive this particular data protection method in another article but it was a very interesting and different approach to what I’m used to.

The HYDRAstor grid is marketed as “Storage for the next 100 years” and with it’s grid architecture it’s reasonably easy to see how that statement could be realised.  You can add additional nodes into the grid and it will automatically redistribute itself to take advantage of the capacity.  You can also mark nodes for removal,  the system evacuating the data to enable nodes to be removed from the grid.  This combined with the ability to co-exist old and new HYDRAstor nodes shows why it’s a good storage location for data with a very long term retention requirement.

It appeared to me that HYDRAstor was designed specifically as a location for the output of archive or backup data and not a primary data storage solution. The reason I say this is that when we discussed in-line de-duplication the product was already integrated with major backup vendors (Symantec NetBackup, CommVault Simpana, Tivoli Storage Manager and EMC Networker). NEC were getting very clever by stripping out metadata from these backup vendors to improve the level of de-dedupe that could be achieved with the product when storing backup data.

I will revisit the HYDRAstor, once I have had a chance to go over my notes I fully intend to dedicate a full article to it as I was very impressed.

image           Capture

Rodney Haywood and Gideon Senderov white boarding the configuration of the NEC HYDRAstor

Note : Tech Field Day is a sponsored event. I receive no direct compensation and take personal leave to attend, however all event expenses are paid by the sponsors via Gestalt IT Media LLC. The views and content expressed here are my own and is in no way influenced by the sponsors of this event.

Events, Gestalt-IT, Storage

Gestalt IT Seattle Tech Field Day – Day 1 Summary

July 15th, 2010

So that is Day 1 of the Seattle Tech Field Day out of the way and what a day it has been.  We’ve been out to Microsoft Redmond HQ, or “the temple” as John Obeto calls it.  We saw some new products from Veeam and were privileged enough to be the first port of call for a new and very exciting storage start-up, Nimble Storage.

There has been a lot of information flowing about today, an awful lot. My plan is to spend some time assimilating all the information and doing more detailed posts on everyone we’ve seen, so for now I think a summary will suffice.

image

Veeam are a company that needs very little introduction.  They’ve not been around long (3 years to be exact) but they are a well known and well respected brand in the virtualisation space.  Today Veeam were announcing a new product / concept that they have at the development stage, one that got delegates quite excited.

Veeam were introducing vPower a new product made up of 3 products, SureBackup, Instant Restore and CDP (a much debated point).  What stood out most for Tech Field Day delegates was the some of the Instant Restore functionality, the ability to run your VM direct from backup image was well received.  My personal thought at the time was who wouldn’t want to have a mechanism available to test your backups actually work.  The added bonus was that Veeam also provide network isolation and an almost Lab Manager ability to create groups of machines that should be recovered together. The idea of verifying your backups by running them from the back up storage was one thing,  Veeam had however written their own NFS in order do this.  This means that technically in the event of an outage you can run your machine directly from the Veeam backup server NFS datastore.  It’ isn’t going to be fast but it’s running which is the main thing you should be concerned about.  It was all good stuff and general consensus was that it was a step in the right direction and quite a shift in the VM backup space.

image

Our surprise for the day was a new Tech start-up who were launching themselves and their product for the very first time.  Nimble Storage is a new start company who consist of a number of high pedigree employees with a proven track record at companies such as NetApp and DataDomain.  This is further backed up with an experienced board of directors and top venture capital investment and last but not least, a pretty good product at a good price point.

Without going into to much detail Nimble storage have produced a new array that probably reshapes the way people think about primary and backup storage as well as the use of flash storage within an array. Right at the outset they stated that their aim was to introduce flash storage to the mid size enterprise while also utilising a lot of the features being pioneered by other vendors.  Nimble’s approach is different in that it provides a converged appliance, one that does primary and secondary storage within the same device while also introducing flash caching to provide high performance.  Through the use of inline compression, flash cache, sequential write down to disk, efficient snapshots and replication as well as zero space cloning, Nimble is packing a lot into their product. At the top end you are paying a list price of  $99,000 + $6,000 annual maintenance.  For this you are looking at 18TB of primary storage (not including flash cache) + 15,000 IOPS from a SATA / Flash Mix. They were also looking at 216TB of backup capacity within that same device, driven primarily by their use of space efficient snapshots.  I have a lot of notes on this particular presentation and will be expanding upon this in the coming weeks.

image

Now F5 was a company I was really interested to see, primarily because I wasn’t entirely sure what they offered.  Sure I knew they were into networking but even then what did they do in the networking space, I had no idea.  We were treated to 4 different presentations that covered the following.

  • WAN optimised geographical vMotion
  • Coding of IRules and IControls for the BIG-IP appliances
  • Intelligent client VPN connectivity via BIG-IP’s Edge gateway module.
  • Data Management and Routing using F5’s ARX appliance, file system virtualisation.

 

All were very impressive and I will definitely be looking to dig a little deeper and examine in full some of the technology presented and discussed.  I was particularly impressed with F5’s vision for data management / file level virtualisation, as they seem to be one of the only companies in this space that I am aware of.  This vision was demonstrated to us as a mix of onsite primary tier 1 storage and off site cloud storage.  The ARX appliance would sit as a director presenting a unified view of the storage to the end user, while internally keeping a routing table of up to a billion files.  This will allow IT departments to place files across multiple types of storage, whether that be differing internal storage devices or storage in the cloud. The concept sits well with the current cloud strategies being developed by most major IT companies, what’s surprising is that nobody else is doing it.  There is a lot more to be said about F5,  I plan to delve a little deeper and write some more,

Summary

It’s been a very busy day,  one however that has been exceptionally rewarding. Tech Field Day has been everything I expected it to be so far,  there has been a wealth of information shared and a lot of feedback given. The biggest win for me though is getting the time to learn more about vendors and their product offerings, that and hearing the comments of my fellow delegates.  There is a good mix of intelligent people from varied backgrounds and that has only added to the experience so far.

We ended the night with a tour of the Boeing museum of flight and a couple of drinks with dinner.  It’s now midnight and after just 6 hours sleep last night and a busy schedule ahead for tomorrow,  I am going to call it a night there.

Note : Tech Field Day is a sponsored event. I receive no direct compensation and take personal leave to attend, however all event expenses are paid by the sponsors via Gestalt IT Media LLC. The views and content expressed here are my own and is in no way influenced by the sponsors of this event.

Events, General, Gestalt-IT, Tech Field Day , , ,

Windows Virtual Desktop Access Licensing - What is it?

June 24th, 2010

I try and avoid licensing at all costs, it’s a horrible subject and one that strikes fear in to many.  When you add virtualisation in to the mix it tends to get a little more complicated and you often find that the rules change on a reasonably regular basis. I was involved in a discussion today about Citrix XenDesktop and an interesting point came up when discussing licensing Virtual PCs.  Someone mentioned something called the Microsoft VDA,  I hadn’t a clue what they were talking about so I did a little digging around to find out more.

In summary this is what I found, it’s not pretty reading. As of the 1st of July 2010 Microsoft is changing the way it licences the Windows OS in VDI environments.  The following changes will take place

Windows® Virtual Enterprise Centralized Desktop (Windows VECD) and Windows VECD for Software Assurance (SA) will no longer appear on the price list.

Virtual desktop access rights will become a Windows Client Software Assurance benefit. Customers who intend on using PCs covered under SA will now be able to access their Virtual Desktop Infrastructure (VDI) desktops at no additional charge.

Customers who want to use devices such as thin clients that do not qualify for Windows Client SA would need to license those devices with a new license called Windows Virtual Desktop Access (Windows VDA) to be able to access a Windows VDI desktop.Windows VDA is also applicable to third party devices, such as contractor or employee-owned PCs.

What does it all mean?

In it’s simplest terms you don’t licence the windows virtual machine itself, you instead licence the end point its being accessed from. To further break this down there are two distinct endpoint categories to consider.

1. The end point is a Windows OS covered by Software Assurance (SA)

2. The end point is a non windows device or is a windows device without SA

In the first category you are covered to access a windows virtual machine as Virtual Desktop Access (VDA) is included as a Software Assurance benefit.  In the second category however you need to purchase a VDA subscriptions for each end point device.  Unfortunately this is not a one off purchase either, this is a $100 per year per device subscription cost.

As an example, say you have  a sales person who uses a company laptop and a company smart phone to access their VDI virtual machine.  You would need to have the laptop installed with a software assured copy of windows and buy a VDA subscription for the smart phone.  Alternatively if you have a non SA copy of windows on the laptop you need 2 VDA subscription licences to cover both devices.  This latter example would obviously be the same if the laptop was MAC OS or Linux based.

There is some good news though in that Microsoft have something called extended roaming rights with the windows VDA licence.  In short the primary user of a VDA licensed device can access their VDI desktop from any device that is not owned by the users company.  Examples would be a users home PC, airport kiosk or hotel business centre

There is a lot to take in with licensing, especially in the VDI space. I suggest everyone running or planning to deploy VDI takes a look at the recent changes and considers how they effect existing or planned deployments.  Some people will see this as Microsoft stifling the growth of Virtual Desktop Infrastructure, others will argue that it may actually acts as an enabler.  In truth I’m just not sure. I’m still digesting what it all means and playing through the various scenarios and combinations of VDI access.  On the surface I can see it hindering as opposed to helping this growing virtualisation sector.

For additional information I’d recommend checking out the following Microsoft FAQ article and for those of you who are Gartner customers the linked article below breaks it down quite nicely into simple terms.

Microsoft VDI suites & Windows VDA Frequently Asked Questions PDF

Gartner – Q&A for understanding Microsoft Licensing Requirements before deploying HVDs

General, Gestalt-IT, Microsoft , , , ,

SNAPVMX – View your Snapshots at VMFS/virtual disk level

June 9th, 2010

Following a recent implementation of VMware Data Recovery manager we ran into a few issues.  We eventually had to kill the virtual appliances due to the issue we were having and as a result we had a couple of virtual machines with outstanding snapshots.  These snapshots were taken by VDR and as a result could not be viewed or deleted using the snapshot manager.

We raised a call with VMware support and they started a WebEx session to look at the issue.  I always love watching VMware support personnel operating at the service console level, I always pick up a command or two that I didn’t know before.  On this occasion the support engineer was using something called SnapVMX to view the hierarchy of snapshots at the virtual disk level.

At first I thought this was an inbuilt VMware command but it turns out it’s not. It was actually a little piece of code that was written by Ruben Garcia.  What does it do?  well the following extract from the download pages explains it pretty well.

  • Displays snapshots structure and size of snapshots for every disk on that VM
  • Calculates free space needed to commit snapshots for the worst case scenario
  • Checks the CID chain of the analysed files and displays a warning if broken.

I’ve included a little demo screenshot to show what it can do. On the left hand side is  a screenshot from Snapshot Manager within vCenter.  On the right hand side is the same VM being viewed with SnapVMX in the service console.  Put the two together and you get a better idea of the snapshot disk hierarchy and the size of each snapshot.

SnapVMX_1SnapVMX

The other interesting feature is that it tells you what space is required to commit the snapshots.  So for example, say you had taken 5 snapshots of a machine as it was being built and configured.  Say that the overall effect of those 5 snapshots is to fill up your VMFS datastore completely. Chances are that you’re not going to be able to commit the snapshots within the current VMFS datastore.  SnapVMX will be able to tell you the worse case scenario on how much space would be required to commit the snapshots.  Armed with this information you could cold migrate to another datastore that has at least that amount of free space in order to allow you to commit the snapshots.  The screenshot below isn’t the best but the best I could do due to the length of the statement.

SnapVMX_2

For the download and full documentation on how to use this piece of code head over to the following web site. Worth a look if you’re a big user of snapshots.

http://geosub.es/vmutils/SnapVMX.Documentation/SnapVMX.Documentation.html

While searching for a link to Ruben Garcia to put on this article I found that he has a blog site and within that I found a link to a superb troubleshooting VM snapshot problems article which I will definitely be keeping a link to and suggest you check out.  Truly excellent stuff Ruben!

General, Gestalt-IT, VMware , , , ,

Storage I/O control – SIOC - VMware DRS for Storage

May 10th, 2010

Following VMworld in 2009 a number of articles were written about a tech preview session on IO DRS – Providing performance Isolation to VMs in Shared Storage Environments. I personally thought that this particular technology was a long way off, potentially something we would see in ESX 4.5. However I recently read a couple of articles that indicate it might not be as far away as first thought.

I initially came across an article by VMware’s Scott Drummond in my RSS feeds.  For those that don’t follow Scott, he has his own blog called the Pivot Point which I have found to be a invaluable source of VMware performance related content. The next clue was an article entitled ESX 4.1 feature leak article, I’m sure you can probably guess what the very first feature listed was? It was indeed Storage I/O Control.

Most people will be aware of VMware DRS and it’s usage in measuring and reacting to CPU and Memory contention. In essence SIOC is the same feature but for I/O, utilising I/O latency as the measure and device queue management as the contention control. In the same way as the current DRS feature for memory and CPU, I/O resource allocation will be controlled through the use of share values assigned to the VM.

VM_Disk_Shares

I hadn’t realised this until now but you can already control share values for VM disk I/O within the setting of a virtual machine (shown above).  The main problem with this is that it is server centric as you can see from the statement below from the VI3 documentation.

Shares is a value that represents the relative metric for controlling disk bandwidth to all virtual machines. The values Low, Normal, High, and Custom are compared to the sum of all shares of all virtual machines on the server and the service console.

Two main problems exist with this current server centric approach.

A) In a cluster, 5 hosts could be accessing VM’s on a single VMFS volume, there may be no contention at host level but lots of contention at VMFS level. This contention would not be controlled by the VM assigned share values.

B) There isn’t a single pane of glass view of how disk shares have been allocated across a host, it appears to only be manageable on a per VM basis.  This makes things a little trickier to manage.

Storage I/O Control (SOIC) deals with the server centric issue by introducing I/O latency monitoring at a VMFS volume level. SOIC reacts when a VMFS volume’s latency crosses a pre-defined level, at this point access to the host queue is throttled based on share value assigned to the VM.  This prevents a single VM getting an unfair share of queue resources at volume level as shown in the before and after diagrams Scott posted in his article.

   queues_before_sioc              queues_after_sioc

The solution to the single pane of glass issue is pure speculation on my part. I’d personally be hoping that VMware add a disk tab within the resource allocation views you find on clusters and resource groups.  This would allow you to easily set I/O shares for tiered resource groups, i.e. Production, Test, Development. It would also allow you to further control I/O within the resource groups at a virtual machine level.

Obviously none of the above is a silver bullet! You still need to have a storage system with a fit for purpose design at the backend to service your workloads. It’s also worth remembering that shares introduce another level of complexity into your environment.  If share values are not assigned properly you could of course end up with performance problems caused by the very thing meant to prevent them.

Storage I/O Control (SOIC) looks like a powerful tool for VMware administrators.  I know in my own instance, I have a cluster that is a mix of production and testing workloads.  I have them ring fenced with resource groups for memory and CPU but always have this nagging doubt about HBA queue contention.  This is one of the reasons I wanted to get EMC PowerPath/VE implemented, i.e. use both HBA’s and all available paths to increase the total bandwidth.  Implementing SOIC when it arrives will give me a peace of mind that production workloads will always win out when I/O contention occurs.  I look forward to the possible debut of SOIC in ESX 4.1 when it’s released.

**UPDATE**

Duncan Epping over at Yellow Bricks has located a demo video of SOIC in action.  Although a very basic demonstration,  it gives you an idea of the additional control SOIC will bring.

Gestalt-IT, New Products, VMware , , , ,

VMware PVSCSI Adapter performance and low I/O Workloads

February 21st, 2010

I’ve recently been implementing a vSphere deployment and have been looking at the new features introduced as part of Virtual Machine Hardware 7.  Obviously one of the major new components is the new Para Virtualised SCSI (PVSCSI) adapter which I wrote about way back in May 2009.  When it first came out there were a number of posts regarding the much improved I/O Performance and latency reduction this new adapter delivered, such as Chad Sakac’s I/O vSphere performance test post.

So the other day I stumbled across a tweet from Scott Drummond who works in the VMware Performance Engineering team. Following a little reading and a bit of digging around it appears that the use of PVSCSI comes with a small caveat.  It would appear that if you use the PVSCSI adapter with low I/O workloads you can actually get higher latency than you get with the LSI Logic SCSI adapter (see the quote below)

The test results show that PVSCSI is better than LSI Logic, except under one condition–the virtual machine is performing less than 2,000 IOPS and issuing greater than 4 outstanding I/Os.

This particular caveat has come to light following some more in-depth testing of the PVSCSI adapter performance.  The full whitepaper can be found at the following link.

PVSCSI whitepaper - http://www.vmware.com/pdf/vsp_4_pvscsi_perf.pdf

For those who don’t want to read the technical whitepaper, a summary of the issue can be found in the following VMware KB article.

VMware KB 1017652 - http://kb.vmware.com/selfservice/1017652

So basically, as opposed to just using the PVSCSI adapter as default with VMs running version 7 of the virtual hardware have a think about it’s I/O profile and whether the PVSCSI or LSI logic adapter would be best.

Gestalt-IT, VMware, vSphere , , , , ,

VMware VMSafe – Are there any actual products yet?

November 29th, 2009

I was doing some work out of hours the other night on my employers Virtual Infrastructure when bang on time the little red triangles started popping up against certain ESX hosts in vCenter.  Why you ask? well it’s AV scanning time on our VM’s of course, or the Sophos summit as we affectionately call it due to its uncanny resemblance to a mountain range when you look at the CPU performance stats in vCenter.

It got me thinking, has any one vendor actually got a product out there utilising the VMSafe API that could help me rid our virtual infrastructure of this problem?

My first stop was of course the main VMSafe page where I did find a large list of official partners who are working on developing products to utilise the VMSafe API. The pleasing thing to see was that there are plenty of mainstream security vendors taking part.  However I’ve still to see any of them releasing a product to market that actually utilises VMSafe.

Earlier this year in Glasgow I heard Mcafee talk about VMSafe as part of the VMware vSphere launch road show.  They talked about building a vApp that could sit in your Virtual Infrastructure and take care of AV scanning with the aim of reducing the CPU overhead that AV scanning introduces. I did a little trawl of the web and couldn’t find anything official, I did however find the following forum post (quoted below) which is definitely the unofficial line.

Virus Scan for Offline images is available, which uses VMSafe APIs to scan offline disks accessed via ESX

Nothing is currently road mapped for on-access scanning - no AV vendor has this technology available (or even road mapped as far as I’m aware) yet.

I did a bit more digging on this “scan offline disks” comment and found a recent article by VMware’s Richard Garsthagen.  This article reveals that a piece of software called the VMware Virtual Disk Development Kit (VDDK) can be used to conduct an offline scan of disks attached to powered on or off virtual machines (quoted below). 

VMware VDDK (also being seen as part of the VMsafe initiative, but has been available for longer). The VDDK is an disk API, that allows other programs to access a virtual machine’s hard disk like the VMware Consolidated Backup solution does. It does not matter is the VM is powered on of off, but a disk can just be ‘extra’ mounted to another virtual machine that for instance runs a virus scanner. The clear downside of VDDK is that nothing is real time.

Surely this would rid me of my daily scheduled Sophos summit, wouldn’t it? Think of a hypothetical scenario where you have a VDI setup with 1000 windows XP VM’s,  imagine the strain put on your ESX clusters by 1000 machines kicking off a scheduled daily AV Scan. Would an appliance that could offline scan disks reduce the strain? Well thinking about it, possibly not.  It would still have to conduct a scan of 1000+ virtual disks, only this time it wouldn’t have nearly as many CPU cycles available to churn through the work. All it would have is the resources assigned to the vApp which is likely to be completely inadequate for such a large task. With this in mind it’s likely that it would probably take a large amount of time to complete.  It could even take longer than a day which wouldn’t be much use for a daily AV scan. I’m sure some companies would rather suffer the ESX CPU resource pain point as opposed to sacrificing security through ineffective or untimely AV Scanning.

Richard’s article along with the solutions tab on the VMSafe webpage did however reveal that a couple of products that use VMSafe have made it to market.  One is called vTrust from Reflex Systems which appears to be a multi faceted application, which according to their site provides dynamic policy enforcement and management, virtual segmentation, virtual quarantine and virtual networking policies.  The other application is a hypervisor based firewall appliance from Altor that supports virtual segmentation and claims to provide better throughput by using the Fast Path element of the VMSafe API.

So it would appear on the surface that progress has been slow.  To only find two VMware certified appliances in the market place was, I have to admit, quite a surprise!  It looks like it’s going to be a while before we see VMsafe being fully utilised by vendors, even then we will  have those wary individuals who will never quite be convinced.

Neil Macdonald of Gartner makes a good point about the potential for VMSafe appliances to introduce possible security vulnerabilities at a lower level in the infrastructure.

If I’m responsible for VM security, I’ll consider it after the APIs ship, after the vendors finally ship their VMSafe-enabled solutions, after I’ve got a level of comfort that these VMSafe-enabled security solutions don’t in of themselves introduce new security vulnerabilities

Edward L Haletky who is very much focused on virtualisation security also makes a good point about low level vulnerabilities and the interaction of multiple VMSafe appliances. 

I fully expect VMware to not only ensure the VMSafe fastpath drivers do nothing harmful to the virtual environment, but also address interaction issues between multiple VMSafe fastpath drivers. In addition, I would like such reports made available to satisfy auditing requirements.

So was VMSafe simply something to bolster the vSphere marketing launch,  an announcement made before it should have been?  Usually VMware are quite good at keeping these kind of things under wraps and releasing them when they are a little more mature and ready for use in real world scenarios.  Now I don’t know what work was done with partners in advance but I would have liked to have seen a couple of the major security vendors releasing appliances at the same time as VMSafe was announced.  For me that certainly would have installed a little more confidence in VMSafe than writing this article has.

If anyone out there is writing appliances utilising the VMSafe API and wants to comment, please do.  I would love to hear some news from the front line as to what is being developed, where it will be applied and when we can expect to see it.

General, Gestalt-IT, vSphere ,

IT Vendor engagement of the customer community

November 22nd, 2009

Over the last month or so I’ve had two invites to participate in vendor events abroad.  The first was an invite to the Gestalt IT tech day in San Francisco, the second was an invite to the EMC EMEA Customer Council event in Prague.  Now as much as I would love to go to everything I get invited to, I have a day job which pays the bills so in this instance I had to chose the one most relevant to my employer and that was the EMC EMEA Customer Council.

Having never been invited to an EMC Customer Council event before I wasn’t entirely sure what to expect. The basic structure of the event involved EMC sharing product roadmap and strategy, deep diving a few key technologies / strategies and then listening to customer feedback.  The sessions I attended were very interactive round table discussions, with a lot of enterprise customers who were not backward in coming forward with their feelings and opinions. As the sessions went on I started to see why EMC run these events. It would be hard to gain this kind of candid and honest feedback through any other medium, this kind of information is invaluable to a vendor. From my perspective as a customer I got a lot of good insight into roadmap, allowing me to more accurately propose a long term EMC storage strategy for my employer.  I also got to meet and chat to a lot of interesting people and best of all, I got to hear about the experiences of other customers. It was re-assuring to hear that whether you are an SMB IT operation or an enterprise level one, you tend to have very similar issues. The only difference sometimes being the scale of the infrastructure involved.

Now unfortunately unlike the Gestalt IT Tech Field day, the EMC Customer Council is governed by a non-disclosure agreement which means I cannot blog about any of the content discussed. However it’s a small price to pay when you get invited to an extremely well organised, well attended event where all parties involved get something out of it.

It’s easy to see why companies are starting to catch on to the benefits of engaging the customer community directly. In some instances the community becomes a self help group of sorts as well as an alternative marketing channel for a vendor. I often see “a community” leading the way with product information awareness, problem resolution, best practice and procurement advice. The VMware community stands as  one of the best examples of this,  there is a wealth of information out there and it’s not hard to find if you ever need to go looking. In fact if you use twitter or subscribe to an RSS feed like PlanetV12n more often than not the information lands in your lap without you needing to ever look for it.

I wanted to briefly cover off the Gestalt IT tech day. Stephen Foskett the organiser and chief recently set out on a mission to organise a technical field day that vendors would sponsor without the usual NDA’s being in place. Thus allowing the attending bloggers to write about what they saw until they couldn’t possibly write anymore.  He did an exceptional job and I believe the experience didn’t put him off, he’s already looking at organising Gestalt IT Tech Day 2.

Well the attending bloggers wrote post after post and there was lots of good stuff coming out from the vendor visits they participated in. This event is another good example of vendors engaging successfully with the community and everyone getting something out of it. The vendors get a chance to spread the word about their products and services and the bloggers get lots of technical content to put out there for their readers.  Everyone is a winner and that is exactly what a vendor event should be all about.

To read more about the Gestalt IT Tech day and sample some of the many articles written, click the link. What a Tech Field Day!

General, Gestalt-IT, Storage ,

Citrix Branch Repeater - WAN Acceleration / Branch office in a box

August 8th, 2009

I’ve been meaning to write about the Citrix Branch Repeater product for some time now, so a timely reminder to actually do this was the release of Citrix Branch Repeater V5.5. Earlier this year I attended a branch office infrastructure event run by Microsoft and Citrix in Edinburgh.  This was the first time I had heard about this product, I luckily had the chance to follow up my interest at the recent Citrix iForum in Edinburgh.

citrixbranchrepeater
Branch Repeater is the rebranding of the old WANScaler product, which, in its simplest form was a WAN acceleration product. The new branch repeater is still a WAN accelerator at heart;  however Citrix have added some clever branch office features as well as some new features for XenApp customers. From a topology perspective, you basically place a larger repeater appliance in your data centre and additional smaller repeater appliances in your branch office.  I was actually surprised to learn that this is not the only option available; there is also a repeater software plug-in for use by remote users.  The diagram below shows the basic topology overview.

screenhunter_01-aug-07-2210

 

 

 

 

 

 

 

 

 

 

Branch Office Operations 

One of the most interesting aspects of the new branch repeater product is the branch-in-a-box concept.  You can purchase your Citrix Branch Repeater with Windows 2008 or Windows 2003 R2 built in.  This allows you to use your appliance to deliver DHCP, DNS, WINS, AD, DFS as well as file and print services through the onboard hard-drive.  Support for Microsoft’s read only domain controller configuration adds to the package, allowing you to actively consider consolidating an entire branch office infrastructure into one appliance.  Now it sounds like an appliance failure could have devastating consequences for your branch office and you’d probably be right.  It was one of the questions I had for the Citrix Consultants at the iForum, they informed me that you can cluster two appliances together for HA resilience.  Increases cost of course, but what price do you put on availability?

Citrix XenApp features

Citrix have added some nice features to encourage those of us who already use XenApp as a branch office delivery mechanism. ICA is already a very efficient protocol and Citrix have attempted to build on that with HDX IntelliCache and HDX Broadcast technologies.  HDX IntelliCache allows local caching and de-duplication of ICA traffic across multiple ICA sessions, it also allows for the local staging of XenApp streamed applications if that’s a technology you utilise.  HDX Broadcast on the other hand is the technology which optimises and gives granular control over the network elements of ICA.  The list of individual features is quite extensive so I won’t reproduce it,  you can check it out over at Citrix’s website by clicking the links above.  The benefits of the branch repeater when used with XenApp probably depend on the number of XenApp users in a branch or your current use of the technology. A branch with a small number of users may not see a benefit that justifies the cost, however  I can see immediate benefit if a branch office was to require expansion. Use these appliances and you probably wouldn’t need to change your WAN Links.  That has to work on the cost front!

Repeater Plug-in for Citrix Reciever

I mentioned the Repeater software plug-in earlier as this was one of the features that caught my eye, primarily because we have a lot of travelling Citrix users and home based users.  This part of the product set claims to “overcome bandwidth and latency limitations on WiFi, broadband and 3G Connections” while also delivering that high definition experience (HDX).  This in itself interests me enough to explore further, but then I find it also allows you to provide central administration of end devices covering software distribution and configuration settings.  It works seamlessly with the Citrix Access Gateway product and other leading VPN’s to optimise traffic within secure tunnelled network connections.  All in all it sounds brilliant and potentially allows you to deliver improvements for users who work outside the branch office, something that is becoming more common every day.

Conclusion

I mentioned before that this is a WAN Accelerator product at heart, with nice new shiny add-ons to meet a number of customer requirements.  I’m genuinely excited by this product as I think it has a place in companies global infrastructures, especially with remote data centres and Citrix based branch offices becoming more common place.  I myself am going to find this hard to sell to my current employer, mainly due to some nasty issues we once had with another WAN Accelerator called Riverbed.  However that was a long time ago and maybe the industry has moved on since then, maybe it’s time to take a fresh look.  Cost is the one thing I’m not 100% sure about at this point in time, there are a number of different models and it would appear that costs range from $5,000 for the branch side appliances to $11,500 for the data centre side appliances.

If anyone is using the Citrix Branch Repeater appliance, we’d love to hear about your experience of it and possibly you could clarify the cost element for us all.

Citrix, Gestalt-IT ,