Tuesday, April 26, 2011

The sales experience

The basic process for purchasing a NetApp SAN or NAS device is a familiar one. You purchase their hardware through a VAR through a quoting/bidding process. Once you have a quote entered with NetApp, no VAR can get you lower pricing. Sort of.

When you purchase a NetApp device, you need to know that their prices are simply made up. You can purchase an $80,000 SAN configuration for $35,000 almost any time during the fiscal year. What's more impressive is that almost anybody can get this impressive discount, even though NetApp makes is seem as though they are bending over backwards for you. When purchasing a real SAN from a real manufacturer, you expect a discount of less than 50%. The only time a competing manufacturer would even dream of giving you a SAN for half of its list price or less is if that SAN is end-of-life and coming to end-of-support soon.

Speaking of prices, NetApp doesn't seem to actually publish any. Most hardware vendors keep their numbers quiet, but there are at least general figures you can find. Not for NetApp. The figures they give to their VARs appear to mean nothing, since they discount so deeply, and nobody appears to be publishing the original figures online. This isn't necessarily a bad business practice, but it does feel a bit shady.

NetApp's entire product line is built on the notion that their software adds value to their SAN hardware. To me, the value of a SAN is in the SAN itself. If the hardware can't stand alone and compete with other manufacturer's products, then it is not worth a second look. NetApp refuses to publish performance data about their SAN hardware. We are told this is because their hardware is best compared feature-to-feature and not IOPS to IOPS. They note that their WAFL technology and the onboard NVRAM that backs it means their write operations are faster than the competition, and their read operations are fast enough. I've actually found that in simple benchmarking scenarios, the NetApp falls flat on its face. More on the WAFL technology and why it's pointless in a later post.

Finally, their pre-sales help is pitiful. The expertise their pre-sales engineers have is minimal when compared to the required expertise of operating a SAN in any environment. They only know what the sales brochures and online videos tell them, and nothing more. This particular situation bit me when a feature that was supposed to work actually performed a completely different function. That, to me, was a huge problem. And to make matters worse, I had asked the same question in their "forums", which resulted in the same misinformation. If you're going to say that a product will perform a certain technical feature and you aren't a technical person, I will have much more respect for you if you tell me you'll have to check with someone with technical knowledge than if you just feed me a line of marketing B.S.

Thursday, April 21, 2011

Introduction Time

I am an IT team member at a small software company. I have been working in the computer field in one manner or another for approximately 18 years, with experience from small mom-and-pop shops to international corporations. Pertinent to this particular blog, I have had experience with SANs for around 5 years total. Most of these have been FiberChannel SANs, simply because of the historical bandwidth gains over competing Ethernet.

So, that's my background. This is my environment. On the production side of the house, I have a blade center with six Windows 2008 R2 servers. Two of them are domain controllers in a sub-domain of my HQ environment, and the remaining four servers run SQL Server 2008 Standard. These four servers are grouped into two Windows clusters using the MS Clustering services built into Windows 2008 R2 and SQL Server 2008 Standard. Cluster one contains four SQL instances, Instance 1, Instance 2, Instance 5, and Instance 6. These instances run two-up on the two physical servers, which act as an active-active failover pair. The second cluster runs three instances- Instance 3, Instance 4, and a general purpose instance. One physical server in cluster group two runs two instances, and these servers also act as active-active failover partners.

My individual SQL instances run anywhere from six to 100+ databases, one for each of our customers. We group these databases by customer size and activity, so our six largest customers are grouped together, and our 100 smallest customers are grouped together. But that doesn't matter much in this context.

Our SAN connects to these physical machines through a Dell blade center interconnect using a pair of Brocade FC switches. Each server has two FC ports for redundancy, one connected to each of the switches. This gives us path resiliency to our SAN from the server.

Finally, there is the SAN itself. In our production environment we have a NetApp FAS 2050 with two filer heads, 20 SAS disks, and two FC ports per head. The SAN's performance is difficult to comment on, because NetApp does not publish IOPS numbers. They claim that this is because of their improved technology using the WAFL filesystem, which effectively makes all writes sequential. From what I've found during internal benchmarking, it's extremely easy to overwhelm the NVRAM backing the WAFL operations, and even easier to overwhelm the RAID-DP protection scheme implemented. When compared to a RAID-10 SAN, RAID-DP falls flat on its face performance wise, but does offer a cost benefit.

In future posts on this blog, I will break down the technology behind the NetApp SANs and describe my horrible experience with NetApp as a company. I will keep things honest and as un-biased as I possibly can. But I will not be gentle. NetApp is, in my opinion, the WORST option for SAN technology on the market.