Big Data Analytics – Is metered consumption right for you?

What kind of analytics are we talking about?

I’ve watched with fascination as the computer industry has searched for ways to put a simple spin on data analytics. A long time ago some very smart marketing gurus decided they just couldn’t stick with anything that used the S-word (statistics). So we started seeing the definition of new terms like data warehousing, data mining, business intelligence, business analytics, and most recently Big Data. Although some of these terms focus more on the data and some on the tools it has always be implicit that you need both to get any interesting answers. It is also interesting how rarely you see the word statistics in any modern discussion of analytics. It worked.

Most of the database management systems that have gained significant market share in the data warehousing space include query languages that can accomplish what I call basic counts and sums. How many customers placed orders this quarter? What was the value of bookings, billings and backlog on July 1st? There are also functions for variance and standard deviation in Microsoft SQL Server but I rarely seem this used or included in any reporting output. It mostly scares people. This type of numerical reporting is often all the business intelligence that an organization needs or wants to invest in. But what about businesses that are excited about the type of insight that comes from higher investments in analytics like machine learning and data mining.

The image to the right is clipped from the Wikipedia article on Machine Learning. That is just a list of the problem categories covered in the article. Each of those problem areas has typically anywhere from 3 to 10 different modeling or analysis approaches that are relevant, popular and useful depending on what you are trying to discover. Common clustering techniques include BIRCH, Hierarchical, k-means , and Expectation-maximization (EM) to name only some. If your organization is interested in the demographic make-up of groups of customer with similar buying patterns you could find value in trying and potentially regularly using several forms of clustering analytics. Need to look at fraud detection, then that will be another group of classification or anomaly detection techniques. I’m not trying to talk anyone out of the importance of big data analytics, but I want to see if I can help you proceed with a balanced and realistic approach.

What has to happen before the analytics begin?

This is the piece of the analytic process where the most expensive lessons get started.

People

I once saw an email signature from a SQL consultant that said:

“If think hiring a professional is expensive wait until you see what an amateur costs!”

If you have not identified the people resources that can understand your analytic needs and the appropriate class of techniques that can be used gain insight, you shouldn’t start making decisions about analytic platforms. People wrongly assume that by now most analytic platforms would be roughly on par for features. They are not. Start with the problem and knowledgeable people and then evaluate platforms.

Data

From my experience, no one has data that just happens to be clean, grouped and in the right shape to immediately start doing analytics. It is still maddeningly complex getting data from multiple sources into a common, clean and appropriate structure before you can apply a data mining technique like classification, regression or learning algorithm. Anyone that tells you it is simple is working with small structured datasets of predominantly clean data. Before you make any big investments in software or even people try to get a handle on what the process would entail to get data from all of your key systems. If you don’t have the skills in-house contract with someone or some group to help identify what a good clean data set would look like and then try to assemble a sample that meets the requirements.

Statistics Software

I used the S-word in a bolded title. What a relief. Even if you’ve never looked up a student t score from a statistical table, you should check out the Wikipedia page reference above on machine learning. The reason is, I want to emphasize that no software package or cloud service is going to have a complete set of all the tools and techniques referenced in that article. The good news is that most organizations will find a few techniques really useful and therefore don’t need to worry about the complete set. The bad news is what if you’ve already bought software or contracted for an online service and they don’t have something that would be really valuable.

The economics of metered consumption.

Analytics is very much an iterative learning journey. What is the best environment to allow this iterative learning process to develop into something useful? How do contract cloud services help or hinder companies that want to move forward with big data analytics? I saw an interesting quote in the Economist recently:

“The shift to the cloud is the biggest upheaval in the IT industry since smaller, networked machines dethroned mainframe computers in the early 1990s.” full article here

One of the primary reasons for the shift away from the mainframe to smaller networked computers was the economics of metered consumption. I worked at the Electric Power Research Institute in Palo Alto from 1990-1994 managing projects to develop new PC-based software for analytics and forecasting from FORTRAN on mainframes to C++. Economists and statisticians working on evaluating the potential impact of investments in energy efficiency were severely impacted by mainframe charge-backs for their use during model development and iterative enhancement.

Today’s big data dwarfs what we were working with in the early 90’s. Plus the variety and sophistication of approaches that have been shown effective in multiple problem domains means that freedom to explore is even more important. It is not uncommon for a data scientist to try 6 different models using various variable transformations and/or different techniques in order to test the stability and relative robustness of the results. This is not considered unscientific given today’s capabilities, experience, and assumption that the practitioner can reliable interpret the results. Data scientists still are very much aware that using the wrong approach for the data and questions being asked can have serious financial implications.

The other trend that is likely to lead to limits of metered analytic services is the increasing connectedness of data. In a report from the SAS Institute and MIT Sloan School of Management, the authors identified a group of Analytical Innovators that are 3 times more likely to use most or all of their data for analytics than do other groups. Having metered storage and networking is also a potential impediment to expanding the breadth and depth of analytics usage when using contract off-premise services.

Don’t get me wrong, I’m all for having full economic signals about the total cost of ownership for IT investments. The models for comparing IT ownership vs contract services are way more complicated that you might imagine. The interesting comparison is that for metered service the cost of a minute of consumption is typically constant. With IT ownership the cost per minute of usage goes down until you reach full utilization. If you plan to be an analytic innovator how does the cost of IT ownership at full utilization compare to what the same level of usage cost in a metered service? How does the impact on monthly expenditure impact the willingness of the business analysts to run a bigger sample of customers or try an innovative approach to a particularly challenging question? I saw what happened when mainframes ruled in the realm of analytics and the promise of escaping high marginal costs of usage gave way to a new type of IT ownership that was empowering for a fixed budget.

Delegating Backup Operations for SQL Server

I had to really think about the title for this article because database backup and recovery is so important to every DBA. If you’ve read this far you are either intrigued or furious about the first word being derived from:

DELEGATION – the assignment of responsibility or authority to another person to carry out specific activities.

I understand that this can be an emotional topic of discussion. DBAs are continually exposed to the message that they are ultimately responsible when data loss occurs – full stop. DBAs lose their jobs over these things. Any suggestion that the responsibility for data protection can be shared with others outside the DBA team is can create anything from a vigorous discussion to a complete lack of response. I was asked to attend a meeting with a large EMC account a couple of months ago to discuss expanding the use of a backup product the customer is using to include Oracle and SQL Server databases. There were several Oracle DBAs there. They asked very few questions. The SQL DBAs didn’t attend. That was unfortunate. The meeting organizers were really frustrated. We missed an opportunity to share information and get any concerns out in the open.

Let me break down the title and explain why I think the wording is useful to expanding the discussion of how to handle SQL Server data protection.

Delegating

I chose this because I want to highlight the need for having team responsibility for data protection. We have to change the mindset that it is ONLY the DBAs that are responsible for SQL data protection. Here is an example I found from a recent blog article on this topic:

“When you’re in charge of things like DR, RTO, RPO, etc., it’s a bitter pill to swallow to cede control of the database backups to someone else and/or lose control of the backup processes.”

Delegation implemented correctly does not have to result in ceding or loss of control. Having clearly identified responsibilities is a good thing. In my experience, having multiple teams working together through delegation is not inconsistent this goal. The key is restricting the delegation to well defined activities. I’ll talk more about that below under the Operations heading.

Backup

I love the saying “Don’t have a backup strategy, only plan for recovery”. I’ve heard a couple different forms of this theme but I’m not sure where it originated. So why did I choose to only highlight backup in the title? It is the best candidate for delegation and automation. Recovery is a much more situationally complicated operation. Many organizations have complete run books for recovering some or all of their IT services, especially when it involves a disaster recovery plan. I recommend having as much recovery automation developed, documented and tested as you can afford based on competing needs for IT staff. It is a hard investment to get a good cost/benefit estimate for.

The choice of tools that are used for SQL Server backup need to be vetted by the DBA staff since you can count on sooner or later encountering a critical event that warrants the DBA team taking a non-scheduled backup or a manual restore that can’t wait for operations staff to get involved.

Operations

SQL Server has a fixed set of operations related to backup and restore. Microsoft also provides APIs for automating all the functionality available through SQL Server Management Studio plus other tools such as VSS assisted backup. How these functions are executed are irrelevant so long as they are done reliably in the correct order and with the correct frequency. Setting RPO, RTO, and definition of what events you are going to be able to react to (server outage, data center failure, etc) is still the responsibility of the DBA team, application owners, and business users. I am not suggesting that there be any transfer of these activities to the operations team. Delegation of operations is restricted to the actives defined to achieve those goals. Operations is the regular execution of the tools or scripts to attain the required reliability. Shared responsibility for the outcome, individual responsibility for the definition and execution.

SQL Server

SQL Server has a complete set of native tools for backing up and restoring databases. It has TSQL and PowerShell scripting interfaces. It does not have a central management server for scheduling and tracking backup and restore settings, status and/or events. In the last 15 years I have seen numerous organization spend significant resources on custom development to cover the gap between what was needed and what is native to SQL Server. I have also seen very large organizations invest nothing in tools but instead have a large staff of people managing data protection on a largely individual basis (you cover those 10 servers and I’ll get this 10) and almost everything in-between.

The other thing I’ve seen frequently is use of 3rd party tools that manage only protection for SQL Server or a small number of RDBMS products. These tools typically include the central management server resource lacking natively in SQL Server and reduce or eliminate the need for large amounts of TSQL, PowerShell or .NET development. However, this still leaves the challenge that when each group in the enterprise wants to use a domain specific tool for data protection, it is extremely difficult to execute a coordinated recovery when something goes wrong. When the DBA team says, I don’t need any “3rd party” tools to manage SQL Server backup and restore that is technically correct but often times institutionally short sighted. Someone else in the organization is responsible for restoring systems that rely on SQL Server. Having to coordinate many teams with specialty tools to put the pieces of a puzzle back together is time consuming and error prone.

This brings me back to the story I told in the top of the article. The organization I went to meet with was trying to consolidate as many of their backup needs into a single product as possible. They have Windows, Linux, VMWare, HyperV, SQL, Oracle, and SAP just to name the big needs. They had started with server OS and desktops and learned a lot about the tool. Now it was time to see what else could be brought into a common framework. I was there to talk about the topics that I put in this article. I had a PowerPoint and decided to fill in the talk track in this article. Almost there.

Summary

  • Large organizations cannot afford to rely on 5-10 or more domain specific backup and restore tools and still have a manageable service level restore.
  • DBAs need to keep responsibility for RPO, RTO and acceptable data loss criteria when moving to a delegated data protection model.
  • DBAs cannot impede movement to centralized operation over unwillingness to delegate due to fear of loss of control. There needs to be a team structure with appropriate responsibility and delegation.
  • DBAs need to have access to scripting tools and GUIs, preferably integrated with SSMS if possible. They cannot insist that it be identical to the tools they use for SQL Server only data protection.
  • Domain specific users like DBAs need to have manual override processes for on demand backup and restore when emergencies arise.
  • Backup should run largely with automation and little need for user intervention. Restore is highly situationally dependent and will often times involve data collection, team review and decision cycles.

Next Steps

  • With the SQL PASS Summit 2015 right around the corner I thought this was a good time to tackle a really emotional topic like this and see if I can get start some productive conversation going. I’ll be in the EMC booth Wednesday through Friday. Come by.
  • I you are a DBA think about what things you absolutely feel you can’t delegate and which things you can. If your organization already uses a backup tool with SQL Server integration investigate what the scripting capabilities are like and what kind of integration with SSMS there is.
  • If you are responsible for enterprise data protection talk to the DBA teams about what tools you have or are planning to acquire. Get someone from the team to evaluate the integration with SQL Server (scripting and GUI) and give you feedback. If the tool looks good, have a DBA demo it to the rest of the team, especially paying attention to how to do on demand backups and restores.
  • Please reply to this article and let me know how it goes. That way we can start to share some lessons.

Thanks for reading!

Phil Hummel – @GotDisk

Microsoft SQL as-a-Service gets real with FEHC 3.1!

This is a guest post from my colleague at EMC Eyal Sharon (@eyalglobal). The views and information expressed here are those of the author. They have not been reviewed or approved by Microsoft or EMC.

Agile, Simple, Secure and Automated, this what the Federation Enterprise Hybrid Cloud (FEHC) marketing blurb promises. The EMC Enterprise Applications Solutions Group has worked very hard to achieve what I’m sharing with you here. I am so proud of the engineering teams that made it real. In this post I would like to take you through the latest capabilities we are going to introduce very soon with the release of Microsoft Applications as-a-service in EMC Federation Hybrid Cloud 3.1.

So what’s in there?

Microsoft applications package for FEHC 3.1 provides:

  • On-demand deployment of Microsoft Exchange, SQL and SharePoint instances
  • blueprints that you can import into vRealize Automation Application Services
  • monitoring capabilities for MS applications via Hyperic with vRealize Operations
  • DBaaS service offerings for SQL server 2008R2, 2012 and 2014
  • automated backup of Exchange and SharePoint resources upon provisioning
  • automated backup and restore for Microsoft SQL Server including an option for native SQL backup to Data Domain
  • support for on demand backup of Microsoft applications
  • continuous availability and disaster recovery enablement
  • support for several Windows OS and application server releases

Let me explain a little further and walk you through the architecture. FEHC has a number of moving parts, which are all well-orchestrated. I will be focusing on the Microsoft applications services we have built and on SQL Server services in particular. All of the components that can be integrated in the solution are shown in the block diagram below. Obviously, each implementation will only use a subset of these with EMC and VMWare software providing the abstraction for storage, compute, and networking. This gives IT organizations flexibility in choosing the infrastructure components that best suit their needs while still enjoying all the benefits I am describing.


Let’s go through a few workflows and screenshots to demonstrate some of the capabilities introduced earlier;

Provisioning

Application provisioning in this solution consists of four phases. First the application architect creates an application blueprint and then publishes this blueprint into vRealize Automation (vRA). The entitlements and approvals are added, and the application gets added to a self-service catalog. Then a user with the appropriate permissions requests the application from the self-service catalog.

The approver approves the request and the deployment process begins. Finally, the application admin verifies the completion of the deployment process and has full control of the application.


The service catalog is quite rich and the user can choose from several blueprints, for example, SQL Server standard, enterprise or even express edition on SATA or Flash storage tier. We even support SQL Server 2008R2 along with 2012 and 2014.


Once the desired catalog item is chosen, there are several SQL Server attributes that can be typed in outside of your standard CPU, RAM and hostname. Like max memory per query, max worker threads etc.


You can even automate the deployment of an AlwaysOn Availability Group. The automation creates a Windows Failover Cluster, creates the desired primary and secondary databases and configures the SQL AlwaysOn availability group. Once the deployment has completed you can connect to one of the virtual machines and view the primary and secondary database replicas.


Operations Granularity

The Microsoft SQL Server operations enabled in FEHC application services are very granular. It covers a lot more just standing up a VM using a SQL template. Many of the regular operations any DBA would be happy to delegate to the application owner are also included. The ability to use the delegated services require the user has been granted sufficient permissions to make changes on these VMs, Instances or Databases.

A user can trigger workflows that would:

  • Create/Delete a SQL Server Instance/s
  • Create/Delete a SQL Server database/s
  • Add/Remove databases to an existing AlwaysOn Availability Group


Monitoring and Alerting

vRealize Hyperic was used in this solution to monitor Microsoft applications on the Federation Enterprise Hybrid Cloud. vRealize Operations provides the monitoring UI. Performance and systems health can be monitored based on many counters which are supported in this architecture. For example you can monitor:

SharePoint Responded Page Request Rate, SharePoint Executing Time/Page Request, SQL Server user connections, SQL Database Buffer Cache hit ratio, SQL Lock Waits. There are many more supported metrics as documented in the VMware vRealize Hyperic resource guide.


Backup and Recovery

Backup-as-a-Service for Microsoft applications in FEHC 3.1 automates the installation of EMC Avamar Agents/Plugins during provisioning and provides an on-demand backup and restore of Microsoft resources. It even empowers the application owner to create and manage native backups done through SQL management directly to an EMC Data Domain target.


AlwaysOn Availability group protected databases are also supported in the backup solution if desired.  In addition to on-demand backup, there’s also a very powerful option for on-demand RESTORE of SQL Server services (surprised?)! Care must be exercised when performing on-demand restore to avoid unintentional and devastating results. I truly believe that the leading use case for this feature in the SQL services space would be related to DevOps rather than business critical LOB applications. This process is shown below. There are several useful options including out-of-place restores.


Continuous Availability and Disaster Recovery

The main component that provides continuous availability in this architecture is based on EMC VPLEX Metro. I won’t get into details of that product but let me say that all storage components in FEHC can be protected by VPLEX distributed volumes, including Microsoft application VMs. This means that application data is available on two sites simultaneously at all times. This architecture allows a storage-consistent restart capability with minimal or no data loss if a failover occurs. The virtual machines will also retain their network configuration, so there is no need to change any networking or name resolution in that scenario. This ensures full automation for an unplanned failover. Note that SQL AAG in this solution is designed primarily to provide local site availability. Multi-site protection is transparently provided by EMC VPLEX. Disaster Recovery protection in FEHC is orchestrated by VMware vCenter Site Recovery Manager (SRM) and its integration with EMC RecoverPoint.

As with all IT products, there’s still a lot of room to automate and simplify application related operations in FEHC, to provide a better user experience and a truly agile IT that businesses are searching for. While all the related documentation, demos and packages are still being finalized we are already working on the next release which would include even more exciting features for Microsoft SQL, SharePoint and Exchange but I can’t share as these are still being discussed……

More information about about EMC Federation Enterprise Hybrid Cloud Solution can be found at EMC Federation – Solutions to Help Win in the Digital World

Stay tuned and please feel free to ask questions and provide feedback.

Eyal Sharon

EMC Solutions Group – Databases Solutions

Why are you taking snapshots of databases?

Introduction

I was searching online recently and saw this headline clipped from the most recent Gartner Magic Quadrant for Enterprise Backup/Recovery Software (June 2015):

Gartner states that by 2016, 20% of organizations, up from 12% today, will employ only snapshot and replication techniques, abandoning traditional backup/recovery for the majority of their data…

I realize that structured data is maybe only 25% or less of what the typical enterprise is trying to protect, however, the fact that this many organizations are committed to a “snapshot/replication first” strategy is a big deal for anyone that manages large databases or large numbers of databases. Technologies that are applicable across multiple system domains are an important strategy for managing IT ownership and operational costs.

I heavily relied on intelligent storage array snapshots for data management associated with SQL Server proof-of-concept projects I conducted with customers during my 10 years at the Microsoft Technology Center in Silicon Valley. It was a big advantage to customers that were trying to minimize time in the lab and maximize results from their collaboration with Microsoft. Even after the SQL Server product team implemented database snapshots in the 2005 release, there were aspects of array based snapshots including read/write access and the ability to efficiently manage many snapshots per database that kept us using them in many projects.

I have heard a number of very knowledgeable SQL DBAs and developers refer to storage array snapshots as “SAN voodoo”. This is the first in a series of articles that I am writing intended to reduce or eliminate the knowledge gap between how SQL Server database snapshots are implemented vs snapshots created on storage arrays. Not all storage arrays implement snapshots in the same way, however there are some common patterns that you’ll see used. In this article I use some examples from EMC’s VNX and VMAX products. I currently work for EMC and I would not want to accidently misrepresent the implementations of snapshots on another vendor’s product. The views and information expressed here are my own. They have not been reviewed or approved by Microsoft or EMC. See the disclaimer on the website https://gotdisk.wordpress.com.

Snapshots of databases and other oddities

My wife is a photographer. When I told her about what I was working on she asked “snapshots of databases – what?” I was hoping she would help me come up with a clever title for this topic but she had to get school – she also is a teacher. I did a quick Bing search (yes I said Bing search) for database snapshots and for the heck of it clicked on the Images tab. I found this one and thought it would help me make a couple of points.

  • The term database snapshot has many meanings.
  • Taking pictures of white boards can be useful to show that there was a meeting.

In this article I want to compare and contrast the implementations of:

  • SQL Server Database Snapshots, and
  • Intelligent storage array snapshots

Since research shows as many as 80% of database teams and storage teams don’t collaborate, I’m hoping articles like these will be interesting to both communities and start some dialog.

There are a lot more similarities than most storage or database professionals might expect between SQL database snapshots and intelligent storage array snapshots. Please share this content with other teams in your organization and start a conversation. Challenge your organization choose the best combination of SQL Server and intelligent storage array features to reduce IT cost and complexity.

I’m planning on extending coverage of this topic in future articles that will go into more depth on storage array snapshots starting with a deep dive on application consistent versus crash consistent (OS consistent) snapshots and some common myths about intelligent storage array snapshots. Please leave comments at the bottom and let me know what you would like to hear about.

In this article, I’ll first go into more details on the SQL Server database snapshot implementation and then compare and contrast that with a generic discussion of how storage array snapshots work. I’ll describe some specifics that I feel are relevant to many storage platforms. Remember, the intent of this discussion is to help storage and database administrators reach some common understanding of how the various features of applications and storage products can better solve business problems together.

 

 

 

SQL Server Database Snapshots

SQL Server introduced database snapshots in the 2005 release. There have been no significant enhancements including in the current previews of SQL 2016. Some of the advantages identified when the feature released included:

  • Protecting your system from user or administrator error
  • Offloading reporting
  • Maintaining historical data
  • System upgrades
  • Materializing data on standby servers
  • Recovering data

These use cases apply universally to all database “copy” technologies. Since snapshots are not full copies like a backup or storage clone they are generally very low overhead to create but don’t have all the protection features of full copies. This article should give you a good understanding of what make snapshots unique among data copy options.

At the heart of all snapshot technologies is the unit of replication or chunk size (snapshot granularity) that will be utilized for managing changes. In the case of SQL Server Database Snapshots that unit is a single 8k page. The second implementation decision that is common to all snapshot architectures is allocating storage space for the additional data chunks that get generated as changes occur. SQL Server uses a NTFS file structure called a sparse file for storing pages generated during change tracking for database snapshots. A NTFS sparse file can be created with a very small footprint and then expand as new data is written without explicitly setting a file size. Sparse files are also very efficient in handling data with large regions of zeros. I found this background article on Sparse Files if you are looking for a starting place for a deeper understanding.

SQL Server allocates new space in the sparse file in 64K chunks to match the size of a data file extent (8x8k per page). If only one page in an extent is modified, the SQL engine creates a 64K allocation in the sparse file and puts the page in the same slot that it occupies in the source data file extent. The other page locations for the extent are written with zeros. As more pages in that source extent are changed they are moved to the sparse in the same slot order as the original extent. This should provide for more sequential I/O if multiple pages are copied to the sparse file and the read only query needs to retrieve multiple pages from that extent. I have not seen any discussion on how much this approach impacts performance but the algorithm seems to be a good balance between creating unused space in the sparse file and giving some potential to cut down on purely random read access to the sparse file. There are a lot more implementation details of how SQL Server uses NTFS sparse files to manage efficient usage of disk space for database snapshots. Since databases that use SQL snapshots have a latency dependency on a sparse file, you need to factor the performance of that device into the design of an application that will include use of snapshots.

In the picture to the right you can see the relationship between the source database and the sparse file. There is only one database page that has been modified since the snapshot was created. Application users often times refer to a SQL snapshot as if it were a separate copy of the data and use terms like “the snapshot copy”. As this picture shows, there is no “snapshot copy”. There is only the current database and a separate storage area (the sparse file) that holds the original copies of the database pages that have changed since the snapshot was created. SQL Server database snapshots are implemented with read-only access support. You cannot write to a SQL Server snapshot. Users can read from and write to the source database. Writes to the source database will update the pages (shown on the left of the picture). When a page is written to in the source database, SQL Server:

  1. Puts a latch on the page.
  2. Copies the unmodified page to the sparse file
  3. Modifies the source page
  4. Releases the latch.

Since the duration of the latch includes the time it takes to make a copy of the page in the sparse file, the latency of “copy on write” operation will add to overall transaction latency. The time that SQL holds a latch on the page may result in increased wait time for any other operation that wants to access that page.

The picture shown to the left represents two points in time. The right side is after 30% of the pages have been modified in the original database and the one on the left shows when 80% of the original pages have been modified. The sparse file size is approaching the size of the original database over time.

Notice the blue line down the center of each drawing with the ‘Read operation on the snapshot”. Read activity on the snapshot is moving from the original database file locations to the sparse file. The performance of queries on the snapshot is going to be impacted by the performance of the storage supporting the sparse file.

SQL Server supports multiple snapshots on a single database. Each snapshot will need to be allocated a unique sparse file at the time of creation. In the copy on write process, a separate read only copy of each original version of changed pages will be written to each sparse file for all the active snapshots. The impacts of maintaining parallel updates for multiple snapshots should be evaluated carefully before implementation on any database.

The existence of dedicated sparse files for each snapshot permits matching the storage performance to the use of the snapshot. If DB01 needs a low latency snapshot, we could place that sparse file on local flash drives. If the performance impact from using snapshots with another database is less critical, you could place that NTFS sparse file on less expensive storage with lower performance.

SQL Server snapshots cannot be refreshed. To obtain a more recent snapshot delete the old snapshot and create a new one. SQL Server database snapshot names must be unique but you can reuse a name after the older copy is deleted.

SQL Server database snapshots are given a database ID like any user database. Therefore the total number of databases plus snapshots cannot exceed the limit of databases per instance (32,767 in recent versions).

 

Intelligent Storage Array Snapshots

In the first part of this article I discussed how SQL Server implemented three important aspects of data snapshot management.

  1. Chunk size for managing change data, also known as snapshot granularity
  2. Space allocation for managing the data change differences.
  3. Read/Write access

Snapshot Granularity

There is no standard chuck size across any of the storage arrays that I am familiar with. For the latest version of the EMC VNX line, the snapshot granularity is also 8K. Snapshot granularity on the EMC VMAX 3 is 64K. Setting a size for snapshot granularity involves making a tradeoff between the amount of space overhead consumed (small chunks are better) vs the amount of resources needed to manage the meta data for the total number of snapshots supported by the array (big chunks are better). Remember for SQL Server database snapshots, space is allocated in the sparse file in 64K chunks (1 extent) while data is written to the sparse file in 8K (1page) chunks using copy on write.

Snapshot Space Allocations

Space allocations for snapshot management on most modern storage arrays are made from the same pool of storage that is used for the source for the snapshot (see VNX Snapshot graphic below). This wasn’t always the case. You should check with the array vendor to determine how it is implemented on the equipment you are using or evaluating. When I was first learning about array based snapshots in the early 2000s you had to create a dedicated set of disks for managing snapshot change data. This pool of storage is directly comparable to the role that NTFS sparse files play in support of database snapshots for SQL Server.

The picture on the right shows a comparison of two different approaches to managing snapshots. These examples are options as implemented on the EMC VNX series of intelligent storage arrays. Other arrays may use different approaches.

The left side of the image labeled SnapView Snapshots is an older method that uses a dedicated storage pool for change block management. The blue box represents the Reserved LUN Pool that would be created using physical disks on the array typically dedicated to snapshot management. The RLP serves the same purpose as the NTFS sparse file for SQL Server snapshots, however, there are some significant differences which I will discuss next.

A Reserved LUN Pool (RLP) on VNX is used for all snapshots created on the array compared to one NTFS sparse file per snapshot with SQL Server. The size and resources for the RLP needs to match the expected needs of reading and writing to snapshots. Each array that implements snapshots will have guidance on sizing and managing space and performance. Array snapshots that use a RLP typically use the same COW process that SQL Server does. You can follow the steps for the COW process for VNX in the red numbered boxes in the SnapView Snapshot graphic on the right. Therefore, just as in SQL Server snapshots the latency of writing to a LUN that is using snapshots will be impacted by the performance of the storage location in this type of implementation.

The right of the image above labeled VNX Snapshot shows an example of how snapshots are implemented on most current storage arrays including EMC VNX and VMAX product lines. The storage for managing changes for LUNS using snapshots is the same as the pool that source LUN is allocated too. The process for handling changes is called Redirect on Write (ROW) and is shown in the graphic by the two boxes with red numbers. Straight away you can see there are less steps which is often times a good thing when writing data!

In advanced snapshot implementations, both the source LUN and any snapshot copies are implemented as a set of pointers to a group of storage locations that hold the unique data for that copy. The simple example diagramed in the image on the right shows a source LUN and two snapshots. The source LUN has pointers to 4 allocations so each snapshot needs the same number of pointers. At the time the first snapshot is created and before any changes are made, the source LUN and the snapshot have pointers to the same 4 allocations (A,B,C,D). Sometime after the purple snapshot copy is created the A allocation is written to on the source LUN. The storage processor marks the modified (A) data as requiring a new allocation (A’) and the pointers for the source LUN are updated to (B,C,D,A’). The data does not have to be physically written to disk until there is cache memory pressure. This process replaces the COW process used for RLP snapshots and SQL Server snapshots where copies of the original data are written to disk in the RLP or sparse file before the source LUN can be updated.

Also shown in the picture to the right is the implementation of multiple snapshots on one source LUN. The orange snapshot (Snap2) was created after the purple snapshot (Snap) and after changes to the source LUN created the (A’) allocation. The data stored in the (D) allocation was changed on the source LUN after Snap2 was created. The source LUN now points the allocation set (B,C,A’,D’). The allocation set for Snap is unchanged.

Notice that the (D) allocation was modified after both snapshots were created. The source LUN needed the (D’) allocation to keep the current copy of that data but both snapshots share access to a single snap copy of the (D) allocation. In the case of SQL Server snapshots two copies of the (D) allocation would be made, one for each of the sparse files associated with the two snapshots.

Read/Write Access

Another important distinction between most storage array based snapshots and SQL Server database snapshots is the ability to have simultaneous read/write access to multiple snapshots. Both VNX and VMAX support 256 snapshots for each LUN that are fully writeable. Based on the change management discussion above, the same process can be used to make writeable snapshots. If a user accessing the first snapshot (Snap) wants to write to the (D) allocation, the storage controller creates a new allocation in memory (D”) and writes the changes there. Then the allocations for the source and 2 snapshots would be as follows:

Copy

Allocations

Source LUN

B,C,A’,D’

Snap

A,B,C,D”

Snap2

B,C,D,A’

 

Summary

  • SQL Server database snapshots provide read only copies of databases from local storage that supports NTFS sparse files.
  • Creating multiple snapshots on a single database requires multiple copies of snap copy pages.
  • Array snapshots are dependent on the specific vendor implementation
  • Most array based snapshots are writeable.
  • The snapshot granularity is comparable between some array based snapshots and SQL Server database snapshots.

Next Steps

  • I will be continuing this series with more articles on snapshot technologies for SQL Server data management, check back or follow this site.
  • If you are a DBA, contact your storage team and learn more about how array based snapshots are used in your organization
  • If you are part of a storage management team, reach out to the SQL DBA team and see if they are using SQL Server database snapshots and learn more about the business needs they are supporting.

 

Thanks for reading

Phil Hummel - @GotDisk

San Jose, CA

IT Spending Forecast – Hyper Cloudy with a Chance of Clearing

Cloud Computing may one day be recognized as the most successful marketing concept ever created. It can mean both everything and nothing at the same time. This produces confusion in the uninitiated and so creates space for products and consultants to step in and solve problems that people may not have known they had.

I was intrigued by the similarity of three conversations that I have been involved in over that last couple of months. All three were focused solely on data protection for NEW data centers that are being built and operated by enterprises solely for their own use. It doesn’t mean that these organizations are not actively pursuing other strategies involving cloud services, but it reminded me of how little media attention is being given to “green IT” or “the data center of the future” concepts. Just 5 years ago these were commonly heard terms but have been largely crowded out of the news by anything with the term cloud appended to it.

Like many good marketing concepts, cloud computing was invented to replace something that was both familiar and had significant baggage – outsourcing. What is brilliant about this particular invention is that what started out to include most of the traditional IT outsourcing options like software-as-a-service, platform-as-a-service, etc, has expanded to also mean a new way to run private data centers as a private cloud. Any attempt now to suggest that cloud computing is a retreaded outsourcing play can be countered with “but it also includes private cloud which is not outsourcing”. Hence one more reason for the earlier statement that cloud means everything and nothing at the same time.

Since IT outsourcing has a long history, how much of an impact is the shift to “cloud computing” marketing having on IT spending? The ambiguity of the cloud computing amalgam more apparent than when trying to answer the question “how much is being spent on cloud offerings” I have been following the Amazon Web Services vs Azure competition for a while now by trying to compare data on relative revenue. Comparing profit is just impossible. . Both companies have been guarded in the way they talk about their cloud offering revenue. I wanted to see whether I could find data that would indicate whether the high growth rates in reported cloud spending was having an impact on plans for data center construction and upgrades.

The numbers shown in this article are a sample from all the publically available sources. I have included links with each table and graph linking back to the sources. The Gartner Group publishes estimates of world-wide information technology spending that are readily distributed and quoted by a number of reputable publications. The total global spending estimate for 2015 is $3.828 billion.

Spending Category

2015 (Billions)

% Growth from 2014

Devices

732

5.1

Data Center Systems

143

1.8

Enterprise Software

335

5.5

IT Services

981

2.5

Telecom Services

1,638

0.7

Overall IT

3,828

2.4

Source Gartner Newsroom http://www.gartner.com/newsroom/id/2959717

While all 5 components of the total are growing, two of the five categories – devices and enterprise software – are up more than 5% from the 2014 numbers. Since the Enterprise Software category includes both cloud (outsourced) software and on premise software we can’t tell how the mix is shifting from these data. We also can’t tell from the category labels above where the other components of cloud computing like PaaS and IaaS are contributing. The chart below gives us details about the public cloud market that we can use for comparison to table above of global IT spending.

Forbes published the results from a survey of global cloud computing forecasts and market estimates for 2015. Their work shows that the estimated SaaS spending for 2015 is $78.43 billion. That is surprising 23.4 % of the global total enterprise software category above. Forrester estimates total enterprise software sales for 2015 at $620 billion and SaaS spending at $105 billion or 17% of total enterprise sales. Since I only had access the publically available summaries I cannot comment on what might be the differences in accounting categories or methodology. This comparison between global enterprise and SaaS software sales also assumes that the majority of SaaS sales are made to enterprises. The Forbes estimates for the PaaS and IaaS components are an extremely low percentage of global IT spend and therefore I won’t try to figure out what Gartner category they contribute to for comparison.

Using these data as a good estimate of global IT and SaaS sales let me compare this to the two players getting a majority of the press coverage for cloud, AWS and Azure. In January 2015, Microsoft stated that commercial cloud revenue growth was in the triple-digits for the sixth consecutive quarter, reaching an annualized revenue run rate of $5.5 billion. Analysts that compared AWS revenue to Microsoft‘s reporting noted that most of the revenue in the Microsoft cloud accounting comes from SaaS and only a small percentage comes from IaaS. Amazon looks like it is projecting AWS revenue of $6 billion for 2015 primarily from IaaS. Despite rapid year over year growth for the IaaS market it is hard to see when the total spending will be a significant share of global IT spending.

There is a lot of competition in the global SaaS market both across different categories and with lots of players in some of the major categories such as analytics-as-a-service, ERP and CRM. Analysts are also predicting that the downward pressure on revenue per seat for SaaS offerings is going to continue and/or accelerate leading to the need for even greater market share to maintain revenue. This is good for consumers but we should expect to see less price per seat declines or even increases when some competitors get forced out by the price wars.

Surprising to me was the projected 1.7% increase in Data Center Systems spending over last year. While just a short 5 years ago “green data center technology” and discussions of “the data center of the future” where common in IT industry publications, that coverage has been almost completely overshadowed by cloud computing coverage despite the fact that there is a projected spend of more this year on Data Center Systems than all components of public cloud computing.

Conclusions

  1. The focus on “following the money” is becoming more visible in discussions of cloud computing especially for publicly traded companies.
  2. The changing composition of the SaaS market is going to be fascinating to watch over the next five years. If the growth projections are anywhere close to accurate the best and brightest are going to be slugging it out for innovation and share.
  3. Anyone expanding or going into the use of SaaS should look at the historical experiences and understand the terms of agreements and the growing practice of using 3rd party data escrow services for added protection from unexpected changes in the market.
  4. The relatively flat and/or declining projections for IaaS and PaaS would indicate that there may not be enough new business to capture that could offset R&D expenditures for innovation. The exception would be for the big SaaS players that also offer PaaS and IaaS as they pass the innovation from their own operations on to customers looking for more foundational outsourced offerings.
  5. Data Center Systems spending is increasing at a moderate rate and may not be getting adequate attention. The level of investment would indicate a need for new research, innovation and consulting. The US banking system is one example where technology spending is increasing significantly this year and it is unlikely to be going to cloud based offerings.
  6. There needs to be more effort put into categorizing and tracking financial data on IT spending especially in the areas of outsourcing and cloud based services. There appears to be confusion related to the growth of ‘shadow IT”. The numbers being cited vary wildly and are even more difficult to compare based on a lack of clear definitions.

Understanding EMC ScaleIO Scaling Basics

The Enterprise Strategy Group recently validated I/O performance of EMC ScaleIO clusters with 3, 32, 64, and 128 nodes and that should true linear scale. The purpose of this paper is to talk about some background on quantifying and evaluating the scale of computer systems and comparing that to some of the design principles of ScaleIO. The goal is that the information contained in this article together with the ESG white paper results will motivate IT professionals to take a closer look into potential applications of ScaleIO in their organizations.

What is Scalability?

Any product that has the word scale as an integral part of its name should be of interest to those of us who like to build systems that are easily expanded in order to deal with growth, and if you check, that is close to the Wikipedia definition of scalability. The purpose of this section is to help anyone interested in deploying distributed computing and storage platforms (hyper converged) understand some of the basics of system architecture that relate to scaling. I want to first talk a little bit more about what scalability is and then talk about how the architects of ScaleIO built these principles into a very interesting product.

I want to briefly introduce three scale concepts:

  1. Defining scale
  2. Scale up vs scale out
  3. Scaling and consistency

Defining Scale

Think of scale in computer systems like a virtue. You can often recognize it but it is hard to define. The study of scalability in computer science has a rich history that started long before there was an Internet or even computer networks. Searching and sorting algorithms are great examples of computer science research that has been focused on scaling. Check out Wikipedia for articles on Bubble Sorting or Binary Search Trees for background on the topic of algorithm scaling.

The same concepts of scaling analysis can also be applied to entire distributed computer systems like peer-to-peer networks or distributed computing and storage (ScaleIO falls in the latter class). For example, some implementations of P2P software use broadcasted messages to look for a particular item on the network. As the number of servers (nodes) in the P2P network increases so do the number of broadcast messages. The network traffic will depend on the number of nodes sending messages (N) times the number of nodes that they are broadcasting to (N) or N2. This design may scale fairly well up to a certain number of nodes but will reach a maximum as soon as the network between nodes gets saturated. Adding additional nodes beyond that point will not result in any additional output and may in fact perform worse than a smaller configuration.

Even for distributed systems that can function with hundreds or thousands of nodes we need to analyze the amount of additional work (web pages served or data written) you can expect when you add 1 additional node. If a web application can server 1,000 pages per second with one server and 2,000 pages/sec with two servers we say the system scales linearly. In most systems the range of linear scalability will eventually diminish beyond some point. Therefore, it is important to understand

  1. Can a system scale linearly,
  2. over what range do we observe linear scale, and
  3. what happens as you scale beyond that point?

Scale up vs scale out

The term “Internet scale” has become synonymous with great engineering and management execution. Many of the most recognized brands on the Internet, e.g., Amazon, Ebay, Google, Netflix, XBOX Live, Yahoo, etc., have gone through numerous architectural iterations to meet the increasing demands of their subscribers and customers. One of the most important design principals to have emerged in the development of these Internet scale services is that 1) scale out is king and 2) scale up is a bottle neck waiting to rise up and bite you.

Nowhere did this become more critical than in really large scale ecommerce. Initially most Internet architects tried to use a scale up approach with a single relational database system (cash register) for processing order transactions. They were counting on the rapid innovation in servers and storage to outpace the growing demands of online shoppers. In the end scale up would not work for the largest online retailers. Those companies that wanted to achieve “Internet scale” had to devise ways to have multiple order processing engines running in parallel. It meant more complexity for operations and reporting but if you wanted to build really big systems you had to eliminate all the components that require a scale up approach as soon as possible and go scale out.

Scaling and consistency

When you post a picture to social media your friends may have to wait for some time before it is available for them see. There is no obligation or expectation that someone on the other side of the world will see the exact same content on your site as someone in the same city as you. That image might be replicated to many servers over time before anyone visiting your site sees that image. This is an example of eventually consistent storage. It helps the architects of social media services scale. In other contexts, like banking and online retailing, eventually consistency is not acceptable. In these cases we expect that there will be strong consistency with zero chance of having conflicting data residing on different nodes of the cluster.

What is ScaleIO?

ScaleIO is a distributed system typically configured in a “hyper converged” implementation.  Hyper converged systems use each node in the cluster to run applications such as VMWare or HyperV server virtualization as well as manage the storage available to the cluster nodes. Given the above discussion, I want to dig a little deeper into how ScaleIO was designed for scalability. ScaleIO is a scale out clustered application so that is promising for achieving scale. I now want to talk about data consistency and proportional scaling, e.g., do we see linear scalability and over what cluster sizes.

ScaleIO Architecture Basics?

The part of ScaleIO that will be least understood by anyone evaluating hyper converged computing is the storage management, I want focus on that in this article. ScaleIO uses a distributed architecture of servers (nodes) working together.  When the software is deployed in a single layer of computing and memory resources, each server will use its compute resources both to run applications and read and write data from the disk resources that are hosted and shared across all the nodes in a cluster.  Let me explain data reading a little bit further and then show a picture.   

Imagine you had a single node “cluster” running Windows and the HyperV role. I am going to also limit the server storage to a single local physical disk drive for this introduction. In this example, any virtual disks that you create for VMs are going to be placed on that one physical disk. The total amount of IO you can expect for all your VMs is what that one disk is capable of.

Now imagine there are two HyperV hosts configured as a ScaleIO cluster.  Each host still only has one physical disk. What if each host could read and write data that was striped across the two disks that are available to the cluster (one on each host)? You could provide more IO to each node than what would be available from just local disk. What if you had 1,000 nodes each with 10 disks and each node could read and write from any or all of those disks in parallel? Each node in the cluster now has access to many more disk resources than what is available locally. Since all the nodes in the cluster share all the disk, we expect that some servers will need access to disk resources while others are idle of using local network or CPU services and all nodes will get better disk performance when they need it.

The picture below shows 2 HyperV nodes with 8 disks on each node. The ScaleIO software component that take data access requests from the host application (HyperV for example) is the ScaleIO Data Client (SDC) and the component that retrieves the data from the disk is the ScaleIO Data Server (SDS). When both the SDC and SDS reside on the same physical server the configuration uses a single layer of computing and memory for both application processing and storage management. This is also called a hyper converged design. The SDC and SDS can be deployed on separate layers. See the ScaleIO deployment documents for more details about the differences in these two approaches.

In order for ScaleIO to scale up, we have to make sure that the overhead of reading and writing to the disks distributed over the two servers does not increase in portion to the size of the cluster as we noted in the P2P sharing design that I talked about earlier.

The picture below collapses all the server details from above and shows more clearly how the nodes of a 10 server cluster can share the cluster resources to read and write data. There is a system component call the Meta Data Manager (MDM) that stores the information about the cluster resources and the relevant mappings between each SDC and all the SDS’s in the system. The MDM is populated with data when the cluster is configured and gets updated anytime the cluster changes by adding nodes, removing nodes or when a node fails. The MDM plays a role in managing the available storage for cluster.

The storage available to each host is divided into a number of chunks for management. The size of the chunks is determined by the ScaleIO architecture. The MDM keeps track of how many chunks of data can be managed by each SDS (1 per server) in the system. When a node needs storage the administrator creates a new storage device for that server and the MDM computes the map of which SDS will handle the storage for every chunk of data the SDC will need to access during read and write operations. When the client ask for a certain logical block address range for a disk device, the SDC can get the information about which SDS manages those chunks. This way, each SDC can look up everything it needs to know about the storage that it wants to access for either reading or writing from the locally available map based on the application request. The MDM does not become a single point of content that has to be accessed by all the nodes the MDM during normal read and write operations.


It is straight forward to see how this architecture can scale. No matter how wide the cluster becomes it is only requires a single lookup by the SDC to the local map to find what SDS is responsible for storing/retrieving the data and a single request to the SDS to read or write that data. Each data client (SDC) only needs to keep a relatively small map of the logical block addresses to SDS identifier for the virtual disks that have been assigned to that host.  The cluster can be expanded out to hundreds or more nodes and the amount of metadata managed by a SDC is only dependent on what appears to be local storage. Since the MDM does the majority of the mapping work only when the cluster is built or reconfigured or when a storage device is created it largely is off the critical path for normal read and write operations. In the event that a data chunk moves to a new SDS due to reconfiguration or a disk/node failure, the SDC request to the original SDS will fail and the SDC will query the MDM for the new location. The SDC will then update the local map and have the correct location locally again.

Wrapping up

 

So how does ScaleIO relate to our concepts of scale talked about earlier? It is a scale out architecture that specifically is designed to eliminate dependencies on any single shared resource during typical operation. Data stored on ScaleIO is always consistent. ScaleIO does not use eventual consistency to achieve linear scale. The algorithms for reading and writing data do not rely on any parameters that increase non-linearly with the number of cluster nodes. Here is a link to a detailed white paper that shows empirical results from a lab exercise that shows that ScaleIO I/O operations scale linearly in a 128 node cluster.

ESG Lab Spotlight: EMC ScaleIO: Proven Performance and Scalability

The goal of this article was to relate some concepts of computer science scalability to the ScaleIO architecture and the ESG report noted above. The ScaleIO platform has many more interesting features including:

  • Snapshots – create multiple fully rewritable, redirect-on-write snapshots.
  • Thin and thick provisioning control
  • Fault sets – ScaleIO mirroring ensures high data availability even if an SDS goes offline.
  • RAM read cache – this feature allocates space on the storage devices for caching reads or writes.
  • QOS – ScaleIO smoothly limits an application’s IOPS or bandwidth workload to make sure even one gets served.
  • Encryption Watch this space for more information on these and other topics.

Watch this space for more articles on the ScaleIO platform and in the meantime check these great articles online.

Virtual Geek Blog Post – EMC Day 3: ScaleIO – Unleashed for the world!

HyperV Guy’s Blog Post – Install ScaleIO + ViPR on VMware workstation in minutes

EMC’s ScaleIO Site – ScaleIO JAW-DROPPING SOFTWARE-DEFINED STORAGE PERFORMANCE AND SCALABILITY.

 

 

Installing EMC ViPR on Windows HyperV

Background

I was contacted recently by Raul Sondhi, one of our enterprise sales engineers in the bay area about three customers that are interested in storage virtualization. EMC is making exciting investments in storage virtualization (both hardware and software) packaged with enhanced automation and management tools. ViPR is one of the central software components of the EMC drive to helping reduce complexity in IT ownership.

There are several products in the VIPR family but these customers are going to be most interested in the ViPR Controller that discovers existing arrays and networks (file and block), catalogs their capabilities (capacity, performance, features), and enables the construction of “virtual storage arrays” – entities that are essentially abstracted across multiple physical arrays and existing data services.

ViPR was announced at EMC World 2013. This year there is news being made at EMC World again when EMC said it would open source the code behind is ViPR Controller. There are a lot of articles from both EMC and others talking about this announcement. A simple web search should keep you busy for quite a while. Here is one example from ZDNET on the ViPR controller open source announcement.

Another interesting development coming out of the ViPR team is the availability of virtual hard disks and scripts that allow for ViPR Controller clusters to be hosted and supported in Microsoft HyperV environments. The remainder of this article is focused on installing a ViPR Controller cluster on HyperV hosts managed by System Center Virtual Machine Manager.

Installing

In order to have something concrete to show our customers, Raul and I are installing the ViPR controller at the Microsoft Technology Center in Mountain View, CA. We have a reasonable investment in EMC gear there including VMAX, VNX, VPLEX, and RecoverPoint. The ViPR Controller version 2.2.0.1 and higher can now be installed in Hyper-V environments so that is how we are going to deploy it. There is a great product documentation index available on the EMC website that will help with downloads, licensing, install guides, new feature lists and much, much, more. Check it out before you start your first installation.

The ViPR controller is installed as a clustered service in either a 3 node (2+1) or 5 node (3+2) configuration. You will need a separate VM for each node. We are going to do a 2+1 configuration. The ViPR install download is a zip file. When you extract the files you should see this.

The .MF file contains configuration in formation. There are also four (4) virtual disk images and two (2) PowerShell scripts. This is all you will need until you are ready to configure the cluster and enter a license file. See the documentation index page discussed above to get instructions for obtaining a license file. One of the PowerShell (PS) script files copies the vhdx images to a library server configured in System Center Virtual Machine Manager. The other PS script file creates one (1) controller node VM each time it is executed. We will be running that script three (3) times to create the nodes for our ViPR controller cluster before any configuration of the cluster is started.

I’m going to follow the step by step instructions for installing the controller in a HyperV environment from the documentation index above. I’ll include some comments and screen shots of the process in the remainder of this article so you have a good idea of the entire process and can check your results compared to mine along the way.

Copy files to the SCVMM server.

We need to copy the vhdx files into a library server that is part of the VMM fabric configuration. The Library server in VMM holds and manages resources primarily used for VM deployments like, ISO images, virtual disk files, VM templates and more. VMM can use more than one library server so determine which one is the most appropriate in case you are using more than one. The VMM instance that we are using in the MTC POC domain only has one library now so I will edit the example PS command that calls the upload script from the documentation.
.\vipr-2.2.1.0.100-uploadVirtualDisks.ps1 -librarypath \\myVMMserver\MSSCVMMLibrary

.\vipr-2.2.1.0.1106-uploadVirtualDisks.ps1 -librarypath \\LibraryServer\Library

I had to edit the version number to match the files in the current distribution and I had to change the server name and library name for my environment (real server names not listed). You can then execute the cmdlet to upload the virtual disks to the library server. I copied the files to my SCVMM server and uploaded from there to the library server (hosted on another server). I used the PowerShell ISE tool and ran as Administrator. There is some UI that will show you a status bar for each file that is being uploaded that I found useful when running the script for the first time. It takes some minutes to upload the four disk so it was nice to have confirmation that something was actually happening while I was waiting for the script to end.

Creating Virtual Machines

The next step for installation is to create enough virtual machines for the cluster size you are deploying (3 or 5). There is a sample PS command for calling the CreateVirtualMachine PS script in the installation guide. It requires several parameters – only some of which are optional. Check the documentation for the details.

.\vipr-2.2.1.0.1106-createVirtualMachine.ps1 -vmhostname hyperv-server -vmpath vm-destination-folder -vmname viprn -VirtualSwitchName vswitch-name -VmNetworkName vm-network-name -vlanid id -disktype [dynamic|fixed]

I’ve highlighted all the parameters above that you need to replace. Here is a brief description of how I choose values for them.

-vmhostname I installed into a HyperV cluster but I used the name of one node that had enough resources to create the 3 VMs that I needed. I will move 2 of the VMs to other nodes in the cluster later for better HA protection.
-vmpath Typically when I create VMs on a cluster using VMM it uses the “placement” path setting in VMM to determine where to locate the files for the VM. I tried to run the create virtual machine script without specifying a path in my VMM environment but PS prompted me for a path during execution and wouldn’t continue without one. I checked the Shared Volumes configuration for the cluster to determine which CSV(s) were owned by the host that I was creating the VMs on.
-vmname I try to follow a set of naming conventions that we have for the POC lab since it is a shared environment to help reduce confusion.
-VirtualSwitchName VMM has a lot of capability to support virtual networking. The next 3 parameters are related to that. Virtual switches are tied to physical network adapters to support network segmentation. A virtual switch in VMM helps facilitate commonality across servers in HyperV clusters to ensure that VMs can migrate from one host to another and “just work”.
-VmNetworkName Virtual networking in VMM allows you to define multiple logical networks in software that can be hosted on a single physical network. You will have to get this information along with the virtual switch name and possibly a vlandid from the VMM/HyperV administrator(s).
-vlanid The logical network that I used in VMM supports more than one VLAN so I specified one explicitly.
-cpucount I used minimum value of 2 (default).
-memory Memory in MB per virtual machine. I used minimum value of 8192 (default).
-disktype Type of virtual hard disk: dynamic or fixed. Use fixed for deployment in a production environment.

After you edit the required and optional parameters you need for your environment, run the command. PS will display the following message after the script completes if everything is specified correctly.

I ran the PS script two more times with a different parameter for –vmname. After the PowerShell script has been run three times with different server name parameters we can check in SCVMM for the results and see that the VMs are created and in a Stopped state.

Configuring the cluster

Continuing to follow the instructions from the documentation index, the next step is to power on the VMs and configure the networking. We need four(4) static IP addresses, one for each of the nodes and one for the cluster (virtual IP). I signed out four static IPs in the management network of the MTC and powered on the first node. After the boot process completed I was taken directly into the cluster setup code.

This will be a 3 server configuration so I accept the default selection. In order to make selections in the cluster setup wizard use TAB to navigate between input fields and buttons. Then PUSH a button by hitting ENTER. In this screen the cursor is on the Next button so I just hit enter.

Here is where I will enter the 4 static IP addresses that I’ve allocated for the cluster. Our management network is a /22 so I need to change the Netmask accordingly and enter the gateway. Remember TAB to field you want to edit or select. For text entry you tab to the end of the field and then backspace and type the new value. Finally TAB to the Next button and hit ENTER. You will then see a confirmation screen repeating the data you just entered with a confirm button selected. Check your entry and hit ENTER to configure the FIRST NODE in the cluster. At the end you will get a confirmation message.

The installation instructions are very clear about this step – DO NOT REBOOT this server. You have to configure all the nodes in the cluster before you reboot any of them. Leave this server in the state shown above, power on the next server and connect to the console. After the second server boots you will see the following screen (I have blacked out part of the network address of the cluster). Setup is aware that you may want to add this node to an existing cluster setup or create a new cluster configuration.

Keep the default selection to add this server to the cluster with the VIP shown (redacted). The next screen allows you to select which of the two remaining servers in the 3 node cluster you are adding. You already enter the IP addresses of all the servers so the IP address you entered for server 2 will be assigned to this machine if you keep the default selection.

Click next and wait for the Local Configuration Done message is displayed. Then repeat for the remainder of the servers in cluster. When you have the configuration complete on all servers you can REBOOT all of them.

Then you can take a short break. It will take some time for all three servers to reboot and finish the configuration of the cluster. No user input is required during this time. You will be able to ping the VIP fairly soon but it will be some minutes before you can get the management application to load in a browser by specifying the VIP address. When the cluster configuration is complete you will be able to use a supported browser to access the login page by entering the address of the virtual IP (VIP). You will need the default user name and password to complete the initial login. Check the install documentation for how to do this.

You will need to upload a license file to finish setting up the ViPR Controller. See the installation documents for links to where you can request the license.

You need to change some passwords on first login. Check the install documentation for how to do this.

That’s it. Now we can begin to connect to all our supported storage arrays and configure virtualized storage assets.

There will be more articles coming in the next few weeks as this POC and demonstrations get more fully developed. Please watch this space.

Phil Hummel, @GotDisk

San Jose, CA

Managing Fibre Channel in VMM with SMI-S or How I Got in the Zone

Greetings from the Microsoft Technology Center in Silicon Valley (MTCSV) in Mountain View, CA. I have been putting in a lot of time lately on the new System Center 2012 R2 Virtual Machine Manager infrastructure that is hosting all the operational compute and storage for the MTC. There are numerous blade chassis and rack mount servers from various vendors as well as multiple storage devices including 2 EMC VMAX arrays and a new 2nd generation VNX 5400. We have been using the SMI-S provider from EMC to provision storage to Windows hosts for a while now. There is a lot of material available on the EMC SMI-S provider and VMM so I am not going to write about that today. I want to focus on something new in the 2012 R2 release of VMM – integration with SMI-S for fibre channel fabrics.

There are many advantages to provisioning storage to Windows host and virtual machines over fibre channel networks or fabrics. Most enterprise customers have expressed interest in continuing to utilize their existing investments in fibre channel and would like to see better tools and integration for management. Microsoft has been supporting integration with many types of hardware devices through VMM and other System Center tools to enable centralized data center management. The Storage Management Initiative Standard (SMI-S) has been a tremendously useful architecture for bringing together devices from different vendors into a unified management framework. This article is focused on SMI-S providers for fibre channel fabrics.

If you right click on the Fibre Channel Fabrics item under Storage in Fabric view and select the Add Storage devices option you will bring up a wizard.

The first screen of the wizard shows the new option for 2012 R2 highlighted below.

We are using the Brocade SMI-S provider for Fibre Channel fabrics. The provider is shipped with the Brocade Network Advisor fabric management tools. We are using BNA version 12.0.3 in the MTCSV environment. The wizard will ask you for the FQDN or IP of the SMI-S provider that you wish to connect too. It will also ask for credentials. We are doing a non-SSL implementation and we left the provider listen on the default port of 5988. That is all there is to the discovery wizard. The VMM server will bring back the current configuration data from the fibre channel fabric(s) that the SMI-S provider knows about. In our case we have fully redundant A/B networks with 4 switches per fabric. Here is what the VMM UI shows after discovery is complete.

 

Once we have discovered the fabrics we can go a the properties of a server that has FC adapters connected to one or more of our managed switches. The first highlight below show that VMM now knows what fabric each adapter is connected. This allows VMM to intelligently select what storage devices and ports can be accessed by this server adapter when creating new zones. That’s right; with VMM 2012 R2 and the appropriate SMI-S providers for your storage and FC fabric you can do zoning right from within the VMM environment. This is huge.

The second highlight above show the HyperV virtual SAN that we created in VMM for each of the adapters. The virtual SAN feature set was released with Windows Server 2012 HyperV. It is the technology that allows direct access to fibre channel LUNs from a virtual machine that can replace pass through disks in most cases. That is also a really big topic so I’m going to write about that more in the context of VMM and fibre channel fabrics in a later article. For today I want to focus on the use of VMM for provisioning fibre channel storage to HyperV parent clusters. Now let’s take a look at the zoning operations in VMM.

The next figure show the Storage properties for a server that is part of a 5 node cluster. The properties show what storage arrays are available through fibre channel zoning. You can also see the zones (active and inactive) that map this server to storage arrays.

Lastly, I want to show you how to zone this server to another storage array. The place to start is in the storage properties window shown above. Click the Add | Add storage array icons to get to this screen.

As you can see from the window title this is the correct place to create a new zone. This is the same regardless of whether this is the first or third array (as in this case) you are zoning to the selected server. I highlighted the Show aliases check box that I selected while making the above selections. In order for the friendly name zoning aliases to be available they must be created in the BNA zoning UI after the server has been connected to one of the switches in this fabric. You can also see the zone name that I entered that will be important when I move to the final steps to complete this example.

Now that the zone has been created let’s take a look at the Fibre channel fabrics details.

I’ve highlighted the total zones defined in the Inactive and Active sets for the A fabric. This shows that new zones have been created but have not yet been moved into the Active zone set. If you open the properties of the Inactive zone set and sort the Zone Name column you can see the zone that we created 2 steps above.

In order to activate this zone use the Activate Zoneset button on the ribbon. One important detail is that you can only activate all the staged zones or none of them. There are two zones that are in the Inactive zoneset that will be activated if you push the button. Be sure to coordinate the staging and activation of zones in the event the tool is being shared with multiple teams or users.

 

The world of private cloud management is changing rapidly. VMM and the other System Center products have made huge advancements in the last two releases. The investments that Microsoft and storage product vendors have been making in SMI-S integration tools are starting to bring real value to private cloud management. Take a look, I think you’ll be surprised.

SQL I/O and EMC XtremIO

Every once in a while a seemingly simple question pops up on an email distribution list or Yammer thread that ends up spawning a lively and productive conversation. We had one recently that started out with the question:

For SQL Server running on XtremIO, is there any current guidance we should be giving our customers to best take advantage of the 4K element size on XtremIO?

One of the thoughts behind the question was motivated by the existence of a 4k disk allocation unit (DAU) option when formatting a disk with NTFS. Since a 4K allocation size significantly increases the size of the allocation structures managed by NTFS and the frequency with which allocations are made there better be a good reason for using them. With the relatively large files sizes typical of most SQL Server applications, we aren’t worried about wasted space even when using the largest available allocation size of 64K so there is really no good reason to use anything but 64K allocation units for SQL Server. This just opened the flood gates to many more SQL I/O questions that went beyond NTFS allocation unit size and XtremIO. I gathered together a summary of the question and best responses in the table below.

 

Are SQL Server write sizes independent of DAU size?

Yes.

How much does the additional overhead of 4K DAU impact performance? Will SQL Server still maintain its write sizes?

Most of the impact of DAU is felt as you create new space in a data or log file.  If you let SQL zero initialize the data file then all the overhead for allocation occurs at once.  Log files always initialize upon creation.  If you use instant file initialization with a data file then you get a small added overhead every time Windows allocates new space to the file.  When I have tested file creation with a new data file with 4K vs 64K DAU the difference in time to create is only a couple of percent difference.  It is almost the same thing if you run 8K or 64K random reads and writes with SQLIO against a disk with 4K vs 64K allocation units, a couple of percent difference but always favors the 64K DAU.

Does SQL typically write to the log in 4KB and DBs in 32-64KB I/Os?

Log writes can be a small as 512 bytes (one sector) up to 60K (the maximum SQL will allow before forcing a flush to disk).  See the Kendra Little article referenced below for lots more details about this topic. DB writes can be 8K (one page) up to 256K (the maximum size the gather write operation will bundle before flushing to disk).

SQL Server uses a lazy writer and will try to write whole extents (if possible)

The lazy writer is constantly scanning the buffer pool for pages that are not being referenced often.  When the lazy writer flushes a page, the buffer manager will try to do a gather write exactly like if the page is being written by a check point so that can be 8-256K.

If an extent in RAM is only partially dirty (various pages dirty) then will only those dirty pages get written. That would result in a smaller write size. 

The gather write operation is looking for adjacent pages from a single data file up to 32 pages.  An extent is 8 pages. There is no consideration for “wanting to” or “having to” to write entire extents from the pool.  The buffer manager will always read an entire extent from disk even if only one page from that extent is required for the operation.

Typically SQL write performance is governed by TLOG write speed and not by DB write speed, unless doing a bulk input. 

You could have a nice wide RAID10 set for the logs with blistering sequential write speed and a RAID 6 SATA set for the data file.  It is hard to make general statements like this but if all you are doing is writing data (no reads) then often times the log file ideal of 5ms per write will get exceeded before the 20ms per write to the data file(s) that usually indicates that I/O wait time are going to start impacting performance.

 

Another great outcome from this discussion was the sharing of favorite white papers and blog posts. I have collected those links and added them here.

SQL Server Best Practices 2012 http://msdn.microsoft.com/en-us/sqlserver/bb671430 and,

SQL Server Best Practices Article in 2007 http://technet.microsoft.com/en-us/library/cc966412.aspx

There is also an interested blog here from Kendra Little:


http://www.brentozar.com/archive/2012/05/how-big-your-log-writes-spying-on-sql-server-transaction-log/ …read the comments also

The best IO size information I’ve seen came for the SQL CAT team a few years back.  I think it’s still relatively accurate, check out the appendix of the presentation from the link below.   

http://blogs.msdn.com/cfs-file.ashx/__key/CommunityServer-Components-PostAttachments/00-09-45-27-65/Mike_5F00_Ruthruff_5F00_SQLServer_5F00_on_5F00_SAN_5F00_SQLCAT.zip

 

For the 10 years I worked at Microsoft I found the SQL Server technical community a constant source of knowledge and challenge that made me a better SQL Server professional. The community at EMC is certainly smaller but no less dedicated to sharing information and promoting technical excellence. I hear the same thing from anyone involved with PASS and local users groups. Let us know how you participate in a SQL Server community and what it means to you. Thanks for reading.