HP G60-630US 15.6-Inch Laptop (Black)

HP G60-630US 15.6-Inch Laptop (Black)

Category: (Personal Computer)

18 new, starting at $568.85

Buy Now More Info
HP Pavilion DM3-1140US 13.3-Inch Laptop (Silver)

HP Pavilion DM3-1140US 13.3-Inch Laptop (Silver)

Category: (Personal Computer)

7 new, starting at $693.54

Buy Now More Info
HP Pavilion DM3-1040US 13.3-Inch Silver Laptop (Windows 7 Home
Premium)

HP Pavilion DM3-1040US 13.3-Inch Silver Laptop (Windows 7 Home Pr...

Category: (Personal Computer)

9 new, starting at $579.00

Buy Now More Info
HP Pavilion DV6-1354US 15.6-Inch Black Laptop - Up to 4 Hours of
Battery Life (Windows 7 Home Premium)

HP Pavilion DV6-1354US 15.6-Inch Black Laptop - Up to 4 Hours of ...

Category: (Personal Computer)

4 new, starting at $743.88

Buy Now More Info
HP 2009M 20-Inch HD LCD Monitor

HP 2009M 20-Inch HD LCD Monitor

Category: (Personal Computer)

9 new, starting at $139.99

2 used, starting at $118.95

Buy Now More Info
HP Pavilion DV4-2161NR 14.1-Inch Laptop (Black)

HP Pavilion DV4-2161NR 14.1-Inch Laptop (Black)

Category: (Personal Computer)

3 new, starting at $679.99

Buy Now More Info
HP TouchSmart 300-1020 20-Inch Black Desktop PC (Windows 7 Home
Premium)

HP TouchSmart 300-1020 20-Inch Black Desktop PC (Windows 7 Home P...

Category: (Personal Computer)

9 new, starting at $799.99

1 used, starting at $758.99

Buy Now More Info
HP Officejet 6500 Wireless All-in-One Inkjet Printer

HP Officejet 6500 Wireless All-in-One Inkjet Printer

Category: (CE)

38 new, starting at Too low to display

Buy Now More Info
HP TouchSmart IQ526 Desktop PC (2.2 GHz Intel Core 2 Duo T6600
Processor, 4 GB RAM, 640 GB Hard Drive, DVD Drive, Vista Premium)

HP TouchSmart IQ526 Desktop PC (2.2 GHz Intel Core 2 Duo T6600 Pr...

Category: (Personal Computer)

2 new, starting at $900.00

3 used, starting at $900.00

Buy Now More Info
HP Pavilion DV4-1541US 14.1-Inch Espresso Laptop - Up to 4.25 Hours
of Battery Life (Windows 7 Home Premium)

HP Pavilion DV4-1541US 14.1-Inch Espresso Laptop - Up to 4.25 Hou...

Category: (Personal Computer)

3 new, starting at $980.00

1 used, starting at $1,100.00

Buy Now More Info

HP Pavilion a6300f Desktop Computer With Intel(R) Pentium(R) Dual-Core Processor E2180

$549.99

HP Pavilion a6300f Desktop Computer With Intel(R) Pentium(R) Dual-Core Processor E2180

More Info Buy Now!

HP Lithium-Ion Battery for HP Deskjet 450, 460 and 470 Mobile Printer Series

$79.99

HP Lithium-Ion Battery for HP Deskjet 450, 460 and 470 Mobile Printer Series

More Info Buy Now!

Remanufactured Hewlett Packard C6656AN / C6656A (HP 56) Black Ink Cartridge

$18.99 $13.95

123inkjets remanufactured C6656AN / C6656A (HP 56) black inkjet cartridges are cost sav...

More Info Buy Now!

HP Officejet 6500 Network-Ready Multifunction Printer/ Copier/ Scanner/ Fax

$159.99

Enjoy fast print speeds and high-color resolution with this network-ready printer for c...

More Info Buy Now!

HP 92 (C9362WN) Black Ink Cartridge

$14.99

HP 92 (C9362WN) Black Ink Cartridge

More Info Buy Now!

HP DreamScreen 100 102"" Widescreen LCD Wi-Fi Digital Photo Frame - Black

$224.99

Display your favorite videos or photos, check your Facebook status or listen to Interne...

More Info Buy Now!

HP Pavilion Slimline s3700f Desktop Computer With AMD Athlon(TM) X2 Dual-Core Processor 5000

$429.99

HP Pavilion Slimline s3700f Desktop Computer With AMD Athlon(TM) X2 Dual-Core Processor...

More Info Buy Now!

HP G60-442OM-B 16" Laptop PC Bundle

$749.99

HP G60-442OM-B 16" Laptop PC Bundle Intel Pentium Processor T4300, genuine Windows Vist...

More Info Buy Now!

HP SimpleSave 320GB External USB 20 Portable Hard Drive - Black

$79.99

This sleek and slim portable hard drive provides plenty of storage space for your digit...

More Info Buy Now!

SimpleSave 320GB Portable Hard Drive

$99.99 $79.99

HP's SimpleSave portable hard drive fits stylishly in your pocket or purse and is so ea...

More Info Buy Now!

HP Scalable NAS for Static Media

Introduction
Today, with the explosion of digital content, the Internet is seen as the ultimate content repository. Consumers are regularly using the Internet as a content repository for uploading and sharing picture files with friends and families. From digital cameras to cameras on mobile phones, the number of
devices that are available for consumers to capture images is growing tremendously. This has led to a significant increase in the amount of digital data that is created than ever before. For instance, numerous studies have indicated and projected the growth of digital content.
In the year 2008 alone several hundred billion images were captured by digital cameras which accounted for a whopping Exabyte (or 1 billion Gigabytes) of data. Not only have the number of devices increased but the growing image resolution on digital cameras over the past two years alone
has doubled or even tripled the size of the images captured and given rise to an increased digital footprint today.
Consumer cameras are not the only things generating large amounts of digital content. Enterprises are dealing with back office applications generating data such as text files, emails, spreadsheets, word documents, and PDF files. Medical imaging systems with high resolution images ranging from several
hundred megabytes to gigabytes in size have all been contributing to the growth of digital data.
This type of content is what is known as static media; data that does not change over time. The volume of static media is exploding and pressing a need for the newer technologies to handle the data growth in an efficient manner. This whitepaper outlines the challenges seen with the growth of
static media and provides alternative thinking about hosting and serving static images using the HP Scalable NAS solution.
Static Media—Challenges
Static Media presents a unique challenge to storage providers because of the sheer number of objects that are stored, the varying size of files, and the inconsistent data access pattern from the underlying storage.
The concept of using the Web as a transport medium for accessing data anytime has caused a tremendous amount of expectation for online service providers.
The increasing number of social networking sites and the blurring lines between pure video and social networking sites offering similar services have increasingly contributed to the user generated content which are not just confined to text blogs and sharing images. As the Web medium becomes more
and more powerful, there is an increased expectation for easier data access and collaborative data sharing. Increasingly there are numerous applications called “Mashups” (for more information see Appendix) which combine data from more than one source to provide a unique set of solutions that
are becoming popular and are extending the services through various Web services with HTTP as the key content access protocol.
There are several applications that produce and consume static data; applications that provide access to static data over the Web and applications that are traditional to enterprises.
It is important to note that not all applications that provide access to data over the Web are Web 2.0 applications.
For example, an online photo sharing site who allows consumers to upload, edit/modify their pictures and share them with friends and family is a classic Web 2.0 application. On the other hand, a journalist working on articles for a news portal might access content publishing data on servers over the Web and perform edits before the content is moved to the online news Web servers. Even though this operation might use Web-based data access protocols, such as HTTP for content access, it is very different from a typical Web 2.0 application.
Storage challenges
Purchasing storage hardware which is less powerful but capable of catering to today’s needs puts stress on the business whenever there is seasonal or unpredictable access to data which is very common in the Internet-based portals where performance requirements could far exceed the capabilities of the deployed system.
In order to provide an enhanced quality of service and to reduce the churn rate, it is imperative to have an infrastructure that is resilient, powerful, reliable, and scalable to grow with the business.
In the next few sections we will examine the traditional architecture for Static Media that exists today and its shortcomings. We will discuss how the next generation scalable storage architectures overcome the issues and provide an elegant solution.
Business example
When the business model is primarily based on user generated content, there is no control over how much data flows into the environment. When the data access pattern and the content flow are highly volatile, it makes it extremely difficult to design a system to cater to the varying load conditions.
For example, an online photo sharing site with many million picture uploads and views per day, the load on the system is very unpredictable and volatile. In this case, at any given time, the number of user image uploads could go from a few thousand images to many millions and this type of activity appears to be very seasonal in nature. Here, consumers upload photographs at varying picture resolutions which are ultimately stored at the service provider’s location. So as far as the consumer is concerned, the image is stored somewhere in the cloud and is accessible when requested. But from a service provider’s perspective, this is a bit more challenging. For every image that is uploaded, there are several other images that are synthetically created and stored in the storage infrastructure. In the case of an online photo sharing company, thumbnail images and low resolution pictures are often created and stored along with the high resolution images uploaded by the user. For every million images uploaded, the system stores two million more images in the form of thumbnail and viewing quality images.
The same is true for a service provider offering a music download service where a thumbnail or a low resolution music album image is typically stored on the storage along with the audio music file.
Long Tail problem
One of the classical challenges of Web 2.0-based models is dealing with the “Long Tail Content”. What that means is, the images may or may not be frequently accessed but need to be stored on reliable storage and served fast when requested. None of the data can be on an offline device but instead needs to be on an online disk for rapid retrieval and serving. But storing all of the data on expensive online storage leads to a very expensive and therefore inefficient business model. Where economies of scale are primary criteria, technology such as HP Scalable NAS comes to the core of the solution. More on the benefits of Scalable NAS will be addressed later in the paper.

Depending on the business model and the type of service offered by the online service providers, the content popularity distribution curve varies. For a typical social networking site or a collaborative data sharing site, some of the data could be more popular than the rest. It is evident that popular data is
accessed more often than other content which is termed as “Long Tail Content”. This creates a non-uniform data access pattern which lays a very different stress on the system.
Key challenges for an infrastructure provider
Ultimately, the challenge facing today’s system administrator at a service provider of online data or a company that is processing a large amount of static data content is:
Scalability: Implement an infrastructure that can dynamically meet the changing requirements and be able to provision capacity that can handle ever growing data.
Availability: Deliver the reliability of systems in terms of famous “five nines” while ensuring the systems are still functioning even with multiple simultaneous component failures and remain running while the infrastructure is being upgraded.
Affordability: Build an infrastructure that is robust, resilient that provides a low cost of investment.
Manageability: Manage Petabytes of data with a limited number of IT Staff.
Traditional architecture
With digital media assets ballooning in size and number the only way to accommodate growth, with traditional NAS systems, is by adding more storage space to the system. This at the beginning might seem like a reasonable approach but the conventional single headed NAS systems do suffer from the
side effects of extra capacity taxing their processors. Now the same set of processors on the NAS head need to drive the extra capacity and this ultimately results in the degradation of performance.
Hence the single headed monolithic NAS systems come with a certain capacity limit. In this case, the reliable way of expansion is either replacing the smaller NAS systems with bigger systems, with larger capacity and compute power, resulting in data migration with a forklift upgrade, or adding
more of the same NAS systems thereby creating islands of storage that need to be managed separately. And to add to this, the NAS filers need to be paired for High Availability and this results in high priced, poorly utilized, and highly complex systems.
It is very evident that the conventional NAS systems were just not designed for environments that deal with an exponential amount of unstructured data growth.
There are several drawbacks to this traditional NAS solution:
Scalability limits: The complexity of dealing with the NAS systems arises from the fact that each of these NAS filers results in islands of storage and file systems. This results in unbalanced and underutilized systems.
Namespace overhead: Multiple file systems results in multiple namespaces which is extremely complex to manage. Again from the picture above, the application server(s) that is utilizing the underlying NAS system will need to mount a new file system every time a new NAS system is added to the system.
Management complexity: Multiple and disconnected namespaces introduces the complexity to managing the file systems and shares. The complexity of managing the NFS/CIFS mounts and the resulting mount storms (if applications are designed to dynamically mount the file shares) can be a very painful exercise for the system administrator to diagnose and fix the environment for performance related issues. This problem is magnified if such a solution is being deployed, for example, in online photo sharing sites where there is a constant upload/viewing of images from the end users. For a classical Web 2.0 solution deployment it is very inefficient if the system introduces underutilization of either server processing power or on capacity or a combination of both. Also, managing data spread over several islands of storage is a paramount task.
High-cost solution: One of the traits of a non-scalable system is “scalability through copy” mechanism. Multiple copies of the same information are made in order to provide the performance that is needed to meet the Service Level Agreements (SLAs). More storage is needed to store the same amount of information resulting in a very high cost solution.
In addition, scaling out the NAS filer introduces new manageability and availability issues. By adding more NAS filers, an administrator can reduce the performance bottleneck. However, the administrator must partition and redistribute the data among the NAS filers. If this data is growing and changing
regularly, as is the case for many websites and Web server log files, the administrator must continually partition the data and ensure that there is ample amount of space on each NAS filer and that no one NAS filer is bearing an overwhelming amount of the load. Managing these data partitioning and load issues can be complex and cumbersome.
Moreover, as NAS filers are added, the overall availability of the system decreases. In fact, since data is being partitioned and not distributed among the NAS filers, a failure of one NAS device can bring the entire system down. Thus, the probability of a single filer failing increases as more NAS
filers are added to the system. Hence, the Mean Time between Failure (MTBF) for NAS becomes analogous to striping without mirroring.
The bottom-line here is that, in order to manage multiple Petabytes worth of information in a cost effective way, a new breed of systems with a dramatic departure from the conventional architecture is needed. This is the idea behind the HP Scalable NAS architecture which effectively solves the
performance and capacity scaling issues which seem insurmountable for the conventional NAS systems.
Scale-Out Architecture for Static Media
The HP Scalable NAS solution with multi-headed symmetrical data access architecture provides a viable solution to the common symptoms found in the enormous data growth environments. Scalability on both capacity and performance fronts is one of the most common symptoms in these environments and hence the need for a scalable architecture that provides a zero-downtime platform enabling the business to grow.
With a Scale-Out and Clustered file system approach there is a single pool of storage which is accessed in parallel by various nodes in a cluster. The nodes work together to form a cohesive unit to provide concurrent access to data. Because all of the resources are stored in a single repository, no one node is taxed while a particular JPEG image file or an audio file is accessed from the system.
Each of these systems is equipped with cache that is coherent and consistent across the cluster nodes.
The benefits of this architecture are:
Shared storage: The HP Scalable NAS solution can be attached to storage with different levels of performance to minimize the overall cost of the solution. For instance, for an online photo sharing site, the high resolution images uploaded by the user are only used for providing value-added services
such as image printing, calendar services, and others, whereas the low resolution images and thumbnails are used whenever a user requests images for viewing. So it is important to store those images on faster disk storage which is critical in serving customer online requests. Hence, it makes
sense to store the high resolution images on a capacity optimized system such as an HP StorageWorks Modular Smart Array (MSA) fronted by an HP StorageWorks EFS Clustered Gateway or an HP StorageWorks 9100 Extreme Data Storage System (ExDS9100) whereas low resolution and
thumbnail images on a faster performance optimized system such as an HP StorageWorks Enterprise Virtual Array (EVA) paired with the HP StorageWorks EFS Clustered Gateway. HP Scalable NAS solutions, such as those presented above, allow customers to build a cluster that is made of
heterogeneous storage with a different class of service.
Global namespace: The file system storing the static data is mounted on every node in a cluster. The data access is symmetric and parallel. The application server or Web server can issue a data request for store or retrieve to any of the cluster nodes as they are all peer nodes. The management is greatly simplified as the nodes are managed through a single dashboard and are managed as one unit, as opposed to individual units.
Load balanced data access: An external load balancer such as DNS round robin or a hardware load balancer switch can be deployed to balance the Application Server connections to the cluster nodes to ensure that no one node is overloaded.
A new node can be added to improve the overall performance of the cluster. This is a critical aspect of the architecture when considering the spiked use data access patterns that are typical in online content repository Web models.
Data access protocols: Most Web applications work on a whole file instead of manipulating portions of the file while other applications such as geo-mapping applications might need to work on certain portions within a file.
In most of the online Web applications, the end user uploads a file and that file could be an image, audio, or video file and remains unaltered throughout its lifecycle. The file is accessed (read) many times but is never changed. For such data access patterns, use of standard NAS protocols such as NFS and CIFS is a true overkill. Hence HTTP is a dominant protocol that is optimized for whole file access and provides a simple interface for data access.
The HP Scalable NAS platform supports multiple protocols namely NFS, CIFS, and HTTP for applications that need to manipulate static data.
Self healing and self managing: There is No Single Point of Failure (NSPOF) within the Scalable NAS cluster. The system is highly available and resilient and can sustain several component failures including nodes, network connectivity, software stack, and disks to name a few. The built-in monitors watch the components and initiate a failover whenever there is a failure. So this means, if a Web application server issues an HTTP GET request for a user image file to a particular node in a cluster and at the time of request the node to which the request was issued fails, the cluster automatically serves the request from a different designated backup node in a cluster.
Co-hosting applications: One of the unique features of the HP Scalable NAS solution is the ability to host applications directly on the cluster nodes. For instance, in order to provide highly available Web serving, the Web server can be hosted to run directly on the cluster providing block access to shared
data and simplifying the overall management and reducing the cost of the infrastructure.
Conclusion
The high availability architecture of the HP Scalable NAS solutions ensures reliable data access to applications at all times. The added benefit of an open application platform offers a unique ability to co-host applications on the Scalable NAS servers, thereby eliminating the need for a separate application tier. With every single cluster node accessing the same shared storage content, a high degree of scalability is achieved with a single copy of content. Also, the clustered file system architecture eliminates the need to replicate the content to keep up with the demand without affecting the quality of service. This scalable clustered architecture provides a simplified way to solve the problem of over provisioning of resources to keep up with the spiked data access pattern and increase growth in the digital static media arena.