Terascala top header graphic Link to site home

 

TISIS v3.0 – Real-time Storage Optimization

TISIS brings rock-solid, proven management and extensive analytics to HPC environments and the cloud

Overview

The Terascala Integrated Storage Information System (TISIS™) v3.0 builds upon generation 1 and 2 of Terascala’s proven Lustre parallel file system appliance technologyto bring rock-solid, proven management and extensive analytics to HPC environments and the cloud. With TISIS, administrators are able to go beyond a simple ‘red light / green light’ system view and can gain insight into how to best tune their environment based on workflow. TISIS v3.0 is designed to provide real-time storage optimization of Fast Data storage, which accelerates the delivery of data to large pools of servers so it can be processed and leveraged by organizations as quickly as possible. TISIS v3.0 is a browser-based solution that is available on a range of Terascala based appliances

The TISIS architecture is based on a unique approach to Fast Data storage optimization, which combines a number of typically separate data sources. With the ability to monitor parallel file system operation, underlying storage platform performance, and client access, TISIS is able to monitor, learn, and predict the characteristics of Fast Data compute jobs to optimize execution and resource allocation.

With its data collection capabilities, TISIS combines different views enabling administrators to track many of the metrics required for a Service Level Agreement (SLA) for Fast Data storage services, including bandwidth usage, capacity consumption, client node access, and specific job resource usage. With this data, TISIS is able to help organizations understand and optimize their storage infrastructure and perform data driven capacity, throughput, and utilization analysis, scheduling optimization and complete system management.

TISIS incorporates four key elements that, together, deliver the visibility and control needed to optimize a complex parallel storage appliance, including:

  • TISIS Analytics: enables administrators to optimize systems over time through a set of pre-defined views that provide both total and relative utilization metrics on OSS nodes, OST arrays, network traffic, metadata operations, client access and more.
  • TISIS Console: enables users to manage complex, multifaceted parallel storage solutions through a single pane of glass.
  • TISIS Server: connects and processes all the component data for use by Terascala’s notification system, enabling prompt relevant notices about any potential system issues.
  • TISIS Agents: collects real-time system data from across the solution.

All of these elements combine to provide users with complete, easy to understand, real-time system management enabling both quick response to potential system issues and optimization information for system use over time.

Features

Ease of management: Single pane of glass management for complex parallel file system appliances

Depth of control: Navigate from top level views to specific component details and individual LUN usage within three clicks of the mouse

Responsiveness: Automated notifications assist support staff while enhancing response time

Simplicity: Enables system administrators without extensive parallel file system experience to easily manage and maintain their environment

Optimization:  Leverages real-time data collection to enable application-driven tuning

Planning:  Leverages data collected over time to enable accurate planning, understand usage by user, and optimize overall system performance

 

Management Tool Details

Administrators using TISIS interact with the system through four key interfaces:

The Dashboard home page provides an overview of the entire system, giving a single “red light/green light” status update and real-time updated charts for performance, utilization and balance. From the dashboard, users can push down to one of the three other interfaces.
The Alert Management Console provides a simple alert review and notification setup for the storage solution. Within the console, users can review and react to any system alerts that are generated. Additionally, administrators can tune alert notifications to insure that issues are properly reported. Examples include adjusting temperature thresholds to match “real” environment conditions and minimize false reports, or adjusting parameters to minimize flapping so that repeated errors are not reported as separate events. Additionally, administrators can review event history to determine patterns or trends.
The Operations Analytics Console provides the tools and views for analyzing understanding system operations over time. With this console, an administrator can analyze system performance to look for overloaded individual OSS servers, poor data striping, client access patterns or unbalanced application codes. This console enables users to detect patterns to improve performance of the file system, understand and justify need for additional capacity or performance, or help users understand unproductive data access patterns.
The Terascala Management Console delivers a single view of the underlying storage platform. This console handles all the “maintenance” tasks associated with the file system, including disk failures and replacements, software and firmware upgrades, file system errors or any other issues or parameter updates that need to be performed.

Use cases with the TISIS system include:

Simple management of a disk failure:
With the Terascala solution, when there is a disk (or other component failure) administrators receive an email that highlights the issue.  Administrators can then go to the console, see which nodes are highlighted in red, push down to the specific disk array, go into the array and find the actual drive.  In only three steps, administrators can determine exactly which drive was affected.

Determine the impact of a large job started on a cluster running a number of small jobs:
With TISIS, administrators are able to see the impact that a large job has on the performance of the smaller jobs through pre-defined views that show I/O performance per OSS, network performance, client interaction and total system throughput.

Understanding capacity usage to plan expansion:
With TISIS’s pre-defined views and long-range data collection, administrators can graph space utilization over an extended period of time. With this knowledge, administrators can understand if capacity usage is cyclical, depending on users, growing consistently over time, or accelerating suddenly. Understanding capacity patterns can help teams properly plan how resources are spent.

Understanding throughput performance over time relative to application performance:
With TISIS’s ability to track and report on bandwidth utilization over time, administrators can understand where bandwidth is being consumed and whether additional bandwidth would assist with overall application performance.

Hotspot monitoring:
By leveraging all the different data views of the appliance, TISIS can assist in understanding where and why hot spots are occurring within the file system. For example, if an Object Storage Server (OSS) is showing heavy utilization, the administrator can determine if there are multiple clients accessing a file on a single OSS (Stripping may help that problem.) or is a single client overloading the OSS, which may imply an application issue.