There is a vast difference between running controlled benchmarks in a lab versus running a diverse set of real-world applications in a business-critical setting. Without in-depth analytics, storage administrators and application users are vulnerable to lost productivity caused by component failure, sub-optimal workflow decisions, and improper system configuration.
Analytics are typically used to discover and resolve hardware failures, application issues such resource contention, and file system tuning opportunities. TeraOS collects key performance indicators for all major hardware and software components within the appliance, including:
- MDS/OSS servers
- storage arrays
- operating system
- Lustre file system
TeraOS also monitors client/applications access patterns to provide a 360 degree view of the health and performance of the storage appliance. A high level dashboard provides an easy-to- understand summary of health and performance with targeted drill down capability for detailed analysis and problem resolution. Individual sensor values are correlated and combined to detect abnormal conditions that are communicated to the user in the form of problem-level alerts. The base scanning interval for sensors is five seconds; however, this can easily be changed to achieve unlimited scalability.
Parallel file systems such as Lustre are inherently complex. Know and correct problems before productivity is affected.