Welcome to Lewis's site

Things I'm working on and interested in....

Home
About Me
IT Arch Stuff
Music
Food
Contact Me
Site Map
PBA Zone : IT Environment : Instrumentation, Monitoring and Troubleshooting Systems Area
 
IT Data Centers are ecosystems of many solutions which must be monitored, analyzed, fixed, updated and adapted on a day to day basis. This is not an easy task and many organizations work quite diligently to optimize their management environment for the various solutions they must encounter. When a new solution requires a data center environment to utilize completely different management systems and techniques, it creates operational cost and complexity friction for an organization and can decrease the competitive capability of the company. However, there might be times when such duplication is warranted (the payoff is bigger than the capital and operating cost and complexity associated with the duplicated management system needed).

Instrumentation, Monitoring processes & systems:
Instrumentation and monitoring systems are crucial to any good data center environment. These systems enable many solutions to be consistently measured and instrumented. Keeping these components to as few as possible promotes a consistent and predictive aspect which reduces complexity in a data center (of course there are exceptions to this statement). If a solution requires significant modification to the monitoring and instrumentation systems (or worse, requires an entirely different instrumentation and monitoring system), then this could negatively impact the data center’s instrumentation and monitoring capabilities for the organization.

Troubleshooting & Root Cause Analysis processes & systems:
Most enterprise organizations have systems, processes, techniques, staff, etc… to troubleshoot, conduct some sort of root cause analysis and repair a solution’s system component when needed. The need to reduce MTTR (mean time to repair) is critical for a company to handle normal and abnormal exceptions in their day to day activities. The solution must align itself well with the troubleshooting and repair model organizations utilize to keep their organization successful.

Perspective Capture Questions: Management System Environment Area:

a) What systems and processes exist for instrumenting and monitoring in the IT organization?

b) How are these instrumentation and monitoring systems utilized in the IT organization?

c) What specific solutions are managed by these instrumentation and monitoring systems?

d) What systems and processes exist for troubleshooting and root cause analysis in the IT organization?

e) How are these troubleshooting and root cause analysis systems and/or processes utilized in the IT organization?

f) What specific solutions are managed by these troubleshooting and root cause analysis systems and processes?
Alternative Impact Questions: Management System Environment Area:

a) What was (or could be) the impact of current or alternative architecture decisions on the existing instrumentation and monitoring systems in the IT organization?

b) What was (or could be) the impact of current or alternative architecture decisions on the existing troubleshooting and root cause analysis systems in the IT organization?
Proposed Impact Questions: Management System Environment Area:

a) What would be the impact of the proposed architecture decisions on the existing instrumentation and monitoring systems in the IT organization?

b) What could be the impact of the proposed architecture decisions on the existing troubleshooting and root cause analysis systems in the IT organization?