понедельник, 30 января 2012 г.

Service Operation processes

Error messaging is important for all components
(hardware, software, networks, etc.). It is particularly
important that all software applications are designed to
support Event Management. This might include the
provision of meaningful error messages and/or codes that
clearly indicate the specific point of failure and the most
likely cause. In such cases the testing of new applications
should include testing of accurate event generation.
Newer technologies such as Java Management Extensions
(JMX) or HawkNL™ provide the tools for building
distributed, web-based, modular and dynamic solutions for
managing and monitoring devices, applications and
service-driven networks. These can be used to reduce or
eliminate the need for programmers to include error
messaging within the code – allowing a valuable level of
normalization and code-independence.
Good Event Management design will also include the
design and population of the tools used to filter, correlate
and escalate Events.
The Correlation Engine specifically will need to be
populated with the rules and criteria that will determine
the significance and subsequent action for each type
of event.
Thorough design of the event detection and alert
mechanisms requires the following:
ЃЎ Business knowledge in relationship to any business
processes being managed via Event Management
ЃЎ Detailed knowledge of the Service Level Requirements
of the service being supported by each CI
ЃЎ Knowledge of who is going to be supporting the CI
ЃЎ Knowledge of what constitutes normal and abnormal
operation of the CI
ЃЎ Knowledge of the significance of multiple similar
events (on the same CI or various similar CIs
ЃЎ An understanding of what they need to know to
support the CI effectively
ЃЎ Information that can help in the diagnosis of problems
with the CI
ЃЎ Familiarity with incident prioritization and
categorization codes so that if it is necessary to create
an Incident Record, these codes can be provided
ЃЎ Knowledge of other CIs that may be dependent on
the affected CI, or those CIs on which it depends
ЃЎ Availability of Known Error information from vendors
or from previous experience.
4.1.10.4 Identification of thresholds
Thresholds themselves are not set and managed through
Event Management. However, unless these are properly
designed and communicated during the instrumentation
process, it will be difficult to determine which level of
performance is appropriate for each CI.
Also, most thresholds are not constant. They typically
consist of a number of related variables. For example, the
maximum number of concurrent users before response
time slows will vary depending on what other jobs are
active on the server. This knowledge is often only gained
by experience, which means that Correlation Engines have
to be continually tuned and updated through the process
of Continual Service Improvement.
4.2 INCIDENT MANAGEMENT
4.2.1 Purpose/goal/objective
The primary goal of the Incident Management process is
to restore normal service operation as quickly as possible
and minimize the adverse impact on business operations,
thus ensuring that the best possible levels of service
quality and availability are maintained. ЃeNormal service
operationЃf is defined here as service operation within
SLA limits.
The value of Incident Management includes:
■ The ability to detect and resolve incidents, which
results in lower downtime to the business, which in
turn means higher availability of the service. This
means that the business is able to exploit the
functionality of the service as designed.
■ The ability to align IT activity to real-time business
priorities. This is because Incident Management
includes the capability to identify business priorities
and dynamically allocate resources as necessary.
■ The ability to identify potential improvements to
services. This happens as a result of understanding
what constitutes an incident and also from being in
contact with the activities of business operational staff.
■ The Service Desk can, during its handling of incidents,
identify additional service or training requirements
found in IT or the business.
Incident Management is highly visible to the business, and
it is therefore easier to demonstrate its value than most
areas in Service Operation. For this reason, Incident
Management is often one of the first processes to be
implemented in Service Management projects. The added
benefit of doing this is that Incident Management can be
used to highlight other areas that need attention –
thereby providing a justification for expenditure on
implementing other processes.

Комментариев нет:

Отправить комментарий