verizon Diagnostics for Managed WAN LAN Services Instruction Manual
- June 4, 2024
- Verizon
Table of Contents
- Introduction
- Generic Incident Management Process
- System Architecture
- Automation
- Triage
- Integrated Test System (ITS)
- Short Hits Automation
- Remote Power Verification
- Power Automation
- DSL – Short Duration Outages
- LAN Automated Testing
- Site Isolation Indication
- Monitor Service
- Planned Maintenance
- Read User Manual Online (PDF format)
- Download This Manual (PDF format)
Diagnostics for Managed
WAN/LAN Services
September 2021
Document ID: VZK047895
Introduction
The purpose of this presentation is to provide an overview of some of the
automated diagnostics used in Verizon’s Incident Management process for
Managed Services.
It is a generic overview and therefore exceptions as well as custom
arrangements are not included.
Verizon seeks to drive a globally consistent “best practice” incident
resolution process on which Incident
Management automation is based. Examples of the benefits that automated
diagnostics can provide:
- Ability to locate and fix network faults automatically
- Detailed test results included in incident tickets which are used by NOC Engineers to facilitate repair
- Automatically routing incidents to internal specialist teams and external 3rd-party suppliers
- Publishing Incident resolution progress almost real-time and in a consistent way on the customer portal and via email notifications
NOTE: Automated diagnostics are tailored to the type of Product or Service, network access technology, and geographic location.
Generic Incident Management Process
| Upon ticket creation, automation emails the customer contacts to request
confirmation that the issue is not caused by a power outage or maintenance as
well as to confirm the local contact name, phone number and access hours. It’s
important to provide this information as early as possible. A quick link is
provided to facilitate this.
---|---
| Meanwhile, automated testing will begin. If the cause can’t be identified by
the system or the ticket can’t be progressed automatically, the ticket will be
assigned to a repair engineer.
| The engineer assigned to the ticket will follow standardized departmental
operating procedures to diagnose and resolve the issue.
Next steps could include opening a ticket with a LEC / PTT, referring the
ticket to another group for further troubleshooting (local NOC, Field
Operations) and contacting the customer for Power & Equipment checks
System Architecture
Automation
Automation is not only used to create alarms & tickets, but also throughout the ticket lifetime to improve troubleshooting efficiency through automated data gathering, and to increase productivity through controlled work assignment and automated ticket handling
Triage
A newly created proactive ticket is placed in the “Triage” phase.
Various automatic diagnostics programs are run during this phase:
- The type of diagnostics are technology dependent (i.e. Ethernet versus DSL)
- The output (i.e. diagnostics data) may be highly technical
- The output is published on the customer portal
lf the automation determines that further troubleshooting is required, the following happens:
- The ticket will be automatically transferred from Triage to the responsible NOC
- ANOC technician will be automatically assigned to perform further diagnostics
- Additional diagnostic run after the triage phase (Monitor Service, Power Automation, etc.)
Integrated Test System (ITS)
The Integrated Test System performs automated diagnostics immediately after proactive ticket creation and publishes the results on the customer portal during the triage phase.
Short Hits Automation
Alarm clears during automated ITS testing:
This automation has been developed to handle cases when during the automated
testing phase of proactive tickets (first 20-30 minutes of ticket life) the
systems determines that the initiating alarm has cleared. Depending on the
automated test result the following may happen:
- The ticket is automatically resolved and closed.
- The ticket is set to Monitor Service for 72 hours to verify service stability before automatic ticket closure.
Remote Power Verification
- Device unreachable -> Proactive ticket is being created
- command Factory tries to access the CPE via an Out of Band Modem (OOB) modem or via an alternate path
- lf Command Factory is able to access the CPE then it is assumed that both the CPE and the access circuit modem have power as well. In this case automation publishes the positive power verification on the customer portal.
Power Automation
Power Automation has been developed to detect a device loss of power.
- Device unreachable -> Proactive ticket is being created for device being down
- Device is reachable again and automation has been able to verify that the device uptime is less than 30 minutes (for EMEA managed) or less than 99 minutes (for US managed)
- Ticket is automatically resolved and closed as power-related
DSL – Short Duration Outages
DSL automation aims to optimize resources by auto-resolving trouble tickets
created for DSL, Cellular,
Wireless and Broadband circuit issues. Due to the instability of these type of
circuits, proactively created trouble tickets for these circuits will be
released to a technician after 45 minutes if the alarm is still active.
In case all alarms have cleared, within 45 minutes after ticket creation, the
ticket will be closed automatically. No in-depth root cause analysis is
provided as most providers will not research root cause if the service
recovers quickly.
LAN Automated Testing
When automated testing determines that the alarm originates from a customer
owned LAN device connected to the Verizon managed service a notification is
sent to the customer to recommend a check of power, cabling and equipment of
their LAN device(s).
The proactive incident ticket will be placed on hold until Verizon receives
customer feedback regarding these checks. No engineer will be assigned during
this hold period of 12 hours and if the alarm clears during the hold period
the ticket will be closed automatically.
Site Isolation Indication
Our automated diagnostic tools detect potential site isolations by correlating
entity alarms. The automated diagnostic tools verify if there is an alarm for
the HSRP/VRRFP peer entity or for the alternate access circuit path.
The site isolation status is published on the customer portal during the
triage phase.
Monitor Service
This automation occurs each time a ticket is not assigned, or assigned but the
NOC technician has not started working yet, and all the alarms are clear.
Once the alarm clears the Monitor Service automation starts:
- The ticket is placed in the deferred state and enters Monitor Service for 6 hours while IMPACT monitors for service stability.
- Incase the alarm(s) reoccur, the ticket will be automatically released to the queue to be assigned to the next available NOC technician.
- Upon a successful monitoring phase, either automation or a NOC technician will resolve the ticket which will auto close the ticket after 72 hours.
Planned Maintenance
The diagnostic systems check if there is known Planned Maintenance in progress
against the alarming device. If planned maintenance is in progress, the ticket
is opened as priority 4 and no automatic testing is performed. The ticket
stays in the “ Triage” phase.
As soon as the maintenance finishes the ticket is automatically closed. If the
alarm has not cleared when the maintenance window ends a new priority 1 ticket
is automatically created.
Verizon Public
Read User Manual Online (PDF format)
Read User Manual Online (PDF format) >>