“As new technologies draw from larger and larger datasets, incident management will continue to evolve to proactively identify incident root causes. While the industry is undergoing this evolution, it’s quickly advancing to help existing teams successfully manage their rapidly growing portfolio of digital services,” says Troy McAlpin, CEO, xMatters in an exclusive interview with EnterpriseTalk.
ET Bureau: What do you think are the greatest challenges for today’s enterprises in resolving their IT incidents?
Troy McAlpin: Providing an excellent customer experience for any digital service is the goal of all DevOps, SRE and operations teams. An excellent customer experience means that it works as it is supposed to and it is a feature-rich experience. Building new features continuously and maintaining a reliable, performant experience can be opposing forces- faster, faster, faster, better, better, better. The only way to meet this demand is through thoughtful, continuous automation.
In the wake of COVID-19 and sweeping shelter-in-place orders, companies rushed to meet heightened demand for digital experiences by spending more time and money to push applications, improvements and new services to market at a relentless pace. Subsequently, these applications are having issues faster than they can be fixed. Every change, even a small one, is a new opportunity for an issue, interruption or incident to disrupt that excellent customer experience.
SRE and DevOps teams that are responsible for maintaining digital services are challenged with juggling internal expectations to not only continually innovate but also regularly address and quickly resolve any service degradations before they become serious, customer-impacting issues. In a recent survey conducted by xMatters, over 70% of technology teams spend half of their time managing incidents and fixing issues instead of releasing new products to market.
This fact brings to light an interesting paradox–– enterprises have accelerated the deployment of digital services. But, spending significant time fixing these services leaves them less time for innovation. To address this disconnect, enterprises must evolve their current approach to resolving issues.
ET Bureau: With incidents being as dynamic as they are today, what should be the first thing enterprises can do to stay on top of their game?
Troy McAlpin: Enterprises can look to implement increased automation across each stage of the incident management lifecycle—from diagnosis and collaboration to resolution and post-incident learning. Traditional approaches to issue resolution do not provide operations teams with the adaptability, agility, or information required to identify trends and potential areas for improvement.
As more organizations adopt agile, DevOps, and AIOps, an increase in the number of change-related incidents and the speed of new software releases demand more automation be applied to accelerate actions, reduce downtime and promote continuous learning when resolving issues.
Enterprises need to develop a better way to assess, prevent, respond, resolve and learn from technical issues and interruptions so they can focus on delivering and maintaining a reliable customer experience. Easy-to-build and use automation is a precedent step toward an automated approach to digital service reliability.
Continuous improvement of traditional ITSM or ITIL practices toward more data-driven and automated approaches are much more effective. If companies want to deliver an “always-on” customer experience, they first have to deliver automation with continuous learning to avoid, prevent and resolve digital service interruptions.
ET Bureau: How can enterprises leverage automation technologies to enhance their incident management operation?
Troy McAlpin: Digital transformation will remain a primary business focus in 2021—but increased competition among digital service providers means the stakes will be higher for delivering impeccable user experiences. To remain competitive in the current dynamics, enterprises must place greater importance on the automation of systems and processes their technical teams use to detect and resolve service degradations.
By applying automation to the software development cycle, teams can reduce the impact of service degradations—reducing the points of friction in quality, security, code integration, development, deployment and production issues.
Additionally, by employing workflow automation, teams can minimize slow and error-prone manual tasks, close visibility gaps that lead to service reliability issues and speed up development velocity without overburdening engineers. Workflow automation can also proactively address potential issues associated with new releases, technology deployments and team scheduling for supporting key customer metrics like adoption, retention and lifetime value.
To move fast, enterprises need a service reliability platform that supports automated workflows that are easily adaptable with little to no coding needed—this will help ensure the ongoing reliability of their systems and the sanity of their product teams.
ET Bureau: What trends do you see will emerge in incident management?
Troy McAlpin: As new technologies draw from larger and larger datasets, incident management will continue to evolve to proactively identify incident root causes. While the industry is undergoing this evolution, it’s quickly advancing to help existing teams successfully manage their rapidly growing portfolio of digital services.
Additionally, the accelerated digital transformation initiatives enterprises witnessed in 2020 point to an increased convergence of features and capabilities across AIOps and modern incident management. Looking forward, I anticipate this trend to continue as vendors seek to progress from insight to action.
The acceleration of digital transformation witnessed in 2020 served as a precursor to this consolidation, acting as a market driver for companies to invest in these capabilities. The industry will see continued acceleration in feature development—either in-house or through acquisitions—in the AIOps and incident management space as vendors aggressively looking to buy, build or partner in order to advance and accelerate their product roadmaps.
Troy McAlpin brings more than 20 years of experience to his leadership role at xMatters, with expertise in process automation, strategic initiatives and corporate strategy. Under Troy’s direction, xMatters has empowered over 650,000 paid IT power users and 2.6 million total users on the xMatters platform worldwide to prevent IT issues from impacting the customer experience. Troy’s domain experience includes IT strategy and vertical market expertise including technology, banking, consumer and retail industries. Prior to founding xMatters, formerly AlarmPoint Systems, he managed marketing, sales, development, M&A and financial aspects at two successful start-up companies and also worked at AT&T Solutions and Accenture.