Opsramp ITSM Platform
2018 • Product Designer
Opsramp ITSM Platform
2018 • Product Designer
OpsRamp is an IT Operations management SaaS platform that provides visibility and control of cloud and on-premises infrastructure and point tools through discovery, monitoring, alert management, artificial intelligence, and automation.
I worked on the Opsramp IT Operations platform and work with product management to prototype high level concepts. Opsramp was acquired by Hewlett Packard Enterprise (HPE) in 2023.
Overview
OpsRamp is an IT Operations management SaaS platform that provides visibility and control of cloud and on-premises infrastructure and point tools through discovery, monitoring, alert management, artificial intelligence, and automation.
I worked on the Opsramp IT Operations platform and worked with the VP of product management to prototype high level concepts. We worked on projects such as reimagining the L1/L2 triage workflow, an app store, analytics, and more. Opsramp customers were frequently large enterprises, so when I designed, I always considered extensibility in mind.
My Role
I worked as a product designer with Opsramp for many projects from ideation and exploring new concepts, to delivering spec design. Often times, I would work directly with the VP of product management to iterate new concepts weekly. From there, my designs would be used as mental models for the future, or given to engineering as spec.
Problem
In an IT environment, there are constantly issues arising with various resources. To effectively mitigate these damages, the ITSM world enforces a concept of problem management, to dig deeper into fixing IT incidents.
The idea is to understand why systems malfunction and how to prevent future occurrences.
This means that engineers need to determine immediate triggers like corrupted files to explore contributing factors and underlying conditions.
For me as a designer, it requires getting a deep understanding of the persona using this platform, and what DevOps or L1/L2 do in their day-to-day life, and how it can be improved.
Opportunity
The opportunity was to learn about customer problems, and continue to advance the value of the platform by introducing new features and modules. There's a continuous growing need to use better tools for triage, and I was exploring some of these concepts.
Process
I researched how ITSM platforms manage resources in all environments, and the customers of Opsramp use the platform. This required speaking to engineers at Opsramp who triage issues. I needed to understand all aspects of how they do their job.
I learned what an engineer looks for when they are conducting an investigation into an issue, and how these systems of rules are setup to automate anything before a human intervenes.
I dove deep into the platform to understand its full capabilities. Opsramp had an existing enterprise customer base and already had many full capabilities such as intelligent automation, as well as ITSM & APM management.
It was important for me to learn the ITSM ticketing lifecycle, and how an issue arises -> how a human intervenes to fix it. By understanding all of these pain points, I could understand the psychology of a user throughout each touch of the UI.
My Work
Here is my work I did with Opsramp. I worked on multiple projects:
A way to query resources and correlate their incidents, alerts, service map, and more.
Reimagined how L1/L2 engineers can triage issues with resources.
Designed an external facing store for companies to install apps to the IT platform. This was for both users and developers.
Designed the first set of analytics apps. These were used as reference in the Opsramp developer SDK.
In part of our discovery projects, we would explore new UX concepts of how we could make the lives of L1/L2 engineers easier. When things go wrong, how can we make it easier so they can figure out what the problem is and fix it? In this demo I designed, a user could query a set of resources, and the system could smartly determine all correlated alerts, incidents, and so on.
The goal was to eliminate the pain point of investigation, and greatly reduce the time it takes to find the root cause of errors.
The model we developed was that by clicking on each object, a drawer would slide into view with the most pertinent information that an engineer needs to know. They could read it, pin it if they want, then close it. The idea was to optimize for the continuous scanning of resources and data that an engineer needs to do when they are conducting an investigation.
When we spoke to DevOps engineers, we wanted to understand their process for resolving issues with resources. As larger services get impacted, it becomes increasingly difficult to determine the issue. I designed a demo (when the UI was a little different), to show more data in an incident. This mean, showing a service map, with the ability to see details & metrics per node.
The mental model was that action icons were in the upper right corner of the screen, at the overall ticket area or in the service map area. When doing research, I needed to understand every single necessary piece of information when creating a service map, and condensing it down, but still making it readable.
It was important to see which specific resources where impacted, and when. Here I designed a card/table view of these resources. One distinction is allowing each impacted resource to be filtered by status/time frame individually versus having the ability to filter across the whole list. This gave engineers viewing this a more granular view.
I designed the Opsramp Store, a place where Opsramp users can install various apps to their IT environment. The idea was to design both the consumer and developer side.
This design followed a more conventional approach to enterprise app stores. Users see a list of apps, with an ability to search or filter. I really wanted to make it feel as clean as possible, and using the full space of the block. For this reason, I added the # of downloads to be right aligned to create a fuller page.
The detail page followed what one would see from app store pages: a title, tabbed views, and a large CTA. On the right hand panel, I displayed attributes that were common across all apps (ex. version, category, and others).
For the developer side, we explored the app approval lifecycle. Developers would create the app in their IDE using the Opsramp SDK, and push a tag of that instance to the store (referencing the app ID). From there, they could test it in their Opsramp IT platform test environment, and then submit the app for approval and publishing.
Here is the location where developers could manage their apps for the Opsramp store. The apps were developed in their IDE, and pushed to the store, and would go through the approval lifecycle.
I kept the UI consistent by keeping each version on a block, just like the apps in the consumer side were on blocks. Additionally, when working with PM, the idea was to show all the stages as headers, so the user could simply scroll the screen to see the status of all versions.
I redesigned many object list/detail pages such as agents, gateway profiles, management profiles, and more. This required knowing the relationship between all objects and how actions affected each other.
I designed the analytics reporting area. The idea by product management was to have the analytics apps made by developers (as opposed to a dedicated area like Google Analytics). Additionally, they wanted it to look like an actual PDF that could get shared.
I designed four different analytic apps (all using this style). This became part of the appearance style for the developer SDK.
The idea was to have the content of the analytics app on a PDF looking background. This way, when shared or exported, it felt more like how one would expect to hand off a report to upper management.
Results
Opsramp continued to grow in size and attract large customers such as Dell and HPE. The engineering team was fantastic and Hewlett Packard Enterprise, one of their customers, acquired the company.
This was a great experience to work on a platform in the ITSM and APM space. I would often times research key players in the space like ServiceNow, Splunk, Datadog, or Pagerduty when doing UX research. The concepts were often very technical, and so I always had to understand the system architecture when designing.