SevOne Data Platform Use-Cases

SevOne Documentation

All documentation is available from the IBM SevOne Support customer portal .

All right, title, and interest in and to the software and documentation are and shall remain the exclusive property of IBM and its respective licensors. No part of this document may be reproduced by any means nor modified, decompiled, disassembled, published or distributed, in whole or in part, or translated to any electronic medium or other means without the written consent of IBM.

IN NO EVENT SHALL IBM, ITS SUPPLIERS, NOR ITS LICENSORS BE LIABLE FOR ANY DAMAGES, WHETHER ARISING IN TORT, CONTRACT OR ANY OTHER LEGAL THEORY EVEN IF IBM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES, AND IBM DISCLAIMS ALL WARRANTIES, CONDITIONS OR OTHER TERMS, EXPRESS OR IMPLIED, STATUTORY OR OTHERWISE, ON SOFTWARE AND DOCUMENTATION FURNISHED HEREUNDER INCLUDING WITHOUT LIMITATION THE WARRANTIES OF DESIGN, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, AND NONINFRINGEMENT.

IBM, the IBM logo, and SevOne are trademarks or registered trademarks of International Business Machines Corporation, in the United States and/or other countries. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on ibm.com/trademark.

About

This manual is intended to provide various use-cases for the valuable features available from SevOne NMS and SevOne Data Insight.

Use-Cases

Alerts

Baseline Standard Deviation
	Problem	Description	Solution	Value
1.	Static Thresholds generate false alerts.	Daily at 8:00 pm, server backups are performed. Due to this, the percentage of server CPU usage becomes very high.	Enable the alert based on baseline + standard deviation (two standard deviations are recommended). Please refer to https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule for additional details.	Alerting now takes into consideration the normal value of the indicator at 8:00 pm. If the indicator (for instance, CPU) is between 80% - 95% at this time, no alert is triggered if the CPU is at 95% at that time. Due to this, false alerts will not be triggered resulting in meaningful alerts only.

Baseline Standard Deviation

Problem

Description

Solution

Value

Static Thresholds generate false alerts.

Daily at 8:00 pm, server backups are performed. Due to this, the percentage of server CPU usage becomes very high.

Enable the alert based on baseline + standard deviation (two standard deviations are recommended).

Please refer to https://en.wikipedia.org/wiki/68%E2%80%9395%E2%80%9399.7_rule for additional details.

Alerting now takes into consideration the normal value of the indicator at 8:00 pm. If the indicator (for instance, CPU) is between 80% - 95% at this time, no alert is triggered if the CPU is at 95% at that time.

Due to this, false alerts will not be triggered resulting in meaningful alerts only.

Baseline Percentage
	Problem	Description	Solution	Value
1.	Alert on a metric that deviates a lot from the normal.	Static thresholds are not an option and standard deviations are quite big. Due to this, alerts are created which spread out significantly and it is difficult to capture these alerts.	Baseline percentage must be used to define a specific percentage threshold (for example, 50%) of the value of the baseline.	Baseline must be used to define the alert threshold.
2.	Identify changes in the number of routing peers (EIGRP, OSPF, BGP, IS-IS, RIP).	Static thresholds cannot be used for different EIGRP routers with different number of peers on each.	Baseline percentage must be used to alert when there is a change on the number of peers on a device (for example, 5% difference).	Any change on the number of peers per device will be dynamically picked. i.e., threshold for each device does not have to be defined manually. Otherwise, this would require the user to manually create an alert for each device, and update the policy every time there is a change on the network - this results in a lot of maintenance overhead.

Baseline Delta
	Problem	Description	Solution	Value
1.	Alert when two or more peer are down.	The edge routers have a different number of BGP peers with different ISPs (some with 10, some with 6, etc.). An alert must be sent when at least two of the peers are down. This is a problem because each device has a different number of devices. i.e., a different baseline. Two devices may be 50% difference or it may be 10% difference in another scenario. Due to this, baseline percentage is not an option.	Each device has a different number of BGP peers. Baseline delta must be used for the number of BGP peers that are up less than 2 (BGP Peers), compared with the baseline.	An alert is sent only when the resiliency of the BGP network has been affected.

Baseline Delta

Problem

Description

Solution

Value

Alert when two or more peer are down.

The edge routers have a different number of BGP peers with different ISPs (some with 10, some with 6, etc.). An alert must be sent when at least two of the peers are down. This is a problem because each device has a different number of devices. i.e., a different baseline.

Two devices may be 50% difference or it may be 10% difference in another scenario. Due to this, baseline percentage is not an option.

Each device has a different number of BGP peers. Baseline delta must be used for the number of BGP peers that are up less than 2 (BGP Peers), compared with the baseline.

An alert is sent only when the resiliency of the BGP network has been affected.

Slope Variance Deviation From Average (DFA)
	Problem	Description	Solution	Value
1.	Significant drop of clients connected to the WiFi network.	Scenario: A wireless provider has a Service-Level Agreement (SLA) with the airport. If the airport is not notified that there is a problem with the WiFi (even though the problem may not be with the WiFi infrastructure itself but has impacted the WiFi resulting in a problem with the wired network), the wireless provider is penalized. In order to monitor the overall status of the WiFi network, the WiFi provider monitors the number of clients connected. For alerting, the WiFi provider cannot use the static thresholds (at night, there are fewer people at the airport than during the day). And, baseline alerting cannot be used either (bank holidays, sporting events, etc.).	Slope Alerting must be used to monitor the sudden decrease in the number of clients connected.	Slope Alerting used to identify the sudden decrease in the number of clients connected. The last 6 data points allows the WiFi provider to identify the issue and as a result, avoid getting penalized by the customer (airport, in this example).
2.	Interface speed change.	Some types of interfaces (such as, port channels, DSL connections, etc.) the interface speed may vary when some problems occur on the interface (for example, one member of the port channel does not work, increase of noise on the DSL interface, etc.)	Slope Alerting must be used when the interface speed changes.	Using slope alerting, sends notification when there is a problem on the interface that cannot be detected in any other way. Before the issue escalates, notification is sent to take action and prevent more interfaces of the port channels to go down.
3.	Rapid increase of disk utilization.	Scenario: A customer has an issue with some disks where space utilization is quite low but suddenly they grow rapidly and get full before anyone realizes it.	Slope alerting must be used to notify when a sudden increase of disk utilization occurs.	The customer is proactively notified if the disk is getting full before hitting any threshold and allows the customer to take necessary action before the disk utilization reaches 100%.
4.	Rapid increase in temperature.	Scenario: In a situation when the A/C shuts down at one of the data centers, the temperature increases rapidly and by the time an alert is received to take any action, the temperature has already exceeded the required setting.	Slope alerting must be used when there is a significant increase in temperature.	The customer is proactively notified before the temperature hits the threshold. It also avoids the false alerts that get generated if the threshold is decreased. In most situation, when the temperature rises rapidly, it is an A/C problem.
5.	UPS (Uninterruptible Power Supply) battery capacity drops.	In some old UPS devices, when running on battery, it lasts for a few minutes only. Alerting on a fix threshold is not a solution as it generates lots of false alerts and sometimes, it may be too late.	Slope alerting must be used when there is a significant decrease of the battery capacity.	The customer is proactively notified before the battery capacity is too low. This avoids false alerts generated by static thresholds that are too high and the customer does not need to guess which static threshold is optimal in each situation.

Slope Variance Relative Standard Deviation (RSD)
This method calculates the difference between the standard deviation (https://www.mathsisfun.com/data/standard-deviation-formulas.html) during the last 6 polls, and the average value during these polls. This type of alert is triggered when the indicator value spreads out a lot. i.e., the 6 polled values are quite far away from the average value.
	Problem	Description	Solution	Value
1.	Flapping interface.	An internet circuit is fluctuating and is going up and down but, the protocol status keeps indicating that it is up. The only way to check whether the circuit is up or down is to check the traffic utilization.	RSD alerting must be used to monitor the standard deviation vs. average value of the traffic (HCOctets) of the interface.	By using RSD alerting, the customer knows when the circuit is down even though the status indicates that it is up.

Time to Newest Data Point
	Problem	Description	Solution	Value
1.	n/a	n/a	n/a	n/a

Count over Threshold
	Problem	Description	Solution	Value
1.	Alert when internet circuit has hit 90% three times in the last 30 minutes.	Scenario: As a financial services company, latency and packet drops are very important. The internet circuits must not be saturated - 95% utilization for a long period of time is a big problem. An alert must be sent before this happens and one cannot rely on baselines. It is important that the internet circuit does not hit 90-95%.	An alert must be sent when the interface traffic is greater than 90% and the Count over Threshold is 3 in the last 30 minutes.	The customer is notified on time so that they can proactively upgrade the internet circuits before they are saturated.

Time over Threshold
	Problem	Description	Solution	Value
1.	Alert when internet circuit has hit 90% for 10 minutes in the last hour.	Scenario: As a financial services company, latency and packet drops are very important. The internet circuits must not be saturated - 95% utilization for a long period of time is a big problem. An alert must be sent before this happens and one cannot rely on baselines. It is important that the internet circuit does not hit 90-95%. Count over Threshold cannot be used as all circuits have different polling interval and some have HFP enabled.	An alert must be sent when the interface traffic is greater than 90% for at least 10 minutes in the last hour.	The customer is notified on time so that they can proactively upgrade the internet circuits before they are saturated. It does not matter if the polling frequency is assigned to the object.

Time over Threshold

Problem

Description

Solution

Value

Alert when internet circuit has hit 90% for 10 minutes in the last hour.

Scenario: As a financial services company, latency and packet drops are very important. The internet circuits must not be saturated - 95% utilization for a long period of time is a big problem. An alert must be sent before this happens and one cannot rely on baselines. It is important that the internet circuit does not hit 90-95%.

Count over Threshold cannot be used as all circuits have different polling interval and some have HFP enabled.

An alert must be sent when the interface traffic is greater than 90% for at least 10 minutes in the last hour.

The customer is notified on time so that they can proactively upgrade the internet circuits before they are saturated. It does not matter if the polling frequency is assigned to the object.

Data Collection

Device Certification
	Problem	Description	Solution	Value
1.	Unable to obtain server, pool, pool members, etc. information from F5 devices.	F5 devices are monitored, but not all the relevant information is picked up by SevOne.	Open a case for device certification and install the .spk file provided to the customer.	After using 10-days of device certification service, the customer could monitor all the desired indicators plus more that they were not aware of.

High Frequency Polling
	Problem	Description	Solution	Value
1.	Limited granularity on devices with issues.	For day-to-day monitoring, 5-minute polling is sufficient. However, when issues are identified on a device, the granularity must be increased to better understand the behavior of the device/indicator of concern.	Enable High Frequency Polling on device(s) with issues. This can be done manually or automatically using REST API.	By enabling High Frequency Polling on the device(s) with issues, the customer gets granular data to help troubleshoot the problem(s) more effectively without impacting the performance it may generate due to continuous high frequency polling on the device(s).

Cross Object Calculation / Group Poller / Synthetic Indicators
	Problem	Description	Solution	Value
1.	Identify when both links at their redundant sites were saturated.	The links were being used at both sites at all times. It is not due to failover.	When the average bandwidth of both links exceeds 80% utilization, SevOne CoC allows the AVG of the total bandwidth and alert.	This means that either one link is fully saturated and the other link must be monitored very closely. Or, both links are nearing 90% utilization which is too high.
2.	Managed Service Provider (MSP) with several sites may not want to see every interface at every site.	The Network Operations Center (NOC) team does not want to report every interface at every site.	Group Poller can be used to group all interfaces at a site using RegEx. Then, the utilization can be summed up across all links at each site.	By grouping all interfaces, it allowing the customer to create one report with stacked lines to easily identify if a particular site is experiencing an issue that the other sites are not.
3.	Large number of Virtual Machines (VMs) with limited visibility to CPU load on all systems.	Need to be alerted on high CPU load on systems with up to 40+ CPUs. A single CPU under high load on a system with 40 CPUs is not worth alerting on. However, the customer does not have a single indicator to represent this.	Using REST API, the query can gather all HOST-MIB CPUs and automatically create the CoC object/calculation.	All CPUs are no aggregated and have the ability to get more meaningful alerts. This particular scenario had over 19K CPUs aggregated.
4.	Show percentage instead of value between 0 to 1.	Customers want to see a metric in percentage. However, the SNMP data polled, show values between 0 and 1 only.	Synthetic indicator multiplies the SNMP value by 100.	This results in simpler, clearer, and more consistent reports throughout the NOC team.
5.	Time in the Red Zone.	Customer would like to see how much time is spent above 80% utilization for a link.	Synthetic indicator can be created and flagged at say: 0 for < 50% utilization 1 for > 50% and < 80% 2 for > 80% By doing this, the customer has visibility when the value exceeds 50% and when the value exceeds 80% and not get visibility only when the value exceeds 80%.	Allows customers to graph and report on the amount of time or number of occurrences that happen in each of the zones. This helps in obtaining more accurate information or information for capacity planning.
6.	Brocade temperature OID shows (Celsius * 2). i.e., if the temperature is 25 degrees Celsius, the OID returns 50 degrees.	Brocade has a MIB file that contains an OID that shows the temperature of the chassis. However, this temperature is not in Celsius or Fahrenheit, but in (Celsius * 2). This is a problem because when the NOC team checks the temperature, they have to remember that the actual temperature is not the value displayed, but that value divided by two.	Create a synthetic indicator that divides the value of the temperature OID by 2.	The synthetic indicator created (which divides the value of the temperature OID by 2) no longer creates false alarms.
7.	Metric is required to show the overall health of a device.	In order to reduce noise, the NOC team needs a metric to provide them with the overall health of a device. This metric must consider indicators such as Availability, CPU Usage, Memory, Errors, etc.	Create CoC to generate a new indicator made up of the following metrics. Availability CPU (if over 80%) Memory (if over 80%) Interface traffic (if over 80%) Error discards Temperature Available fans Available power supplies	The new indicator created got rid of all the noise generated by all the indicators and alerts resulting in focus on a single metric.
8.	See overall number of VPN connections regardless of the VPN concentrator.	Scenario: A company has several VPN concentrators and they are alerting when the connections on an individual concentrator goes under a specific threshold. Due to this, lots of false alerts are generated indicating that there may be an issue with the VPN when in fact, it may be that one VPN concentrator is getting more connections than the other.	Create a Group Poller that sums up all the VPN connections into a single metric.	This allows the customer to create an alert only when the total number of VPN connections is below a threshold, not depending on a specific device.
9.	Getting lots of alerts for high CPU on only one of the CPU cores.	There is one single CPU core that from time to time goes very high and generates alerts. However, the rest of the CPUs load is quite low. An alert must be created to trigger only when the average of the CPUs is high.	Create a CoC with the average of all the CPU cores of the device.	CPU alerts will only be received when all the CPU cores are running high. This will avoid the alert noise generated by a single CPU core running high.

Universal Collector
	Problem	Description	Solution	Value
1.	Obtain disk size of a specific folder.	To measure the performance of an application, size of a specific folder on a server must be monitored. One way to do this can be to execute a CLI command on the server to obtain the results. For example, df <enter path to folder>	The Universal Collector script can be used to execute the desired command and format the result(s) as needed.	The number of false alerts have been reduced. By automating this task, it has reduced the amount of time dedicated to monitor the performance of the application.
2.	Read performance data from a JSON file.	If a certain type of device is not compatible with SNMP, the only way to collect performance data from that device is to read the JSON file for details.	Use Universal Collector to parse the JSON file and collect the performance data.	Performance of the device can be monitored automatically.
3.	Monitor thousands of devices that are not compatible with SNMP.	There is the need to monitor performance metrics from a 5G core devices that are not compatible with SNMP. The information can be exported in CSV format.	Use generic xStats adapter to monitor millions of indicators per hour.	Provides visibility on unsupported devices for SevOne features such as speed at scale, baselining or advanced alerting.
4.	Cloud provider monitoring such as, Amazon Web Services (AWS).	Cloud providers rely on APIs to share performance data.	Use productized xStats adapter for AWS.	Provides visibility on the cloud provider for SevOne features such as speed at scale, baselining or advanced alerting.
5.	Arista SNMP does not return all the required data. However, can obtain the data using the Arista API.	Customers do not share all the required data using SNMP. Sometimes, one needs to rely on the CLI commands. Arista has developed an API that allows the customer to obtain the data from CLI commands from the API interface.	Use deferred data to query the Arista API and obtain the data desired.	The collection process has been automated to obtain data that is not available from SNMP.
6.	When polling a device for a specific metric, the data returned is not clean - it is mixed with other irrelevant data, making it difficult to extract the desired information.	Sometimes when data is polled from a device, using SNMP or any other type of protocol, the result is not in the desired format. The data needs to be converted/extracted before it is stored on the platform.	Use deferred data with custom scripts that extract and save only the desired data (for instance, use regular expressions).	Any type of data can now be monitored that could not be done before as the format was either incorrect or it also contained irrelevant data.

Telemetry
	Problem	Description	Solution	Value
1.	Need to collect and receive data from the customer's network devices at higher frequency.	Customer needs to monitor some indicators at a higher frequency in order to keep a closer look at the performance of certain metrics. Using SNMP is not an option because the device is not capable to cope with the high number of SNMP requests.	Use telemetry to monitor some key indicators at a higher frequency.	With the use of telemetry, customer can now obtain the information at the frequency required; not impacting on the performance of the network devices.
2.	Need the ability to monitor data at different frequencies depending on flexible needs.	Each indicator has a different requirement to be monitored at different frequencies. For example, an indicator for serial numbers do not need to be monitored frequently whereas, some interface errors need to be monitored at a more frequent schedule than the normal 5 minute polling, etc.	Use telemetry to adapt the frequency to stream data to the telemetry collector based on the current situation of the network.	With the use of telemetry, metrics can be obtained at the desired frequency. Storage on the indicators does not get wasted. Indicators that need to be monitored more frequently get more visibility.

Integration

Webhooks
	Problem	Description	Solution	Value
1.	Email is not checked for Alerts.	The Support Team receives too many emails with alerts from several monitoring tools they use. Due to this, emails are ignored resulting in missing critical alerts that require immediate attention.	Alerts must be integrated with Slack/Teams.	When the Support Team receives an alert on Slack, it is not ignored and it allows them to tag individuals who can fix the issue. This makes it a more collaborative method of working between teams.
2.	Difficult to notify all the relevant individuals at the same time regarding the resolution to an issue.	In order to avoid being flooded with alerts, the customer decides to send all alerts to a centralized location i.e., the NOC team. Some alerts require escalations/immediate attention. But, due to all alerts going a centralized location, it resulted in several delays and missing out alerts that required immediate attention/assignment of the issue to the correct individual(s).	Alerts must be integrated with notification systems such as PagerDuty.	When there is a critical alert, an integration with PagerDuty is triggered and based on the business rules defined, depending on the time of day, device(s) involved, event(s), etc., a specific notification method is triggered, including emails, a message on Slack, or a conference call to engage all relevant individuals.
3.	NOC team spends too much time fixing small issues such as deleting files from full disks	NOC teams spend too much time performing manual tasks that can easily be automated.	Alerts must be integrated with automation tools such as Ansible.	When an alert is triggered, if it can be solved using some automated playbook, SevOne will send the data to the automation tool to automatically remediate the issue.

SevOne Data Publisher (SDP)
	Problem	Description	Solution	Value
1.	Apply network changes in real-time.	In order to apply changes on the VPN concentrators more effectively, real-time data is needed for the number of connections per VPN concentrator in order to apply rule changes on those devices.	Integrate SDP with ansible to stream data in real -ime to automate network changes.	Once SDP is integrated with ansible (through a Kafka bus), changes on the policies of the VPN concentrators are made in real-time, avoiding congestion problems and performance issues.
2.	Other teams use Grafana or something similar for analytics.	Teams are not using SURF or SevOne Data Insight to consume SevOne data, as they prefer to use their own tools. When API is used to collect the data from SevOne, the performance is unacceptable.	Use SDP to stream data in real-time to Grafana.	The other tools can pick and choose which data they want to get (using topics and filters) and can consume in real-time all the data needed without performance issues.
3.	Store data for a long period.	Some markets by-law, require customers to keep data for a long period of time (5-6 years). This data does not need to be accessed very frequently and keeping it in-house results in big costs to the customers.	Configure SDP to integrate with a Cloud data warehouse. For example, Cloudera, that supports the Kafka bus.	Customer can be compliant with the sector laws, minimizing the cost of keeping data for a long time.

SevOne API
	Problem	Description	Solution	Value
1.	Need to manually copy Metadata from Configuration Management Database (CMDB) tool (ServiceNow) to SevOne.	The customer is taking very long to copy the metadata attributes (such as Region, Country, City, Priority, Team, etc.) from one tool to another.	Use API to copy values from the CMDB tool to SevOne and from SevOne to the CMDB tool every day by using a script.	The CMDB tool now has all the required Metadata which is synchronized with SevOne.
2.	Manually raise incidents in ServiceNow from SevOne alerts.	Customer manually generates incidents in ServiceNow from SevOne alerts. This causes a problem because the engineers sometimes enter incorrect data. Also, it can take some time between the alert being triggered in SevOne and the incident being created in SevOne.	Use API to raise ServiceNow incidents directly.	When API is used, customer has all the incidents in ServiceNow created in real-time and with no typos.
3.	Maintain CMDB data with accurate information.	It is challenging to keep CMDB up-to-date due to the natural way networks operate where change is constant. Changes on modules/cards on network devices, CPU or RAM assigned to Virtual Machines, serial numbers, topology, etc. are common but very difficult to track and to keep updated.	Use API to read data from SevOne and update it in the CMDB tool.	When API is used, the automated process populates some of the asset details. This saves time and avoids human errors.

WDK
	Problem	Description	Solution	Value
1.	Correlate ServiceNow incidents with SevOne performance data.	The Support Team struggles to correlate incidents with performance data to find the root cause of the problems in the network.	Use WDK to display performance data from SevOne and ServiceNow incidents in a single widget.	When WDK is used, the Support Team can now correlate incidents and performance data allowing them to identify faster performance problems that generate incidents.
2.	Identify performance impact generated from network changes.	The Support Team struggles to correlate network changes and the impact of these on the network performance.	Use WDK to display performance data from SevOne and network changes from ServiceNow Change Requests in a single widget.	When WDK is used, the Support Team can now identify the impact of network changes on the performance of the network.

Reports

Calendar Widget
	Problem	Description	Solution	Value
1.	Performance issue occurring at the same time.	Sometimes performance issues occur at a specific frequency. However, it is not always easy to spot this recurrence.	Use calendar widget.	The Calendar widget provides the ability to easily visualize patterns allowing user to identify performance issues rapidly.

Chart Settings
	Problem	Description	Solution	Value
1.	No clear visibility of Rx and Tx traffic.	When a performance metric chart gets busy, it is difficult to distinguish between the Rx and Tx traffic.	Use negative y-axis for Tx traffic.	Showing Rx traffic in positive (what gets in) and Tx traffic in negative (what gets out) makes it much clearer and easier to understand the traffic utilization of that interface.

Baseline
	Problem	Description	Solution	Value
1.	Identify changes on the behavior of the devices.	During the planning steps to deploy configuration changes on the network, one of the main steps that is very commonly asked is to get a baseline of the network to understand what changes have happened on the network after the change is deployed.	Enable baseline on all indicators available on the device.	The baselines generated by SevOne gives the user the information needed to understand the normal value of an indicator at different points in time. With this feature, the user can compare the current value of the indicator with the baseline to understand the impact of the network changes applied.
2.	Troubleshooting time spent on normal behavior of the metric.	The Network Operations Center (NOC) team spends a lot of time troubleshooting spikes in utilization of some metrics (for instance, CPU usage or interface traffic) that end up being the normal behavior of the metric.	Enable baseline on reports.	With enabling the baseline, the NOC team now can easily visualize what the normal behavior of the metric is compared with the current values, allowing them to troubleshoot issues faster.

Standard Deviation
	Problem	Description	Solution	Value
1.	n/a	n/a	n/a	n/a

Time-over-Time
	Problem	Description	Solution	Value
1.	Check if an indicator has a normal value after a recent change.	Customer wants to ascertain if the current values are normal. Baselines are a good way of doing this but it can sometimes lag behind. Due to this, sometimes baselines cannot be used if the change has been made recently (last few days/weeks).	Use time-over-time (average last week) to compare the current results with the average values from last week.	User can compare the current values with the normal/expected values since the change was done.
2.	Keep a close look to the changes on an indicator.	Customer is monitoring a metric very closely. To identify important changes, customer needs to consider time of the day (for example, there are more connections in the afternoon than in the morning).	Use time-over-time (average last day) to compare the current results with values from the day before.	User can compare the current value with the behavior from the previous day. This allows the user to understand the changes on the indicator.

Projections
	Problem	Description	Solution	Value
1.	No visibility on when a WAN circuit will be used at capacity.	Customer wants to know when the WAN circuits reach the capacity in order to upgrade the circuit before it happens.	Use TopN time-to-capacity (or 90%, 80%, etc.) for that indicator.	When TopN time-to-capacity is used, the customer knows how much time they have before the circuit reaches the capacity, allowing them to plan the upgrades accordingly.
2.	Virtual machine capacity planning.	It is unclear when the customer needs to increase the resources of the virtual infrastructure.	Enable projected trend on charts.	With enabling projected trends on charts, it is easy to visualize the future utilization of the existing resources, putting the user in a better position to forecast future needs.

Percentiles
	Problem	Description	Solution	Value
1.	As an MSP, I want to charge my customer not based on the bandwidth assigned to them, but using the 95th percentile.	It is quite common that ISP/MSPs gives more bandwidth to the customers that they signed up for so as not to limit them when a spike of traffic comes in. Therefore, in order to charge them not for the spikes, but for the normal max bandwidth used, ISP/MSPs use the 95th percentile.	Enable 95th percentile on reports.	ISP/MSP is now getting the 95th percentile value of the bandwidth consumed by their clients, making it easy for them to know how much they have to charge them.

Report Linking
	Problem	Description	Solution	Value
1.	Troubleshooting steps consistency.	The Support Team manager has realized that the Support Team members get confused because in some reports they can drill-down to the device/object/indicator to have a more granular view, and in some other reports, this option is unavailable.	Create a global report link for the device/object/indicator.	The Support Team now takes less time to troubleshoot problems because they are familiar with the drill-down workflows available in the tool.

Report Chaining
	Problem	Description	Solution	Value
1.	Consistent delays during specific times of the day.	Scenario: A customer site is experiencing consistent delays from 12:00 noon to 3:00 pm, and it was difficult to drill-down to the specific issue that causes the delays.	Report link from most utilized interfaces on that site (using dynamic filtering) to flow conversations.	Identify the specific IPs that generated the traffic that was causing problems on the site.
2.	Difficulty identifying spikes on cluttered reports.	Performance Metrics charts can get very cluttered, making it very difficult to spot issues or spikes.	Use report chaining from TopN to Performance Metrics chart.	Chaining widgets together allows user to increase effectiveness when there is a need to highlight one device and review performance on a Performance Metrics chart.

Variables
	Problem	Description	Solution	Value
1.	Customer has a team that is distributed geographically and they want to have a report for their own location.	Scenario: Because there are 100 different sites, and the team located on each site wants to have a dedicated report for them, this means that the customer has to create 100 different reports with the same data but with a filter for the specific location. This is very problematic because every time they need to add a new indicator or KPI to the report, they need to update 100 reports. Furthermore, the team manager and the regional team leaders would like to have a report for their own region, meaning that even further reports are required.	Create one device group per site and use device groups variable on the report.	When one device group per site is created, there is a single report that meets the requirement of all the teams, consolidating the 100+ reports into one, and reducing the maintenance time per report type drastically.

Variables

Problem

Description

Solution

Value

Customer has a team that is distributed geographically and they want to have a report for their own location.

Scenario: Because there are 100 different sites, and the team located on each site wants to have a dedicated report for them, this means that the customer has to create 100 different reports with the same data but with a filter for the specific location. This is very problematic because every time they need to add a new indicator or KPI to the report, they need to update 100 reports.

Furthermore, the team manager and the regional team leaders would like to have a report for their own region, meaning that even further reports are required.

Create one device group per site and use device groups variable on the report.

When one device group per site is created, there is a single report that meets the requirement of all the teams, consolidating the 100+ reports into one, and reducing the maintenance time per report type drastically.

Table of Contents (Start)

SevOne Data Platform Use-Cases

About

Use-Cases

Alerts

Data Collection

Integration

Reports