Ability to turn off alerts not able to be closed in SCOM 2019
We need to be able to disable the feature preventing alerts from being closed if the monitor is in an unhealthy state.
As it currently stands integration with a Service desk such as Netcool is problematic as if the alert is closed in the service desk when the writeback happens to SCOM the chain breaks as the alert cannot always be closed in SCOM.
If you close an alert generated by a monitor (from the Operations Console “Active alerts” view) which is in a unhealthy state then the following message will be displayed and the alert will not be closed:
“Alert(s) in the current selection cannot be closed as the monitor(s) which generated these alerts are still unhealthy. For more details on the alerts which could not be closed, view the “Alert Closure Failure” dashboard in the Operations Manager Web Console”
We are reviewing this change and would like to know the kind of problems faced by customer. Please reach out to the below email address with more details —> email@example.com
Appreciate your feedback.
Erling B. Kjeldsen commented
This feature change is the worst change ever! And most of my colleagues are running away from SCOM as a platform for monitoring their servers and systems - PLEASE FIX THIS NOW
Gerald Versluis commented
It gets worse.
I sent my feedback to Microsoft, as described/requested above...
I got an NDR.
The mailbox accepts mail only from internal senders.
Gerald Versluis commented
This change was all I needed from the release notes: with this, there is no way I'd consider upgrading. No improvement whatsoever could outweigh the rubbish idea to implement this forced block.
There are several ways to properly address this, such as:
* When closing an alert, automatically reset the monitor- it will turn yellow or red again if the fault condition exists
* When closing an alert, rerun the monitoring workflow and close only if the condition is now healthy
* The flow that Samuel described
And I see an even worse issue at hand here, which IMO has been festering for years: Program Management doesn't seem to give a darn about what we think. Why is this still under review??? Idea was posted TWO YEARS AGO, it has taken Program Management almost a year to even show us - the SCOM community - the common courtesy of looking at it. Over a year later, it's still "under review". Especially considering reactions from well-known experts like Kevin and Bob are clear on why this forced block is a bad idea, the lack of progress is downright disgraceful.
Samuel Tegenfeldt commented
Just want to add to this, now in late 2020.
Last week I helped a new customer install and configure a new SCOM 2019 installation and the spontaneous laughter at "how many clicks is that!?" when trying to close the alert using the new web console is really telling. They posed they obvious question from a UX-design standpoint; why is there even the option to close a monitor alert if it is not allowed?
I see the point being made, but it is a disruptive workflow, especially when you're in a high-volume environment. And with many monitors taking quite a long time to automatically resolve even after a fix it makes this closure-block an actual problem for service-provider with tight SLAs.
Our work-around is a rule that checks for "Resolved" alerts and resets the monitor automatically. This does cause state-changes and repeated alerts on few occasions, but these guys have proper processes, and the effect of the current behavior is worse. An alert simply isn't "Resolved" until a tech has done something about it. In SCOM 2019, they must significantly slow down their workflow to account for a feature that should be optional.
Make the block enabled by default for the masses, but with a setting for the administrators to uncheck. Also, add the option (emphasis on "option") to automatically reset the monitor in case the alert is closed. This will help many customers that have Incidents opened and closed automatically with attached SLAs. Give us and our customers the ability to choose our own workflows.
This single feature is the sole reason three of my customers haven't yet upgraded, because it will break their ServiceNow integrations unless we implement a work-around like the one I mentioned.
While I appreciate the warning, there really does need to be a way to override this. Sometimes I know the monitor is healthy before the next health check. I can go in and reset the monitor status, but that is a LOT of clicks for a lot of alerts, or I have to use some PowerShell, but either way it's highly inconvenient.
The ideal workflow would be:
- Alert in GUI you are closing an alert on an unhealthy monitor, but allow override.
- Allow closing the alert via PowerShell (show warning message, do not prompt, allow).
- Reset health status of monitor automatically when alert is closed so it will "re-fire" on the next health check interval.
I've been using a PowerShell script since SCOM 2012 to reset monitors and it worked great. Admins would accidentally close monitors and they'd reopen the next time the monitor checked its health.
This would be a great feature. I would be happy for a technician to close a ticket, which in turn closes the alert, resets the health, creates a new alert, creates a new ticket, technician decides to fix underlying issue.
Educate the technicians to fix the issue in the first place? Not when there is a machine perfectly capable of having that conversation with them...
Patrick Seidl (s2) commented
This is a nightmare. By when do you finally fix it?
This definitely needs to be remedied.
Kevin Holman commented
I just had more customer feedback. They are impacted by this issue, when they place servers into Maintenance Mode. Their operations process is to close the alerts once the server is in MM. However, we only allow alert closure when the alert source is "healthy" and SCOM 2019 UR1 with Hotfix is blocking their ability to close alerts manually for agents that are placed into Maintenance Mode. This is just another reason why we need to be able to disable this new feature in SCOM 2019 for customers whose operations processes don't easily leverage the design change.
Completely agree with you here. It is a pain with the integration for Service Desks and in our case, we use an Orchestrator solution which fails to update the alert due to the restriction.
But...and I keep pondering on this, whilst I can easily add a runbook, or an action within the existing runbook to reset the monitor, at the end of the day this is only really going to be useful if the issue has definitely been fixed and it is something like a manual reset monitor.
If people are just closing incidents to clear their view, not only is it just going to come back, but it will cause many state changes.
The resolution surely has to be to fix the issue. Of course as mentioned, this does leave the manual reset monitors.
I've tried educating people till I am blue in the face, but people don't like seeing incidents in their queue and in some cases driven by closures :-(
I agree it needs looking in to, but still not sure myself what the best way would be to approach this so that it is not a blocker, but also not going to be impacting to SCOM
José Costa commented
Nearly one year later and no addon or fix to make this optional has been deliverable. Might as weel forget this or in case you've got integrations hurt by this then might as well forget SCOM altogether...
Bob Cornelissen commented
I do love this feature, as somebody who works with dashboarding a lot. And people close all kinds of alerts without the issue being solved and all dashboards stay red. And now they are red without an alert that belongs to it.
However I do also see how connected tools do not have a function to reset health (which should close the alert). Those third party tools should get a way of doing that, so they can change their product connectors to solve this issue (which should work for SCOM 2012R2/2016 etc as well if possible).
It would be helpful is dashboarding tools, including SCOM itself in console and web console has tasks available on an alert coming from a monitor to immediately reset health.
Of course there is always the common case of people closing alerts and resetting health to monitors if needed even if the issue is not solved yet, but that is a matter of training.
Tomas Thönell commented
Maybe give us the choise to reset the monitor when closing instead of refusing it all together?
We monitor hundreds of Citrix servers which are constanly being recycled/restaged, so pop in and out of SCOM regularly. Now the console is full of heartbeat failure alerts which I am no longer able to get rid of!
Ralf Kermes commented
I agree with Mr. Holman.
We must have the Choice to close the Active Alerts generated by a Monitor manually.
Additionally in view of collaboration between SCOM and our Ticket System.
Patrick Seidl (s2) commented
YES, the new feature is great in theory but not practical.
It should be possible to disable it and it would be great to finally reset a monitor on alert closure.
I don't say this should go, it should be improved.
@MSFT: read the comments after "fixing" this: https://systemcenterom.uservoice.com/forums/293064-general-operations-manager-feedback/suggestions/9356712-a-way-to-reset-monitors-automatically-when-alert-i
Mark Slootweg commented
I agree with Michel van der Zijden. The ticketing system should be an administrative tool only. Tickets should be closed when the underlying problem is solved (the monitoring alert in scom).
Michel van der Zijden commented
I'm strongly against removal/changing of this great feature. An alert, triggered by a monitor, should never be closed while the monitor is still in an unhealthy state. You should fix the problem (or tune the monitor) after which the alerts closes. I think this functionality, of not being able to manually close monitor alerts, is one of the great additions of SCOM 2019.
We are working with a ServiceNow integration and have found the health states to be an issue as well.
We have chosen to not allow the event management tool to close monitors but to either reset health or mark them as resolved.
Alerts in a resolved state for an extended amount of time should be reviewed.
I think that instead of SCOM erroring out when an alert triggered by a monitor is closed it should perform the health reset.
Pedro Tedim commented
I can't understand why some people here say they would downvote this request, or that it doesn't make sense for them, etc...
The request is for "Ability to turn off alerts not able to be closed in SCOM 2019", meaning it would be something controlled by the SCOM admin either to turn that on or Off...
Please implement this as this is the biggest pain in our day to day work with SCOM, we lose dozens of hours a week with this alone