Asm Health Checker Found 1 New Failures May 2026
Before troubleshooting the error, understanding the tool is crucial. The ASM Health Checker is an automatic, background diagnostic framework introduced in Oracle Grid Infrastructure 11gR2 (enhanced in 12c and 19c). It runs periodically (typically every hour or on specific triggers) to validate:
When any test returns a "FAIL" status, the health checker logs a failure count increment. The message asm health checker found 1 new failures means exactly that: since the last run, the checker identified one more problem than before.
Based on standard ASM operational patterns, the failure is likely attributed to one of the following scenarios:
1. Database Connection Throttling (High Probability) If the ASM manages audit trails or session data, the database writer may have hit a connection limit. The health checker attempted to write a "heartbeat" record and timed out.
2. Stale Session or Token Expiry If the ASM manages authentication sessions, a background process responsible for token revocation or cleanup may have stalled.
3. Resource Starvation (Memory/CPU) The host machine running the ASM instance may be experiencing resource contention, causing the health check script to lag or fail execution.
After fixing the issue, the ASM health checker will automatically re-evaluate within 1 hour (by default). To force an immediate recheck:
sqlplus / as sysasm
EXEC DBMS_SCHEDULER.RUN_JOB('SYS.ASM_HEALTH_CHECK_JOB');
-- Or manually:
SELECT * FROM table(asm_health_check());
If the fix is correct, v$asm_health_check will show status='RESOLVED' for that failure ID.
To acknowledge and purge the failure from the persistent repository (without waiting for auto-clear):
DECLARE
v_fid NUMBER;
BEGIN
SELECT failure_id INTO v_fid FROM v$asm_health_check WHERE status='FAIL' AND rownum=1;
DBMS_SCHEDULER.SET_ATTRIBUTE('SYS.ASM_HEALTH_CHECK_JOB','COMMENTS','Manually cleared');
EXECUTE IMMEDIATE 'BEGIN SYS.ASM_HEALTH_CHECK_PURGE('||v_fid||'); END;';
END;
/
| Check ID | Component | Failure Description | Severity | First Detected |
|--------------|---------------|------------------------|--------------|--------------------|
| ASM-042 | Disk Group Mount Consistency | Disk group DATA – one offline disk not yet force-mounted after node reboot | Warning | [Date/Time of scan] |
Note: Replace the above with actual failure description from your ASM health checker output.
The "1 new failure" could represent dozens of distinct underlying issues. Based on real-world Oracle support cases, here are the top triggers: asm health checker found 1 new failures
The new ASM health check failure is isolated and classified as warning-level. Immediate intervention is not critical, but prompt remediation will restore full redundancy and prevent potential escalation.
Please assign a team member to validate disk status and perform the recommended actions by [date + 2 days].
Prepared by: [Your Name / Automated Monitoring System]
Attachments: Full ASM health check log (if available)
The alert " ASM Health Checker found 1 new failures " is a critical notification typically found in Oracle Automatic Storage Management (ASM) alert logs. It indicates that the GMON (Group Monitor)
process has detected an issue—often a disk failure or a forced dismount—that requires immediate attention What This Alert Means
This message usually appears alongside other ORA- errors and signals that ASM has identified a problem with the storage layer. Common triggers include: Disk Failures
: A physical disk or a storage path (LUN) has become inaccessible. Forced Dismounts
: The diskgroup has been forced offline because it can no longer maintain its required redundancy (e.g., a disk failure in an EXTERNAL REDUNDANCY Metadata Corruption
: Corruption in the ASM metadata blocks, which can happen during intensive operations like rebalancing. Configuration Issues
: Problems during the addition of new disks or voting file refreshes. Immediate Troubleshooting Steps Check the ASM Alert Log : Locate the alert log for your ASM instance (often in /u01/app/oracle/diag/asm/.../trace/alert_+ASM.log
). Look for the ORA- errors immediately preceding the "1 new failures" message to identify the specific disk or group affected. Verify Disk Status Before troubleshooting the error, understanding the tool is
: Run the following query in your ASM instance to check for offline or missing disks: name, group_number, path, state, header_status v$asm_disk; Use code with caution. Copied to clipboard Investigate the Incident : Oracle’s Fault Diagnosability Infrastructure
often generates an incident report when this occurs. Use the tool to view the incident details: show incident show tracefile (for the specific process like +ASM_rbal_xxxx.trc Monitor Rebalance/Repair : If a disk is just offline and you have redundancy, check the REPAIR_TIME
to see how long you have to fix the issue before ASM automatically drops the disk. Oracle Forums When to Take Urgent Action External Redundancy
: If your diskgroup uses external redundancy and a disk fails, the group will likely dismount immediately, potentially crashing your database. Intermediate States
: If your Clusterware (Grid Infrastructure) resources show an INTERMEDIATE
state after this alert, the diskgroup may be partially available but failing to fully mount. trace file associated with this failure?
When the ASM Health Checker reports "found 1 new failures," it usually indicates a critical disruption to the storage layer, often leading to a forced dismount of a disk group to prevent data corruption. This message is a summary alert that appears in the ASM Alert Log after a specific storage-related error has already occurred. Common Causes
Missing or Inaccessible Disks: The most frequent cause is that one or more disks in a group are no longer reachable due to hardware failure, storage connectivity issues, or OS-level changes.
Metadata Corruption: If ASM detects invalid block headers or internal inconsistencies in the metadata, it may trigger a failure and dismount the group.
Insufficient Quorum: In diskgroups with redundancy (Normal or High), if too many disks or a required "voting" disk (PST) become unavailable, the group cannot maintain a read quorum and will fail.
I/O Errors: Significant write failures or heartbeat timeouts to the PST (Physical Status Table) will prompt the health checker to record a new failure. Immediate Troubleshooting Steps 2 Automatic Storage Management - Oracle Help Center When any test returns a "FAIL" status, the
ASM Health Checker Found 1 New Failure: What It Means and How to Resolve It
The Automatic Storage Management (ASM) health checker is a crucial tool in Oracle databases that monitors the health and integrity of the storage infrastructure. When the ASM health checker reports a new failure, it's essential to understand the implications and take corrective actions to prevent data loss or system downtime. In this blog post, we'll discuss what an ASM health checker failure means, how to investigate the issue, and steps to resolve it.
What does an ASM health checker failure mean?
When the ASM health checker detects a problem, it logs an error message indicating that a failure has been detected. The message may look like this:
"ASM health checker found 1 new failure"
This message indicates that the ASM health checker has detected a single failure in the storage system. The failure could be related to various issues, such as:
Investigating the ASM health checker failure
To investigate the failure, follow these steps:
Resolving the ASM health checker failure
Once you've identified the root cause of the failure, take corrective actions to resolve the issue:
Best practices to prevent ASM health checker failures
To minimize the likelihood of ASM health checker failures:
By understanding the causes of ASM health checker failures and taking proactive steps to prevent them, you can ensure the reliability and performance of your Oracle database storage infrastructure.