Database-level health detection in SQL Server Always On Availability Groups

This is the 8th article in the series for SQL Server Always On Availability Groups.


SQL Server Always On Availability Groups create a group of databases that fails over together in case the primary replica is unavailable. These availability group databases failover manually or automatically depending upon your AG
configuration. By default, SQL Server fails over all databases from the primary replica to the secondary replica if the primary replica is down or SQL Services stopped in the primary instance.

You can create multiple SQL Server Always On Availability Group in SQL Server configuration. Usually, we group the application dependent databases in an availability group. These databases must be failed over together for smooth working on the application.

Suppose you have two databases in an AG group. Your primary instance is online, but one of the AG databases becomes unhealthy on the primary replica. Your application reports issue because it could not access a database in the AG
group. SQL Server does not perform a failover of the AG group to the secondary replica. It requires manual intervention to fix the issues at the primary replica or perform a manual failover to the secondary replica.
Sometimes we want that SQL Server should check the database health as well in an AG group and perform automatic
failover if any of the databases is not healthy.

Do we have the option to enable the database-level health detection in SQL Server Always ON? Let’s explore it in
this article.


You should configure SQL Server Always On Availability Groups in synchronized mode. You can refer to this article,
Add a new node into existing SQL Server Always On Availability Groups for further insights.

In the earlier articles, we have created an availability group with the following configuration.

  • Current primary replica: SQLNode1
  • Secondary replica in Synchronous mode: SQLNode2
  • Secondary replica in Asynchronous mode: SQLNode3
  • Two availability group databases [MyNewDB] and [SQLShackDemo]

SQL Server Always on Availability Group setup

An Overview of the database-level health detection for SQL Server Always On Availability Groups

SQL Server 2016 introduced Enhanced database failover to monitor the database health and trigger
the failover for that availability group. This feature is known as Database Level Health Detection
in an availability group.

We can enable this feature at the availability group; therefore, it monitors all databases corresponding to that
availability group. It cannot be enabled for an individual database in an AG group. Your database should be in
synchronized data commit with automatic failover mode so that SQL Server can failover it in case of any issues.

SQL Server performs the following checks for the database-level health detection. If the database status is online
and SQL Server could not write in the transaction log of an AG database, it initiates AG failover. It also checks
for the following error ids as part of the database health check.

Suppose you have the following AG architecture with one secondary replica. Your AG database stores data files in D drive and log file in the L drive.

disk configuration

Now, on your primary replica, log drive L fails. SQL Server is not able to write in the AG database transaction log.
It fails a successful database health detection. SQL Server availability group fails over from primary to the
secondary replica.

disk failure and failover

In another scenario, your data drive fails on the primary replica, but your log drive is intact. In this case, no AG
failover occurs because SQL Server can write in the transaction log.

data disk failure and failover

Database-level health detection does not monitor the idle database or disk uptime. It does not directly monitor the
database files. In the above image, we saw that disk failure causes the availability group automatic failover.
Suppose your AG database is idle and does not have any active transactions. In the idle state, SQL Server does not
perform any physical activity to the disk. Now, suppose, you have other databases as well in the disk and disk gets
full due to those database transactions. Due to disk space issues, SQL Server might not be able to perform any
transactions for the databases lying in that disk. Later. You perform a transaction and SQL Server not able to write
database checkpoints and write the data into the disk. At that time, SQL Server finds unhealthy database status and
perform a database failover as a result of the database health change.

Similarly, if your database can satisfy the data page from the memory, SQL Server does not require to perform a
physical read from the disk. It gets the data from the buffer pool memory. It might also not cause your database to
failover for SQL Server always-on availability group.

Enhanced database-level failover and SQL Server event ids

SQL Server monitors the following event ids for the enhanced database-level failover.

Event ID



Database or page corruption


Database checkpoint failure


Disk corruption issues


Hardware or memory-related issues


Disk space issues in the filegroup. In this case, SQL Server not able to allocate a new page in the


Missing filegroup ID requests


Wrong file ID requests


SQL Filestream container issues


Filestream log corruption


Filestream data container corruption


Transaction log corruption

SQL Server logs get error id 41653 once availability group failovers due to database-level health detection.

Configure enhanced database failover in SQL Server Always On Availability Groups

You can configure the enhanced database failover using both the GUI and T-SQL. It is an optional feature. If you
have an existing availability group, you can use the dynamic management view sys.availability_groups to
check the enhanced database failover is enabled or not.

Execute the below database management view (DMV) in the primary replica instance to check database-level health
detection status for your existing availability group. The DMV sys.availability_groups has a column [db_failover] to check
the configuration. If this DMV returns value 1 for the [db_failover] column, it means database-level health
detection is enabled else zero value shows the feature is not enabled for all databases in the availability group.

As shown below, currently, we are not using this feature.

Verify enhanced database failover

We can enable this feature while configuring a new availability group or in an existing availability group.

Enable enhanced database failover using GUI in an existing availability group

To enable the enhanced database failover, connect to the primary replica using SSMS. In the Always On Availability
Group, right-click on the availability group and open its properties.

In the AG property, you get an option as Database-level health detection.

Enable Database-level health detection

Put a tick for the Database-level health detection and generate the T-SQL script.

Put a tick on Database-level health detection

It uses the Alter Availability Group script with the DB_FAILOVER argument.

Create a new SQL Server Always On Availability Group with enhanced database failover

To enable the database-level health detection while creating a new availability group, enable it in the Specify option page, as shown below.

Option to enable in the new availability group wizard

Extended event to track the database health detection

SQL Server 2016 onwards we have a new extended event availability_replica_database_fault_reporting to track the
replica restart in case of the unhealthy database. You can add this event to the extended event session on the
primary replica.

Extended event to track the database

Disable enhanced database failover for SQL Server Always On Availability Groups

You can disable the enhanced database failover without restarting SQL Services. To disable the feature, either
remove the tick from the check from the database-level health detection or use the following query.

Availability group properties


In this article, we explored the useful enhanced database-level failover feature in an availability group. You
should enable it for the production databases. You should also group the database appropriately in an AG group so
that it gets failover in case any database reports fails the database-level health detection. You can create
multiple AG groups and group databases as per their availability and application requirements.

Table of contents

Rajendra Gupta
Latest posts by Rajendra Gupta (see all)

About: admin