In this 33rd article of SQL Server Always On Availability Group series, we will use extended events to
monitor the availability group.
Database professionals’ primary role is to do proactive monitoring for ensuring system availability. DBA must
investigate the problem root cause analysis in case any critical event or downtime occurs. You can use various
things for investigation purposes such as SQL Server logs, Windows & Cluster logs, and dynamic management views
output, profiler and extended events.
SQL Server Always On Availability Group provide robust high availability and disaster recovery solution. It is
equally essential and beneficial to monitor the AG group ownership, their synchronization states, failover, critical
parameters. Let’s go ahead and see how extended events are used in monitoring the availability groups.
You can use earlier articles in this series (see TOC at the bottom) and configure two-node SQL Server Always On
Availability group replicas as shown below. The availability group databases are in the synchronized commit mode.
- Primary Replica: SQLNode1INST1
- Secondary Replica: SQLNode2INST1
- Database: [MyNewDB]
SQL Server Always On Monitoring using the extended events
SQL Server Extended events are a lightweight monitoring tool that captures useful SQL instance and database
parameters. We can collect the required database and use it for monitoring and troubleshooting purpose. For example,
in an article, Monitoring SQL Server deadlocks using the system_health extended event, we use it to get deadlock XML and graph using
the default system_health session. We can use SSMS GUI or t-SQL to create, configure and retrieve data from extended
Once we configure the availability group, the create Availability Group Wizard automatically creates the
AlwaysOn_health extended event session on all replicas participating in the AG configuration. You
can view it in SSMS navigating to Management -> Extended Events -> AlwaysOn_health.
If you configure the availability group using t-SQL or new availability group, it does not start the alwayson_health
event session. Therefore, you should verify that the extended event session is running and collecting the required
To understand the AlwaysOn_health event session, it is a good idea to script it out. Right-click on the event
session-> Script Session as -> Create To-> New Query editor window.
As shown above, it has several extended events along with the error numbers to capture. We use the sys.dm_xe_objects to retrieve a brief
description of these extended events.
SELECT name, description
WHERE NAME IN (
This event occurs when we execute a data definition language (DDL) statement such as CREATE, ALTER, DROP on an
availability group database. It does not capture detailed information about DDL. For example, you cannot retrieve an
execution plan. Its main purpose is to indicate you with the DDL followed by planned or unexpected failover.
If the Windows failover cluster and availability group have a connectivity issue, it might cause the lease expired
messages in the event logs. You might face automatic failover for the synchronous replica’s in case it occurs on the
For an automatic failover, SQL Server validates that the primary target replica is synchronized or not. It provides
an AG failover point for failovers. DBA’s can use the information to investigate the cause of an automatic failover
using this extended event.
This event occurs if the availability replica manager states changes. It is also a useful event session to
investigate the reason for a different state.
We can use this event to monitor the state of the availability group replica. It is also helpful to understand the
internal of an AG failover.
It tracks various error numbers to find out the connectivity errors for availability group endpoints. You can query
the sys.messages table to check the description for these different error id’s.
SELECT message_id, [text] as [Description]
FROM sys.messages AS m
WHERE m.language_id = SERVERPROPERTY(‘LCID’)
Order by message_id
In the query output, you get the error id and its corresponding description.
To summaries these error ids you can refer to the following table for errors:
Consistency issues Corruption
Logical consistency error, disk corruption, database consistency issues reported by DBCC CHECKDB.
It gives you a message once the AG replica changes its role, such as secondary to primary.
If SQL Server could not listen for an IP address or port, it logs the message in the extended event. For
You get these errors in case of connection handshake failure. For example, if your AG owner or service
In case your availability group replicas get any connection timeouts, it logs the information in the
Windows failover informational messages
These are the informational messages in different states of replicas during failover. It is useful to
AG failover issues
Suppose your SQL Server Always On Availability group replica tries for automatic failover, but the new
This event is to track the HADR partner sync state changes.
This event is useful to track the redo thread blocks that might cause latency in the data synchronizations.
Data Storage for Alwayson_health extended event
Open the Alwayson_health extended event properties and navigate to Data Storage. By default, it stores the collected
data in an event file. This event file gets created in the instance log directory.
- Maximum file size: 5 MB
- Maximum number of files: 4
Enable file rollover: It rollover the files once it extended event collected data fills the 4 files with 5 MB
each. It again starts overwriting the data from the first event file
Browse to your Log directory for SQL instance, and here you see the event files.
Monitor SQL Server Always On Availability Groups using the Extended event
We can extract data from the AlwaysOn_health extended event using GUI and T-SQL script.
You can expand the AlwaysOn_health in SSMS, right-click on the package0.event_data file and select View targeted
data. If you want to view the live data, click on the Watch live data on the root Alwayson_health folder in SSMS.
It opens the event viewer to show the collected information. You can filter, aggregate, and choose columns in this
window for your required data.
Sometimes, it is not convenient to get information using the graphical method. You can use t-SQL as well to get data
from the extended event files and choose the required data.
To demonstrate the data collection for SQL Server Always On Availability Group, I suspend the data movement from the
secondary replica. You can refer to series articles for understanding the suspend and resume data movement.
Expand Availability Group Database -> Suspend Data Movement.
Once the data movement suspends, wait for some time and again resume the data movement using the Resume Data
Use the following query to filter the extended event collected data for error 35264(suspend),35265 (resume).
In the below query, we use sys.fn_xe_file_target_read_file to read the event file and filter the data.
DECLARE @FileName NVARCHAR(4000)
SELECT @FileName = target_data.value(‘(EventFileTarget/File/@name)’,
CAST(target_data AS XML) target_data
FROM sys.dm_xe_sessions s
JOIN sys.dm_xe_session_targets t
ON s.address = t.event_session_address
WHERE s.name = N‘AlwaysOn_health’
XEData.value(‘(event/@timestamp)’,‘datetime2(3)’) AS event_timestamp,
SELECT CAST(event_data AS XML) XEData, *
FROM sys.fn_xe_file_target_read_file(@FileName, NULL, NULL, NULL)
WHERE object_name = ‘error_reported’
WHERE XEData.value(‘(event/data[@name=”error_number”]/value)’, ‘int’) IN ( 35264, 35265)
ORDER BY event_timestamp DESC;
In the output, you can see the suspend and resume data movement for the availability group database. In the
suspended message, it specifies that the database [MyNewDB] is suspended due to user activity.
Similarly, you can filter the extended event session for error for troubleshooting SQL Server Always On Availability Group.
Apart from the default availability group, you can configure additional event sessions as per your requirement. To
check the extended events for HADR, you can filter events from the sys.dm_xe_objects as shown below.
SELECT Name, object_type, description FROM sys.dm_xe_objects WHERE name
It gives you a list of 178 events. Here, you can see a snippet of all corresponding extended events.
In this article, we learned the extended events for monitor SQL Server Always On Availability group related events
and errors. You can add additional monitoring events to collect the relevant data as per your requirement.