A distributed availability group (AG) is a special type of availability group that spans two separate availability groups. Distributed availability groups are available starting with SQL Server 2016.This article describes the distributed availability group feature. To configure a distributed availability group, see Configure distributed availability groups. Overview
Overview
A
distributed availability group is a special type of availability group that
spans two separate availability groups. The availability groups that
participate in a distributed availability group don't need to be in the same
location. They can be physical, virtual, on-premises, in the public cloud, or
anywhere that supports an availability group deployment. This includes
cross-domain and even cross-platform - such as between an availability group
hosted on Linux and one hosted on Windows. As long as two availability groups
can communicate, you can configure a distributed availability group with them.
A
traditional availability group has resources configured in a Windows Server
Failover Cluster (WSFC) or if on Linux, Pacemaker. A distributed availability
group doesn't configure anything in the underlying cluster (WSFC or Pacemaker).
Everything about it is maintained within SQL Server. To learn how to view
information for a distributed availability group, see Viewing distributed availability group information.
A
distributed availability group requires that the underlying availability groups
have a listener. Rather than provide the underlying server name for a
standalone instance (or in the case of a SQL Server failover cluster instance
[FCI], the value associated with the network name resource) as you would with a
traditional availability group, you specify the configured listener for the
distributed availability group with the parameter ENDPOINT_URL when you create
it. Although each underlying availability group of the distributed availability
group has a listener, a distributed availability group has no listener.
The
following figure shows a high-level view of a distributed availability group
that spans two availability groups (AG 1 and AG 2), each configured on its own
WSFC. The distributed availability group has a total of four replicas, with two
in each availability group. Each availability group can support up to the
maximum number of replicas, so a distributed availability group can have up to
18 total replicas.
You can configure the data movement in
distributed availability groups as synchronous or asynchronous. However, data
movement is slightly different within distributed availability groups compared
to a traditional availability group. Although each availability group has a
primary replica, there is only one copy of the databases participating in a
distributed availability group that can accept inserts, updates, and deletions.
As shown in the following figure, AG 1 is the primary availability group. Its
primary replica sends transactions to both the secondary replicas of AG 1 and
the primary replica of AG 2. The primary replica of AG 2 is also known as
a forwarder. A forwarder is a primary replica in a secondary
availability group in a distributed availability group. The forwarder receives
transactions from the primary replica in the primary availability group and
forwards them to the secondary replicas in its own availability group. The
forwarder then keeps the secondary replicas of AG 2 updated.
The only way to make AG 2's primary
replica accept inserts, updates, and deletions, is to manually fail over the
distributed availability group from AG 1. In the preceding figure, because AG 1
contains the writeable copy of the database, issuing a failover makes AG 2 the
availability group that can handle inserts, updates, and deletions. For
information about how to fail over one distributed availability group to
another, see Failover to a
secondary availability group.
SQL Server 2025
(17.x) introduces the following changes:
SQL
Server 2025 (17.x) introduces a change to the internal synchronization
mechanism for distributed availability groups to improve synchronization
performance by reducing network saturation when the forwarder replica is in
asynchronous commit mode. This change is enabled by default and doesn't require
any configuration.
SQL Server 2025 (17.x) introduces
support for a distributed
contained availability group. If you intend to use a contained AG as
the forwarder in a distributed availability group, you must create the
contained AG by using the AUTOSEEDING_SYSTEM_DATABASES clause for
the WITH | CONTAINED option of the CREATE
AVAILABILITY GROUP command.
Distributed availability groups in SQL
Server 2017 or later can mix major versions of SQL Server in the same
distributed availability group. The AG containing read/write primary can be the
same version or lower than the other AGs participating in the distributed AG.
The other AGs can be the same version or higher. This scenario is targeted to
upgrade and migration scenarios. For example, if the AG containing the
read/write primary replica is SQL Server 2016, but you want to upgrade/migrate
to SQL Server 2017 or later, the other AG participating in the distributed AG
can be configured with SQL Server 2017.
Because the distributed availability
groups feature didn't exist in SQL Server 2012 or 2014, availability groups
that were created with these versions can't participate in distributed
availability groups.
Because there are two separate
availability groups, the process of installing a service pack or cumulative
update on a replica that's participating in a distributed availability group is
slightly different from that of a traditional availability group:
A distributed availability group spans
multiple availability groups, each on its own underlying WSFC, and a
distributed availability group is a SQL Server-only construct. This means the
WSFCs that house the individual availability groups can have different major
versions of Windows Server. The major versions of SQL Server must be the same,
as discussed in the previous section. Much like the initial figure, the
following figure shows AG 1 and AG 2 participating in a distributed
availability group, but each of the WSFCs is a different version of Windows
Server.
The individual WSFCs and their
corresponding availability groups follow traditional rules. That is, they can
be joined to a domain or not joined to a domain (Windows Server 2016 or later).
When two different availability groups are combined in a single distributed
availability group, there are four scenarios:
When both WSFCs are joined to the same
domain (not trusted domains), you don't need to do anything special when you
create the distributed availability group. For availability groups and WSFCs
that aren't joined to the same domain, use certificates to make the distributed
availability group work, much in the way that you might create an availability
group for a domain-independent availability group. To see how to configure
certificates for a distributed availability group, follow steps 3-13 under Create a domain-independent availability group.
With a distributed availability group,
the primary replicas in each underlying availability group must have each
other's certificates. If you already have endpoints that aren't using
certificates, reconfigure those endpoints by using ALTER
ENDPOINT to reflect the use of certificates.
Here are the three main usage scenarios
for a distributed availability group:
A traditional availability group
requires that all servers be part of the same WSFC, which can make spanning
multiple data centers challenging. The following figure shows what a
traditional multi-site availability group architecture looks like, including
the data flow. There is one primary replica that sends transactions to all
secondary replicas. This configuration is less in some ways than a distributed
availability group. For example, you must implement such things as Active
Directory (if applicable) and the witness for a quorum in the WSFC. You might
also need to take into account other aspects of a WSFC, such as altering node
votes.
Distributed availability groups offer a
more flexible deployment scenario for availability groups that span multiple
data centers. You can even use distributed availability groups where features
such as log shipping were
used in the past for scenarios such as disaster recovery. However, unlike log
shipping, distributed availability groups can't have delayed application of
transactions. This means that availability groups or distributed availability
groups can't help in the event of human error in which data is incorrectly
updated or deleted.
Distributed availability groups are
loosely coupled, which in this case means that they don't require a single WSFC
and they're maintained by SQL Server. Because the WSFCs are maintained
individually and the synchronization is primarily asynchronous between the two
availability groups, it's easier to configure disaster recovery at another
site. The primary replicas in each availability group synchronize their own
secondary replicas.
Because distributed availability groups
support two completely different availability group configurations, they enable
not only easier disaster-recovery and multi-site scenarios, but also migration
scenarios. Whether you're migrating to new hardware or virtual machines
(on-premises or IaaS in the public cloud), configuring a distributed
availability group allows a migration to occur where, in the past, you might
have used backup, copy, and restore, or log shipping.
The ability to migrate is especially
useful in scenarios where you're changing or upgrading the underlying OS while
you keep the same SQL Server version. Although Windows Server 2016 does allow a
rolling upgrade from Windows Server 2012 R2 on the same hardware, most users
choose to deploy new hardware or virtual machines.
To complete the migration to the new
configuration, at the end of the process, stop all data traffic to the original
availability group, and change the distributed availability group to
synchronous data movement. This action ensures that the primary replica of the
second availability group is fully synchronized, so there would be no data
loss. After you've verified the synchronization, fail over the distributed
availability group to the secondary availability group. For more information,
see Fail over to
a secondary availability group.
Post-migration, where the second
availability group is now the new primary availability group, you might need to
do either of the following steps:
During a migration scenario, while it's
possible to configure a distributed AG to migrate your databases to a SQL
Server target that is a higher version than the source, there are a few
limitations.
When you configure the distributed AG
with a SQL Server migration target that is a higher version than the source,
auto seeding isn't supported so the seeding mode must be set to MANUAL. If
you don't disable AUTO-SEEDING, your migration will fail and you'll see error
946 "Cannot open database 'Distribution AG' version xxx. Upgrade the
database to the latest version" in the error log. You must set seeding
mode to MANUAL and manually perform a full and transaction log backup of the
source database from the primary AG. Then manually restore it, along with the
transaction log, to the secondary AG. To learn more, review the manual
seeding steps to configure your distributed AG, and scripts to
back up and restore your database from the primary AG to the secondary AG.
Assuming the secondary AG (AG2) is the
migration target and is a higher version than the primary AG (AG1), consider
the following limitations:
A single distributed availability group
can have up to 16 secondary replicas, as needed. So it can have up 18 copies
for reading, including the two primary replicas of the different availability
groups. This approach means that more than one site can have near-real-time
access for reporting to various applications.
Distributed availability groups can help
you scale out a read-only farm more than you can with just a single
availability group. A distributed availability group can scale out readable
replicas in two ways:
In other words, a primary replica can
participate in different distributed availability groups. The following figure
shows AG 1 and AG 2 both participating in Distributed AG 1, while AG 2 and AG 3
are participating in Distributed AG 2. The primary replica (or forwarder) of AG
2 is both a secondary replica for Distributed AG 1 and a primary replica of
Distributed AG 2.
The following figure shows AG 1 as the
primary replica for two different distributed availability groups: Distributed
AG 1 (composed of AG 1 and AG2) and Distributed AG 2 (composed of AG 1 and AG
3).
In both preceding examples, there can be
up to 27 total replicas across the three availability groups, all of which can
be used for read-only queries.
Read-only
routing doesn't completely work with Distributed Availability
Groups. More specifically,
Distributed availability groups were
designed with automatic
seeding to be the main method used to initialize the primary
replica on the second availability group. A full database restore on the
primary replica of the second availability group is possible if you do the
following:
When you add the second availability
group's primary replica to the distributed availability group, the replica is
checked against the first availability group's primary databases, and automatic
seeding catches the database up to the source. There are a few caveats:
Monitor health
All detailed information about a
distributed availability group is in SQL Server, specifically in the
availability-group dynamic management views. Currently, the only information
shown in SQL Server Management Studio for a distributed availability group is
on the primary replica for the availability groups. As shown in the following
figure, under the Availability Groups folder, SQL Server Management Studio
shows that there is a distributed availability group. The figure shows AG1 as a
primary replica for an individual availability group that's local to that
instance, not for a distributed availability group.
However, if you right-click the
distributed availability group, no options are available (see the following
figure), and the expanded Availability Databases, Availability Group Listeners,
and Availability Replicas folders are all empty. SQL Server Management Studio
16 displays this result, but it might change in a future version of SQL Server
Management Studio.
As shown in the following figure,
secondary replicas show nothing in SQL Server Management Studio related to the
distributed availability group. These availability group names map to the roles
shown in the previous CLUSTER_A WSFC image.
The same concepts hold true when you use
the dynamic management views. By using the following query, you can see all the
availability groups (regular and distributed) and the nodes participating in
them. This result is displayed only if you query the primary replica in one of
the WSFCs that are participating in the distributed availability group. There
is a new column in the dynamic management view sys. Availability groups named is
distributed, which is 1 when the availability group is a distributed
availability group. To see this column:
An example of output from the second
WSFC that's participating in a distributed availability group is shown in the
following figure. SPAG1 is composed of two replicas: DENNIS and JY. However,
the distributed availability group named SPD ist AG has the names of the two
participating availability groups (SPAG1 and SPAG2) rather than the names of
the instances, as with a traditional availability group.
In SQL Server Management Studio, any
status shown on the Dashboard and other areas, are for local synchronization
only within that availability group. To display the health of a distributed
availability group, query the dynamic management views. The following example
query extends and refines the previous query:
To further extend the previous query,
you can also see the underlying performance via the dynamic management views by
adding in sys.dm hadr database replicas states. The dynamic management
view currently stores information about the second availability group only. The
following example query, run on the primary availability group, produces the
sample output shown below:
The below query displays performance
counters associated with the specific distributed availability group.
The below query displays a wealth of
information about the health of both the availability group, and the
distributed availability group. (Reproduced with permission from Tracy
Boggiano.)
The below queries will display
information about endpoint URLs used by the availability groups, including the
distributed availability group. (Reproduced with permission from David
Barbarin.)
The below query displays information
about the current state of seeding. This is useful for troubleshooting
synchronization errors between replicas. (Reproduced with permission
from David
Barbarin.)
Years of Experience
Gratified Students
Training Batches
Training Hours
Please subscribe our technical blog to get recent updates.