Database Mirroring in SQL Server 2005 - My research for a customer
One of my customers was working on a geographically dispersed HA solution for 2006, and they wondered how SQL Server 2005 Database Mirroring would play into their decision. In order to give an intelligent analysis, I began working with Database Mirroring. I spent a week at Platform Labs in Columbus, Ohio setting up and testing Mirroring. This is what I found.
The Test Environment
SQL Server 2005 Beta 2 August release.
Windows 2003 with some hotfixes
Server1
• 2 Procs
• 4GB RAM
• 500GB Direct Attach Storage
• Mirror Principal and Repl Publisher
Server2
• 2 Procs
• 4GB RAM
• 500GB DA Storage
• Mirror and Repl Subscriber
Server3
• 1 Proc
• 2GB RAM
• 18GB Storage
• Witness and Repl Distributor
The network was a 100Mb LAN connected through a Cisco switch.
Setup
4 databases were restored from a SQL Server 2000 server. The upgrade worked great (Thanks MSFT!). I then tried to restore the SQL2k backups on the mirror and the setup mirroring. Of course, this did not work. Don't try this.
Next I backed up the SQL2k5 DBs and restored them to the mirror partner WITH NORECOVERY. Thanks to Skip Shaw for helping me through this part and for giving me some scripts he'd used in his Beta 1 review!
Microsoft has created a new feature called Endpoints, which are the underpinning of server to server communications for Mirroring and Service Broker I believe, too. We created endpoints on all 3 servers with similar script to the following.
CREATE ENDPOINT endpoint_mirroring
STATE = STARTED
AS TCP ( LISTENER_PORT = 5022 ) --different ports for each server
FOR DATABASE_MIRRORING (ROLE=PARTNER) --change to role = witness for that server
The next step was to create the mirroring session. This is where we started to run into problems. The scripts from BOL did not work as advertised. We received error messages stating that the mirror was not ready or database could not be found. Reluctantly, we went to the new Beta 2 GUI, something a good DBA should never do, right? It was a life saver. Microsoft did a pretty good job here. Using the GUI, the first database mirroring session was started by just following the wizard. It recreated the endpoints and setup all the necessary partnerships.
Being the curious person that I am, I wondered what the GUI was doing that I wasn't with the scripts, so I fired up Profiler and captured while I set it up again. After some head scratching, I couldn't see any differences. The GUI has the Mojo I guess!?!
Since there were 4 databases to work with, I started setting up mirroring for the next one in line using the magical GUI. It created the mirroring session, but it disrupted the original database mirroring session (Hmm. Strange). Profiler was still running so I peeked to see what was going on. An ALTER ENDPOINT command was issued, and this seemed to stop and restart the previous session. Very undesirable! Fortunately, the databases came into a synchronized state and we moved on in the same fashion with the other databases. Finally, we had all 4 databases running in high availability mirroring sessions with a witness. To see the state of data mirrors look in sysdatabases specifically at the columns with mirror in the name.
Unfortunately, the good state of the mirrors didn't last long. Without any provocation, a database failed over to the mirror. The rest of the databases disconnected and reconnected sporadically. In a few minutes there were 2 databases on one partner, 1 database on the original principal, and 1 database that was in disconnected state. Chaos! What if this was production and the application required all the databases to be together? EEEK!
I was able to recover from this state by manually failing over some databases and restarting the services on the mirror. Again, very undesirable!
Wanting to eliminate variables, we decided to scale back the test and remove the witness from the mix. This is still synchronous and the 2 servers work it out between themselves. Simpler, right? After setting up mirroring again on all 4 databases, the results were much of the same. The databases were unstable and failed over without any intervention. At one point, we actually could not regain control of one database. Both nodes thought the other owned it. We had to drop the database and completely start from scratch with it. EEK again!
Continuing once again to eliminate variables, I decided to go down to Asynchronous Mirroring, which is not much more than good ol' log shipping, but with granularity to the transaction instead of the entire log. Success! This configuration seemed to be rock solid!!! We were able to manually fail over the databases and back again. We were able to run transactions and get timings for latency. Life was very good!
Replication
Trying to recreate a similar environment to that of the customer's, I setup transactional replication from the Server1 to Server2 with Server3 as the distributor. This too was fairly easy. MSFT has done good things with repl in SQL2k5, but the GUI could use some work (more in another post later). The goal of this was to see how Repl would recover to a failed over mirror. As you might guess, the log reader failed when trying to connect to a mirrored partner in a mirroring session. Not a bad thing, but it would be great to have it fail-over too. This might be asking too much for version 1.0, though.
Performance
With databases on Server1 mirrored to Server2 and replication still intact, we began to take timings of some simple transactions. SELECT statements were not affected in any way by mirroring. An UPDATE of 40k rows took about 1 second. We were rather impressed that performance was so good.
In order to compare Asynchronous Mirroring to Synchronous, we again tried to setup 1 database where we had done the timings in a 2 server Synchronous Mirror, which was successful. Again SELECT statements were not affected, but the 40k row UPDATE took 28 seconds, much longer than the Asynchronous mode. Being so surprised by this, we tried other transactions, receiving significant performance degradation with Asynchronous mode. While we expected some degradation, we did not expect it to be this large. The more disturbing part was that it was on a high speed LAN. What would happen on a WAN?
Conclusion
Database mirroring has some extremely great applications for organizations that need low cost high availability. It has really great implications for rolling system upgrades when automatic application failover is in place. However, it looks like Microsoft still has some work to do to make Synchronous Mirroring as robust as Clustering is today. The Asynchronous mode seems to work very well, and I will be recommending it to the customer. The good news is that this is still Beta 2, and there is much time for Microsoft to get this corrected. I look forward to future releases to see how they are progressing.
The Test Environment
SQL Server 2005 Beta 2 August release.
Windows 2003 with some hotfixes
Server1
• 2 Procs
• 4GB RAM
• 500GB Direct Attach Storage
• Mirror Principal and Repl Publisher
Server2
• 2 Procs
• 4GB RAM
• 500GB DA Storage
• Mirror and Repl Subscriber
Server3
• 1 Proc
• 2GB RAM
• 18GB Storage
• Witness and Repl Distributor
The network was a 100Mb LAN connected through a Cisco switch.
Setup
4 databases were restored from a SQL Server 2000 server. The upgrade worked great (Thanks MSFT!). I then tried to restore the SQL2k backups on the mirror and the setup mirroring. Of course, this did not work. Don't try this.
Next I backed up the SQL2k5 DBs and restored them to the mirror partner WITH NORECOVERY. Thanks to Skip Shaw for helping me through this part and for giving me some scripts he'd used in his Beta 1 review!
Microsoft has created a new feature called Endpoints, which are the underpinning of server to server communications for Mirroring and Service Broker I believe, too. We created endpoints on all 3 servers with similar script to the following.
CREATE ENDPOINT endpoint_mirroring
STATE = STARTED
AS TCP ( LISTENER_PORT = 5022 ) --different ports for each server
FOR DATABASE_MIRRORING (ROLE=PARTNER) --change to role = witness for that server
The next step was to create the mirroring session. This is where we started to run into problems. The scripts from BOL did not work as advertised. We received error messages stating that the mirror was not ready or database could not be found. Reluctantly, we went to the new Beta 2 GUI, something a good DBA should never do, right? It was a life saver. Microsoft did a pretty good job here. Using the GUI, the first database mirroring session was started by just following the wizard. It recreated the endpoints and setup all the necessary partnerships.
Being the curious person that I am, I wondered what the GUI was doing that I wasn't with the scripts, so I fired up Profiler and captured while I set it up again. After some head scratching, I couldn't see any differences. The GUI has the Mojo I guess!?!
Since there were 4 databases to work with, I started setting up mirroring for the next one in line using the magical GUI. It created the mirroring session, but it disrupted the original database mirroring session (Hmm. Strange). Profiler was still running so I peeked to see what was going on. An ALTER ENDPOINT command was issued, and this seemed to stop and restart the previous session. Very undesirable! Fortunately, the databases came into a synchronized state and we moved on in the same fashion with the other databases. Finally, we had all 4 databases running in high availability mirroring sessions with a witness. To see the state of data mirrors look in sysdatabases specifically at the columns with mirror in the name.
Unfortunately, the good state of the mirrors didn't last long. Without any provocation, a database failed over to the mirror. The rest of the databases disconnected and reconnected sporadically. In a few minutes there were 2 databases on one partner, 1 database on the original principal, and 1 database that was in disconnected state. Chaos! What if this was production and the application required all the databases to be together? EEEK!
I was able to recover from this state by manually failing over some databases and restarting the services on the mirror. Again, very undesirable!
Wanting to eliminate variables, we decided to scale back the test and remove the witness from the mix. This is still synchronous and the 2 servers work it out between themselves. Simpler, right? After setting up mirroring again on all 4 databases, the results were much of the same. The databases were unstable and failed over without any intervention. At one point, we actually could not regain control of one database. Both nodes thought the other owned it. We had to drop the database and completely start from scratch with it. EEK again!
Continuing once again to eliminate variables, I decided to go down to Asynchronous Mirroring, which is not much more than good ol' log shipping, but with granularity to the transaction instead of the entire log. Success! This configuration seemed to be rock solid!!! We were able to manually fail over the databases and back again. We were able to run transactions and get timings for latency. Life was very good!
Replication
Trying to recreate a similar environment to that of the customer's, I setup transactional replication from the Server1 to Server2 with Server3 as the distributor. This too was fairly easy. MSFT has done good things with repl in SQL2k5, but the GUI could use some work (more in another post later). The goal of this was to see how Repl would recover to a failed over mirror. As you might guess, the log reader failed when trying to connect to a mirrored partner in a mirroring session. Not a bad thing, but it would be great to have it fail-over too. This might be asking too much for version 1.0, though.
Performance
With databases on Server1 mirrored to Server2 and replication still intact, we began to take timings of some simple transactions. SELECT statements were not affected in any way by mirroring. An UPDATE of 40k rows took about 1 second. We were rather impressed that performance was so good.
In order to compare Asynchronous Mirroring to Synchronous, we again tried to setup 1 database where we had done the timings in a 2 server Synchronous Mirror, which was successful. Again SELECT statements were not affected, but the 40k row UPDATE took 28 seconds, much longer than the Asynchronous mode. Being so surprised by this, we tried other transactions, receiving significant performance degradation with Asynchronous mode. While we expected some degradation, we did not expect it to be this large. The more disturbing part was that it was on a high speed LAN. What would happen on a WAN?
Conclusion
Database mirroring has some extremely great applications for organizations that need low cost high availability. It has really great implications for rolling system upgrades when automatic application failover is in place. However, it looks like Microsoft still has some work to do to make Synchronous Mirroring as robust as Clustering is today. The Asynchronous mode seems to work very well, and I will be recommending it to the customer. The good news is that this is still Beta 2, and there is much time for Microsoft to get this corrected. I look forward to future releases to see how they are progressing.
