Tuesday, January 11, 2005

Database Mirroring in SQL Server 2005 - My research for a customer

One of my customers was working on a geographically dispersed HA solution for 2006, and they wondered how SQL Server 2005 Database Mirroring would play into their decision. In order to give an intelligent analysis, I began working with Database Mirroring. I spent a week at Platform Labs in Columbus, Ohio setting up and testing Mirroring. This is what I found.

The Test Environment
SQL Server 2005 Beta 2 August release.
Windows 2003 with some hotfixes

Server1
• 2 Procs
• 4GB RAM
• 500GB Direct Attach Storage
• Mirror Principal and Repl Publisher

Server2
• 2 Procs
• 4GB RAM
• 500GB DA Storage
• Mirror and Repl Subscriber

Server3
• 1 Proc
• 2GB RAM
• 18GB Storage
• Witness and Repl Distributor

The network was a 100Mb LAN connected through a Cisco switch.

Setup
4 databases were restored from a SQL Server 2000 server. The upgrade worked great (Thanks MSFT!). I then tried to restore the SQL2k backups on the mirror and the setup mirroring. Of course, this did not work. Don't try this.

Next I backed up the SQL2k5 DBs and restored them to the mirror partner WITH NORECOVERY. Thanks to Skip Shaw for helping me through this part and for giving me some scripts he'd used in his Beta 1 review!

Microsoft has created a new feature called Endpoints, which are the underpinning of server to server communications for Mirroring and Service Broker I believe, too. We created endpoints on all 3 servers with similar script to the following.

CREATE ENDPOINT endpoint_mirroring
STATE = STARTED
AS TCP ( LISTENER_PORT = 5022 ) --different ports for each server
FOR DATABASE_MIRRORING (ROLE=PARTNER) --change to role = witness for that server


The next step was to create the mirroring session. This is where we started to run into problems. The scripts from BOL did not work as advertised. We received error messages stating that the mirror was not ready or database could not be found. Reluctantly, we went to the new Beta 2 GUI, something a good DBA should never do, right? It was a life saver. Microsoft did a pretty good job here. Using the GUI, the first database mirroring session was started by just following the wizard. It recreated the endpoints and setup all the necessary partnerships.

Being the curious person that I am, I wondered what the GUI was doing that I wasn't with the scripts, so I fired up Profiler and captured while I set it up again. After some head scratching, I couldn't see any differences. The GUI has the Mojo I guess!?!

Since there were 4 databases to work with, I started setting up mirroring for the next one in line using the magical GUI. It created the mirroring session, but it disrupted the original database mirroring session (Hmm. Strange). Profiler was still running so I peeked to see what was going on. An ALTER ENDPOINT command was issued, and this seemed to stop and restart the previous session. Very undesirable! Fortunately, the databases came into a synchronized state and we moved on in the same fashion with the other databases. Finally, we had all 4 databases running in high availability mirroring sessions with a witness. To see the state of data mirrors look in sysdatabases specifically at the columns with mirror in the name.

Unfortunately, the good state of the mirrors didn't last long. Without any provocation, a database failed over to the mirror. The rest of the databases disconnected and reconnected sporadically. In a few minutes there were 2 databases on one partner, 1 database on the original principal, and 1 database that was in disconnected state. Chaos! What if this was production and the application required all the databases to be together? EEEK!

I was able to recover from this state by manually failing over some databases and restarting the services on the mirror. Again, very undesirable!

Wanting to eliminate variables, we decided to scale back the test and remove the witness from the mix. This is still synchronous and the 2 servers work it out between themselves. Simpler, right? After setting up mirroring again on all 4 databases, the results were much of the same. The databases were unstable and failed over without any intervention. At one point, we actually could not regain control of one database. Both nodes thought the other owned it. We had to drop the database and completely start from scratch with it. EEK again!

Continuing once again to eliminate variables, I decided to go down to Asynchronous Mirroring, which is not much more than good ol' log shipping, but with granularity to the transaction instead of the entire log. Success! This configuration seemed to be rock solid!!! We were able to manually fail over the databases and back again. We were able to run transactions and get timings for latency. Life was very good!

Replication
Trying to recreate a similar environment to that of the customer's, I setup transactional replication from the Server1 to Server2 with Server3 as the distributor. This too was fairly easy. MSFT has done good things with repl in SQL2k5, but the GUI could use some work (more in another post later). The goal of this was to see how Repl would recover to a failed over mirror. As you might guess, the log reader failed when trying to connect to a mirrored partner in a mirroring session. Not a bad thing, but it would be great to have it fail-over too. This might be asking too much for version 1.0, though.

Performance
With databases on Server1 mirrored to Server2 and replication still intact, we began to take timings of some simple transactions. SELECT statements were not affected in any way by mirroring. An UPDATE of 40k rows took about 1 second. We were rather impressed that performance was so good.

In order to compare Asynchronous Mirroring to Synchronous, we again tried to setup 1 database where we had done the timings in a 2 server Synchronous Mirror, which was successful. Again SELECT statements were not affected, but the 40k row UPDATE took 28 seconds, much longer than the Asynchronous mode. Being so surprised by this, we tried other transactions, receiving significant performance degradation with Asynchronous mode. While we expected some degradation, we did not expect it to be this large. The more disturbing part was that it was on a high speed LAN. What would happen on a WAN?

Conclusion
Database mirroring has some extremely great applications for organizations that need low cost high availability. It has really great implications for rolling system upgrades when automatic application failover is in place. However, it looks like Microsoft still has some work to do to make Synchronous Mirroring as robust as Clustering is today. The Asynchronous mode seems to work very well, and I will be recommending it to the customer. The good news is that this is still Beta 2, and there is much time for Microsoft to get this corrected. I look forward to future releases to see how they are progressing.

Wednesday, January 05, 2005

SQL Server 2005 Customer Application Smoke Test Success!

The smoke test of in-house applications at one of my customers is complete, and it was a huge success! Kudos to Microsoft for making SQL Server 2005 so backward compatible. There were 2 minor issus (see previous Blog entries), but otherwise there were no changes made to the applications in order to run on SQL Server 2005. This was much better than any of us expected, or even hoped for.

SQL Server 2005 Deprecated Table Hint

While working with SQL Server 2005 CTP December 2004, I ran into a deprecated feature last week. It appears that Microsoft is no longer supporting the WITH (INDEX = "index_name") syntax. They have been saying since 7.0 to use WITH (INDEX("index_name"), but it appears that they really mean it this time. :)


Wednesday, December 29, 2004

System Table Security in SQL Server 2005

I ran into an interesting issue today while working with SQL Server 2005 Beta 2 CTP December 2004. It would appear that Microsoft has greatly restricted permissions, even SELECTing from master system tables. Here's the situation we had.

TableA in user DatabaseZ had an insert/update trigger to insert into TableB in DatabaseY, but only if the context_info from master..sysprocesses was not = 1, which was set to 1 if the application set the context_info when it logged in. (Yeah, I know this should really be added to http://thedailywtf.com/). When the trigger tried to SELECT from sysprocesses the user calling did not have permission to read the table, thus causing an error.

To correct this we (thanks to Bruce Nation and Peter Watson) changed the trigger to use the new context_info() function instead of selecting from sysprocesses.

We still have not determined if this is just a broken permissions chain or if MSFT just won't allow this. We were not able to GRANT to the system tables in master, and we weren't able to wrap them in views, SPs, or functions either. It just appears that MSFT doesn't want us using the system tables for ANYTHING anymore.

If anyone has additional insight into this problem, posts would be appreciated.

Monday, December 27, 2004

Error Renaming Server with SQL Server 2005 CTP December 2004

I have just recovered from an interesting error with SQL Server 2005 CTP December 2004. Part of the project at a customer is to "Smoke Test" the in-house applications against SQL Server 2005 as-is. To do this, we built a SQL Server 2005 server to be identical to an existing non-production SQL2000 server, swapped servers(rename), and re-pointed the applications for testing.

While renaming servers I ran into a problem where SQL Server 2005 would start, would not allow user connections of any kind, and quickly stopped. The only way I found to recover from it was to move the databases out of the install path and reinstall.

The error reported by SQL Server was TDSSNICLIENT INITIALIZATION FAILURE. Error: 17182, Severity: 16, State: 1. There were subsequent errors, but this was the first in the chain.

I could find no documentation on the error on the net or the documentation with the Beta. To recover, I reinstalled and moved the databases (master and all) into the install path and restarted SQL Server. I then performed the usual sp_dropserver/sp_addserver steps of renaming a server.

All is well, but I thought I would post this just in case anyone else runs across it.


Jon Baker

Thursday, December 16, 2004

Why did I start a blog?

It is probably bad etiquette to start a blog with a rant, but here goes

<rant>

Why are there so few blogs about SQL Server?!?!? Other than Microsoft employees and a few other well-connected people, I could find very little blogging about SQL Server on the big wide internet.

I actually went looking for a new source of SQL Server knowledge in the form of blogs, a good RSS reader, and the informal community of SQL Server developers/DBAs. I would never have been so pretentious as to think, "Gee, I should start a blog on SQL Server!" But with so little published work out there, I was compelled to create some work in this space.

If I have missed something and/or I AM being pretentious, someone please tell me!
</rant>

Future posts will include work that I'm doing at customer sites, research into SQL Server 2005, my presentations from the local SQL Server SIG, and anything else I think might be interesting to the SQL Server community. Please feel free to give me feedback on the things I write. I am a firm believer that debate is healthy, and I love to learn from others.