I support a two node cluster running on Windows Server 2003 (x86) hosting 8 SQL Server 2005 instances. I upgraded a new instance to SP4 today, but the installation froze. Logs showed that the installation had completed successfully so after 2-hours, I kill the installer, and I attempted to bring the engine service online through cluster manager. The engine service wouldn't start and the service shows in cluster manager as being in a failed state.
When I try to bring the engine service online through cluster manager, the service never shows online and after a few moments again appears failed. If I start the service manually using the following, it starts fine:
net start "Machine\Foo"
But, when started like this, it doesn't show as online in cluster manager.
It appears that when started thru cluster manager, the instance comes online in single user mode. Once started some process connects to the instance, and as result, cluster manager's "alive" pollers can't connect, cause a failover which fails (that's another story), and the instance stays in a failed state.
I've confirmed that the startup parameters for the instance don't include a "-m". I've verfiied the startup parameters for the instance in the registry. Why is the cluster starting in single user mode when started from cluster manager?
ERRORLOG
2013-01-25 18:47:35.92 Server Microsoft SQL Server 2005 - 9.00.5000.00 (Intel X86)
Dec 10 2010 10:56:29
Copyright (c) 1988-2005 Microsoft Corporation
Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2)
(c) 2005 Microsoft Corporation.
All rights reserved.
Server process ID is 6140.
Authentication mode is MIXED.
Logging SQL Server messages in file 'L:\Microsoft SQL Server\MSSQL.8\MSSQL\LOG\ERRORLOG'.
This instance of SQL Server last reported using a process ID of 3564 at 1/25/2013 6:39:43 PM (local) 1/25/2013 11:39:43 PM (UTC). This is an informational message only; no user action is required.
Registry startup parameters:
-d L:\Microsoft SQL Server\MSSQL.8\MSSQL\DATA\master.mdf
-e L:\Microsoft SQL Server\MSSQL.8\MSSQL\LOG\ERRORLOG
-l L:\Microsoft SQL Server\MSSQL.8\MSSQL\DATA\mastlog.ldf
SQL Server is starting at normal priority base (=7). This is an informational message only. No user action is required.
Detected 16 CPUs. This is an informational message; no user action is required.
Set AWE Enabled to 1 in the configuration parameters to allow use of more memory.
Using dynamic lock allocation. Initial allocation of 2500 Lock blocks and 5000 Lock Owner blocks per node. This is an informational message only. No user action is required.
Lock partitioning is enabled. This is an informational message only. No user action is required.
Attempting to initialize Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
Attempting to recover in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC). This is an informational message only. No user action is required.
Database mirroring has been enabled on this instance of SQL Server.
Starting up database 'master'.
Recovery is writing a checkpoint in database 'master' (1). This is an informational message only. No user action is required.
SQL Trace ID 1 was started by login "sa".
Starting up database 'mssqlsystemresource'.
The resource database build version is 9.00.5000. This is an informational message only. No user action is required.
Server name is 'Machine\Foo'. This is an informational message only. No user action is required.
Starting up database 'model'.
The NETBIOS name of the local node that is running the server is 'NodeName'. This is an informational message only. No user action is required.
Clearing tempdb database.
A self-generated certificate was successfully loaded for encryption.
Server is listening on [ 192.168.0.1 <ipv4> 1860].
Server local connection provider is ready to accept connection on [
\\.\pipe\SQLLocal\Foo ].
Server local connection provider is ready to accept connection on [
\\.\pipe\$$\Machine\MSSQL$Foo\sql\query ].
Starting up database 'tempdb'.
The SQL Network Interface library could not register the Service Principal Name (SPN) for the SQL Server service. Error: 0x2098, state: 15. Failure to register an SPN may cause integrated authentication to fall back to NTLM instead of Kerberos. This is an informational
message. Further action is only required if Kerberos authentication is required by authentication policies.
SQL Server is now ready for client connections. This is an informational message; no user action is required.
Starting up database 'DB3'.
Starting up database 'DB1'.
Starting up database 'DB2'.
Starting up database 'msdb'.
The Service Broker protocol transport is disabled or not configured.
The Database Mirroring protocol transport is disabled or not configured.
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
...
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
Analysis of database 'DB3' (7) is 100% complete (approximately 0 seconds remain). This is an informational message only. No user action is required.
CHECKDB for database 'DB3' finished without errors on 2012-12-13 18:01:36.530 (local time). This is an informational message only; no user action is required.
CHECKDB for database 'DB1' finished without errors on 2012-12-13 18:01:36.890 (local time). This is an informational message only; no user action is required.
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
CHECKDB for database 'DB2' finished without errors on 2012-12-13 18:01:38.013 (local time). This is an informational message only; no user action is required.
Recovery of any in-doubt distributed transactions involving Microsoft Distributed Transaction Coordinator (MS DTC) has completed. This is an informational message only. No user action is required.
Recovery is complete. This is an informational message only. No user action is required.
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
....
Error: 18456, Severity: 14, State: 11.
Login failed for user 'domain\srvaccount'. [CLIENT: 192.168.0.1]
Service Broker manager has shut down.
SQL Server is terminating in response to a 'stop' request from Service Control Manager. This is an informational message only. No user action is required.
SQL Trace was stopped due to server shutdown. Trace ID = '1'. This is an informational message only; no user action is required.
Adam