Tuesday, March 27, 2012

Error 17883 The Scheduler 0 appears to be hung

I get this message every time i try to take a backup. Both manually and via
the Scheduler. What is the problem?
--
Best wishesSee if these help:
http://support.microsoft.com/kb/815056
http://support.microsoft.com/kb/810885
-oj
"Mats" <Mats@.discussions.microsoft.com> wrote in message
news:289BEB45-7BDF-47DD-937F-0425DB689549@.microsoft.com...
>I get this message every time i try to take a backup. Both manually and via
> the Scheduler. What is the problem?
> --
> Best wishes|||Well I do not know. The funny thing is that I get the same message if I
reboot the system ansd take a manual backup. The system get stuck takes
almost all of the processor power. I have then to kill enterprise manager.
Then the process lies there with the backup and I can not kill it. It
diseappears only with reboot.The reboot at that time is also extremely
difficult you have to shut off by the power button as the whole machine is
stucked. I then see the message in the error log.
If I take the backup via the scheduler nothing happends it only adds
processes until next reboot and then I see the message.
The system ran fine for over a month and then this suddenly happened. The
machine by the way is a two processor Dell.
"oj" wrote:

> See if these help:
> http://support.microsoft.com/kb/815056
> http://support.microsoft.com/kb/810885
>
> --
> -oj
>
> "Mats" <Mats@.discussions.microsoft.com> wrote in message
> news:289BEB45-7BDF-47DD-937F-0425DB689549@.microsoft.com...
>
>|||Backups should not spike your CPUs. My guess is that you have another
problem on this server and it's hanging SQL. You should engage MS support
and have them review 17883 minidumps. You need to be on build 818 (MS03-031
patch) or higher to generate good minidumps.
Adrian
"Mats" <Mats@.discussions.microsoft.com> wrote in message
news:3EB13F27-EDF3-47E6-9CBB-94A8960CAC20@.microsoft.com...[vbcol=seagreen]
> Well I do not know. The funny thing is that I get the same message if I
> reboot the system ansd take a manual backup. The system get stuck takes
> almost all of the processor power. I have then to kill enterprise manager.
> Then the process lies there with the backup and I can not kill it. It
> diseappears only with reboot.The reboot at that time is also extremely
> difficult you have to shut off by the power button as the whole machine is
> stucked. I then see the message in the error log.
> If I take the backup via the scheduler nothing happends it only adds
> processes until next reboot and then I see the message.
> The system ran fine for over a month and then this suddenly happened. The
> machine by the way is a two processor Dell.
> "oj" wrote:
>|||I have just installed 818. How do I do a minidump then?
"Adrian Zajkeskovic" wrote:

> Backups should not spike your CPUs. My guess is that you have another
> problem on this server and it's hanging SQL. You should engage MS support
> and have them review 17883 minidumps. You need to be on build 818 (MS03-03
1
> patch) or higher to generate good minidumps.
> Adrian
>
> "Mats" <Mats@.discussions.microsoft.com> wrote in message
> news:3EB13F27-EDF3-47E6-9CBB-94A8960CAC20@.microsoft.com...
>
>|||Backups should not cause this but could trigger behavior that you don't
expect.
Perhaps this will help.
The 17883 is a generic message (often referred to as the service engine
light) indicating that the UMS scheduling mechanism has detected a
worker that is not properly yielding to other workers within SQL Server.
The message was added in SQL 2000 SP3 but a later build (8.00.818
Security patch) is required to get extended diags. When the error is
first encountered a mini-dump is generated capturing the state of the
non-yielding thread. The mini-dump can be used by SQL support to
determine where in the code the SQL Server is non-yielding.
Since SP3 Microsoft has corrected or protected the engine in approx 25
places in the SQL Server code base that may not yield to other workers
properly, diminishing concurrency of the database engine. That is why
you find so many references to upgrade to a later QFE. I generally
recommend going to 8.00.997 to obtain the majority of these corrections
if you have encountered the 17883 on SP3 (8.00.760) where the mini dump
is not generated so the cause can not be determined.
Future versions of SQL Server are working to make these mini-dumps
Watson enabled so they can be directly submitted to the Watson site and
automated responses generated.
The problems have ranged from SQL Server bugs to getting stuck in API
calls that should return quickly but from time to time do not. For
example if we make a call to a security API and the PDC/BDC does not
return a response for 60 seconds (unusual and unexpected) the worker
ties up the specific UMS scheduler.
Do not be mislead, the worker is still an NT thread or fiber and must
honor NT scheduling. The design of SQL Server (UMS scheduling) is such
that only a single worker on a given scheduler can be scheduled by the
OS at any point in time. As each completes major phases or work
(natural yield points) the UMS scheduler coordinates the scheduling of
another worker on the same scheduler. It is rather simple, each worker
does a WaitForSingleObject with in INFINITE timeout so NT does not see
them as a viable worker to schedule. When a the active worker finishes
the unit of work it signals the next worker and waits on its event and
the cycle continues. Without going into more detail in this thread it
is a very simplistic non-preemptive system inside the SQL Server setup
to maximize resource usage and increase scalability. There is not need
for a lock waiter to do anything until the lock owner signals it to
execute so use the CPU for something more meaningful.
Bob Dorr
Microsoft SQL Server Escalation Support - Senior EE
*** Sent via Developersdex http://www.codecomments.com ***sql

No comments:

Post a Comment