How to get and decode callstacks in in extended events for recent versions of SQL Server
- Tipstricks , Troubleshooting
- February 17, 2018
Table of Contents
The other day I was reading about spinlocks and troubleshooting and it mentioned to get the callstack for a certain xevent and to decode it using the windows debugging symbols, however, that material was related to SQL Server 2012 and the steps weren’t basically replicabile for a modern version like SQL 2016/2017 (or even 2014 I think, but haven’tn checked personally), I searched around and didn’t really find something straight forward for just the process as it is now: So, here we are, small guide on how to get and decode callstacks in recent versions of SQL Server.
Xevent SetUp - Get the callstack
In my case, I wanted to read the callstack (i.e. the last 16 frames inside the SQLServer process) when a SOS_SCHEDULER_YIELD wait event occurred, there was no practical reason, just to see how things work, as callstacks are needed only in very edge cases scenarios for serious troubleshooting. I created the following event:
CREATE EVENT SESSION [WaitCalllstack] ON SERVER
ADD EVENT sqlos.wait_info(
ACTION(package0.callstack)
WHERE ([wait_type]=(123)))
ADD TARGET package0.ring_buffer
WITH (MAX_MEMORY=4096 KB,EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS,MAX_DISPATCH_LATENCY=30 SECONDS,MAX_EVENT_SIZE=0 KB,MEMORY_PARTITION_MODE=NONE,TRACK_CAUSALITY=OFF,STARTUP_STATE=OFF)
GO
ALTER EVENT SESSION [WaitCallstack] ON SERVER
STATE = START;
GO
Of course I used the GUI because I’m lazy, but the code isn’t too bad either this time as we’re only acting on a sqlos.wait_info filtered by wait type = 123 (which is SOS_SCHEDULER_YIELD), getting the callstack and saving it to the ring buffer. Launch the event, wait a couple of seconds and you’ll start to see the following in the ring buffer:
<event name="wait_info" package="sqlos" timestamp="2018-02-16T23:08:35.919Z">
<data name="wait_type">
<type name="wait_types" package="sqlos"></type>
<value>123</value>
<text><![CDATA[SOS_SCHEDULER_YIELD]]></text>
</data>
<data name="opcode">
<type name="event_opcode" package="sqlos"></type>
<value>1</value>
<text><![CDATA[End]]></text>
</data>
<data name="duration">
<type name="uint64" package="package0"></type>
<value>0</value>
</data>
<data name="signal_duration">
<type name="uint64" package="package0"></type>
<value>0</value>
</data>
<data name="wait_resource">
<type name="ptr" package="package0"></type>
<value>0x0000000000000000</value>
</data>
<action name="callstack" package="package0">
<type name="callstack" package="package0"></type>
<value><![CDATA[sqldk.dll+0x0000000000105685
sqldk.dll+0x0000000000002211
sqldk.dll+0x0000000000001F3F
sqldk.dll+0x0000000000011047
sqldk.dll+0x0000000000010F3F
sqldk.dll+0x000000000000AA59
sqldk.dll+0x00000000000025BF
sqldk.dll+0x0000000000007F64
sqldk.dll+0x00000000000B36CF
sqldk.dll+0x00000000000B087A
sqldk.dll+0x00000000000B0054
sqlservr.exe+0x0000000000012F5D
sqldk.dll+0x0000000000004E7D
sqldk.dll+0x0000000000005378
sqldk.dll+0x0000000000004FED
sqldk.dll+0x0000000000030C38
sqldk.dll+0x0000000000030D30
sqldk.dll+0x0000000000030857
sqldk.dll+0x0000000000031049
KERNEL32.DLL+0x0000000000011FE4
ntdll.dll+0x000000000006EFC1]]>
</value>
</action>
</event>
Of course we’re interested in the callstack section of the XML output, which at the moment is pretty much useless as far as understanding what’s going on goes; So, next step, decode it.
Getting the right symbols
Decoding the callstack has actually gotten easier than the old days.
Get WinDbg via the Windows SDK tools: To do so, go to this page and download the latest SDK for Windows, you don’t need to install the whole SDK, only the debugging tools can be installed by selecting only “Debugging Tools for Windows” as in this screenshot:
Once installed you’ll be able to open the memory dump files(*.mdmp) created by sqldumper like a real pro Microsoft employee, but that’s not our scope right now. Remember where you just installed the debugging tools, since you’ll need one of them right now, specifically symchk.exe, which in my case was installed in “E:\Program Files (x86)\Windows Kits\10\Debuggers\x64\symchk.exe” Next, identify the folder where SQL Server is installed and get to the \Binn directory (where the executable and dll are), in my case it was “E:\Program Files\SQL Server\MSSQL14.SQL2017\MSSQL\Binn” Now, you need to use symchk to scan all the DLL and the sqlservr.exe file and download the appropriate debugging symbols from Microsoft, it’s pretty simple, using the command prompt:
cd "E:\Program Files\SQL Server\MSSQL14.SQL2017\MSSQL\Binn"
“E:\Program Files (x86)\Windows Kits\10\Debuggers\x64\symchk.exe” sqlservr.exe /s SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
“E:\Program Files (x86)\Windows Kits\10\Debuggers\x64\symchk.exe” *.dll /s SRV*c:\symbols*http://msdl.microsoft.com/download/symbols
FYI, this syntax works in the command prompt but not in Powershell, you’ll have to fix it accordingly, but it’s pretty easy that we’re using the tool to donwload the symbols from microsoft for the sqlservr executable and all the DLL included in the Binn folder.
The Twist
Perfect! now let’s get back to SSMS, restart the event session, get the new results and - nothing changed.
WHY?
Because SQL Server doesn’t want to lookup debugging symbols all day long, so you have to explicitly activate a trace flag to do so, of course this trace flag may hit your server pretty hard, so, use it only sparingly and never in production.
ALTER EVENT SESSION [WaitCallstack] ON SERVER
STATE = STOP;
GO
DBCC TRACEON (3656,-1)
GO
ALTER EVENT SESSION [WaitCallstack] ON SERVER
STATE = START;
GO
Now, you finally get the decoded frames:
<event name="wait_info" package="sqlos" timestamp="2018-02-16T23:54:44.612Z">
<data name="wait_type">
<type name="wait_types" package="sqlos"></type>
<value>123</value>
<text><![CDATA[SOS_SCHEDULER_YIELD]]></text>
</data>
<data name="opcode">
<type name="event_opcode" package="sqlos"></type>
<value>0</value>
<text><![CDATA[Begin]]></text>
</data>
<data name="duration">
<type name="uint64" package="package0"></type>
<value>0</value>
</data>
<data name="signal_duration">
<type name="uint64" package="package0"></type>
<value>0</value>
</data>
<data name="wait_resource">
<type name="ptr" package="package0"></type>
<value>0x0000000000000000</value>
</data>
<action name="callstack" package="package0">
<type name="callstack" package="package0"></type>
<value><![CDATA[sqldk.dll!SOS_DispatcherBase::GetTrack+0x7d05
sqldk.dll!SOS_Task::PreWait+0x136
sqldk.dll!Worker::OSYieldNoAbort+0x103
sqldk.dll!MemoryNode::GetPagesAllocated+0x5cd
sqldk.dll!HRESULT_FROM_SOS_RESULT_Uncommon+0x181b
sqldk.dll!HRESULT_FROM_SOS_RESULT_Uncommon+0x1209
sqldk.dll!ResourceMonitor::ResourceMonitorTask+0x202
sqlservr.exe!_____SQL______Process______Available+0x80dd
sqldk.dll!SOS_Task::SetDeadlockPriority+0xfd
sqldk.dll!SOS_Task::SetDeadlockPriority+0x5f8
sqldk.dll!SOS_Task::SetDeadlockPriority+0x26d
sqldk.dll!SystemThread::MakeMiniSOSThread+0xd38
sqldk.dll!SystemThread::MakeMiniSOSThread+0xe30
sqldk.dll!SystemThread::MakeMiniSOSThread+0x957
sqldk.dll!SystemThread::MakeMiniSOSThread+0x1149
KERNEL32.DLL!BaseThreadInitThunk+0x14
ntdll.dll!RtlUserThreadStart+0x21]]>
</value>
</action>
</event>
Don’t forget to stop the session and disable the traceflag when you’ve finished!
ALTER EVENT SESSION [WaitCallstack] ON SERVER
STATE = STOP;
GO
DBCC TRACEOFF (3656,-1)
GO