Friday, September 10, 2010

"!CLRStack -p" Isn't Always Reliable

One of the most commonly used commands in SOS is !CLRStack. When combined with the -p switch, SOS will attempt to display the values of the parameters that were passed to the functions in our managed call-stack. It's important to emphasis that SOS will only attempt to display the correct values, since sometimes its just going to get it all wrong.

In the case where SOS comes to the conclusion that it cannot retrieve the value of a parameter, the string will be displayed. This happens at situations where SOS honestly can't track the value of the parameter by just looking at a specific stack frame. For example, if we're using a calling convention such as fast call, the first two parameters (starting from the left) that can be contained in a register, will be passed in the ECX and EDX registers instead of being pushed onto the stack. For member functions, the value of the this pointer is usually passed in the ECX register.
This kind of behavior can lead into situations where the values of some of the function parameters may be missing since the registers we're already overridden by other functions that we're called down the stack.
As opposed to situations where SOS is able to come to the conclusion that it isn't sure what is the correct value of a parameter, every once in a while it's just going to get things wrong, and display incorrect values. This obviously could terribly mislead the person who attempts to debug the application.
The thing that might be the most concerning about this phenomena, is that it's not very hard to reproduce. Let's take the following application as an example:

    class Program
    {
        static void Main(string[] args)
        {
            Foo a = new Foo();
            a.Work(1, 2, 3, 4, 5);
        }
    }
 
    class Foo
    {
        [MethodImpl(MethodImplOptions.NoInlining)]
        public void Work(int x, int y, int z, int k, int p)
        {
            // break with debugger here
        }
    }
Now we'll run WinDbg, load SOS and see what the !CLRStack command has to say about the parameters that we're passed to the Work method.
(Note: The output might slightly vary, depending on your version of SOS. The following debugging session was perform on version 4.0.30319.1).

0:000> !clrstack -p
OS Thread Id: 0xfbc (0)
Child SP IP       Call Site
0012f3fc 030300b3 ConsoleApplication1.Foo.Work(Int32, Int32, Int32, Int32, Int32) [Program.cs @ 24]
    PARAMETERS:
        this () = 0x00b3c358
        x () = 0x00000001
        y (0x0012f40c) = 0x00000003
        z (0x0012f408) = 0x00000004
        k (0x0012f404) = 0x00000005
        p (0x0012f400) = 0x03030092
0012f414 03030092 ConsoleApplication1.Program.Main(System.String[]) [Program.cs @ 16]
    PARAMETERS:
        args =
0012f648 791421db [GCFrame: 0012f648]

0:000> r ecx  // holds "this" pointer
ecx=00b3c358
0:000> r edx  // holds "x" parameter
edx=00000001
If so, it isn't difficult to see that besides the this pointer and the x parameter (that we're passed as registers), SOS got all of the other parameters wrong. In fact, one may notice in a "shift" that was performed on the displayed values (y got the value of z, while z got the value of k, and so on...).
In order to better understand what went on here, we'll print the memory between the relevant stack pointers (in previous versions, this value was represented by the ESP column instead of the Child SP column).

0:000> dp /c 1 0012f3fc  0012f414
0012f3fc  0012f414 // EBP
0012f400  03030092 // Return Address
0012f404  00000005 // p 
0012f408  00000004 // k
0012f40c  00000003 // z
0012f410  00000002 // y
0012f414  0012f424 // EBP

Now, when we compare the stack-addresses of the parameters that SOS gave us, against the real addresses that we see here, we can confirm that a small address shift was performed to the memory addresses of the parameters that we're passed on the stack. So every time SOS attempts to display the value of a parameter, it actually displays the value of the parameter that was passed next to it.
This scenario is a classic example to the bit "buggy" nature of SOS. It doesn't mean that we have to immediatly stop using the !CLRStack command, but it should to remind us not to take SOS's output as the "absolute truth" when debugging, and just keep ourselves alert for all kind of "weird" behaviors such as this one.

No comments:

Post a Comment