codelldb RFC/WIP: Support DAP disassemble request

Opening this up as a 'request for comments'. I had a quick go at implementing the dap disassemble request in CodeLLDB as I wanted something reliable to test Vimspector's disassemble view with.

Clearly this is a prototype. Would you be interested in a proper patch to support the DAP disassemble request?

CodeLLDB currently suppports a custom disassembly view and provides disassembly as "source" when debugging into objects with no sourceline info.

DAP now also has a disassemble request which, given a memory refernce from the stack trace, produces a set number of instructions from that address.

This is simple to implement based on the existing DisassembledRange.

WIP.

we don't return the exact number of instructions
we don't populate a lot of the optional fields
my-first-rust(TM)
no tests yet

Jan 25 '22 08:01 puremourning

Thanks! Yes, I'd like to implement native DAP disassembly support at some point, and gave it a try a while back, in fact. However, I was not able to satisfactorily resolve the question of how to handle disassembling backwards, and then I got busy, so that stuff is on hold for now. If you'd like to think about it, here's my branch.

Jan 25 '22 20:01 vadimcn

Thanks I’ll take a look

Jan 25 '22 20:01 puremourning

not able to satisfactorily resolve the question of how to handle disassembling backwards

I'm not sure I fully followed this. Are you referring to something like a negative instructionOffset in the disassemble request? I can see how that can be tricky, especially on intel/cisc systems with variable length instructions.

One idea springs to mind:

map the current address to a source line
disassemble from the previous source line's load_addr up to the current one, count the instructions
repeat until we have enough, or the source's symbol (function) changes... or something
if we get to the end and more were requested, pad with NOPs

This seems like it might be possible in theory. Need to look at the api for the practice though. I think it might be possible by using the SBCompileUnit directly. WDYT?

That aside, for now, I took your branch and added source line info to the disassembled instructions and tat seems to work with my (extremely limited) client implementation. I'll try and dig through the LLDB api to see if there's anything we can do about negative instruction offsets, but would just bailing out and not supporting that be an option?

Feb 03 '22 22:02 puremourning

Are you referring to something like a negative instructionOffset in the disassemble request?

Yes, that.

but would just bailing out and not supporting that be an option?

Don't think so. First and foremost, this is a VSCode extension, and VSCode's implementation of disassembly view uses negative offsets extensively.

disassemble from the previous source line's load_addr up to the current one, count the instructions

This will likely break in release builds: the optimizer may rearrange instructions such that they are not longer in line order. Also, disassembling must be able to function without any debug info whatsoever.

I can think of two methods:

If the binary has debug info, find which function the current PC address is in, then disassemble starting from beginning of that function until PC is reached. Not sure what corner cases are there... one that comes to mind is that functions don't have to occupy a continuous range. For example profile-guided optimization may split a function into hot/cold parts and put these in different code sections.
If current PC does not belong to any function, one can simply start disassembling at PC-x bytes, see if any invalid instructions were encountered, and whether PC ended up at the beginning of an instruction. Otherwise, try again at PC-x-1, and so on.
But it's probably possible to start in the middle of an instruction and get a bogus, but valid-looking instruction stream between PC-x and PC, though likelihood of that goes down the larger x is.

I expect that a robust implementation will require quite a bit of research and experimentation.

...I bet there is a blog post or a mailing list discussion somewhere on the internet which has all the tips and tricks, because the problem is definitely not new. However so far I've been unsuccessful in locating it :man_shrugging:

Feb 04 '22 03:02 vadimcn

This branch has conflicts that must be resolved

Jul 12 '22 00:07 micwoj92

I still have this on my TODO list by the way. I notice that vscode-cpptools seems to support a negative offset so it might be possible to reverse engineer what they do and pick it up again. Just need that "free" time people keep talking about :)

Oct 05 '22 11:10 puremourning

OK, so this is what MIEngine does:

        private async Task<DisasmInstruction[]> VerifyDisassembly(DisasmInstruction[] instructions, ulong startAddress, ulong endAddress, ulong targetAddress)
        {
            if (startAddress > targetAddress || targetAddress > endAddress)
            {
                return instructions;
            }
            var originalInstructions = instructions;
            int count = 0;
            while (instructions != null && (instructions.Length == 0 || Array.Find(instructions, (i)=>i.Addr == targetAddress) == null) && count < _process.MaxInstructionSize)
            {
                count++;
                startAddress--;         // back up one byte
                instructions = await Disassemble(_process, startAddress, endAddress); // try again
            }
            return instructions == null ? originalInstructions : instructions;
        }

So basically:

try to disassemble MaxSizeOfOneInstruction * instructionCount+1 bytes starting address - MaxSizeOfOneInstruction * -instructionOffset
if that results in a set of addresses that does not include an instruction starting at address (presumably, because it's invalid), then:
- add 1 to the number of bytes in the range
- decrement the start address by 1 byte
- and repeat.

I don't love it, but I also don't hate it. What do you think?

FWIW this is what they do to calculate the "MaxSizeOfOneInstruction", which was my next question :)

        public void SetTargetArch(TargetArchitecture arch)
        {
            switch (arch)
            {
                case TargetArchitecture.ARM:
                    MaxInstructionSize = 4;
                    Is64BitArch = false;
                    break;

                case TargetArchitecture.ARM64:
                    MaxInstructionSize = 8;
                    Is64BitArch = true;
                    break;

                case TargetArchitecture.X86:
                    MaxInstructionSize = 20;
                    Is64BitArch = false;
                    break;

                case TargetArchitecture.X64:
                    MaxInstructionSize = 26;
                    Is64BitArch = true;
                    break;

                case TargetArchitecture.Mips:
                    MaxInstructionSize = 4;
                    Is64BitArch = false;
                    break;

                default:
                    throw new ArgumentOutOfRangeException(nameof(arch));
            }
        }

Oct 06 '22 14:10 puremourning

well, believe it or not, it works.

I'll tidy it up a bit and push a new PR.

Oct 06 '22 16:10 puremourning

What happens if startAddress lands in the middle of an instruction, such that the trailing bytes just happen to encode a valid instruction?

Oct 07 '22 00:10 vadimcn

If the start address happens to be mid-instruction and that resolves to a valid instruction then one of a few things might happen:

After reading the first "bogus" instruction, the stream is no longer interprettable and the likelihood of a valid instruction appearing in the stream at the exact requested base address is very low, so we would reject it and move back a byte.
After reading the first "bogus" instruction, the "new" interpretation happens to end on the same byte location as the next valid instruction run the stream. we would then return one bogus instruction followed by N valid (correct) instructions. In all likelihood, this invalid instruction would then be chopped off the front. The reason for this is that we must return the exact number of requested instructions, and due to seeking backwards M * the MAX instruction size, we always overshoot and have to re-centre the result.

I need to craft some careful test cases around this. Sorry if the above explanation is not very clear. My WIP commit message is below and the change is here - it's still WIP and the code is terrible, but hopefully you get the idea:

Disassembly for negative instruction offsets

For a negative instruction offset, we have a challenge: what _byte_
position should we start disassembling at? For ARM this seems
fairly simple (all instructions are 4 bytes), but is complicated by
thumb which uses a mix of 2 and 4 byte instructions. X64 on the other
hand has technically unlimited instruction size (though in practice 15
bytes is the maximum).

We therefore can't just assume that we can offset the base address by
some fixed number of bytes and get the exact number of instructions we
want. Instead, we have to attempt to find a valid address, then
re-center the resulting instruction list around the requested base
address.

The way this works is as follows:

* If the instruction offset is positive or zero, LLDB gives us a
  specific call to read a set number of instructions, so we use that,
  padding with invalids if we underflow.
* Otherwise, for a negative instruction offset:
  1. Guess a start address as base_address - instruction_offset * 16
  2. Disassemble from there for instruction_count * 16 bytes
  3. Check to see if the resulting set of instructions contains an
     instruction whose address matches our base_address. If not, move 1
     byte further back and try again. Do this up to 16 times and we
     should find an address which is the start of an instruction
     (assuming we're actually still in a code segment...)
  4. Pad or truncate the start of the instruction list so that the
     base_address instruction is at the expected location in the list.
* Slice and pad the disassembled instructions so that we have exactly
  instruction_count entries, as required by the protocol.

Oct 07 '22 09:10 puremourning

Hello, I was taking a look at these changes to enable disassemble requests. I was wondering, what's the difference between read_instructions() (used for positive offsets) and read_memory() + get_instructions() (used for negative offsets)? Couldn't we use read_instructions() in both cases? Sorry for the dumb question. Thanks

Dec 04 '22 18:12 eloparco

LLDB API doesn’t provide a way to do a negative offset read. This is also more complex due to the variable length of instructions in x86 (hence the read memory gymnastics)

see explanation here https://github.com/vadimcn/vscode-lldb/pull/627#issuecomment-1271324798

Dec 04 '22 19:12 puremourning

I'm writing a custom extension and your implementation is being a good guidance!

Still, I'm having problems when VS Code asks for a large offset (e.g. -200) for my small program and disassemble_byte_range() attempts to read memory outside the current stack frame. When that happens, a few initial instructions (outside the current stack frame) are read but then it exits in advance (https://github.com/vadimcn/vscode-lldb/blob/master/adapter/src/disassembly.rs#L256) without retrieving the instructions from the current stack frame.

Is there anything I'm missing?

Dec 05 '22 17:12 eloparco

Could be a bug. Please can you raise an issue with steps to repro using codelldb and I can take a look.

Dec 05 '22 18:12 puremourning

Actually, in codelldb it works fine. That's why I was wondering where that case is handled in codelldb code.

I'm trying to do something similar but using the VS Code embedded Open Disassembly View. I implemented a similar logic to what you've done but I'm bumping into problems since, after apply the negative offset requested by VS Code, I end up outside the current stack frame (i.e. disassemble_byte_range() returns some instructions outside the current stack frame).

I need to add a check on the start address but I didn't find any easy way to retrieve the stack frame start address from lldb.

Dec 06 '22 00:12 eloparco

The code for handing the disassemble request is here https://github.com/vadimcn/vscode-lldb/blob/master/adapter/src/debug_session.rs#L1134. I don’t know anything about vscode.

Dec 06 '22 08:12 puremourning

Yes, that's what I was already looking at and using as a reference. Maybe in your case you receive valid offsets, so my issues (i.e. offset resulting in reading memory out of current stack frame) doesn't show up. I'll dig into it.

Dec 06 '22 10:12 eloparco

I'm really struggling to understand what you're asking for. If you think the above code doesn't work in some scenario, I'm happy to look into that. I can assure you that I tested negative offsets that go outside the definition of the current "function". Even outside the binary image. I'm not sure what "stack frame" per se has to do with it. Disassembly is just taking a chunk of memory and trying to interpret that byte stream as instructions. Often the memory isn't actually instructions and you get various forms of invalid (or NOP) instruction instead. The idea of the above code is that it tries to determine a valid start address by heuristically disassembling various bytes (up to one instruction width back from the calculated start address) and looking to see if it "looks" valid. The stack is really not involved unless the code location happens to be very close to where the stack is in memory.

Dec 06 '22 10:12 puremourning

Your implementation works perfectly, I was having a problem on my side. Thanks for the reply!

Dec 08 '22 01:12 eloparco

codelldb codelldb copied to clipboard

RFC/WIP: Support DAP disassemble request

codelldb
codelldb copied to clipboard