rr: lightweight recording & deterministic debugging

Start by using rr to record your application:

$ rr record /your/application --args
...
FAIL: oh no!

The entire execution, including the failure, was saved to disk. That recording can now be debugged.

$ rr replay
GNU gdb (GDB) ...
...
0x4cee2050 in _start () from /lib/ld-linux.so.2
(gdb)

Remember, you're debugging the recorded trace deterministically; not a live, nondeterministic execution. The replayed execution's address spaces, register contents, syscall data etc are exactly the same in every run.

Most of the common gdb commands can be used.

(gdb) break mozilla::dom::HTMLMediaElement::HTMLMediaElement
...
(gdb) continue
Continuing.
...
Breakpoint 1, mozilla::dom::HTMLMediaElement::HTMLMediaElement (this=0x61362f70, aNodeInfo=...)
...

If you need to restart the debugging session, for example because you missed breaking on some critical execution point, no problem. Just use gdb's run command to restart replay.

(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
...
Breakpoint 1, mozilla::dom::HTMLMediaElement::HTMLMediaElement (this=0x61362f70, aNodeInfo=...)
...
(gdb)

The run command started another replay run of your recording from the beginning. But after the session restarted, the same execution was replayed again. And all your debugging state was preserved across the restart.

Note that the this pointer of the dynamically-allocated object was the same in both replay sessions. Memory allocations are exactly the same in each replay, meaning you can hard-code addresses you want to watch.

Even more powerful is reverse execution. Suppose we're debugging Firefox layout:

Breakpoint 1, nsCanvasFrame::BuildDisplayList (this=0x2aaadd7dbeb0, aBuilder=0x7fffffffaaa0, aDirtyRect=..., aLists=...)
    at /home/roc/mozilla-inbound/layout/generic/nsCanvasFrame.cpp:460
460   if (GetPrevInFlow()) {
(gdp) p mRect.width
12000

We happen to know that that value is wrong. We want to find out where it was set. rr makes that quick and easy.

(gdb) watch -l mRect.width
(gdb) reverse-cont
Continuing.
Hardware watchpoint 2: -location mRect.width
Old value = 12000
New value = 11220
0x00002aaab100c0fd in nsIFrame::SetRect (this=0x2aaadd7dbeb0, aRect=...)
    at /home/roc/mozilla-inbound/layout/base/../generic/nsIFrame.h:718
718       mRect = aRect;

This combination of hardware data watchpoints with reverse execution is extremely powerful!

video

getting started

Build from source

Follow these instructions. Recommended if the packages don't work for you --- kernel changes and OS updates sometimes require rr changes.

Or in Fedora:

cd /tmp
wget https://github.com/rr-debugger/rr/releases/download/5.7.0/rr-5.7.0-Linux-$(uname -m).rpm
sudo dnf install rr-5.7.0-Linux-$(uname -m).rpm

Or in Ubuntu:

cd /tmp
wget https://github.com/rr-debugger/rr/releases/download/5.7.0/rr-5.7.0-Linux-$(uname -m).deb
sudo dpkg -i rr-5.7.0-Linux-$(uname -m).deb

Running rr

Follow the usage instructions to learn how to use rr.

If you're using rr to debug Firefox, you may find these setup instructions helpful. They cover how to use rr to record Firefox test suites.

background and motivation

rr's original motivation was to make debugging of intermittent failures easier. These failures are hard to debug because any given program run may not show the failure. We wanted to create a tool that would record program executions with low overhead, so you can record test executions until you see a failure, and then replay the failing execution repeatedly under a debugger until it has been completely understood.

We also hoped that deterministic replay would make debugging of any kind of bug easier. With normal debuggers, information you learn during the debugging session (e.g. the addresses of objects of interest, and the ordering of important events) often becomes obsolete when you have to rerun the testcase. With deterministic replay, that never needs to happen: your knowledge of what happens during the failing run increases monotonically.

Furthermore, since debugging is the process of tracing effects to their causes, it's much easier if your debugger can execute backwards in time. It's well-known that given a record/replay system which provides restartable checkpoints during replay, you can simulate reverse execution to a particular point in time by restoring the previous checkpoint and executing forwards to the desired point. So we hoped that if we built a low-overhead record-and-replay system that works well on the applications we care about (Firefox), we could build a really usable backend for gdb's reverse execution commands.

These goals have all been met. rr is not only a working tool, but it's being used regularly by developers on many large and small projects.

rr records a group of Linux user-space processes and captures all inputs to those processes from the kernel, plus any nondeterministic CPU effects performed by those processes (of which there are very few). rr replay guarantees that execution preserves instruction-level control flow and memory and register contents. The memory layout is always the same, the addresses of objects don't change, register values are identical, syscalls return the same data, etc.

Tools like fuzzers and randomized fault injectors become even more powerful when used with rr. Those tools are very good at triggering some intermittent failure, but it's often hard to reproduce that same failure again to debug it. With rr, the randomized execution can simply be recorded. If the execution failed, then the saved recording can be used to deterministically debug the problem.

rr lowers the cost of fixing bugs. rr helps produce higher-quality software for the same cost. rr also makes debugging more fun.

rr in context

Record-and-replay debugging is an old idea; many systems preceded rr. What makes rr different are the design goals:

Initial focus on Firefox. Many record and replay techniques require specific programming languages or don't scale well and thus can't handle Firefox --- or were just experimental and were never fleshed out. Firefox is a complex application, so given rr is useful for debugging Firefox, it is likely to be generally useful.
Deployability. rr runs on stock Linux kernels, on commodity hardware, and requires no system configuration changes. Many record and replay techniques require kernel changes. Many rely on running the OS in a virtual machine.
Low run-time overload. We want rr to replace gdb in your workflow. That means you need to start getting results with rr about as quickly as you would if you were using gdb. Low overhead also means less perturbation of tests.
Simplicity of design. We didn't have a lot of resources to develop rr, so we avoided approaches that rely on complex techniques such as dynamic binary instrumentation. This simplicity has also made rr more robust and lower overhead.

The overhead of rr depends on your application's workload. On Firefox test suites, rr's recording performance is quite usable. We see slowdowns down to ≤ 1.2x. A 1.2x slowdown means that if the suite takes 10 minutes to run by itself, it will take around 12 minutes to be recorded by rr. However, overhead can vary dramatically depending on the workload. For mostly-single-threaded programs, rr has much lower overhead than any competing record-and-replay system we know of.

limitations

rr …

emulates a single-core machine. So, parallel programs incur the slowdown of running on a single core. This is an inherent feature of the design.
cannot record processes that share memory with processes outside the recording tree. This is an inherent feature of the design. rr automatically disables features such as X shared memory for recorded processes to avoid this problem.
requires a reasonably modern x86 CPU or certain ARM CPUs (Apple M1+).
requires knowledge of every system call executed by the recorded processes. It already supports a wide range of syscalls — those needed by Firefox and other applications people have tackled with rr — but support isn't complete, so running rr on your application may uncover a syscall that needs to be implemented. Please file github issues for unsupported system calls.
sometimes needs to be updated in response to kernel changes, updates to system libraries, or new CPU families. If rr isn't working for you (and the above caveats do not apply), please file an issue.

rr

what rr does

rr aspires to be your primary C/C++ debugging tool for Linux, replacing — well, enhancing — gdb. You record a failure once, then debug the recording, deterministically, as many times as you want. The same execution is replayed every time.

rr also provides efficient reverse execution under gdb. Set breakpoints and data watchpoints and quickly reverse-execute to where they were hit.

rr works on real applications and is used by many developers to fix real bugs. It makes debugging hard bugs much easier, but also speeds up debugging of easy bugs.

the rr debugging experience