Fork me on GitHub
WELCOME TO RR
WELCOME TO RR
WELCOME TO RR
WELCOME TO RR
WELCOME TO RR

Last updated Fri Oct 18 17:19

rr

rr records nondeterministic executions and debugs them deterministically

rr aspires to be your primary debugging tool, replacing — well, enhancing — gdb. You record a failure once, then debug the recording, deterministically, as many times as you want. Every time the same execution is replayed.

the rr debugging experience

Start by using rr to record your application:

$ rr record /your/application --args
...
FAIL: oh no!
      

The entire execution, including the failure, was saved to disk. That recording can now be debugged.

$ rr replay
GNU gdb (GDB) ...
...
0x4cee2050 in _start () from /lib/ld-linux.so.2
(gdb)
      

Remember, you're debugging the recorded trace deterministically; not a live, nondeterministic execution. The replayed execution's address spaces, register contents, syscall data etc are exactly the same in every run.

Most of the common gdb commands can be used.

(gdb) break mozilla::dom::HTMLMediaElement::HTMLMediaElement
...
(gdb) continue
Continuing.
...
Breakpoint 1, mozilla::dom::HTMLMediaElement::HTMLMediaElement (this=0x61362f70, aNodeInfo=...)
...
      

If you need to restart the debugging session, for example because you missed breaking on some critical execution point, no problem. Just use gdb's run command to restart replay.

(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
...
Breakpoint 1, mozilla::dom::HTMLMediaElement::HTMLMediaElement (this=0x61362f70, aNodeInfo=...)
...
(gdb) 
      

The run command started another replay run of your recording from the beginning. But after the session restarted, the same execution was replayed again. And all your debugging state was preserved across the restart.

Note that the the this pointer of the dynamically-allocated object was the same in both replay sessions. Memory allocations are exactly the same in each replay, meaning you can hard-code addresses you want to watch.

This video shows a quick demo of rr recording and replaying Firefox.

This video demonstrates rr's basic capabilities in a bit more detail.

getting started

fedora

cd /tmp
wget http://rr-project.org/releases/rr-2.0.0-Linux-$(uname -m).rpm
sudo rpm -i rr-2.0.0-Linux-$(uname -m).rpm
      

ubuntu

cd /tmp
wget http://rr-project.org/releases/rr-2.0.0-Linux-$(uname -m).deb
sudo dpkg -i rr-2.0.0-Linux-$(uname -m).deb
      

or build from source

Follow these instructions.

run rr

Follow the usage instructions to set up your machine (if necessary) and learn how to use rr.

If you're using rr to debug Firefox, you may find these special setup instructions helpful. They cover how to build a 32-bit Firefox on a 64-bit OS, and how to use rr to record Firefox test suites.

background and motivation

Everyone who's worked on a nontrivial application (like Firefox) has gone through the pain of debugging an intermittently-reproducible bug. Since nontrivial applications are nondeterministic, each execution is different, and you may require 5, 10, or even 100 runs just to see the bug manifest.

It's hard to debug these bugs with traditional techniques because single stepping, setting breakpoints, inspecting program state, etc, is all a waste of time if the program execution you're debugging ends up not even exhibiting the bug. Even when you can reproduce the bug consistently, important information such as the addresses of suspect objects is unpredictable from run to run. Given that software developers spend a lot of time finding and fixing bugs, nondeterminism has a major impact on their work.

And there are intermittent bugs that are so hard to reproduce that they're literally not the worth the time to fix with traditional techniques. However, for big projects like Firefox with its half-billion users, a bug that only reproduces 1 out of 10,000 test runs can still have a negative impact on users.

rr solves these problems by splitting debugging into two phases: first recording, in which the application's execution history is saved; then deterministic debugging of the saved trace: using gdb to control replay of the trace, as many times as you want.

The saved execution history captures all nondeterminism in the program's execution. By replaying that trace in the right way, rr guarantees each debugging session is entirely deterministic. The memory layout is always the same, the addresses of objects don't change, register values are identical, syscalls return the same data, etc.

The benefit to developers is obvious: an intermittent bug can be recorded by a script over lunchtime, say, and then debugged at leisure in the afternoon. Multiple cores can be used in parallel to record failures. If you accidentally set a breakpoint in the wrong place and miss gathering critical information, your precious intermittent failure isn't lost. Just fix your breakpoint and then tell gdb to run the recording back from the beginning again. Even for easily reproducible bugs, a repeatable, deterministic, debugging session is a powerful tool on top of traditional debugging.

And for projects like Firefox which run literally millions of tests a day on a vast build and test infrastructure, intermittent failures in those test runs can be recorded on the infrastructure itself and then deterministically debugged at some later time, offline.

Tools like fuzzers and randomized fault injectors become even more powerful when used with rr. Those tools are very good at triggering some intermittent failure, but it's often hard to reproduce that same failure again to debug it. With rr, the randomized execution can simply be recorded. If the execution failed, then the saved recording can be used to deterministically debug the problem.

So rr lowers the cost of fixing intermittent bugs. This allows a new class of bugs to be fixed with the same amount of engineering time and money, which in turn produces higher-quality software for the same cost.

Deterministic debugging is an old idea; many systems have preceded rr. What makes rr different, in our opinion, are the design goals:

The overhead of rr depends on your application's workload. On Firefox test suites, rr's recording performance is quite usable. We see slowdowns down to ≤ 1.2x. A 1.2x slowdown means that if the suite takes 10 minutes to run by itself, it will take around 12 minutes to be recorded by rr. However, different test suites have different performance characteristics, so they have different overheads as well.

limitations

Some of rr's limitations are inherent, and some will be removed in future releases.

rr …

further reference

This presentation provides an overview of the rr implementation and is meant for potential rr developers. There are some bonus slides intended to introduce rr to record/replay researchers.

The rr wiki contains pages that cover technical topics related to rr.

More information about rr will be posted in the future.