Welcome to Tor codebase’s documentation!

Overview

This document describes the general structure of the Tor codebase, how it fits together, what functionality is available for extending Tor, and gives some notes on how Tor got that way.

Tor remains a work in progress: We’ve been working on it for more than a decade, and we’ve learned a lot about good coding since we first started. This means, however, that some of the older pieces of Tor will have some “code smell” in them that could sure stand a brisk refactoring. So when I describe a piece of code, I’ll sometimes give a note on how it got that way, and whether I still think that’s a good idea.

The first drafts of this document were written in the Summer and Fall of 2015, when Tor 0.2.6 was the most recent stable version, and Tor 0.2.7 was under development. If you’re reading this far in the future, some things may have changed. Caveat haxxor!

This document is not an overview of the Tor protocol. For that, see the design paper and the specifications at https://spec.torproject.org/ .

For more information about Tor’s coding standards and some helpful development tools, see doc/HACKING in the Tor repository.

For more information about writing tests, see doc/HACKING/WritingTests.txt in the Tor repository.

The very high level

Ultimately, Tor runs as an event-driven network daemon: it responds to network events, signals, and timers by sending and receiving things over the network. Clients, relays, and directory authorities all use the same codebase: the Tor process will run as a client, relay, or authority depending on its configuration.

Tor has a few major dependencies, including Libevent (used to tell which sockets are readable and writable), OpenSSL (used for many encryption functions, and to implement the TLS protocol), and zlib (used to compress and uncompress directory information).

Most of Tor’s work today is done in a single event-driven main thread. Tor also spawns one or more worker threads to handle CPU-intensive tasks. (Right now, this only includes circuit encryption.)

On startup, Tor initializes its libraries, reads and responds to its configuration files, and launches a main event loop. At first, the only events that Tor listens for are a few signals (like TERM and HUP), and one or more listener sockets (for different kinds of incoming connections). Tor also configures a timer function to run once per second to handle periodic events. As Tor runs over time, other events will open, and new events will be scheduled.

The codebase is divided into a few main subdirectories:

src/common – utility functions, not necessarily tor-specific.

src/or – implements the Tor protocols.

src/test – unit and regression tests

src/ext – Code maintained elsewhere that we include in the Tor source distribution.

src/trunnel – automatically generated code (from the Trunnel) tool: used to parse and encode binary formats.

Some key high-level abstractions

The most important abstractions at Tor’s high-level are Connections, Channels, Circuits, and Nodes.

A ‘Connection’ represents a stream-based information flow. Most connections are TCP connections to remote Tor servers and clients. (But as a shortcut, a relay will sometimes make a connection to itself without actually using a TCP connection. More details later on.) Connections exist in different varieties, depending on what functionality they provide. The principle types of connection are “edge” (eg a socks connection or a connection from an exit relay to a destination), “OR” (a TLS stream connecting to a relay), “Directory” (an HTTP connection to learn about the network), and “Control” (a connection from a controller).

A ‘Circuit’ is persistent tunnel through the Tor network, established with public-key cryptography, and used to send cells one or more hops. Clients keep track of multi-hop circuits, and the cryptography associated with each hop. Relays, on the other hand, keep track only of their hop of each circuit.

A ‘Channel’ is an abstract view of sending cells to and from a Tor relay. Currently, all channels are implemented using OR connections. If we switch to other strategies in the future, we’ll have more connection types.

A ‘Node’ is a view of a Tor instance’s current knowledge and opinions about a Tor relay orbridge.

The rest of this document.

Note: This section describes the eventual organization of this document, which is not yet complete.

We’ll begin with an overview of the various utility functions available in Tor’s ‘common’ directory. Knowing about these is key to writing portable, simple code in Tor.

Then we’ll go on and talk about the main data-flow of the Tor network: how Tor generates and responds to network traffic. This will occupy a chapter for the main overview, with other chapters for special topics.

After that, we’ll mention the main modules in Tor, and describe the function of each.

We’ll cover the directory subsystem next: how Tor learns about other relays, and how relays advertise themselves.

Then we’ll cover a few specialized modules, such as hidden services, sandboxing, hibernation, accounting, statistics, guards, path generation, pluggable transports, and how they integrate with the rest of Tor.

We’ll close with a meandering overview of important pending issues in the Tor codebase, and how they affect the future of the Tor software.

Utility code in Tor

Most of Tor’s utility code is in modules in the src/common subdirectory.

These are divided, broadly, into compatibility functions, utility functions, containers, and cryptography. (Someday in the future, it would be great to split these modules into separate directories. Also, some functions are probably put in the wrong modules)

Compatibility code

These functions live in src/common/compat*.c; some corresponding macros live in src/common/compat*.h. They serve as wrappers around platform-specific or compiler-specific logic functionality.

In general, the rest of the Tor code should not be calling platform-specific or otherwise non-portable functions. Instead, they should call wrappers from compat.c, which implement a common cross-platform API. (If you don’t know whether a function is portable, it’s usually good enough to see whether it exists on OSX, Linux, and Windows.)

Other compatibility modules include backtrace.c, which generates stack traces for crash reporting; sandbox.c, which implements the Linux seccomp2 sandbox; and procmon.c, which handles monitoring a child process.

Parts of address.c are compatibility code for handling network addressing issues; other parts are in util.c.

Notable compatibility areas are:

  • mmap support for mapping files into the address space (read-only)
  • Code to work around the intricacies
  • Workaround code for Windows’s horrible winsock incompatibilities and Linux’s intricate socket extensions.
  • Helpful string functions like memmem, memstr, asprintf, strlcpy, and strlcat that not all platforms have.
  • Locale-ignoring variants of the ctypes functions.
  • Time-manipulation functions
  • File locking function
  • IPv6 functions for platforms that don’t have enough IPv6 support
  • Endianness functions
  • OS functions
  • Threading and locking functions.

=== Utility functions

General-purpose utilities are in util.c; they include higher-level wrappers around many of the compatibility functions to provide things like file-at-once access, memory management functions, math, string manipulation, time manipulation, filesystem manipulation, etc.

(Some functionality, like daemon-launching, would be better off in a compatibility module.)

In util_format.c, we have code to implement stuff like base-32 and base-64 encoding.

The address.c module interfaces with the system resolver and implements address parsing and formatting functions. It converts sockaddrs to and from a more compact tor_addr_t type.

The di_ops.c module provides constant-time comparison and associative-array operations, for side-channel avoidance.

The logging subsystem in log.c supports logging to files, to controllers, to stdout/stderr, or to the system log.

The abstraction in memarea.c is used in cases when a large amount of temporary objects need to be allocated, and they can all be freed at the same time.

The torgzip.c module wraps the zlib library to implement compression.

Workqueue.c provides a simple multithreaded work-queue implementation.

Containers

The container.c module defines these container types, used throughout the Tor codebase.

There is a dynamic array called smartlist, used as our general resizeable array type. It supports sorting, searching, common set operations, and so on. It has specialized functions for smartlists of strings, and for heap-based priority queues.

There’s a bit-array type.

A set of mapping types to map strings, 160-bit digests, and 256-bit digests to void *. These are what we generally use when we want O(1) lookup.

Additionally, for containers, we use the ht.h and tor_queue.h headers, in src/ext. These provide intrusive hashtable and linked-list macros.

Cryptography

Once, we tried to keep our cryptography code in a single “crypto.c” file, with an “aes.c” module containing an AES implementation for use with older OpenSSLs.

Now, our practice has become to introduce crypto_*.c modules when adding new cryptography backend code. We have modules for Ed25519, Curve25519, secret-to-key algorithms, and password-based boxed encryption.

Our various TLS compatibility code, wrappers, and hacks are kept in tortls.c, which is probably too full of Tor-specific kludges. I’m hoping we can eliminate most of those kludges when we finally remove support for older versions of our TLS handshake.

Memory management

Heap-allocation functions

Tor imposes a few light wrappers over C’s native malloc and free functions, to improve convenience, and to allow wholescale replacement of malloc and free as needed.

You should never use ‘malloc’, ‘calloc’, ‘realloc, or ‘free’ on their own; always use the variants prefixed with ‘tor_’. They are the same as the standard C functions, with the following exceptions:

  • tor_free(NULL) is a no-op.
  • tor_free() is a macro that takes an lvalue as an argument and sets it to NULL after freeing it. To avoid this behavior, you can use tor_free_() instead.
  • tor_malloc() and friends fail with an assertion if they are asked to allocate a value so large that it is probably an underflow.
  • It is always safe to tor_malloc(0), regardless of whether your libc allows it.
  • tor_malloc(), tor_realloc(), and friends are never allowed to fail. Instead, Tor will die with an assertion. This means that you never need to check their return values. See the next subsection for information on why we think this is a good idea.

We define additional general-purpose memory allocation functions as well:

  • tor_malloc_zero(x) behaves as calloc(1, x), except the it makes clear the intent to allocate a single zeroed-out value.
  • tor_reallocarray(x,y) behaves as the OpenBSD reallocarray function. Use it for cases when you need to realloc() in a multiplication-safe way.

And specific-purpose functions as well:

  • tor_strdup() and tor_strndup() behaves as the underlying libc functions, but use tor_malloc() instead of the underlying function.
  • tor_memdup() copies a chunk of memory of a given size.
  • tor_memdup_nulterm() copies a chunk of memory of a given size, then NUL-terminates it just to be safe.

Why assert on failure?

Why don’t we allow tor_malloc() and its allies to return NULL?

First, it’s error-prone. Many programmers forget to check for NULL return values, and testing for malloc() failures is a major pain.

Second, it’s not necessarily a great way to handle OOM conditions. It’s probably better (we think) to have a memory target where we dynamically free things ahead of time in order to stay under the target. Trying to respond to an OOM at the point of tor_malloc() failure, on the other hand, would involve a rare operation invoked from deep in the call stack. (Again, that’s error-prone and hard to debug.)

Third, thanks to the rise of Linux and other operating systems that allow memory to be overcommitted, you can’t actually ever rely on getting a NULL from malloc() when you’re out of memory; instead you have to use an approach closer to tracking the total memory usage.

Conventions for your own allocation functions.

Whenever you create a new type, the convention is to give it a pair of x_new() and x_free() functions, named after the type.

Calling x_free(NULL) should always be a no-op.

Grow-only memory allocation: memarea.c

It’s often handy to allocate a large number of tiny objects, all of which need to disappear at the same time. You can do this in tor using the memarea.c abstraction, which uses a set of grow-only buffers for allocation, and only supports a single “free” operation at the end.

Using memareas also helps you avoid memory fragmentation. You see, some libc malloc implementations perform badly on the case where a large number of small temporary objects are allocated at the same time as a few long-lived objects of similar size. But if you use tor_malloc() for the long-lived ones and a memarea for the temporary object, the malloc implementation is likelier to do better.

To create a new memarea, use memarea_new(). To drop all the storage from a memarea, and invalidate its pointers, use memarea_drop_all().

The allocation functions memarea_alloc(), memarea_alloc_zero(), memarea_memdup(), memarea_strdup(), and memarea_strndup() are analogous to the similarly-named malloc() functions. There is intentionally no memarea_free() or memarea_realloc().

Collections in tor

Smartlists: Neither lists, nor especially smart.

For historical reasons, we call our dynamic-allocated array type “smartlist_t”. It can grow or shrink as elements are added and removed.

All smartlists hold an array of void *. Whenever you expose a smartlist in an API you must document which types its pointers actually hold.

Smartlists are created empty with smartlist_new() and freed with smartlist_free(). See the containers.h module documentation for more information; there are many convenience functions for commonly needed operations.

Digest maps, string maps, and more.

Tor makes frequent use of maps from 160-bit digests, 256-bit digests, or nul-terminated strings to void *. These types are digestmap_t, digest256map_t, and strmap_t respectively. See the containers.h module documentation for more information.

Intrusive lists and hashtables

For performance-sensitive cases, we sometimes want to use “intrusive” collections: ones where the bookkeeping pointers are stuck inside the structures that belong to the collection. If you’ve used the BSD-style sys/queue.h macros, you’ll be familiar with these.

Unfortunately, the sys/queue.h macros vary significantly between the platforms that have them, so we provide our own variants in src/ext/tor_queue.h .

We also provide an intrusive hashtable implementation in src/ext/ht.h . When you’re using it, you’ll need to define your own hash functions. If attacker-induced collisions are a worry here, use the cryptographic siphash24g function to extract hashes.

Time in tor

What time is it?

We have several notions of the current time in Tor.

The wallclock time is available from time(NULL) with second-granularity and tor_gettimeofday() with microsecond granularity. It corresponds most closely to “the current time and date”.

The monotonic time is available with the set of monotime_* functions declared in compat_time.h. Unlike the wallclock time, it can only move forward. It does not necessarily correspond to a real world time, and it is not portable between systems.

The coarse monotonic time is available from the set of monotime_coarse_* functions in compat_time.h. It is the same as monotime_* on some platforms. On others, it gives a monotonic timer with less precision, but which it’s more efficient to access.

Cached views of time.

On some systems (like Linux), many time functions use a VDSO to avoid the overhead of a system call. But on other systems, gettimeofday() and time() can be costly enough that you wouldn’t want to call them tens of thousands of times. To get a recent, but not especially accurate, view of the current time, see approx_time() and tor_gettimeofday_cached().

Parsing and encoding time values

Tor has functions to parse and format time in these formats:

  • RFC1123 format. (“Fri, 29 Sep 2006 15:54:20 GMT”). For this, use format_rfc1123_time() and parse_rfc1123_time.
  • ISO8601 format. (“2006-10-29 10:57:20”) For this, use format_local_iso_time and format_iso_time. We also support the variant format “2006-10-29T10:57:20” with format_iso_time_nospace, and “2006-10-29T10:57:20.123456” with format_iso_time_nospace_usec.
  • HTTP format collections (preferably “Mon, 25 Jul 2016 04:01:11 GMT” or possibly “Wed Jun 30 21:49:08 1993” or even “25-Jul-16 04:01:11 GMT”). For this, use parse_http_time. Don’t generate anything but the first format.

Some of these functions use struct tm. You can use the standard tor_localtime_r and tor_gmtime_r() to wrap these in a safe way. We also have a tor_timegm() function.

Scheduling events

The main way to schedule a not-too-frequent periodic event with respect to the Tor mainloop is via the mechanism in periodic.c. There’s a big table of periodic_events in main.c, each of which gets invoked on its own schedule. You should not expect more than about one second of accuracy with these timers.

You can create an independent timer using libevent directly, or using the periodic_timer_new() function. But you should avoid doing this for per-connection or per-circuit timers: Libevent’s internal timer implementation uses a min-heap, and those tend to start scaling poorly once you have a few thousand entries.

If you need to create a large number of fine-grained timers for some purpose, you should consider the mechanism in src/common/timers.c, which is optimized for the case where you have a large number of timers with not-too-long duration, many of which will be deleted before they actually expire. These timers should be reasonably accurate within a handful of milliseconds – possibly better on some platforms. (The timers.c module uses William Ahern’s timeout.c implementation as its backend, which is based on a hierarchical timing wheel algorithm. It’s cool stuff; check it out.)

Lower-level cryptography functionality in Tor

Generally speaking, Tor code shouldn’t be calling OpenSSL (or any other crypto library) directly. Instead, we should indirect through one of the functions in src/common/crypto*.c or src/common/tortls.c.

Cryptography functionality that’s available is described below.

RNG facilities

The most basic RNG capability in Tor is the crypto_rand() family of functions. These currently use OpenSSL’s RAND_() backend, but may use something faster in the future.

In addition to crypto_rand(), which fills in a buffer with random bytes, we also have functions to produce random integers in certain ranges; to produce random hostnames; to produce random doubles, etc.

When you’re creating a long-term cryptographic secret, you might want to use crypto_strongest_rand() instead of crypto_rand(). It takes the operating system’s entropy source and combines it with output from crypto_rand(). This is a pure paranoia measure, but it might help us someday.

You can use smartlist_choose() to pick a random element from a smartlist and smartlist_shuffle() to randomize the order of a smartlist. Both are potentially a bit slow.

Stream ciphers

You can create instances of a stream cipher using crypto_cipher_new(). These are stateful objects of type crypto_cipher_t. Note that these objects only support AES-128 right now; a future version should add support for AES-128 and/or ChaCha20.

You can encrypt/decrypt with crypto_cipher_encrypt or crypto_cipher_decrypt. The crypto_cipher_crypt_inplace function performs an encryption without a copy.

Note that sensible people should not use raw stream ciphers; they should probably be using some kind of AEAD. Sorry.

Public key functionality

We support four public key algorithms: DH1024, RSA, Curve25519, and Ed25519.

We support DH1024 over two prime groups. You access these via the crypto_dh_*() family of functions.

We support RSA in many bit sizes for signing and encryption. You access it via the crypto_pk_() family of functions. Note that a crypto_pk_t may or may not include a private key. See the crypto_pk_ functions in crypto.c for a full list of functions here.

For Curve25519 functionality, see the functions and types in crypto_curve25519.c. Curve25519 is generally suitable for when you need a secure fast elliptic-curve diffie hellman implementation. When designing new protocols, prefer it over DH in Z_p.

For Ed25519 functionality, see the functions and types in crypto_ed25519.c. Ed25519 is a generally suitable as a secure fast elliptic curve signature method. For new protocols, prefer it over RSA signatures.

Metaformats for storage

When OpenSSL manages the storage of some object, we use whatever format OpenSSL provides – typically, some kind of PEM-wrapped base 64 encoding that starts with “—– BEGIN CRYPTOGRAPHIC OBJECT —-“.

When we manage the storage of some cryptographic object, we prefix the object with 32-byte NUL-padded prefix in order to avoid accidental object confusion; see the crypto_read_tagged_contents_from_file() and crypto_write_tagged_contents_to_file() functions for manipulating these. The prefix is “== type: tag ==”, where type describes the object and its encoding, and tag indicates which one it is.

Boxed-file storage

When managing keys, you frequently want to have some way to write a secret object to disk, encrypted with a passphrase. The crypto_pwbox and crypto_unpwbox functions do so in a way that’s likely to be readable by future versions of Tor.

Certificates

We have, alas, several certificate types in Tor.

The tor_x509_cert_t type represents an X.509 certificate. This document won’t explain X.509 to you – possibly, no document can. (OTOH, Peter Gutmann’s “x.509 style guide”, though severely dated, does a good job of explaining how awful x.509 can be.) Do not introduce any new usages of X.509. Right now we only use it in places where TLS forces us to do so.

The authority_cert_t type is used only for directory authority keys. It has a medium-term signing key (which the authorities actually keep online) signed by a long-term identity key (which the authority operator had really better be keeping offline). Don’t use it for any new kind of certificate.

For new places where you need a certificate, consider tor_cert_t: it represents a typed and dated something signed by an Ed25519 key. The format is described in tor-spec. Unlike x.509, you can write it on a napkin.

(Additionally, the Tor directory design uses a fairly wide variety of documents that include keys and which are signed by keys. You can consider these documents to be an additional kind of certificate if you want.)

TLS

Tor’s TLS implementation is more tightly coupled to OpenSSL than we’d prefer. You can read most of it in tortls.c.

Unfortunately, TLS’s state machine and our requirement for nonblocking IO support means that using TLS in practice is a bit hairy, since logical writes can block on a physical reads, and vice versa.

If you are lucky, you will never have to look at the code here.

OS compatibility functions

We’ve got a bunch of functions to wrap differences between various operating systems where we run.

The filesystem

We wrap the most important filesystem functions with load-file, save-file, and map-file abstractions declared in util.c or compat.c. If you’re messing about with file descriptors yourself, you might be doing things wrong. Most of the time, write_str_to_file() and read_str_from_file() are all you need.

Use the check_private_directory() function to create or verify the presence of directories, and tor_listdir() to list the files in a directory.

Those modules also have functions for manipulating paths a bit.

Networking

Nearly all the world is on a Berkeley sockets API, except for windows, whose version of the Berkeley API was corrupted by late-90s insistence on backward compatibility with the sort-of-berkeley-sort-of-not add-on thing that was WinSocks.

What’s more, everybody who implemented sockets realized that select() wasn’t a very good way to do nonblocking IO… and then the various implementations all decided to so something different.

You can forget about most of these differences, fortunately: We use libevent to hide most of the differences between the various networking backends, and we add a few of our own functions to hide the differences that Libevent doesn’t.

To create a network connection, the right level of abstraction to look at is probably the connection_t system in connection.c. Most of the lower level work has already been done for you. If you need to instantiate something that doesn’t fit well with connection_t, you should see whether you can instantiate it with connection_t anyway – or you might need to refactor connection.c a little.

Whenever possible, represent network addresses as tor_addr_t.

Process launch and monitoring

Launching and/or monitoring a process is tricky business. You can use the mechanisms in procmon.c and tor_spawn_background(), but they’re both a bit wonky. A refactoring would not be out of order.

Threads in Tor

Tor is based around a single main thread and one or more worker threads. We aim (with middling success) to use worker threads for CPU-intensive activities and the main thread for our networking. Fortunately (?) we have enough cryptography that moving what we can of the cryptographic processes to the workers should achieve good parallelism under most loads. Unfortunately, we only have a small fraction of our cryptography done in our worker threads right now.

Our threads-and-workers abstraction is defined in workqueue.c, which combines a work queue with a thread pool, and integrates the signalling with libevent. Tor main instance of a work queue is instantiated in cpuworker.c. It will probably need some refactoring as more types of work are added.

On a lower level, we provide locks with tor_mutex_t, conditions with tor_cond_t, and thread-local storage with tor_threadlocal_t, all of which are specified in compat_threads.h and implemented in an OS- specific compat_*threads.h module.

Try to minimize sharing between threads: it is usually best to simply make the worker “own” all the data it needs while the work is in progress, and to give up ownership when it’s complete.

String processing in Tor

Since you’re reading about a C program, you probably expected this section: it’s full of functions for manipulating the (notoriously dubious) C string abstraction. I’ll describe some often-missed highlights here.

Comparing strings and memory chunks

We provide strcmpstart() and strcmpend() to perform a strcmp with the start or end of a string.

tor_assert(!strcmpstart("Hello world","Hello"));
tor_assert(!strcmpend("Hello world","world"));

tor_assert(!strcasecmpstart("HELLO WORLD","Hello"));
tor_assert(!strcasecmpend("HELLO WORLD","world"));

To compare two string pointers, either of which might be NULL, use strcmp_opt().

To search for a string or a chunk of memory within a non-null terminated memory block, use tor_memstr or tor_memmem respectively.

We avoid using memcmp() directly, since it tends to be used in cases when having a constant-time operation would be better. Instead, we recommend tor_memeq() and tor_memneq() for when you need a constant-time operation. In cases when you need a fast comparison, and timing leaks are not a danger, you can use fast_memeq() and fast_memneq().

It’s a common pattern to take a string representing one or more lines of text, and search within it for some other string, at the start of a line. You could search for “\ntarget”, but that would miss the first line. Instead, use find_str_at_start_of_line.

Parsing text

Over the years, we have accumulated lots of ways to parse text – probably too many. Refactoring them to be safer and saner could be a good project! The one that seems most error-resistant is tokenizing text with smartlist_split_strings(). This function takes a smartlist, a string, and a separator, and splits the string along occurrences of the separator, adding new strings for the sub-elements to the given smartlist.

To handle time, you can use one of the functions mentioned above in “Parsing and encoding time values”.

For numbers in general, use the tor_parse_{long,ulong,double,uint64} family of functions. Each of these can be called in a few ways. The most general is as follows:

  const int BASE = 10;
  const int MINVAL = 10, MAXVAL = 10000;
  const char *next;
  int ok;
  long lng = tor_parse_long("100", BASE, MINVAL, MAXVAL, &ok, &next);

The return value should be ignored if “ok” is set to false. The input string needs to contain an entire number, or it’s considered invalid… unless the “next” pointer is available, in which case extra characters at the end are allowed, and “next” is set to point to the first such character.

Generating blocks of text

For not-too-large blocks of text, we provide tor_asprintf(), which behaves like other members of the sprintf() family, except that it always allocates enough memory on the heap for its output.

For larger blocks: Rather than using strlcat and strlcpy to build text, or keeping pointers to the interior of a memory block, we recommend that you use the smartlist_* functions to build a smartlist full of substrings in order. Then you can concatenate them into a single string with smartlist_join_strings(), which also takes optional separator and terminator arguments.

As a convenience, we provide smartlist_add_asprintf(), which combines the two methods above together. Many of the cryptographic digest functions also accept a not-yet-concatenated smartlist of strings.

Logging helpers

Often we’d like to log a value that comes from an untrusted source. To do this, use escaped() to escape the nonprintable characters and other confusing elements in a string, and surround it by quotes. (Use esc_for_log() if you need to allocate a new string.)

It’s also handy to put memory chunks into hexadecimal before logging; you can use hex_str(memory, length) for that.

The escaped() and hex_str() functions both provide outputs that are only valid till they are next invoked; they are not threadsafe.

Data flow in the Tor process

We read bytes from the network, we write bytes to the network. For the most part, the bytes we write correspond roughly to bytes we have read, with bits of cryptography added in.

The rest is a matter of details.

Diagram of main data flows in TorDiagram of main data flows in Tor

Connections and buffers: reading, writing, and interpreting.

At a low level, Tor’s networking code is based on “connections”. Each connection represents an object that can send or receive network-like events. For the most part, each connection has a single underlying TCP stream (I’ll discuss counterexamples below).

A connection that behaves like a TCP stream has an input buffer and an output buffer. Incoming data is written into the input buffer (“inbuf”); data to be written to the network is queued on an output buffer (“outbuf”).

Buffers are implemented in buffers.c. Each of these buffers is implemented as a linked queue of memory extents, in the style of classic BSD mbufs, or Linux skbufs.

A connection’s reading and writing can be enabled or disabled. Under the hood, this functionality is implemented using libevent events: one for reading, one for writing. These events are turned on/off in main.c, in the functions connection_{start,stop}_{reading,writing}.

When a read or write event is turned on, the main libevent loop polls the kernel, asking which sockets are ready to read or write. (This polling happens in the event_base_loop() call in run_main_loop_once() in main.c.) When libevent finds a socket that’s ready to read or write, it invokes conn_{read,write}_callback(), also in main.c

These callback functions delegate to connection_handle_read() and connection_handle_write() in connection.c, which read or write on the network as appropriate, possibly delegating to openssl.

After data is read or written, or other event occurs, these connection_handle_read_write() functions call logic functions whose job is to respond to the information. Some examples included:

  • connection_flushed_some() – called after a connection writes any amount of data from its outbuf.
  • connection_finished_flushing() – called when a connection has emptied its outbuf.
  • connection_finished_connecting() – called when an in-process connection finishes making a remote connection.
  • connection_reached_eof() – called after receiving a FIN from the remote server.
  • connection_process_inbuf() – called when more data arrives on the inbuf.

These functions then call into specific implementations depending on the type of the connection. For example, if the connection is an edge_connection_t, connection_reached_eof() will call connection_edge_reached_eof().

Note: “Also there are bufferevents!” We have vestigial code for an alternative low-level networking implementation, based on Libevent’s evbuffer and bufferevent code. These two object types take on (most of) the roles of buffers and connections respectively. It isn’t working in today’s Tor, due to code rot and possible lingering libevent bugs. More work is needed; it would be good to get this working efficiently again, to have IOCP support on Windows.

Controlling connections

A connection can have reading or writing enabled or disabled for a wide variety of reasons, including:

  • Writing is disabled when there is no more data to write
  • For some connection types, reading is disabled when the inbuf is too full.
  • Reading/writing is temporarily disabled on connections that have recently read/written enough data up to their bandwidth
  • Reading is disabled on connections when reading more data from them would require that data to be buffered somewhere else that is already full.

Currently, these conditions are checked in a diffuse set of increasingly complex conditional expressions. In the future, it could be helpful to transition to a unified model for handling temporary read/write suspensions.

Kinds of connections

Today Tor has the following connection and pseudoconnection types. For the most part, each type of channel has an associated C module that implements its underlying logic.

Edge connections receive data from and deliver data to points outside the onion routing network. See connection_edge.c. They fall into two types:

Entry connections are a type of edge connection. They receive data from the user running a Tor client, and deliver data to that user. They are used to implement SOCKSPort, TransPort, NATDPort, and so on. Sometimes they are called “AP” connections for historical reasons (it used to stand for “Application Proxy”).

Exit connections are a type of edge connection. They exist at an exit node, and transmit traffic to and from the network.

(Entry connections and exit connections are also used as placeholders when performing a remote DNS request; they are not decoupled from the notion of “stream” in the Tor protocol. This is implemented partially in connection_edge.c, and partially in dnsserv.c and dns.c.)

OR connections send and receive Tor cells over TLS, using some version of the Tor link protocol. Their implementation is spread across connection_or.c, with a bit of logic in command.c, relay.c, and channeltls.c.

Extended OR connections are a type of OR connection for use on bridges using pluggable transports, so that the PT can tell the bridge some information about the incoming connection before passing on its data. They are implemented in ext_orport.c.

Directory connections are server-side or client-side connections that implement Tor’s HTTP-based directory protocol. These are instantiated using a socket when Tor is making an unencrypted HTTP connection. When Tor is tunneling a directory request over a Tor circuit, directory connections are implemented using a linked connection pair (see below). Directory connections are implemented in directory.c; some of the server-side logic is implemented in dirserver.c.

Controller connections are local connections to a controller process implementing the controller protocol from control-spec.txt. These are in control.c.

Listener connections are not stream oriented! Rather, they wrap a listening socket in order to detect new incoming connections. They bypass most of stream logic. They don’t have associated buffers. They are implemented in connection.c.

structure hierarchy for connection typesstructure hierarchy for connection types

Note: “History Time!” You might occasionally find reference to a couple types of connections which no longer exist in modern Tor. A CPUWorker connection connected the main Tor process to a thread or process used for computation. (Nowadays we use in-process communication.) Even more anciently, a DNSWorker connection connected the main tor process to a separate thread or process used for running gethostbyname() or getaddrinfo(). (Nowadays we use Libevent’s evdns facility to perform DNS requests asynchronously.)

Linked connections

Sometimes two channels are joined together, such that data which the Tor process sends on one should immediately be received by the same Tor process on the other. (For example, when Tor makes a tunneled directory connection, this is implemented on the client side as a directory connection whose output goes, not to the network, but to a local entry connection. And when a directory receives a tunnelled directory connection, this is implemented as an exit connection whose output goes, not to the network, but to a local directory connection.)

The earliest versions of Tor to support linked connections used socketpairs for the purpose. But using socketpairs forced us to copy data through kernelspace, and wasted limited file descriptors. So instead, a pair of connections can be linked in-process. Each linked connection has a pointer to the other, such that data written on one is immediately readable on the other, and vice versa.

From connections to channels

There’s an abstraction layer above OR connections (the ones that handle cells) and below cells called Channels. A channel’s purpose is to transmit authenticated cells from one Tor instance (relay or client) to another.

Currently, only one implementation exists: Channel_tls, which sends and receiveds cells over a TLS-based OR connection.

Cells are sent on a channel using channel_write_{,packed_,var_}cell(). Incoming cells arrive on a channel from its backend using channel_queue*_cell(), and are immediately processed using channel_process_cells().

Some cell types are handled below the channel layer, such as those that affect handshaking only. And some others are passed up to the generic cross-channel code in command.c: cells like DESTROY and CREATED are all trivial to handle. But relay cells require special handling…

From channels through circuits

When a relay cell arrives on an existing circuit, it is handled in circuit_receive_relay_cell() – one of the innermost functions in Tor. This function encrypts or decrypts the relay cell as appropriate, and decides whether the cell is intended for the current hop of the circuit.

If the cell is intended for the current hop, we pass it to connection_edge_process_relay_cell() in relay.c, which acts on it based on its relay command, and (possibly) queues its data on an edge_connection_t.

If the cell is not intended for the current hop, we queue it for the next channel in sequence with append cell_to_circuit_queue(). This places the cell on a per-circuit queue for cells headed out on that particular channel.

Sending cells on circuits: the complicated bit.

Relay cells are queued onto circuits from one of two (main) sources: reading data from edge connections, and receiving a cell to be relayed on a circuit. Both of these sources place their cells on cell queue: each circuit has one cell queue for each direction that it travels.

A naive implementation would skip using cell queues, and instead write each outgoing relay cell. (Tor did this in its earlier versions.) But such an approach tends to give poor performance, because it allows high-volume circuits to clog channels, and it forces the Tor server to send data queued on a circuit even after that circuit has been closed.

So by using queues on each circuit, we can add cells to each channel on a just-in-time basis, choosing the cell at each moment based on a performance-aware algorithm.

This logic is implemented in two main modules: scheduler.c and circuitmux*.c. The scheduler code is responsible for determining globally, across all channels that could write cells, which one should next receive queued cells. The circuitmux code determines, for all of the circuits with queued cells for a channel, which one should queue the next cell.

(This logic applies to outgoing relay cells only; incoming relay cells are processed as they arrive.)

Tor’s modules

Generic modules

buffers.c : Implements the buf_t buffered data type for connections, and several low-level data handling functions to handle network protocols on it.

channel.c : Generic channel implementation. Channels handle sending and receiving cells among tor nodes.

channeltls.c : Channel implementation for TLS-based OR connections. Uses connection_or.c.

circuitbuild.c : Code for constructing circuits and choosing their paths. (Note: this module could plausibly be split into handling the client side, the server side, and the path generation aspects of circuit building.)

circuitlist.c : Code for maintaining and navigating the global list of circuits.

circuitmux.c : Generic circuitmux implementation. A circuitmux handles deciding, for a particular channel, which circuit should write next.

circuitmux_ewma.c : A circuitmux implementation based on the EWMA (exponentially weighted moving average) algorithm.

circuituse.c : Code to actually send and receive data on circuits.

command.c : Handles incoming cells on channels.

config.c : Parses options from torrc, and uses them to configure the rest of Tor.

confparse.c : Generic torrc-style parser. Used to parse torrc and state files.

connection.c : Generic and common connection tools, and implementation for the simpler connection types.

connection_edge.c : Implementation for entry and exit connections.

connection_or.c : Implementation for OR connections (the ones that send cells over TLS).

main.c : Principal entry point, main loops, scheduled events, and network management for Tor.

ntmain.c : Implements Tor as a Windows service. (Not very well.)

onion.c : Generic code for generating and responding to CREATE and CREATED cells, and performing the appropriate onion handshakes. Also contains code to manage the server-side onion queue.

onion_fast.c : Implements the old SHA1-based CREATE_FAST/CREATED_FAST circuit creation handshake. (Now deprecated.)

onion_ntor.c : Implements the Curve25519-based NTOR circuit creation handshake.

onion_tap.c : Implements the old RSA1024/DH1024-based TAP circuit creation handshake. (Now deprecated.)

relay.c : Handles particular types of relay cells, and provides code to receive, encrypt, route, and interpret relay cells.

scheduler.c : Decides which channel/circuit pair is ready to receive the next cell.

statefile.c : Handles loading and storing Tor’s state file.

tor_main.c : Contains the actual main() function. (This is placed in a separate file so that the unit tests can have their own main().)

Node-status modules

directory.c : Implements the HTTP-based directory protocol, including sending, receiving, and handling most request types. (Note: The client parts of this, and the generic-HTTP parts of this, could plausibly be split off.)

microdesc.c : Implements the compact “microdescriptor” format for keeping track of what we know about a router.

networkstatus.c : Code for fetching, storing, and interpreting consensus vote documents.

nodelist.c : Higher-level view of our knowledge of which Tor servers exist. Each node_t corresponds to a router we know about.

routerlist.c : Code for storing and retrieving router descriptors and extrainfo documents.

routerparse.c : Generic and specific code for parsing all Tor directory information types.

routerset.c : Parses and interprets a specification for a set of routers (by IP range, fingerprint, nickname (deprecated), or country).

Client modules

addressmap.c : Handles client-side associations between one address and another. These are used to implement client-side DNS caching (NOT RECOMMENDED), MapAddress directives, Automapping, and more.

circpathbias.c : Path bias attack detection for circuits: tracks whether connections made through a particular guard have an unusually high failure rate.

circuitstats.c : Code to track circuit performance statistics in order to adapt our behavior. Notably includes an algorithm to track circuit build times.

dnsserv.c : Implements DNSPort for clients. (Note that in spite of the word “server” in this module’s name, it is used for Tor clients. It implements a DNS server, not DNS for servers.)

entrynodes.c : Chooses, monitors, and remembers guard nodes. Also contains some bridge-related code.

torcert.c : Code to interpret and generate Ed25519-based certificates.

Server modules

dns.c : Server-side DNS code. Handles sending and receiving DNS requests on exit nodes, and implements the server-side DNS cache.

dirserv.c : Implements part of directory caches that handles responding to client requests.

ext_orport.c : Implements the extended ORPort protocol for communication between server-side pluggable transports and Tor servers.

hibernate.c : Performs bandwidth accounting, and puts Tor relays into hibernation when their bandwidth is exhausted.

router.c : Management code for running a Tor server. In charge of RSA key maintenance, descriptor generation and uploading.

routerkeys.c : Key handling code for a Tor server. (Currently handles only the Ed25519 keys, but the RSA keys could be moved here too.)

Onion service modules

rendcache.c : Stores onion service descriptors.

rendclient.c : Client-side implementation of the onion service protocol.

rendcommon.c : Parts of the onion service protocol that are shared by clients, services, and/or Tor servers.

rendmid.c : Tor-server-side implementation of the onion service protocol. (Handles acting as an introduction point or a rendezvous point.)

rendservice.c : Service-side implementation of the onion service protocol.

replaycache.c : Backend to check introduce2 requests for replay attempts.

Authority modules

dircollate.c : Helper for dirvote.c: Given a set of votes, each containing a list of Tor nodes, determines which entries across all the votes correspond to the same nodes, and yields them in a useful order.

dirvote.c : Implements the directory voting algorithms that authorities use.

keypin.c : Implements a persistent key-pinning mechanism to tie RSA1024 identities to ed25519 identities.

Miscellaneous modules

control.c : Implements the Tor controller protocol.

cpuworker.c : Implements the inner work queue function. We use this to move the work of circuit creation (on server-side) to other CPUs.

fp_pair.c : Types for handling 2-tuples of 20-byte fingerprints.

geoip.c : Parses geoip files (which map IP addresses to country codes), and performs lookups on the internal geoip table. Also stores some geoip-related statistics.

policies.c : Parses and implements Tor exit policies.

reasons.c : Maps internal reason-codes to human-readable strings.

rephist.c : Tracks Tor servers’ performance over time.

status.c : Writes periodic “heartbeat” status messages about the state of the Tor process.

transports.c : Implements management for the pluggable transports subsystem.

This not that

Don’t use memcmp. Use {tor,fast}_{memeq,memneq,memcmp}.

Don’t use assert. Use tor_assert or tor_assert_nonfatal or BUG. Prefer nonfatal assertions or BUG()s.

Don’t use sprintf or snprintf. Use tor_asprintf or tor_snprintf.

Don’t write hand-written binary parsers. Use trunnel.

Don’t use malloc, realloc, calloc, free, strdup, etc. Use tor_malloc, tor_realloc, tor_calloc, tor_free, tor_strdup, etc.

Don’t use tor_realloc(x, y*z). Use tor_reallocarray(x, y, z);

Don’t say “if (x) foo_free(x)”. Just foo_free(x) and make sure that foo_free(NULL) is a no-op.

Don’t use toupper or tolower; use TOR_TOUPPER and TOR_TOLOWER.

Don’t use isalpha, isalnum, etc. Instead use TOR_ISALPHA, TOR_ISALNUM, etc.

Don’t use strcat, strcpy, strncat, or strncpy. Use strlcat and strlcpy instead.

Don’t use tor_asprintf then smartlist_add; use smartlist_add_asprintf.

Don’t use any of these functions: they aren’t portable. Use the version prefixed with tor_ instead: strtok_r, memmem, memstr, asprintf, localtime_r, gmtime_r, inet_aton, inet_ntop, inet_pton, getpass, ntohll, htonll, strdup, (This list is incomplete.)

Don’t create or close sockets directly. Instead use the wrappers in compat.h.

When creating new APIs, only use ‘char *’ to represent ‘pointer to a nul-terminated string’. Represent ‘pointer to a chunk of memory’ as ‘uint8_t *’. (Many older Tor APIs ignore this rule.)

Don’t encode/decode u32, u64, or u16 to byte arrays by casting pointers. That can crash if the pointers aren’t aligned, and can cause endianness problems. Instead say something more like set_uint32(ptr, htonl(foo)) to encode, and ntohl(get_uint32(ptr)) to decode.

Don’t declare a 0-argument function with “void foo()”. That’s C++ syntax. In C you say “void foo(void)”.

When creating new APIs, use const everywhere you reasonably can.

Sockets should have type tor_socket_t, not int.