Signal Handler for multithreaded C++

I have recently written a system daemon that had to handle signals gracefully. That is, initiating a clean shutdown of all threads upon receiving a signal that would otherwise immediately terminate the application. For example, SIGTERM and SIGINT (usually initiated by a furious developer banging on [ctrl]+[c]) most likely must be handled by every long running process.

~tldr: Signal handlers bad, sigwait great.

Unix Signals are a bit of a pain:

  • They seem deceptively simple, but aren’t.
  • They are asynchronous. Upon arrival of a signal, the process is suspended and a signal handler is executed if one was installed by the application (through sigaction or its older version signal). Because the process is suspend out of the blue, no usual assumptions can be made about the state of the application.
  • Signals are delivered to any one thread that has not blocked the signal, including (and as a most likely candidate) the main thread.

The asynchronous nature of signals limits interaction of signal handlers with the rest of the process as much for there to be a man page about it: man 7 signal-safety. Notice how the whitelist of safe function calls misses anything that is fun.

Using atomics is allowed. sig_atomic_t was made for this purpose and std::atomic is signal safe if it is lock-free.

The following example uses atomics to safely tell the main loop to exit:

#include <atomic>
#include <chrono>
#include <iostream>
#include <thread>

#include <signal.h>
#include <unistd.h>


namespace {
  // In the GNUC Library, sig_atomic_t is a typedef for int,
  // which is atomic on all systems that are supported by the
  // GNUC Library
  volatile sig_atomic_t do_shutdown = 0;

  // std::atomic is safe, as long as it is lock-free
  std::atomic<bool> shutdown_requested = false;
  static_assert( std::atomic<bool>::is_always_lock_free );
  // or, at runtime: assert( shutdown_requested.is_lock_free() );
}

void my_signal_handler(int /*signum*/)
{
  // ok, lock-free atomics
  do_shutdown = 1;
  shutdown_requested = true;

  const char str[] = "received signal\n";
  // ok, write is signal-safe
  write(STDERR_FILENO, str, sizeof(str) - 1);

  // UB, unsafe, internal buffers: std::cout << "received signal\n";
  // UB, unsafe, allocates: std::vector<T>(20);
  // unsafe, internal buffers: printf("received signal\n");
}

int main()
{
  // setup signal handler
  {
    struct sigaction action;
    action.sa_handler = my_signal_handler;
    sigemptyset(&action.sa_mask);
    action.sa_flags = 0;
    sigaction(SIGINT, &action, NULL);
  }

  // main loop
  while( !do_shutdown && !shutdown_requested.load() )
  {
    std::cout << "doing work...\n";
    std::this_thread::sleep_for(std::chrono::seconds(1));
  }

  std::cout << "shutting down\n";

  /* do cleanup ... */

  return 0;
}

This also works well with threads that have a main loop: Each thread can access these atomics to check if a shutdown was requested.

But what if your threads sleep most of the time? For example, in my application threads only do some work sporadically, with long sleeps in between. If a signal for termination arrives, the sleep must of course be interrupted immediately. Further, if init sends SIGTERM to the application, and the application does not terminate within the timeout, the application will be killed.

The solution to this is a conditional wait: Let the threads sleep until a certain timespan has passed, or a condition was satisfied, whichever comes first.

The idiomatic way to do this in C++ is to use a std::condition_variable: By calling std::condition_variable::notify_{one,all} threads can be woken up from their sleep. Unfortunately, notify_{one,all} is not signal safe, and therefore cannot be used within a signal handler.

So signal handlers are out; there’s no safe way to make them work in this case.

Enter sigwait:

The sigwait() function suspends execution of the calling thread until one of the signals specified in the signal set set becomes pending. The function accepts the signal (removes it from the pending list of signals), and returns the signal number in sig.

By using sigwait or sigwaitinfo, a multithreaded application can block all signals at startup and have one dedicated thread to wait for signals. This allows us to use all synchonization primitives at our disposal.

#include <condition_variable>
#include <cstdlib>
#include <future>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>

#include <signal.h>


int main()
{
  // block signals in this thread and subsequently
  // spawned threads
  sigset_t sigset;
  sigemptyset(&sigset);
  sigaddset(&sigset, SIGINT);
  sigaddset(&sigset, SIGTERM);
  pthread_sigmask(SIG_BLOCK, &sigset, nullptr);

  std::atomic<bool> shutdown_requested(false);
  std::mutex cv_mutex;
  std::condition_variable cv;

  auto signal_handler = [&shutdown_requested, &cv, &sigset]() {
    int signum = 0;
    // wait until a signal is delivered:
    sigwait(&sigset, &signum);
    shutdown_requested.store(true);
    // notify all waiting workers to check their predicate:
    cv.notify_all();
    return signum;
  };

  auto ft_signal_handler = std::async(std::launch::async, signal_handler);

  auto worker = [&shutdown_requested, &cv_mutex, &cv]() {
    while( shutdown_requested.load() == false )
    {
      std::unique_lock lock(cv_mutex);
      cv.wait_for(
          lock,
          // wait for up to an hour
          std::chrono::hours(1),
          // when the condition variable is woken up and this predicate
          // returns true, the wait is stopped:
          [&shutdown_requested]() { return shutdown_requested.load(); });
    }

    return shutdown_requested.load();
  };

  // spawn a bunch of workers
  std::vector<std::future<bool>> workers;
  for( int i = 0; i < 10; ++i )
    workers.push_back(std::async(std::launch::async, worker));

  std::cout << "waiting for SIGTERM or SIGINT ([CTRL]+[c])...\n";

  // wait for signal handler to complete
  int signal = ft_signal_handler.get();
  std::cout << "received signal " << signal << "\n";

  // wait for workers
  for( auto& future : workers )
    std::cout << "worker observed shutdown request: "
              << std::boolalpha
              << future.get()
              << "\n";

  std::cout << "clean shutdown\n";

  return EXIT_SUCCESS;
}

I’ve bundled this up and released it on github.

In “The Design of the Unix Operating System” from 1986, Maurice J. Bach cites Dennis Ritchie:

According to Ritchie (private communication), signals were designed as events that are fatal or ignored, not neccessarily handled

In A Research UNIX Reader: Annotated Excerpts from the Programmer’s Manual, 1971-1986:

A simple unconditional kill was available to terminate rogue programs in the background (v2). In v5 kill was generalized to send arbitrary signals. Never, however, was the basically unstructured signal-kill mechanism regarded as a significant means of interprocess communication.

Alrighty, then! :)

Updated: