Home Signal Handlers for Multithreaded C++
Post
Cancel
Preview Image

Signal Handlers for Multithreaded C++

I have recently written a system daemon that had to handle signals gracefully. That is, initiating a clean shutdown of all threads upon receiving a signal that would otherwise immediately terminate the application. For example, SIGTERM and SIGINT most likely must be handled by every long running process.

Jump to github.com/thomastrapp/signal-wrangler for the nitty gritty.

Silly Signal Handlers

Unix Signals are a bit of a pain:

  • They seem deceptively simple, but aren’t.
  • They are asynchronous. Upon arrival of a signal, the process is suspended and a signal handler is executed if one was installed by the application (through sigaction or its older version signal). Because the process is suspend out of the blue, no usual assumptions can be made about the state of the application.
  • Signals are delivered to any one thread that has not blocked the signal, including (and as a most likely candidate) the main thread.

Atomics to the Rescue

The asynchronous nature of signals limits the interaction of signal handlers with the rest of the process as much for there to be a man page about it: man 7 signal-safety. Notice how the whitelist of safe function calls misses anything that is fun.

But – Using atomics is allowed. sig_atomic_t was made for this purpose and std::atomic is signal safe if it is lock-free.

The following example uses atomics to safely tell the main loop to exit:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
#include <atomic>
#include <chrono>
#include <iostream>
#include <thread>

#include <signal.h>
#include <unistd.h>


namespace {
  // In the GNUC Library, sig_atomic_t is a typedef for int,
  // which is atomic on all systems that are supported by the
  // GNUC Library
  volatile sig_atomic_t do_shutdown = 0;

  // std::atomic is safe, as long as it is lock-free
  std::atomic<bool> shutdown_requested = false;
  static_assert( std::atomic<bool>::is_always_lock_free );
  // or, at runtime: assert( shutdown_requested.is_lock_free() );
}

void my_signal_handler(int /*signum*/)
{
  // ok, lock-free atomics
  do_shutdown = 1;
  shutdown_requested = true;

  const char str[] = "received signal\n";
  // ok, write is signal-safe
  write(STDERR_FILENO, str, sizeof(str) - 1);

  // UB, unsafe, internal buffers: std::cout << "received signal\n";
  // UB, unsafe, allocates: std::vector<T>(20);
  // unsafe, internal buffers: printf("received signal\n");
}

int main()
{
  // setup signal handler
  {
    struct sigaction action;
    action.sa_handler = my_signal_handler;
    sigemptyset(&action.sa_mask);
    action.sa_flags = 0;
    sigaction(SIGINT, &action, NULL);
  }

  // main loop
  while( !do_shutdown && !shutdown_requested.load() )
  {
    std::cout << "doing work...\n";
    std::this_thread::sleep_for(std::chrono::seconds(1));
  }

  std::cout << "shutting down\n";

  /* do cleanup ... */

  return 0;
}

This also works well with threads that have a main loop: Each thread can access these atomics to check if a shutdown was requested.

Sleepy Threads and Condition Variables

But what if your threads sleep most of the time? For example, in my application threads only do some work sporadically, with long sleeps in between. If a signal for termination arrives, the sleep must of course be interrupted immediately. Further, if init sends SIGTERM to the application, and the application does not terminate within the timeout, the application will be killed ruthlessly.

The solution to this is a conditional wait: Let the threads sleep until a certain timespan has passed, or a condition was satisfied, whichever comes first.

The idiomatic way to do this in C++ is to use a std::condition_variable: By calling std::condition_variable::notify_{one,all} threads can be woken up from their sleep. Unfortunately, notify_{one,all} is not signal safe, and therefore cannot be used within a signal handler.

So signal handlers are out; there’s no safe way to make them work in this case. Great – What now?

No need to call me, I’ll sigwait for you

Enter sigwait:

The sigwait() function suspends execution of the calling thread until one of the signals specified in the signal set set becomes pending. The function accepts the signal (removes it from the pending list of signals), and returns the signal number in sig.

By using sigwait or sigwaitinfo, a multithreaded application can block all signals at startup and have one dedicated thread to wait for signals. This allows us to use all synchonization primitives at our disposal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
#include <condition_variable>
#include <cstdlib>
#include <future>
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>

#include <signal.h>


int main()
{
  // block signals in this thread and subsequently
  // spawned threads
  sigset_t sigset;
  sigemptyset(&sigset);
  sigaddset(&sigset, SIGINT);
  sigaddset(&sigset, SIGTERM);
  pthread_sigmask(SIG_BLOCK, &sigset, nullptr);

  std::atomic<bool> shutdown_requested(false);
  std::mutex cv_mutex;
  std::condition_variable cv;

  auto signal_handler = [&shutdown_requested, &cv, &sigset]() {
    int signum = 0;
    // wait until a signal is delivered:
    sigwait(&sigset, &signum);
    shutdown_requested.store(true);
    // notify all waiting workers to check their predicate:
    cv.notify_all();
    return signum;
  };

  auto ft_signal_handler = std::async(std::launch::async, signal_handler);

  auto worker = [&shutdown_requested, &cv_mutex, &cv]() {
    while( shutdown_requested.load() == false )
    {
      std::unique_lock lock(cv_mutex);
      cv.wait_for(
          lock,
          // wait for up to an hour
          std::chrono::hours(1),
          // when the condition variable is woken up and this predicate
          // returns true, the wait is stopped:
          [&shutdown_requested]() { return shutdown_requested.load(); });
    }

    return shutdown_requested.load();
  };

  // spawn a bunch of workers
  std::vector<std::future<bool>> workers;
  for( int i = 0; i < 10; ++i )
    workers.push_back(std::async(std::launch::async, worker));

  std::cout << "waiting for SIGTERM or SIGINT ([CTRL]+[c])...\n";

  // wait for signal handler to complete
  int signal = ft_signal_handler.get();
  std::cout << "received signal " << signal << "\n";

  // wait for workers
  for( auto& future : workers )
    std::cout << "worker observed shutdown request: "
              << std::boolalpha
              << future.get()
              << "\n";

  std::cout << "clean shutdown\n";

  return EXIT_SUCCESS;
}

Signal Wrangler – A header-only Library

Signal Wrangler is a small header-only library with convenient helpers to manage signals more expressively. See sgnl/AtomicCondition.h and sgnl/SignalHandler.h for documentation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
#include <sgnl/AtomicCondition.h>
#include <sgnl/SignalHandler.h>

#include <cstdlib>
#include <future>
#include <iostream>
#include <thread>


void Worker(const sgnl::AtomicCondition<bool>& exit_condition)
{
  auto predicate = [&exit_condition]() {
    return exit_condition.get();
  };
  while( true )
  {
    exit_condition.wait_for(std::chrono::minutes(1), predicate);
    if( exit_condition.get() )
      return;
    /* ... do work ... */
  }
}

int main()
{
  sgnl::AtomicCondition<bool> exit_condition(false);

  auto handler = [&exit_condition](int signum) {
    std::cout << "received signal " << signum << "\n";
    if( signum == SIGTERM || signum == SIGINT )
    {
      exit_condition.set(true);
      // wakeup all waiting threads
      exit_condition.notify_all();
      // stop polling for signals
      return true;
    }

    // continue waiting for signals
    return false;
  };

  // Block signals in this thread.
  // Threads spawned later will inherit the signal mask.
  sgnl::SignalHandler signal_handler({SIGINT, SIGTERM, SIGUSR1});

  std::future<int> ft_sig_handler =
    std::async(
        std::launch::async,
        &sgnl::SignalHandler::sigwait_handler,
        &signal_handler,
        std::ref(handler));

  std::vector<std::future<void>> futures;
  for(int i = 0; i < 10; ++i)
    futures.push_back(
        std::async(
          std::launch::async,
          Worker,
          std::ref(exit_condition)));

  // SIGUSR1
  std::this_thread::sleep_for(std::chrono::milliseconds(100));
  kill(0, SIGUSR1);

  // SIGTERM
  kill(0, SIGTERM);
  std::this_thread::sleep_for(std::chrono::milliseconds(100));

  for(auto& future : futures)
    future.wait();

  int last_signal = ft_sig_handler.get();
  std::cout << "exiting (received signal " << last_signal << ")\n";

  return EXIT_SUCCESS;
}

If you have any thoughts or suggestions, feel free to hit me up by email, or create an issue/pull-request at github.com/thomastrapp/signal-wrangler.

Alternative: Polling for Signals

signalfd returns a file descriptor that allows polling for signals. If there’s already a main loop monitoring a set of file descriptors (e.g. through epoll) it might be prudent to just add another descriptor and block all signals for the process.
Signals can then be handled in the main loop, when the application is in a well defined state.

In “The Design of the Unix Operating System” from 1986, Maurice J. Bach cites Dennis Ritchie:

According to Ritchie (private communication), signals were designed as events that are fatal or ignored, not neccessarily handled

In A Research UNIX Reader: Annotated Excerpts from the Programmer’s Manual, 1971-1986:

A simple unconditional kill was available to terminate rogue programs in the background (v2). In v5 kill was generalized to send arbitrary signals. Never, however, was the basically unstructured signal-kill mechanism regarded as a significant means of interprocess communication.

Alrighty, then! :)