Realtime Configuration

5 minute read

Introduction

MOTORCORTEX is designed for running Tasks with hard-realtime deadlines. It does this without using special hypervisors that execute code within their own environment and scheduler. MOTORCORTEX uses the realtime capabilities of the latest standard Linux kernels, which makes it possible to run userspace applications with very stringent hard-realtime deadlines. For optimal performance, Vectioneer provides specially tuned Linux distributions that have been tested on a wide range of hardware, like the CX range of industrial embedded computers from Beckhoff, Compulab’s Fitlet2, the Aaeon UP², Siemens Simatic IPC127E IOT gateways and even System-On-Chips (SoC) like the Raspberry Pi or NanoPi.

Since MOTORCORTEX applications are user-space applications they can also be compiled to run in non-realtime on any platform that supports Linux. This may be sufficient for a lot of applications that only need soft-realtime.

In this chapter it will be shown how to configure a MOTORCORTEX application for hard-realtime and give some pointers how to get the best possible performance and what to avoid.

The do’s and don’ts of hard-realtime

Hard-realtime is very difficult to achieve in a general purpose operating systems, especially on modern CPUs or systems that have a lot of features that may interrupt the realtime process. Especially hardware interrupts always have been a problem for realtime, like interacting with USB devices or video cards. Modern systems are generally tuned for speed (or responsiveness) and not for realtime (deterministic) behavior; snappy response to user inputs for instance is then prioritized over calculations done in the background.

Luckily, in Linux this realtime or low-latency requirement has been taken very seriously by Kernel developers and a lot of development has already gone into making Linux a general purpose operating system with uncompromised real-time capability. This originated with the PREEMPT_RT patch that was originally published by Thomas Gleixner and Ingo Molnar, which currently has reached maturity to be included into the standard (Vanilla) Linux Kernel.

Currently, with a tuned Linux kernel the same performance can be achieved as with dedicated Realtime Operating Systems or hypervisors.

In general, to make a realtime Task run such that it never misses a beat, follow the following rules:

  • (De-)Allocate memory before realtime is needed. Reduce or better eliminate dynamic memory allocations during an iterate task (e.g. avoid using new in C++). Do not use blocking calls; for example for IO calls. Be careful with using external libraries in your code that are not designed to be real-time-capable.
  • Reduce the amount of page-faults as much as possible.
  • Do not use the CPU’s C-States. C-States are there to put system into (partial) sleep or reduce clock speeds when the system load allows this. When this happens probably realtime performance will be badly affected. Disable C-States in the BIOS. The negative consequence of this is that the system will consume more power and generate more heat than usual. Check if the system is sufficiently cooled.
  • Some CPUs throttle their throughput when the temperature rises above some point. Make sure there is adequate cooling for the task, and test realtime performance under full load in a worst-case environment.

The good news is that MOTORCORTEX already takes care of a lot of tuning and configuration for you.

Assigning Tasks to CPUs

To run tasks in realtime, the CPUs that are going to be used for realtime must be isolated, so other processes cannot use these CPUs.

// start low latency, isolate CPU 0 and 1
utils::startRealTime({0, 1});

As soon as these CPU cores are isolated, other tasks are pushed off these CPUs onto other CPUs. Make sure that not all CPUs are isolated, Linux itself needs at least one CPU to run.

Tasks can be configured to run with different schedulers. Currently the following scheduler policies are supported:

enum class TaskSched {
 NORMAL = SCHED_OTHER, // Default non-realtime policy.
 REALTIME = SCHED_FIFO, // Default realtime policy.
 REALTIME_FIFO = SCHED_FIFO, // Realtime FIFO policy.
 REALTIME_RR = SCHED_RR // Realtime Round-Robin policy.
};

When a task is started it can be assigned to a scheduler, to a number of CPUs, with a certain priority. In general, for non-realtime tasks the NORMAL scheduler is used and for realtime tasks the REALTIME scheduler is used:

// start logger and communication with non-realtime scheduler
logger_task.start(rt_dt_micro_s,
container::TaskSched::NORMAL);
comm_task.start(rt_dt_micro_s, container::TaskSched::NORMAL);
// start control and ethercat with realtime scheduler,
// attach to isolated CPUs 0 and 1, set priorities to 80
controls_task.start(rt_dt_micro_s,
container::TaskSched::REALTIME, {0}, 80);
ethercat_task.start(rt_dt_micro_s,
container::TaskSched::REALTIME, {1}, 80);

Adjusting timing with secaligned

Each task has a secaligned parameter that can be modified through the MOTORCORTEX Parameter Tree. The secalign parameter aligns the Task’s execution cycle with respect to the system monotonic timer. The secaligned value can be set in the normalized range of [0..1], where 1 corresponds to a full cycle time of the task.

In general, changing secaligned from its default value is not required.

TBD: Diagram & further explanation

Parallelizing execution; splitting the load into Workers

MOTORCORTEX has a special active container to run callable targets asynchronously. This can be used to parallelize heavy computations inside the control blocks or perform blocking calls asynchronously (without blocking the control loop). The advantage of the MOTORCORTEX Worker is that it can be setup to run on a specific CPU with specific priority and scheduler, by default the Worker inherits these properties from the calling task. It is good practice to create the worker in the startOp phase, when the calling task has all its Realtime related properties set.

bool MainControlLoop::startOp_() {
  // creates the Worker with the same priority and CPU affinity as the calling task
  worker1.create(std::string("worker1");
  // creates the Worker with the Normal priority on the non-isolated CPUs
  worker2.create("worker2", mcx::container::TaskSched::NORMAL, mcx::utils::getAvailableCpu());
  return true;
}
bool MainControlLoop::iterateOp_(const container::TaskTime& system_time, container::UserTime* user_time) {
  // moving half of the control loops to the worker
  unsigned int split_the_work = number_of_control_loops / 2;
  for (unsigned int cnt = 0; cnt < split_the_work; cnt++) {
    worker1.start([&, cnt]() {
      punchControl_[cnt].iterate(system_time, user_time);
    });
  }
  // computing the other half in the task
  for (unsigned int cnt = split_the_work; cnt < number_of_control_loops; cnt++) {
    punchControl_[cnt].iterate(system_time, user_time);
  }
  // Important: wait for the worker to finish
  worker1.join(0);
}
Last modified March 23, 2021: Restructured GRID (44d0658)