Last modification: 03-Apr-2019
Author: R. Koucha

isys: a way to make system() more efficient

Foreword

This article has been extracted from this larger study on the solutions to optimize the C library's system() service.

1. Introduction

In embedded environments, the cost of the hardware is an important consideration. As a consequence, the memory is often very limited. The memory as well as the CPU time are critical resources which must be used with care and as efficiently as possible not only for response time and robustness purposes but also for hardware cost reduction purposes. Several applications need to call shell commands to trigger various tasks that would be tough to accomplish with languages like C. Hence, to make it, the C library provides the system() service which is passed as parameter the command line to run:

int system(const char *command);

The "command" parameter may be a simple executable name or a more complex shell command line using output redirections and pipes.

system() hides a call to "/bin/sh -c" to run the command line passed as parameter.

From Linux system point of view, in the simplest case, system() triggers at least two pairs of fork()/exec() system calls: one for "sh -c" and another for the command line itself as depicted in Figure 1.

figure_1

Figure 1: system() internals

Moreover, fork() triggers a duplication of some resources (memory, file descriptors...) of the calling process (the father) to make the forked process (the child) inherit them. If the calling process is big from a memory occupation point of view or the overall memory occupation is high, the system()call may fail because of a lack of free memory. Even tough Linux benefited multiple enhancements like the Copy On Write (i.e. COW) to make the fork() more efficient and less cumbersome, this may lead to a memory over consumption which triggers Linux defense mechanisms like Out Of Memory (OOM) killer.

This paper aims at addressing the problem of system() overuse with an alternate solution called isys to enhance existing applications in a confident way that is to say with a minimal impact on the existing source code and its behaviour.

2. Isys

As some applications need to call system() frequently, it means that "sh -c" is run very often. Moreover, the execution and termination of multiple shells by several concurrent applications sucks CPU time and memory resources. It is possible to plan a solution where a shell is executed once and stays ready to use in any application needing to run commands.

The idea consists to start one (or more ?) background shell(s) at application startup. We don't use the "-c" option which runs one command line and then makes the shell exit. The shell must live in background during the application lifetime even after command execution. Each time the application needs to run a command, it submits it to the background shell. This saves the CPU time and memory needed to start and stop the shell. Figure 2 depicts the principle.

figure_2

Figure 2: Background shell

Without "-c" option, the shell is interactive. In other words, it needs to be in front of a terminal. Linux provides the pseudo-terminal (i.e. PTY) concept to manage this kind of needs. The PTY is setup between the application process (master side) and the background shell process (slave side). The latter believes that it is interacting with an operator through a real terminal whereas the operator is actually the application process: cf. Figure 3.

figure_3

Figure 3: Pseudo-terminal

As the shell is in interactive mode, it displays a prompt to wait for a command. It gets the command, executes it and displays a new prompt at the end of the command to wait for another one. At first sight, the application process would need to do some tricky work to parse the displays from the shell in order to discriminate the command display from the displayed prompt at the end of the command. Moreover, the application must also get the result of the command (i.e. the exit status). To make it simple, it is possible to use PDIP (i.e. Programmed Dialogs with Interactive Programs). This is an open source. The package is fully documented with online manuals, html pages and examples. It is an expect-like tool but much more simple to use than its ancestor. It provides the ability to pilot interactive programs. It comes in two flavors: a command named pdip which is used to control interactive programs from a shell script and an C language API offered by a shared library called libpdip.so to control interactive programs from a C/C++ language program. The latter is interesting to implement the current solution.

In the source tree of PDIP package, the isys sub-directory contains a variant of system() using the above principle (cf. isys.c embedded in a shared library called libisys.so). § 4 presents some details about this library. With libisys.so, the application process calls an API named isystem() which behaves the same as system() but actually it hides the PTY and the running background shell described above (cf. Figure 4). The name of this service, isys, stands for "Interactive SYStem()" because it lies on shells running in interactive mode.

figure_4

Figure 4: Use of PDIP library

The solution described in this chapter saves the fork()/exec() of "sh -c" by keeping at least one running background shell per application process. Depending on the application's behaviour, it may be useful to keep at least a running shell. But it may be cumbersome from a memory point of view if the application calls to isystem() are rare. It is possible to enhance this implementation to reduce the number of running background shells by sharing them with all the running applications as proposed by rsys solution.

3. Performances

In this article a little test program is used to compare the performances of system() and isystem():

system() $ tests/system_it 2000 tests/scrip.sh
Running command 'tests/scrip.sh' 2000 times...
Elapsed time: 5 s - 612918826 ns
isystem() $ tests/isystem_it 2000 tests/scrip.sh
Running command 'tests/scrip.sh' 2000 times...
Elapsed time: 4 s - 209876090 ns


We can see that isystem() is faster than system(). As a consequence, it is a good alternative to system().

4. Download, build and installation

4.1. Build from the sources

Unpack the source code package:

$ tar xvfz pdip-xxx.tgz

Go into the top level directory of the sources and trigger the build of the DEB packages:

$ cd pdip-xxx
$ ./pdip_install -P DEB

4.2. Installation from the packages

ISYS depends on PDIP. So, PDIP must be installed prior to install ISYS otherwise you get the following error:

$ sudo dpkg -i isys_xxx_amd64.deb

Selecting previously unselected package isys.
(Reading database ... 218983 files and directories currently installed.)
Preparing to unpack isys_xxx_amd64.deb ...
Unpacking isys (xxx) ...
dpkg: dependency problems prevent configuration of isys:
isys depends on pdip (>= 2.0.4); however:
Package pdip is not installed.

dpkg: error processing package isys (--install):
dependency problems - leaving unconfigured
Errors were encountered while processing:
isys

Install first the PDIP package:

$ sudo dpkg -i pdip_xxx_amd64.deb Selecting previously unselected package pdip.
(Reading database ... 218988 files and directories currently installed.)
Preparing to unpack pdip_xxx_amd64.deb ...
Unpacking pdip (xxx) ...
Setting up pdip (xxx) ...
Processing triggers for man-db (2.7.5-1)...

Then install the ISYS package:

$ sudo dpkg -i isys_xxx_amd64.deb

(Reading database ... 219040 files and directories currently installed.)
Preparing to unpack isys_xxx_amd64.deb ...
Unpacking isys (xxx) over (xxx) ...
Setting up isys (xxx)

Installation from the packages is the preferred way as it is easy to get rid of the software with all the cleanups by calling:

$ sudo dpkg -r isys
(Reading database ... 219043 files and directories currently installed.)
Removing isys (xxx)

To display the list of files installed by the package:

$ dpkg -L isys
/.
/usr
/usr/local
/usr/local/include
/usr/local/include/isys.h
/usr/local/lib
/usr/local/lib/libisys.so
/usr/local/share
/usr/local/share/man
/usr/local/share/man/man3
/usr/local/share/man/man3/isys.3.gz
/usr/local/share/man/man3/isystem.3.gz

4.3. Installation from cmake

It is also possible to trigger the installation from cmake:

$ tar xvfz pdip-xxx.tgz
$ cd pdip-xxx
$ cmake .
-- The C compiler identification is GNU 6.2.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Building PDIP version xxx
The user id is 1000

-- Configuring done
-- Generating done
-- Build files have been written to: ...

$ sudo make install
Scanning dependencies of target man
Building pdip_en.1.gz
Building pdip_fr.1.gz
Building pdip_configure.
-- Installing: /usr/local/lib/librsys.so
-- Installing: /usr/local/sbin/rsystemd
-- Set runtime path of "/usr/local/sbin/rsystemd" to ""

4.4. Manual

When the ISYS package is installed, on line manuals are available in section 3 (API).

$ man 3 isystem

NAME

isys - Interactive system() service

SYNOPSIS

#include "isys.h"

int isystem(const char *fmt, ...);

int isys_lib_initialize(void);

DESCRIPTION

The ISYS API provides a system(3)-like service based on a remanent background shell to save memory and CPU time in applications where system(3) is heavily used.

isystem() executes the shell command line formatted with fmt. The behaviour of the format is compliant with printf(3). Internally, the command is run by a remanent shell created by the libisys.so library in a child of the current process.

isys_lib_initialize() is to be called in child processes using the ISYS API. By default, ISYS API is deactivated upon fork(2).

ENVIRONMENT VARIABLE

The ISYS_TIMEOUT environment variable specifies the maximum time in seconds to wait for data from the shell (by default, it is 10 seconds).

RETURN VALUE

isystem() returns the status of the executed command line (i.e. the last executed command). The returned value is a "wait status" that can be examined using the macros described in waitpid(2) (i.e. WIFEXITED(), WEXITSTATUS(), and so on).

isys_lib_initialize() returns 0 when there are no error or -1 upon error (errno is set).

MUTUAL EXCLUSION

The service does not support concurrent calls to isystem() by multiple threads. If this behaviour is needed, the application is responsible to manage the mutual exclusion on its side.

EXAMPLE

The following program receives a shell command as argument and executes it via a call to isystem().

#include <stdio.h>
#include <assert.h>
#include <stdlib.h>
#include <libgen.h>
#include <stdlib.h>
#include <string.h>
#include <isys.h>

int main(int ac, char *av[])
{
int status;
int i;
char *cmdline;
size_t len;
size_t offset;

  if (ac < 2)
  {
    fprintf(stderr, "Usage: %s cmd params...\n", basename(av[0]));
    return 1;
  }

  // Build the command line
  cmdline = (char *)0;
  len = 1; // Terminating NUL
  offset = 0;
  for (i = 1; i < ac; i ++)
  {
    len += strlen(av[i]) + 1; // word + space
    cmdline = (char *)realloc(cmdline, len);
    assert(cmdline);
    offset += sprintf(cmdline + offset, "%s ", av[i]);
  } // End for

  printf("Running '%s'...\n", cmdline);

  status = isystem(cmdline);
  if (status != 0)
  {
    printf("Error from program (0x%x)!\n", status);
    free(cmdline);
    return 1;
  } // End if

  free(cmdline);
  return 0;
} // main

Build the program:

$ gcc tisys.c -o tisys -lisys -lpdip -lpthread

Then, run something like the following:

$ ./tisys echo example
Running 'echo example '...
example

AUTHOR

Rachid Koucha

SEE ALSO

system(3).

4.5. Build facilities

To help people to auto-detect the location of ISYS stuff (libraries, include files), the ISYS package installs a configuration file named isys.pc to make it available for pkg-config tool.
Moreover, for cmake based packages, a FindIsys.cmake file is provided at the top level of isys sub-tree to facilitate auto-configuration.

5. About the author

The author is an engineer in computer sciences located in France. He can be contacted here.