How setting the TZ environment variable avoids thousands of system calls
The purpose of the vDSO system call method was to create a way for certain, very frequently used system calls (like
clock_gettime, etc) to avoid needing to actually enter the kernel and cause a context switch from user land to kernel land. The result of this method is that certain system calls, like those listed, can be used by programs at much, much lower cost.
What would happen if every time you called one of these fast vDSO system calls (like
time) you also called a normal system call like, say
stat, which does not pass through the vDSO?
If you did that, you’d essentially be negating some of the performance improvement you were meant to gain by the vDSO optimization in the first place; you’d be making a slow system call very often.
It turns out that this situation happens rather often with a pair of functions that are commonly used together:
time: A vDSO-enabled system call used to obtain the number of seconds since the epoch, and
localtime: A glibc provided function which converts the output of
timeto a local time in the user’s timezone. This is not a system call, but internally
localtimecan make a system call in some cases.
time vDSO-enabled system call and the
localtime function from glibc are often used together in applications either directly by the programmer, or at a lower level unbeknownst to the programmer for formatting dates and times for everything from log messages to SQL queries. This pair is commonly used in Ruby on Rails.
It turns out that the
localtime function in glibc will check if the
TZ environment variable is set. If it is not set (the two Ubuntus I’ve tested do not set it), then glibc will use the
stat system call every time
localtime is called.
In other words: your system supports calling the
time system call via the Linux kernel’s vDSO to avoid the cost of switching to the kernel. But, as soon as your program calls
time, it calls
localtime immediately after, which invokes a system call anyway. So, you’ve eliminated one system call with the vDSO, but replaced it with another.
Let’s see a sample program that shows this behavior, how to use
strace to detect this, and finally how to prevent it by setting the
TZ environment variable.
Sample program showing the issue
Let’s start by creating a simple test program that reproduces this issue:
You can compile this program by simply running
gcc -o test test.c. As you can see, this program simply calls
localtime in a loop 10 times.
Verifying this with
Every single call to the
localtime glibc function will generate a system call to
stat. Don’t believe me? Let’s use
strace to prove it using the program shown above:
strace output above we see a few things:
/etc/localtimefile is opened.
fstatis called twice, followed by two reads to the file to pull in the timezone data.
- Next, 9 calls to
statare made passing
/etc/localtimeover and over.
Notice that the
strace output does not show the call to
time. This is expected. System calls made via the vDSO do not appear in
strace output. To see them, you’d need to use
What’s going on here is that the first call to
localtime in glibc opens and reads the contents of
/etc/localtime. All subsequent calls to
localtime internally call
stat, but they do this to ensure that the timezone file has not changed.
On many systems
/etc/localtime is a symlink to a timezone file. It is conceivable that a program might be running when the
/etc/localtime symlink is updated. If this were to happen, glibc would notice this when
localtime is called and re-read the file before doing any time conversions.
Preventing extraneous system calls
Many production systems use the UTC timezone and don’t ever need (or want) to change that. For this use case, there’s no reason to
/etc/localtime file or symlink over and over and over when
localtime is called. The timezone is never going to change from UTC (or if it does, the application can just be restarted).
The easiest way to prevent these
stat calls is to set the
TZ environment variable. When the
TZ environment variable is set glibc will:
- Notice you’ve told it explicitly what timezone file to use for your program.
- Read the file and cache it internally.
readthat file path again, as long as the value of the
TZenvironment variable is left unchanged.
Let’s set the
TZ variable and check with
As you can see, simply setting
TZ causes glibc to read
/etc/localtime a single time and never again. The same program runs with 9 fewer system calls all because of a single environment variable.
Effect on production systems
The effect of setting
TZ on a production system will depend mostly on how often
localtime is called. This is application specific and can vary with request load. All that said, if you aren’t changing your timezone often (or ever), it may be worth simply eliminating these unnecessary calls even if your system isn’t making many of them.
On my test system with a real life app (not the example shown above):
- Without setting
TZduring normal operations yields approximately: 14,925 calls to
statover a 30 second period (or roughly 497
stats per second).
TZset during the same time period results in 8 calls to
statover a 30 second period.
So, I’ve eliminated on the order of 10,000 extra system calls (and their associated context switches) without changing anything other than an environment variable. Pretty cool.
strace and questioning why patterns of system calls emerge from your infrastructure can help you better understand exactly what your systems are doing and why. You may even be able to remove unncessary system calls and save some system resources, too.