-- Leo's gemini proxy

-- Connecting to thrig.me:1965...

-- Connected

-- Sending request

-- Meta line: 20 text/gemini

Duplicate Environment Variables


A common problem is to actually convince programmers that duplicate environment variables are possible on unix; most programmers interact with the environment through an interface that gives the impression that environment variables are unique in their Platonic splendor.


    $ env FOO=bar perl -E 'say $ENV{FOO}'
    bar
    $ env FOO=bar cfu 'printf("%s\n", getenv("FOO"))'
    bar
    $ env FOO=bar perl -E '$ENV{FOO} = "baz"; say $ENV{FOO}'
    baz
    $ env FOO=bar cfu 'setenv("FOO", "baz",1);printf("%s\n", getenv("FOO"))'
    baz

See? FOO only has one value that changes when you update it. Therefore, environment variables are unique. Q.E.D.


Wrong!


Abstractions Are, Like, You Know, Everywhere


%ENV (a hash or associative array or dictionary, typical in those filthy scripting languages) or getenv(3) (the C library function interface) are abstractions built on top of something. What is that something? A computer! That models a PDP-11! No, closer to home. Various approaches work here, such as delving the code for getenv, or to read documentation such as environ(7), which may mention something along the lines of


    NAME
         environ - user environment

    SYNOPSIS
         extern char **environ;

    DESCRIPTION
         An array of strings called the "environment" is made available
         by execve(2) when a process begins. By convention these strings
         have the form name=value.

That's from OpenBSD; other unixlikes may vary with the documentation. But the gist is that **environ is an array of strings, which if you know anything about C might look a lot like *argv[] or the equivalent **argv for the arguments given to a program.


    // argarg - print two args
    #include <stdio.h>
    int main(int argc, char **argv) {
        if (argc > 2) {
            printf("%s %s\n", argv[1], argv[2]);
        }
        return 0;
    }

Thus we can have duplicate entries in **argv, a point that few will dispute:


    $ make argarg
    cc -O2 -pipe    -o argarg argarg.c
    $ ./argarg foo foo
    foo foo

Given this, what do you think **environ might allow by way of duplicates? This takes a bit more work to setup, but luckily someone wrote a small program that helpfully creates duplicate environment variables, somewhere under the glorious mess that is


https://thrig.me/src/scripts.git


With dupenv, we can wrap env, here merely to report what environment variables are set, and see if two FOO exist.


    $ dupenv FOO=bar FOO=baz env | grep FOO
    FOO=bar
    FOO=baz

Nope, FOO is not Platonic. More like contingent arising... I digress.


    $ dupenv SHELL=/bin/sh SHELL=/bin/ed env | grep \^SHELL
    SHELL=/bin/ksh
    SHELL=/bin/sh
    SHELL=/bin/ed

Whose shell is it, anyways?


The duplication has been known and published since at least the 1990s, though not widely known, even among unix users. There have been various band-aids put in place, because if you pick the wrong environment variable or otherwise fail to cleanup the list, you get security vulnerabilities that were there for something like 35 years,


https://www.sudo.ws/repos/sudo/rev/d4dfb05db5d7


whoops, and a complicating factor is that different languages have put the band-aid on in different ways, or not at all, and some will pick the first of any duplicated environment variables (C, Go, Perl, sudo, zsh, ...) while others will pick the last of any duplicated environment variables (bash, ksh, ...). Also languages vary as whether they de-duplicate environment variables, whether a de-duplicated list or the original list is passed to child processes, etc.


    $ dupenv FOO=aaa FOO=ZZZ cfu 'printf("%s\n", getenv("FOO"))'
    aaa
    $ dupenv FOO=aaa FOO=ZZZ ksh -c 'echo $FOO'
    ZZZ
    $ dupenv FOO=aaa FOO=ZZZ zsh -c 'echo $FOO'
    aaa
    $ dupenv FOO=aaa FOO=ZZZ expect -c 'puts "$env(FOO) [exec sh -c {echo $FOO}]"'
    aaa ZZZ
    $ dupenv FOO=aaa FOO=ZZZ python3 -uc 'import os;print(os.environ["FOO"]);os.execvp("env",["env"])' | egrep 'aaa|ZZZ'
    aaa
    FOO=aaa
    FOO=ZZZ
    $ dupenv FOO=aaa FOO=ZZZ perl -E 'say $ENV{FOO};exec qw(sh -c), q{echo $FOO}'
    aaa
    aaa

Buyer beware?


What Am I Bewaring?


Good question! One may note that some of the above languages pass duplicate environment variables to programs they run--garbage in, garbage out--and that some tools use the last of the duplicate values instead of the first. This is wiggle room for an attacker, and perhaps enough wiggle room to embiggen the CVE list. What would happen if say, hypothetically, you have some Python code that runs some bash scripts, and the bash scripts see completely different values for PATH or LD_PRELOAD or who knows what other envrionment variables? What could an attacker do with that difference? Could there be an information leak, or an escalation of privileges?


Recap


Are the programmers in error? At a certain level of abstraction, no. In C, using only the getenv and setenv interface the environment will not appear to contain duplicates, and this will in most cases not cause a problem. In Go or Perl where the environment list is de-duplicated it is even more true that environment variables are unique, though Go does let one create a new []string with duplicates for the syscall.Exec call.


**environ is a List; higher abstraction layers attempt to convert this List into a Set, or at least paper the List over with a Set-like interface. The Set conversion can be done in several incompatible ways. If you are talk to programmers who have only interacted with a Set interface, they will likely not believe in the List that their Set came from.


There are security ramifications.


tags #perl #c #go #unix

-- Response ended

-- Page fetched on Wed May 22 01:05:08 2024