In C, why is the pointer returned by getenv automatically reclaimed?

I use the following code to test the pointer returned by getenv, if not free, testing will cause a memory leak.

#include <stdio.h>
#include <stdlib.h>

void demo() {
    const char *home = getenv("HOME");
    printf("%s\n", home);
}

int main() {
    demo();
    return 0;
}

I use Valgrind to detect memory leaks:

$ gcc main.c -o main
$ valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all ./main

The result is as follows:

==134679== Memcheck, a memory error detector
==134679== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==134679== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==134679== Command: ./demo
==134679== 
/home/aszswaz
==134679== 
==134679== HEAP SUMMARY:
==134679==     in use at exit: 0 bytes in 0 blocks
==134679==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==134679== 
==134679== All heap blocks were freed -- no leaks are possible
==134679== 
==134679== For lists of detected and suppressed errors, rerun with: -s
==134679== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

It shows that a piece of memory has been applied for and a piece of memory has been reclaimed. Why can the pointer obtained with getenv be automatically recycled?

If I add free((void *)home) to the code, will it affect the global environment variable?

2 answers

  • answered 2022-01-21 10:33 Example person

    https://man7.org/linux/man-pages/man3/getenv.3.html does not mention anything that getenv()'s return value is allocated on the heap. Hence, think of it like argv.

    Although rarely and myself never seen this happen, in some implementations, getenv() may allocate memory on the heap and return a pointer to it. Therefore, its best to review your operating system's manual pages.

    Only memory allocated on the heap must be free()d.

    And, pointer to memory allocated on the heap is usually only returned when you call malloc(), calloc(), realloc(), strdup(), etc.

    Anyways:

    If you do call free() on the pointer returned by getenv(), it is undefined behavior if your operating system did not return a pointer to memory on heap.

    And whoa, I almost didn't see this!:
    I think this is what is confusing you:

    usage: 1 allocs, 1 frees
    

    printf() may allocate on heap and free it before returning back.

    Edit:

    Using gdb, we can find that:

    #0  __GI___libc_malloc (1024) // and indeed it calls with 1024 bytes
    #1  __GI__IO_file_doallocate ()
    #2  __GI__IO_doallocbuf ()
    #4  _IO_new_file_xsputn ()
    #5  _IO_new_file_xsputn ()
    #6  __GI__IO_puts
    #7  demo ()
    #8  main ()
    

    printf() seems to be replaced by puts(), and puts() calls malloc() through a long function call chain.

    But, it doesn't seem to call free(). I think it calls some other function that frees the memory. I'm still doing my research.

  • answered 2022-01-24 19:25 Luis Colorado

    The environment is initialized on the initial process stack, just above the parameters argc and argv to main().

    Normally, as the environment life and parameters themselves is the whole program life, the common implementation consist in pushing the environment strings, the environment array, the main function parameter strings and the command line array in the stack, just before pushing the three variables (historically main() had a third parameter environ, that was passed also to main) For legacy code reasons, this environment pointer is still passed to main.

    Just try the following program:

    #include <stdio.h>
    #include <string.h>
    
    int main(int argc, char **argv, char **environ)
    {
        printf("argc = %d\n", argc);
        printf("args:");
        for (int i = 0; i < argc; i++) {
            printf(" [%s]", argv[i]);
        }
        printf("\n\nEnvironment:\n");
        for (char **p = environ; *p; p++) {
            printf("  %s\n", *p);
        }
        return 0;
    }
    

    what getenv() returns is not statically allocated data, nor dynamically allocated data in the heap. It is stored in the stack (almost all POSIX-like operating systems solve this problem in the same way) just above main() parameters (you can check it with a debugger, to see where are exactly located the strings, I have already done too)

    So your problem is not related to getenv(), which doesn't use malloc() to return dynamically allocated strings to you. You'll need to look elsewhere for that memory Valgrind identifies as dynamic (It doesn't show any problem with it)

    Think that valgrind identifies the memory you have allocated with malloc() (or any of its friends) and not free()d, and the report says at exit() there's no memory allocated. So getenv() doesn't allocate memory to give you the environment contents.

    If you read on how the kernel initializes a process memory's initial stack, you will find very interesting things (listed from most deep in the stack to the top of it):

    • There's some fixed machine code to properly return from kernel mode when there's a pending interrupt (interrupts are executed only when the kernel switches from kernel mode to user mode because they have to execute the handler code in user mode, and never in kernel mode, for obvious reasons)
    • There's a legacy structure to save command line parameters and to allow the kernel to access command parameters for ps(1) command to work. This is no longer true, as it represents a security hole to put kernel info in user space, but the structure is still there, for legacy code to use. The parameters for a command are now in kernel space, and can only be modified by a special system call.
    • There are all the strings associated to the received environment of the process.
    • The array of pointer to the environment strings is also stored in the stack. This includes a final NULL pointer to be able to identify the end of the array.
    • All the strings of the command line parameters are also stored there.
    • The array of pointers (including the last NULL pointer, which is not counted in argc) of the command line parameters.
    • The parameter environ with a pointer to the array of environment strings.
    • The parameter argv with a pointer to the array of command line parameters.
    • The parameter argc with the number of command line parameters (this includes the program name, but excludes the last NULL pointer)

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum