Is the memory address for string literals in different translation units the same?

Suppose we have the following snippet of code.

#include <iostream>

int main(){
    const char* p1="hello";
    const char* p2="hello";
    std::cout<<p1==p2;
}

Output

1

As we know, p1 and p2 are pointing to the same memory address (correct me if I am wrong).

Now, suppose we have these pointers defined in different translation units:

//A.cpp
#include <iostream>

const char* pA="hello";

int main(){
    //whatever.
}

//B.cpp
#include <iostream>

const char* pB="hello"; //Same string literal

int whatever(){
    //whatever.
}

My question is, will the memory addresses pointed by pA and pB still be the same, and if yes, in which cases they may differ (like using keywords and whatever)?

1 answer

  • answered 2021-07-22 12:10 Adrian Mole

    As mentioned in the comments, the C++ Standard does not enforce whether or not multiply-defined, identical string literals should be merged:

    5.13.5 String literals        [lex.string]


    16    Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above. Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.

    Frequently, compilers (or linkers) offer a command-line switch to decide on whether or not to merge identical strings. For example, the MSVC compiler has the "Enable string pooling" option – /GF to merge, or /GF- to keep them separate.

    Using the following code units:

    #include <iostream>
    extern void other();
    
    int main()
    {
        const char* inmain = "hello";
        std::cout << (void*)(inmain) << std::endl;
        other();
        return 0;
    }
    

    and (in a separate source file):

    #include <iostream>
    
    void other()
    {
        const char* inother = "hello";
        std::cout << (void*)(inother) << std::endl;
    
    }
    

    Building with the /GF switch yields output like the following (the strings have the same addresses):

    00007FF778952230
    00007FF778952230
    

    However, using /GF- produces:

    00007FF7D5662238
    00007FF7D5662230
    

    In fact, even your first code snippet (with minor modifications, shown below), where both literals are in the same translation unit (and even in the same scope) generates two different objects when built with the /GF- option:

    #include <iostream>
    #include <ios>
    
    int main()
    {
        const char* p1 = "hello";
        const char* p2 = "hello";
        std::cout << std::boolalpha << (p1 == p2) << std::endl;
        // Output: "false" using /GF- or "true" using /GF
        return 0;
    }
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum