a large enough problemSize in C++ takes 0 run time

this piece of code used to demonstrate algorithm complexity and Measuring the Run Time of an Algorithm comes from a book FUNDAMENTALS OF PYTHON: FROM FIRST PROGRAMS THROUGH DATA STRUCTURES

"""
File: timing1.py
Prints the running times for problem sizes that double,
using a single loop.
"""

import time

problemSize = 10000000
print "%12s%16s" % ("Problem Size", "Seconds")
for count in xrange(5):

    start = time.time()
    # The start of the algorithm
    work = 1
    for x in xrange(problemSize):
        work += 1
        work -= 1
    # The end of the algorithm
    elapsed = time.time() - start

    print "%12d%16.3f" % (problemSize, elapsed)
    problemSize *= 2

this piece of code works well and I am trying to do the similar trial with C++. here is the code (snippet_2)

#include <iostream>
#include <chrono>

using namespace std;
using namespace std::chrono;

void functiona()
{
    long long number = 0;
    long long problemSize = 100000000000;

    for( long long i = 0; i < problemSize; ++i )
    {
       for(long long j = 0; j < problemSize; j++)
       {
           for(long long k = 0; k < problemSize; k++)
           {
               for(long long l = 0; l < problemSize; l++)
               {
                   for(long long l = 0; l < problemSize; l++)
                   {
                       number++;
                       number--;
                   }
               }
            }
        }
    }
}

int main()
{
    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    functiona();
    high_resolution_clock::time_point t2 = high_resolution_clock::now();

    auto duration = duration_cast<microseconds>( t2 - t1 ).count();

    cout << duration;
    return 0;
}

I guess the problemSize is large enough though, snippet_2 outputs 0.

what am I missing?

4 answers

  • answered 2019-06-11 23:13 Deepstop

    This reminds me of a benchmark I ran for a potential customer. On the competitor's system it took about 60 seconds. On our system it took a small fraction of a second.

    The difference was not that our CPU was so much faster, but that the compiler's optimizer realized it could take a shortcut because the calculation could be easily optimized without the loop.

    In your case you a incrementing and decrementing the variable number, but the compiler probably knows that number won't change as a result, and so optimizes the loops out of the code.

  • answered 2019-06-11 23:28 Jay

    a little bit modification to your code so that the compiler won't optimize your benchmark out.

    void function()
    {
        long long number = 0;
        long long problemSize = 100;
        long long res = 0;
    
        for( long long i = 0; i < problemSize; ++i )
        {
            number++;
            res += number;
            number--;
        }
        cout << res << endl;
    }
    

  • answered 2019-06-12 00:16 Kosta

    The function is getting optimized away. You can easily fix that by forcing the compiler to assume there are side effects every time a variable is read or written to. This can be done by declaring the variable volatile:

    /* long long number = 0; --> */ volatile long long number = 0;
    

  • answered 2019-06-12 00:58 Lightness Races in Orbit

    This is what happens when people forget what C++ is. Your source code is not a sequence of instructions for the computer chip to execute one at a time. It is a description of a program. It is an abstraction.

    People like to talk about optimisation this and release mode that. People like to treat the optimiser as if it were an afterthought, tacked on at the end of your build process to what would otherwise be a line-by-line match to the statements you typed out. It's not. It's a fundamental part of what it means to take an abstract program and create something real from it (read: the process of compilation). Optimisation levels would be better named "effort levels": how much effort the compiler goes to by venturing further and further from your words in search of a real program that'll execute nice and quickly on a computer chip.

    Your code, albeit in a very round-about way, describes a program that achieves nothing. So you should very well expect a decent compiler to produce an executable that does no meaningful work.

    Sure, you can trick it with side-effects and blah blah blah and other stuff that you didn't actually need to measure in the first place, but that just misses the point. Instead, benchmark actual, useful work. Anything else is just wasting your own time!

    If you need a delay in your program, use std::this_thread::sleep_for.

    It is different in Python because that is an interpreted scripting language, which does not [typically] go through any translation before being executed. That's why the book you're reading talks about Python, not C++. It is usually folly to take concepts from Python and try to apply them to C++.