Topic : Opzimizing Your Code
Author : McMillan
Page : << Previous 2  Next >>
Go to page :


destr; //global counters
class C
{
public:
  C();
  C& operator = (const C&);
  C(const C&);
  ~C();
};
C::C()
{
  ++constructor;
}
C& C::operator = (const C& other)
{
  ++assignment_op;
  return *this;
}
C::C(const C& other)
{
  ++copy;
}
C::~C()
{
  ++destr;
}


As in the previous example, two different versions of the same function are compared; the first uses object initialization and the second uses assignment:

void assign(const C& c1)
{
C c2;
c2 = c1;
}
void initialize(const C& c1)
{
C c2 = c1;
}


Calling assign() causes three member function invocations: one for the constructor, one for the assignment operator, and one for the destructor. initialize() causes only two member function invocations: the copy constructor and the destructor. Initialization saves one function call. For a nonsensical class such as C, the additional runtime penalty that results from a superfluous constructor call might not be crucial. However, bear in mind that constructors of real-world objects also invoke constructors of their base classes and embedded objects. When there is a choice between initialization and assignment, therefore, initialization is always preferable.

Relocating Declarations
Preferring initialization of objects over assignment is one aspect of localizing declarations. On some occasions, the performance boost that can result from moving declarations is even more appreciable. Consider the following example:

bool is_C_Needed();
void use()
{
  C c1;
  if (is_C_Needed() == false)
  {
    return; //c1 was not needed
  }  
  //use c1 here
  return;
}


The local object c1 is unconditionally constructed and destroyed in use(), even if it is not used at all. The compiler transforms the body of use() into something that looks like this:

void use()
{
  C c1;
  c1.C::C(); //1. compiler-added constructor call
  if (is_C_Needed() == false)
  {
    c1.C::~C(); //2. compiler-added destructor call
    return; //c1 was not needed but was constructed and destroyed still
  }  
  //use c1 here
  c1.C::~C(); //3. compiler-added destructor call
  return;
}


As you can see, when is_C_Needed() returns false, the unnecessary construction and destruction of c1 are still unavoidable. Can a clever compiler optimize away the unnecessary construction and destruction in this case? The Standard allows the compiler to suppress the creation (and consequently, the destruction) of an object if it is not needed, and if neither its constructor nor its destructor has any side effects. In this example, however, the compiler cannot perform this feat for two reasons. First, both the constructor and the destructor of c1 have side effects -- they increment counters. Second, the result of is_C_Needed() is unknown at compile time; therefore, there is no guarantee that c1 is actually unnecessary at runtime. Nevertheless, with a little help from the programmer, the unnecessary construction and destruction can be eliminated. All that is required is the relocation of the declaration of c1 to the point where it is actually used:

void use()
{
  if (is_C_Needed() == false)
  {
    return; //c1 was not needed
  }  
  C c1; //moved from the block's beginning
  //use c1 here
  return;
}

Consequently, the object c1 is constructed only when it is really needed -- that is, when is_C_Needed() returns true. On the other hand, if is_C_Needed() returns false, c1 is neither constructed nor destroyed. Thus, simply by moving the declaration of c1, you managed to eliminate two unnecessary member function calls! How does it work? The compiler transforms the body of use() into something such as the following:

void use()
{
  if (is_C_Needed() == false)
  {
    return; //c1 was not needed
  }  
  C c1; //moved from the block's beginning
  c1.C::C(); //1 compiler-added constructor call
  //use c1 here
  c1.C::~C(); //2 compiler-added destructor call
  return;
}


To realize the effect of this optimization, change the body of use(). Instead of constructing a single object, you now use an array of 1000 C objects:

void use()
{
  if (is_C_Needed() == false)
  {
    return; //c1 was not needed
  }  
  C c1[1000];
  //use c1 here
  return;
}


In addition, you define is_C_Needed() to return false:

bool is_C_Needed()
{
  return false;
}


Finally, the main() driver looks similar to the following:

int main()
{
  for (int j = 0; j<100000; j++)
    use();
  return 0;  
}

The two versions of use() differ dramatically in their performance. They were compared on a Pentium II, 233MHz machine. To corroborate the results, the test was repeated five times. When the optimized version was used, the for loop in main() took less than 0.02 of a second, on average. However, when the same for loop was executed with the original, the nonoptimized version of use() took 16 seconds. The dramatic variation in these results isn't too surprising; after all, the nonoptimized version incurs 100,000,000 constructor calls as well as 100,000,000 destructor calls, whereas the optimized version calls none. These results might also hint at the performance gain that can be achieved simply by preallocating sufficient storage for container objects, rather than allowing them to reallocate repeatedly (see also Chapter 10, "STL and Generic Programming").

Member-Initialization Lists
As you read in Chapter 4, "Special Member Functions: Default Constructor, Copy Constructor, Destructor, and Assignment Operator," a member initialization list is needed for the initialization of const and reference data members, and for passing arguments to a constructor of a base or embedded subobject. Otherwise, data members can either be assigned inside the constructor body or initialized in a member initialization list. For example

class Date //mem-initialization version
{
private:
  int day;
  int month;
  int year;
  //constructor and destructor
public:
  Date(int d = 0, int m = 0, int y = 0) : day , month(m), year(y) {}
};


Alternatively, you can define the constructor as follows:

Date::Date(int d, int m, int y) //assignment within the constructor body
{
  day   = d;
  month = m;
  year  = y;
}


Is there a difference in terms of performance between the two constructors? Not in this example. All the data members in Date are of a fundamental type. Therefore, initializing them by a mem-initialization list is identical in terms of performance to assignment within the constructor body. However, with user-defined types, the difference between the two forms is significant. To demonstrate that, return to the member function counting class, C, and define another class that contains two instances thereof:

class Person
{
private:
  C c_1;
  C c_2;
public:
  Person(const C& c1, const C& c2 ): c_1(c1), c_2(c2) {}
};


An alternative version of Person's constructor looks similar to the following:

Person::Person(const C& c1, const C& c2)
{
c_1 = c1;
c_2 = c2;
}


Finally, the main() driver is defined as follows:

int main()
{
  C c; //created only once, used as dummy arguments in Person's constructor
  for (int j = 0; j<30000000; j++)
  {
    Person p(c, c);
  }
  return 0;  
}

The two versions were compared on a Pentium II, 233MHz machine. To corroborate the results, the test was repeated five times. When a member initialization list was used, the for loop in main() took 12 seconds, on average. The nonoptimized version took 15 seconds, on average. In other words, the assignment inside the constructor body is slower by a factor of 25% compared to the member-initialized constructor. The member function counters can give you a clue as to the reasons for the difference. Table 12.1 presents the number of member function calls of class C for the member initialized constructor and for the assignment inside the constructor's body.

Table 12.1 Comparison Between Member Initialization and Assignment Within the Constructor's Body for Class Person
Initialization Method
Default Constructor Calls
Assignment Operator Calls
Copy Constructor Calls
Destructor Calls

Member initialization list
0
0
60,000,000
60,000,000

Assignment within Constructor
60,000,000
60,000,000
0
60,000,000


When a member initialization list is used, only the copy constructor and the destructor of the embedded object are called (note that Person has two embedded members), whereas the assignment within the constructor body also adds a default constructor call per embedded object. In Chapter 4, you learned how the compiler inserts additional code into the constructor's body before any user-written code. The additional code invokes the constructors of the base classes and embedded objects of the class. In the case of polymorphic classes, this code also initializes the vptr. The assigning constructor of class Person is transformed into something such as the following:

Person::Person(const C& c1, const C& c2) //assignment within constructor body
{
//pseudo C++ code inserted by the compiler before user-written code
  c_1.C::C(); //invoke default constructor of embedded object c_1
  c_2.C::C(); //invoke default constructor of embedded object c_2
//user-written code comes here:
  c_1 = c1;
  c_2 = c2;
}


The default construction of the embedded objects is unnecessary because they are reassigned new values immediately

Page : << Previous 2  Next >>