Topic : Opzimizing Your Code
Author : McMillan
Page : << Previous 5  Next >>
Go to page :


is of type char * used
  PersonalDetails(long id) : ID(id) {} //numeric key used   
};


Memory is wasted here because only one of the keys can be used at a time. An anonymous union can be used in this case to minimize memory usage. For example

class PersonalDetails
{
private:
  union  //anonymous
  {
    char * name;
    long ID;
  };
public:
  PersonalDetails(const char *nm);
  PersonalDetails(long id) : ID(id) {/**/}  // direct access to a member
  //...
};


By using a union, the size of class PersonalDetails is halved. Again, saving four bytes of memory is not worth the trouble unless this class serves as a mold for millions of database records or if the records are transmitted on slow communication lines. Note that unions do not incur any runtime overhead, so there is no speed tradeoff in this case. The advantage of an anonymous union over a named one is that its members can be accessed directly.

Speed Optimizations
In time-critical applications, every CPU cycle counts. This section presents a few simple guidelines for speed optimization. Some of them have been around since the early days of C; others are C++ specific.

Using a Class To Pack a Long Argument List
The overhead of a function call is increased when the function has a long list of arguments. The runtime system has to initialize the stack with the values of the arguments; naturally, this operation takes longer when there are more arguments. For example, executing the following function 100,000,000 times takes 8.5 seconds on average on my machine:

void retrieve(const string& title, //5 arguments
              const string& author,
              int ISBN,  
              int year,
              bool&  inStore)
{}


Packing the argument list into a single class and passing it by reference as the only argument reduces the result to five seconds, on average. Of course, for functions that take a long time to execute, the stack initialization overhead is negligible. However, for short and fast functions that are called very often, packing a long parameter list within a single object and passing it by reference can improve performance.

Register Variables
The storage specifier register can be used as a hint to the compiler that an object will be heavily used in the program. For example

void f()
{
  int *p = new int[3000000];
  register int *p2 = p; //store the address in a register
  for (register int j = 0; j<3000000; j++)
  {
    *p2++ = 0;
  }
  //...use  p    delete [] p;
}


Loop counters are good candidates for being declared as register variables. When they are not stored in a register, a substantial amount of the loop's execution time is wasted in fetching the variable from memory, assigning a new value to it, and storing it back in memory repeatedly. Storing it in a machine register reduces this overhead. Note, however, that register is only a recommendation to the compiler. As with function inlining, the compiler can refuse to store the object in a machine register. Furthermore, modern compilers optimize loop counters and move them to the machine's registers anyway. The register storage specification is not confined to fundamental types. Rather, it can be used for any type of object. If the object is too large to fit into a register, the compiler can still store the object in a faster memory region, such as the cache memory (cache memory is about ten times faster than the main memory).




NOTE: Some compilers ignore the register specification altogether and automatically store the program's variables according to a set of built-in optimization rules. Please consult your vendor's specifications for more details on the compiler's handling of register declarations.



Declaring function parameters with the register storage specifier is a recommendation to pass the arguments on the machine's registers rather than passing them on the stack. For example

void f(register int j, register Date d);

Declaring Constant Objects as const
In addition to the other boons of declaring constant objects as const, an optimizing compiler can take advantage of this declaration, too, and store such an object in a machine register instead of in ordinary memory. Note that the same optimization can be applied to function parameters that are declared const. On the other hand, the volatile qualifier disables such an optimization (see Appendix A, "Manual of Programming Style"), so use it only when it is unavoidable.

Runtime Overhead of Virtual Functions
When a virtual function is called through a pointer or a reference of an object, the call doesn't necessarily impose additional runtime penalties. If the compiler can resolve the call statically, no extra overhead is incurred. Furthermore, a very short virtual function can be inlined in this case. In the following example, a clever compiler can resolve the calls of the virtual member functions statically:

#include <iostream>
using namespace std;
class V
{
public:  
  virtual void show() const { cout<<"I'm V"<<endl; }
};
class W : public V
{
public:
  void show() const { cout<<"I'm W"<<endl; }
};
void f(V & v, V *pV)
{
  v.show();  
  pV->show();  
}
void g()
{
  V v;
  f(v, &v);
}
int main()
{
  g();
  return 0;
}


If the entire program appears in a single translation unit, the compiler can perform an inline substitution of the call of the function g() in main(). The invocation of f() within g() can also be inlined, and because the dynamic type of the arguments that are passed to f() is known at compile time, the compiler can resolve the virtual function calls inside f() statically. There is no guarantee that every compiler actually inlines all the function calls; however, some compilers certainly take advantage of the fact that the dynamic type of the arguments of f() can be determined at compile time, and avoid the overhead of dynamic binding in this case.

Function Objects Versus Function Pointers
The benefits of using function objects instead of function pointers (function objects are discussed in Chapter 10 and in Chapter 3, "Operator Overloading") are not limited to genericity and easier maintenance. Furthermore, compilers can inline the call of a function object, thereby enhancing performance even further (inlining a function pointer call is rarely possible).

A Last Resort
The optimization techniques that have been presented thus far do not dictate design compromises or less readable code. In fact, some of them improve the software's robustness and the ease of maintenance. Packing a long argument list within a class object, const declarations, and using function objects rather than function pointers provide additional benefits on top of the performance boost. Under strict time and memory constraints, however, these techniques might not suffice; additional tweaks are sometimes required, which affect the portability and extensibility of the software. The techniques that are presented in this section are to be used only as a last resort, and only after all the other optimizations have been applied.

Disabling RTTI and Exception Handling Support
When you port pure C code to a C++ compiler, you might discover a slight performance degradation. This is not a fault in the programming language or the compiler, but a matter of compiler tuning. All you have to do to gain the same (or better) performance that you might get from a C compiler is switch off the compiler's RTTI and exception handling support. Why is this? In order to support RTTI or exception handling, a C++ compiler inserts additional "scaffolding" code to the original source file. This increases the executable size a little, and imposes slight runtime overhead (the overhead of exception handling and RTTI are discussed in Chapter 6, "Exception Handling," and Chapter 7, "Runtime Type Identification", respectively). When pure C is used, this additional code is unnecessary. Please note, however, that you should not attempt to apply this tweak with C++ code or C code that uses any C++ constructs such as operator new and virtual functions.

Inline Assembly
Time-critical sections of C++ code can be rewritten in native assembly code. The result can be a significant increase in speed. Note, however, that this measure is not to be taken lightly because it makes future modifications much more difficult. Programmers who maintain the code might not be familiar with the particular assembly language that is used, or they might have no prior experience in assembly language at all. Furthermore, porting the software to other platforms requires rewriting of the assembly code parts (in some instances, upgrading the processor can also necessitate rewriting). In addition, developing and testing assembly code is an arduous task that can take much more time than developing and testing code that is written in a high-level language.

Generally,

Page : << Previous 5  Next >>