ATL Under the Hood Part 1

In this series of tutorials I am going to discuss some inner working of ATL
and techniques which ATL use. Let’s start discussion by the memory layout of the
program. Let’s make a simple program which doesn’t have any data member and take
a look at the memory structure of it.


Program 1.


#include <iostream>
using namespace std;

class Class {
};

int main() {
Class objClass;
cout << “Size of object is = ” << sizeof(objClass) << endl;
cout << “Address of object is = ” << &objClass << endl;
return 0;
}

The output of this program is

Size of object is = 1
Address of object is = 0012FF7C

Now if we are going to add some data member then the size of the class is sum
of all the storage of individual member variable. It is also true in case of
template. Now take a look at template class of Point.

Program 2.


#include <iostream>
using namespace std;

template <typename T>
class CPoint {
public:
T m_x;
T m_y;
};

int main() {
CPoint<int> objPoint;
cout << “Size of object is = ” << sizeof(objPoint) << endl;
cout << “Address of object is = ” << &objPoint << endl;
return 0;
}

Now the output of the program is

Size of object is = 8
Address of object is = 0012FF78

Now add inheritance too in the program. Now we are going to inherit class
Point3D from Point class and see the memory structure of this program.

Program 3.


#include <iostream>
using namespace std;

template <typename T>
class CPoint {
public:
T m_x;
T m_y;
};

template <typename T>
class CPoint3D : public CPoint<T> {
public:
T m_z;
};

int main() {
CPoint<int> objPoint;
cout << “Size of object Point is = ” \
<< sizeof(objPoint) << endl;
cout << “Address of object Point is = ” \
<< &objPoint << endl;

CPoint3D objPoint3D;
cout << “Size of object Point3D is = ” \
<< sizeof(objPoint3D) << endl;
cout << “Address of object Point3D is = ” \
<< &objPoint3D << endl;

return 0;
}

The output of this program is

Size of object Point is = 8
Address of object Point is = 0012FF78
Size of object Point3D is = 12
Address of object Point3D is = 0012FF6C

This program shows the memory structure of the drive class. It shows the
memory occupied by the object is sum of its data member plus its base
member.

Things become interesting when virtual function join the party. Take a look
at the following program

Program 4.


#include <iostream>
using namespace std;

class Class {
public:
virtual void fun() { cout << “Class::fun” << endl; }
};

int main() {
Class objClass;
cout << “Size of Class = ” << sizeof(objClass) << endl;
cout << “Address of Class = ” << &objClass << endl;
return 0;
}

The output of the program is


Size of Class = 4
Address of Class = 0012FF7C

And situation becomes more interesting when we add more than one virtual
function.

Program 5.


#include <iostream>
using namespace std;

class Class {
public:
virtual void fun1() { cout << “Class::fun1” << endl; }
virtual void fun2() { cout << “Class::fun2” << endl; }
virtual void fun3() { cout << “Class::fun3” << endl; }
};

int main() {
Class objClass;
cout << “Size of Class = ” << sizeof(objClass) << endl;
cout << “Address of Class = ” << &objClass << endl;
return 0;
}

The output of the program is same as above program. Let’s do one more
experiment to better understand it.

Program 6.


#include <iostream>
using namespace std;

class CPoint {
public:
int m_ix;
int m_iy;
virtual ~CPoint() { };
};

int main() {
CPoint objPoint;
cout << “Size of Class = ” << sizeof(objPoint) << endl;
cout << “Address of Class = ” << &objPoint << endl;
return 0;
}

The output of the program is

Size of Class = 12
Address of Class = 0012FF68

The output of these programs shows that when you add any virtual function in
the class then its size increases one int size. I.e. in visual C++ increase 4
bytes. It means there are 3 Slot for integer in this class one for x one for y
and one to handle virtual function that is called virtual pointer. First take a
look the new slot i.e. virtual pointer is at starting of the object or ending of
the object.

To do this we are going to directly access memory occupied by the object. To
do this stores the address of object in int pointer and use the magic of pointer
arithmetic.

Program 7.


#include <iostream>
using namespace std;

class CPoint {
public:
int m_ix;
int m_iy;
CPoint(const int p_ix = 0, const int p_iy = 0) :
m_ix(p_ix), m_iy(p_iy) {
}
int getX() const {
return m_ix;
}
int getY() const {
return m_iy;
}
virtual ~CPoint() { };
};

int main() {
CPoint objPoint(5, 10);

int* pInt = (int*)&objPoint;
*(pInt+0) = 100; // want to change the value of x
*(pInt+1) = 200; // want to change the value of y

cout << “X = ” << objPoint.getX() << endl;
cout << “Y = ” << objPoint.getY() << endl;

return 0;
}

The important thing in this program is


  int* pInt = (int*)&objPoint;
*(pInt+0) = 100; // want to change the value of x
*(pInt+1) = 200; // want to change the value of y

In which we treat object as an integer pointer after store its address in
integer pointer. The output of this program is

X = 200
Y = 10

Of course this is not our required result. This shows when 200 is store in
the location where m_ix data member is resident. This means m_ix i.e. first
member variable, start from second position of the memory not the first. In
other words the first member is the virtual pointer and then rest is the data
member of the object. Just change the following two lines

  int* pInt = (int*)&objPoint;
*(pInt+1) = 100; // want to change the value of x
*(pInt+2) = 200; // want to change the value of y

And we get the required result. Here is the complete program

Program 8.


#include <iostream>
using namespace std;

class CPoint {
public:
int m_ix;
int m_iy;
CPoint(const int p_ix = 0, const int p_iy = 0) :
m_ix(p_ix), m_iy(p_iy) {
}
int getX() const {
return m_ix;
}
int getY() const {
return m_iy;
}
virtual ~CPoint() { };
};

int main() {
CPoint objPoint(5, 10);

int* pInt = (int*)&objPoint;
*(pInt+1) = 100; // want to change the value of x
*(pInt+2) = 200; // want to change the value of y

cout << “X = ” << objPoint.getX() << endl;
cout << “Y = ” << objPoint.getY() << endl;

return 0;
}

And output of the program is


X = 100
Y = 200

This clearly shows that whenever we add the virtual function int the class
then virtual pointer is added at first location of memory structure.

Now the question arise what is store in the virtual pointer? Take a look at
the following program to get an idea of this

Program 9.

#include <iostream>
using namespace std;

class Class {
virtual void fun() {
cout << “Class::fun” << endl;
}
};

int main() {
Class objClass;

cout << “Address of virtual pointer ” \
<< (int*)(&objClass+0) << endl;
cout << “Value at virtual pointer ” << \
(int*)*(int*)(&objClass+0) << endl;
return 0;
}

The output of this program is


Address of virtual pointer 0012FF7C
Value at virtual pointer 0046C060

Virtual pointer stores the address of a table that is called virtual table.
And virtual table store address of all the virtual function of that class. In
other words virtual table is an array of address of virtual function. Let’s take
a look at the following program to get an idea of it.

Program 10.


#include <iostream>
using namespace std;

class Class {
virtual void fun() { cout << “Class::fun” << endl; }
};

typedef void (*Fun)(void);

int main() {
Class objClass;

cout << “Address of virtual pointer ” \
<< (int*)(&objClass+0) << endl;
cout << “Value at virtual pointer i.e. Address of virtual table ” \
<< (int*)*(int*)(&objClass+0) << endl;
cout << “Value at first entry of virtual table ” \
<< (int*)*(int*)*(int*)(&objClass+0) << endl;

cout << endl << “Executing virtual function” << endl << endl;
Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);
pFun();
return 0;
}

This program has some uncommon indirection with typecast. Most important line
of this program is

  Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);

Here Fun is a typedefed function pointer.

  typedef void (*Fun)(void);

Let’s dissect the lengthy uncommon indirection. (int*)(&objClass+0)
give address of virtual pointer of the class which is first entry in the class
and we typecast it to int*. To get the value at this address use indirection
operator (i.e. *) and then again typecast it to int* i.e.
(int*)*(int*)(&objClass+0). This will give the address of first entry of the
virtual table. To get the value at this location, i.e. get the address of first
virtual function of the class again use the indirection operator and now
typecast to the appropriate function pointer type. So

  Fun pFun = (Fun)*(int*)*(int*)(&objClass+0);

Means get the value from the first entry of the virtual table and store it in
pFun after typecast it into the Fun type.

What happen when one more virtual function add in the class. Now we want to
access second member of the virtual table. Take a look at the following program
to see the values at virtual table

Program 11.

#include <iostream>
using namespace std;

class Class {
virtual void f() { cout << “Class::f” << endl; }
virtual void g() { cout << “Class::g” << endl; }
};

int main() {
Class objClass;

cout << “Address of virtual pointer ” << (int*)(&objClass+0) << endl;
cout << “Value at virtual pointer i.e. Address of virtual table ”
<< (int*)*(int*)(&objClass+0) << endl;

cout << endl << “Information about VTable” \
<< endl << endl;
cout << “Value at 1st entry of VTable ” \
<< (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
cout << “Value at 2nd entry of VTable ” \
<< (int*)*((int*)*(int*)(&objClass+0)+1) << endl;

return 0;
}

The output of this program is


Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C0EC

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E

Now one question naturally comes in the mind. How compiler knows the length
of vtable. The answer is the last entry of vtable is NULL. Change a program
little bit to get and idea of this.

Program 12.


#include <iostream>
using namespace std;

class Class {
virtual void f() { cout << “Class::f” << endl; }
virtual void g() { cout << “Class::g” << endl; }
};

int main() {
Class objClass;

cout << “Address of virtual pointer ” << (int*)(&objClass+0) << endl;
cout << “Value at virtual pointer i.e. Address of virtual table ”
<< (int*)*(int*)(&objClass+0) << endl;

cout << endl << “Information about VTable” \
<< endl << endl;
cout << “Value at 1st entry of VTable ” \
<< (int*)*((int*)*(int*)(&objClass+0)+0) << endl;
cout << “Value at 2nd entry of VTable ” \
<< (int*)*((int*)*(int*)(&objClass+0)+1) << endl;
cout << “Value at 3rd entry of VTable ” \
<< (int*)*((int*)*(int*)(&objClass+0)+2) << endl;
cout << “Value at 4th entry of VTable ” \
<< (int*)*((int*)*(int*)(&objClass+0)+3) << endl;

return 0;
}

The output of this program is

Address of virtual pointer 0012FF7C
Value at virtual pointer i.e. Address of virtual table 0046C134

Information about VTable

Value at 1st entry of VTable 0040100A
Value at 2nd entry of VTable 0040129E
Value at 3rd entry of VTable 00000000
Value at 4th entry of VTable 73616C43

Output of this program shows that the last entry of vtable is NULL. Let’s
call virtual function from the knowledge we have.

Program 13.


#include <iostream>
using namespace std;

class Class {
virtual void f() { cout << “Class::f” << endl; }
virtual void g() { cout << “Class::g” << endl; }
};

typedef void(*Fun)(void);

int main() {
Class objClass;

Fun pFun = NULL;

// calling 1st virtual function
pFun = (Fun)*((int*)*(int*)(&objClass+0)+0);
pFun();

// calling 2nd virtual function
pFun = (Fun)*((int*)*(int*)(&objClass+0)+1);
pFun();

return 0;
}

The output of this program is

Class::f
Class::g

Now see the case of multiple inheritances. Let’s see the simple case of
multiple inheritances

Program 14.


#include <iostream>
using namespace std;

class Base1 {
public:
virtual void f() { }
};

class Base2 {
public:
virtual void f() { }
};

class Base3 {
public:
virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

int main() {
Drive objDrive;
cout << “Size is = ” << sizeof(objDrive) << endl;
return 0;
}

The output of this program is

Size is = 12

This program shows when you drive class with more then one base class then
drive class have virtual pointer of all of base classes.

And what happen when drive class also have virtual function. Lets see this
program to better understand the concepts of virtual function with multiple
inheritance.

Program 15.


#include <iostream>
using namespace std;

class Base1 {
virtual void f() { cout << “Base1::f” << endl; }
virtual void g() { cout << “Base1::g” << endl; }
};

class Base2 {
virtual void f() { cout << “Base2::f” << endl; }
virtual void g() { cout << “Base2::g” << endl; }
};

class Base3 {
virtual void f() { cout << “Base3::f” << endl; }
virtual void g() { cout << “Base3::g” << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
public:
virtual void fd() { cout << “Drive::fd” << endl; }
virtual void gd() { cout << “Drive::gd” << endl; }
};

typedef void(*Fun)(void);

int main() {
Drive objDrive;

Fun pFun = NULL;

// calling 1st virtual function of Base1
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+0);
pFun();

// calling 2nd virtual function of Base1
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+1);
pFun();

// calling 1st virtual function of Base2
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+0);
pFun();

// calling 2nd virtual function of Base2
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+1)+1);
pFun();

// calling 1st virtual function of Base3
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+0);
pFun();

// calling 2nd virtual function of Base3
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+2)+1);
pFun();

// calling 1st virtual function of Drive
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+2);
pFun();

// calling 2nd virtual function of Drive
pFun = (Fun)*((int*)*(int*)((int*)&objDrive+0)+3);
pFun();

return 0;
}

The output of this program is


Base1::f
Base1::g
Base2::f
Base2::g
Base3::f
Base3::g
Drive::fd
Drive::gd

This program show that the virtual function of drive store in the vtable of
first vptr.

We can get the offset of Drive class vptr with the help of static_cast. Let’s
take a look at he following program to better understand it.

Program 16.


#include <iostream>
using namespace std;

class Base1 {
public:
virtual void f() { }
};

class Base2 {
public:
virtual void f() { }
};

class Base3 {
public:
virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

// any non zero value because multiply zero with any no is zero
#define SOME_VALUE 1

int main() {
cout << (DWORD)static_cast<Base1*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
cout << (DWORD)static_cast<Base2*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
cout << (DWORD)static_cast<Base3*>((Drive*)SOME_VALUE)-SOME_VALUE << endl;
return 0;
}

ATL use a macro name offsetofclass defined in ATLDEF.H to do this. Macro is
defined at

#define offsetofclass(base, derived) \
((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

This macro returns the offset of the base class vptr in the drive class
object model. Let’s see an example to get an idea of this

Program 17.


#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
virtual void f() { }
};

class Base2 {
public:
virtual void f() { }
};

class Base3 {
public:
virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
cout << offsetofclass(Base1, Drive) << endl;
cout << offsetofclass(Base2, Drive) << endl;
cout << offsetofclass(Base3, Drive) << endl;
return 0;
}

The memory layout of the drive class is

And output of this program is


0
4
8

Output of this program shows this macro return the offset of vptr of required
base class. In Don Box’s Essential COM, he used similar macro to this. Change a
program little bit and replaces ATL macro with Box’s macro.

Program 18.


#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
virtual void f() { }
};

class Base2 {
public:
virtual void f() { }
};

class Base3 {
public:
virtual void f() { }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define BASE_OFFSET(ClassName, BaseName) \
(DWORD(static_cast<BaseName*>(reinterpret_cast<ClassName*>\
(0x10000000))) – 0x10000000)

int main() {
cout << BASE_OFFSET(Drive, Base1) << endl;
cout << BASE_OFFSET(Drive, Base2) << endl;
cout << BASE_OFFSET(Drive, Base3) << endl;
return 0;
}

The output and purpose of this program is same as previous program.

Let’s do something practical and use this macro in our program. In fact we
can call the virtual function of our required base class by getting the offset
of base class vptr in drive’s memory structure.

Program 19.


#include <windows.h>
#include <iostream>
using namespace std;

class Base1 {
public:
virtual void f() { cout << “Base1::f()” << endl; }
};

class Base2 {
public:
virtual void f() { cout << “Base2::f()” << endl; }
};

class Base3 {
public:
virtual void f() { cout << “Base3::f()” << endl; }
};

class Drive : public Base1, public Base2, public Base3 {
};

#define _ATL_PACKING 8

#define offsetofclass(base, derived) \
((DWORD)(static_cast<base*>((derived*)_ATL_PACKING))-_ATL_PACKING)

int main() {
Drive d;

void* pVoid = NULL;

// call function of Base1
pVoid = (char*)&d + offsetofclass(Base1, Drive);
((Base1*)(pVoid))->f();

// call function of Base2
pVoid = (char*)&d + offsetofclass(Base2, Drive);
((Base2*)(pVoid))->f();

// call function of Base1
pVoid = (char*)&d + offsetofclass(Base3, Drive);
((Base3*)(pVoid))->f();

return 0;
}

The output of the program is


Base1::f()
Base2::f()
Base3::f()

I tried to explain the working of offsetofclass macro of ATL in this
tutorial. Hope to explore other mysterious of ATL in next article.

More by Author

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Must Read