C makes it easy to shoot yourself in the foot; C++ makes it harder, but when you do it blows your whole leg off. -- Bjarne Stroustrup (The father of C++)

04 Aug 2010

I have been a Java engineer for over a decade, and have recently found myself thrust back into the world of C++. I haven't really done C++ since college, so as you can imagine i'm a bit rusty. As I've been coming up to speed I thought it might be nice to have a place to jot down little things that i find interesting (or awful) about the language - from the viewpoint of a hardcore java developer.
  1. Unicode
  2. Interfaces
  3. struct
  4. Stack vs Heap
  5. The many uses of Static
  6. The double pointer for OUT params
  7. Virtual
  8. Number to string and back again
Unicode-- 15 Jul 2008

Something that's just built into the java language and you don't even have to think about is unicode support. All characters and Strings in Java use the Unicode encoding. If you want to specify a unicode character, you just escape it like so:

char c = 'u0645';
and viola - you have a unicode character. You can even type non-ASCII characters directly into a string - or even a variable name. It's just part of the language.

Not so in C++. You must specify something as "wide". wchar_t is a good example of a "wide character type". If you just have an ascii string that you need to make "wide", you can do this:

L"my string"
L for wide? I think L for "large" in this case, but it's the same thing.
Interfaces-- 02 Oct 2008

Believe it or not you actually can do interfaces in C++ (well, almost). The syntax is a little funny, but once you get past that it's exactly like a java interface. For those in the C++ know it's defining a class with all pure virtual functions, and then any number of subclasses that implement each of the methods.

Step 1: create the interface header. By convention (at our comapny at least) you name it like so: IMyClass.h

#ifndef IMYCLASS_H_
#define IMYCLASS_H_

class IMyClass
{ 
public:
	virtual ~IMyClass() {};
	virtual bool getValue() const=0;
	virtual void setValue(bool)=0;
};

#endif /*IMYCLASS_H_*/
Some funky things to note about the syntax. It's the =0 at the end of each of the method names that tells the compiler that there is no implementation for this method at this time (i.e. it's abstract [pure virtual]).

An aside: the const keyword after the method declaration means that the method is not allowed to change any state.

Step 2: create an implenting header file. By contention it is: MyClass.h

#ifndef MYCLASS_H_
#define MYCLASS_H_

#include "IMyClass.h"

class MyClass : public IMyClass
{
public:
	bool getValue() const;
	void setValue(bool);
protected:
	bool m_value;
};

#endif /*MYCLASS_H_*/
Here we're extending the interface with a concrete class definition (MyClass), and we're not using virtual because we're not planning to subclass this class (in fact if you don't implement each pure virtual method from above, this class will also be abstract and you won't be able to instantiate it).

Aside #2: You can have multiple public/private/protected sections, and the order doesn't matter. ICK. My convention is to have one section of each, with public on top, protected next, and private on the bottom. Another company convention is to use m_ for member variables. The typical java equivalent is just an underscore.

Step 3: implement the class. Create the (you guessed it) MyClass.cpp file.

#include "MyClass.h"

bool MyClass::getValue() const
{
	return m_value;
}

void MyClass::setValue(bool val)
{
	m_value = val;
}
Aside #3: There was much debate as to whether you should also #include the "IMyClass.h" file. You don't need to, but some thought you should. My personal opinion was that it's not necessary since MyClass.h is implementing IMyClass.h and it's not going to suddenly stop #including IMyClass.h itself since it's part of the contract.

What good is an interface if you only have one implementation? Let's create a second implementation. You use the same IMyClass.h file, but in this case we'll create MyOTherClass.h/cpp files. The only difference will be that the getValue will return the inverse of the actual value.

MyOtherClass.h

#ifndef MYOTHERCLASS_H_
#define MYOTHERCLASS_H_

#include "IMyClass.h"

class MyOtherClass : public IMyClass
{
public:
	bool getValue() const;
	void setValue(bool);
protected:
	bool m_value;
};

#endif /*MYOTHERCLASS_H_*/
MyOtherClass.cpp
#include "MyOtherClass.h"

bool MyOtherClass::getValue() const
{
	return !m_value;
}

void MyOtherClass::setValue(bool val)
{
	m_value = val;
}
Finally, let's show an example of actually using our two interfaces. I've created a small main.cpp class:
#include <iostream>
#include "MyClass.h"
#include "MyOtherClass.h"

using namespace std;

int main()
{
	MyClass mc;
	MyOtherClass mc2;

	mc.setValue(false);
	mc2.setValue(false);
	cout << "mc value is: " << mc.getValue() << endl;
	cout << "mc2 value is: " << mc2.getValue() << endl;
	mc.setValue(true);
	mc2.setValue(true);
	cout << "mc value is: " << mc.getValue() << endl;
	cout << "mc2 value is: " << mc2.getValue() << endl;

	return 0;
}
Aside #4: Notice that to instantiate our classes, i just declared the variables? That would never fly in java, but in C++ it's actually calling the default constructor (copy constructor?).

When running the program, you get the following output:

mc value is: 0
mc2 value is: 1
mc value is: 1
mc2 value is: 0
struct-- 05 Jun 2008

A struct is the same as a class ... almost. It can have members, methods, public/private sections, and even inheritance. The main difference is that by default the first section is public, whereas in a class by default the first section is private. (Since i always explicitly say public: or private: at the top this wouldn't ever matter for me). So what gives - why do we have structs and classes if they're the same thing?

A struct is a c leftover. In c++ it's generally considered "bad form" to use a struct for anything other than a complex datatype; i.e. if you're just using it to aggregate a number of other datatypes to create a complex datatype, it's perfectly acceptable. If you need to have methods to act on the data, use a class.

Stack vs Heap-- 11 Jun 2008

An important thing to note in c++ is where your memory is allocated. You can allocate something on the stack or you can allocate it on the heap. It makes a big difference and you have to think about the implications every time you allocate memory. If you allocate something on the stack, as soon as the current 'stack frame' goes out of scope (i.e. whatever block you're in, such as a for loop or a method call finishes), all stack memory is cleaned up. Thus, if you've created a class Foo on the stack it will only be usable in that stack frame, and the system will clean it up for you. Very nice for local variables.

However, if you want your variable to stick around for a while, you need to use the heap. To use the heap, you create your variable using the "new" keyword. (You might also be able to use malloc; haven't tried that yet). The caveat with this is that you are now responsible for cleaning up this memory. You must call delete at some point, otherwise you have a memory leak. And of course, if delete gets called and some piece of code tries to reference the variable after that, you'll get undefined results: it might work, it might give you garbage, it might crash your system - depending on what's happened to the memory at the location on the heap since delete was called.

Let's illustrate with a code example. Assume a file named main.cpp as follows:

#include <iostream>

class Foo
{
public:
	int getX()
	{
		return 5;
	};
};

int main(int argc, char **argv)
{
	//initialize foo on the stack
	Foo fStack;
	std::cout << "x: " << fStack.getX() << std::endl;

	//initialize foo on the heap
	Foo *fpHeap = new Foo();
	std::cout << "x: " << fpHeap->getX() << std::endl;
	delete fpHeap;

	return 0;
}
Ok, let's walk through this piece of code. You'll notice that "Foo fStack;" is not only declaring the variable, but it's also calling the default constructor. Because we didn't use the new keyword, this variable is on the stack and will go out of scope when the method main completes. The system will free the memory for us and all is well.

You'll also notice that when Foo was initialized on the heap, it was done as a pointer. This was not a stylistic choice. I had to use a pointer. The only way to get access to heap memory is via a pointer. Also notice that since i used the new keyword i used the delete keyword as well to clean up the memory once i was done with fpHeap.

Aside #1: Non-pointer variable access to class members is via the dot operator: fStack.getX(). Pointer variable access to the class members is via the arrow operator: fpHeap->getX(). The arrow operator is just a convenience operator that is equivalent to the following: (*fpHeap).getX()

The many uses of Static-- 15 Jul 2008

There are three distinct types of usage for the static keyword in C++.

If you place the static keyword in front of a member or method name inside of a class definition, it behaves exactly like you'd expect from Java: that member or function is outside of any class instance and can (should/must?) be called in a static way. For example: MyClass::MyStaticMethod().

You can use the static keyword on a method in a file and it acts similar to the java private keyword. In other words, the method is now only visible to other functions within the same physical file.

Finally, you can declare a variable within a method to be static. Doing this has the same effect as if you'd done it inside of a class definition, but perhaps you don't have a class definition and just want to keep a static piece of information. Here's an example:

#include <iostream>

int main() 
{
  for (int x=0;x<3;x++)
    for (int y=0;y<5;y++)
    {
      static int count=0; // only called 1 time
      count++;
      std::cout << "count: " << count << std::endl; //will print 0..15
    } 

  return 0;
}
The double pointer for OUT params-- 02 Oct 2008

A double pointer allows code to do an "out" parameter for a c function. Typically when an object is passed in with a pointer, it's an "in" parameter. The object already exists and will (potentially) have its state changed. With an "out" parameter, the object doesn't yet exist. The function will create it and the caller will have access to it afterward.

The following code example illustrates this point.
DoublePointer.cpp

#include <iostream>

using namespace std;

/**
 * Simple class to demonstrate using the double pointer
 */
class Foo
{
public:
	Foo(int x): _x(x) { }
	int getX() { return _x; }
	void setX(int x) { _x = x; }
private:
	int _x;
};

/**
 * @param f1 [in]   f1 can be accessed and modified by this function.
 * @param f2 [out]  f2 will be created (and modified) by this function and will
 *                  be available to the caller as if the caller had created it.
 */
void fx(Foo *f1, Foo **f2)
{
	f1->setX(7);
	*f2 = new Foo(20);
}

int main()
{
	Foo f1(3); //allocate f1 on the stack
	Foo *f2;   //define (but do not allocate) f2 on the heap

	cout << "f1 (pre) [3]: " << f1.getX() << endl;

	fx(&f1, &f2); //perform some operation on f1 and f2

	cout << "f1 (post) [7]: " << f1.getX() << endl;
	cout << "f2 (post) [20]: " << f2->getX() << endl;

	delete f2; //clean up the heap memory for f2
}
You should see the following output:
f1 (pre) [3]: 3
f1 (post) [7]: 7
f2 (post) [20]: 20
Virtual-- 03 Jun 2009

This was brushed over lightly when talking about interfaces, but virtual is an important keyword. In java, every method defined on a class is the equivalent of a c++ virtual method. In a nutshell it means that when a method is defined as virtual, the "vtable" will be consulted at runtime to decide which instance of the method to call.

For example, if you have a class A and a class B which extends A, and both A and B define the foo method, which one gets called depends on a number of factors. If the virtual keyword is not used, A's foo will get called if the variable is of type A. B's foo will get called if the variable is of type B. If the virtual keyword is used, then which one gets called depends on which type of object was instantiated (A or B), not on what the variable type has been cast to.

Let's write some code to illustrate. Call it "vrtl.cpp"

#include <stdio.h>

class A {
    public:
        void doIt() { printf("A::doIt()\n"); }
};

class B : public A {
    public:
        void doIt() { printf("B::doIt()\n"); }
};

class C {
    public:
        virtual void doIt() { printf("C::doIt()\n"); };
        virtual ~C() {};
};

class D : public C {
    public:
        virtual void doIt() { printf("D::doIt()\n"); }
};

int main(int argc, char **argv) {
    A *v1 = new B();
    v1->doIt();

    B *v2 = new B();
    v2->doIt();

    C *v3 = new D();
    v3->doIt();

    C *v4 = new C();
    v4->doIt();

    delete v1;
    delete v2;
    delete v3;
    delete v4;

    return 0;
}
Running it will yield the following output:
A::doIt()
B::doIt()
D::doIt()
C::doIt()
The variable v1 creates a new B, and yet it printed out A's doIt because there is no virtual, so statically at compile time, it saw that the variable was of type A.

The variable v2 also creates a new B, and it printed out B's doIt because at compile time, the variable was statically linked to type B.

The variable v3 creates a D and printed out D, even though the variable is of type C. This is because C is virtual, so it doesn't look at the variable type at compile time, but looks at the class type at run time. Same rule applies for v4.

Number to string and back again-- 04 Aug 2010

I've run into this several times recently. I've had either a number or a char * and needed to convert into the other. Simple as can be in java, but not quite so obvious in C++. There are actually lots of solutions, but some make assumptions about data types and lengths. Others are "not official, but typically supported by compilers". I wanted something that would work in almost all circumstances. Here's what i came up with.

Make a class called "conversion.cpp".

#include <iostream>
#include <math.h>
#include <stdio.h>
#include <sstream>

int main (int argc, char * const argv[]) {
	uint64_t myReallyLongNumber = 987654321012;
	std::cout << "  orig: " << myReallyLongNumber << std::endl;
	
	//convert from number to char *
	long len = log10(myReallyLongNumber) + 1;
	char buf[len+1];
	sprintf(buf, "%llu", myReallyLongNumber);
	std::cout << "string: " << buf << std::endl;
	
	//convert from char * to number
	std::string sNum(buf);
	std::stringstream ss(sNum);
	uint64_t reconstitutedNumber;
	ss >> reconstitutedNumber;
	std::cout << "number: " << reconstitutedNumber << std::endl;
	
    return 0;
}
Running will yield the following output:
  orig: 987654321012
string: 987654321012
number: 987654321012