in

MSMVPS.COM

The Ultimate Destination for Blogs by Current and Former Microsoft Most Valuable Professionals.

Cluebat-man to the rescue

A weblog dedicated to Visual C++, interoperability and other stuff.

References in C++ are not necessarily safe

People who are new to C++ sometimes have the mistaken idea that using references instead of pointers makes your code safe. People who have been programming a bit longer know this is anything but the case. References are just semantic sugar coated pointers.

I'll explain in more detail with a couple of examples. They use this pretty simple class A.

struct A
{
 
int &I;
 
A(int &i) : I(i)
 
{
 
}
 
void print(void)
 
{
   
cout << I << endl;
 
}
};

When A is instantiated, the constructor takes a reference to an integer and uses it to initialize an internal integer reference. The print method simply prints the value of that integer.

Case 1

int * i= new int(32);
A a(*i);
*i = 43;
a.print();
delete i;
a.print();

I create a new integer value on the heap and I initialize a pointer with its address. I then create a new instance of A, and pass the newly created integer by reference. a initializes its own internal reference with the address of that integer. And indeed, if I assign a value to that integer and then print a's internal reference, the values match.

But what happens if that integer is deleted from the heap? It depends. The behavior is undefined. The next read from a.I could result in a bad value or an application crash. But in any case, it is not safe.

You might argue that I am purposely using pointers to cause problems, but in a large application, you won't know how and where your objects will be created. This could happen.

Case 2

A second case is much more innocuous. It uses no visible pointers.

vector<int> v;
v.push_back(42);
A b(v[0]);
v[0] = 43;
b.print();
v.pop_back(); //remove that element from the container
b.print();
v.push_back(2); //put something else in the same location
b.print();

Instead of dynamically allocation memory myself, I put an integer in a vector and then feed a reference to that element into an instance of A. This new instance (b) is now using the location of an element in a vector as an internal reference.

The thing with vectors is that they normally have more capacity than elements. This is because you don't want the vector to allocate more space for every element that gets inserted. Likewise, you are not going to do a deallocation with each element that gets removed.

Because of this, when that element is removed, the physical space is still there. A read from that location is not immediately going to trigger a crash. Instead, the program will keep on running, but with bad data instead. Or maybe not, if that element was at the threshold of triggering a deallocation. With bugs like this, you never know what will happen.

Case 3

There is an even more interesting option. I am not going to write a repro case for it, but consider the following: I create multiple instances of A, giving them each a reference to an integer that is located on the stack, and I pass those instances of A as pointers to newly created threads, to be used there.

Meanwhile, the stack on which the integers are located is unwound. What happens next?

Again, it depends. Probably the stack gets stomped. Or perhaps only the data will be corrupted. Again, what happens is unknown. And I don't say that this design pattern is a good one. It isn't. But given certain constraints, it might just be an approach for solving a specific problem (though perhaps not the best one).

Conclusion

By now, anyone should be convinced that using references do not protect your application against design mistakes.

The only time your references are really safe is if:
a) your application is single threaded, and there is no asynchronous code executed anywhere, and
b) your classes don't store references anywhere.

Outside of those constraints, references don't guarantee anything more than pointers would. The only guarantee you have with references is that at some point in time, there was an object of the correct type at the specified location, and it was not NULL. Probably.

If someone is actively trying to subvert your code, you can't even be sure of that. But since there is nothing you can do about that, you might as well use references because they make your life easier, and are better than raw pointers when it comes to preventing honest mistakes.

Only published comments... Sep 17 2008, 07:46 AM by vanDooren
Filed under:

Comments

 

Alun Jones said:

References are shorthand for pointers.

There's no other way to get it in a programmer's head what references really are. Sure, technically, they are slightly different, and it's possible that a new compiler technology may one day come along in which references and pointers are separated - but it's still going to behave in the ways that you've outlined.

Of course, as an ex-Fortran programmer (everything is passed by reference, even constants!), it's not too hard for me to deal with.

October 2, 2008 2:17 PM

Leave a Comment

(required)  
(optional)
(required)  
Add


Copyright © is the original authors. Blog site is an independent site not sponsored by Microsoft. The Yoda blog server and the Brianna SQL server would like to thank www.ownwebnow.com and www.exchangedefender.com. They wouldn't be here and broadcasting without the generosity of Vlad Mazek and his companies.

Powered by Community Server (Commercial Edition), by Telligent Systems