Pointing the Way

While doing a code review the other day I came across something like this
(names have been changed to protect the proprietary)

class Foo
{
private:
   Bar m_bar;

public:
   std::shared_ptr<Bar> GetBar() const
   {
      return std::make_shared<Bar>(m_bar);
   }
};          

Needless to say the code in question did not pass review. Between the unescessary copying of m_bar and the inflexiblility of the return type of GetBar there's a lot of confusion about object lifetimes and ownership going on here.

Given m_bar being defined the way it is, GetBar should be returning a const Bar&, end of story. If the caller wants a shared_ptr to a copy she can make it herself from the reference, and if she doesn't want one then the reference doesn’t make her pay for one (unlike the original code). The only catch is that we have to make sure that the Foo object sticks around as long as we're using the the reference so that the reference doesn't go bad on us.

In a multithreaded program there are also synchronization issues to worry about: either Bar has to handle its own synchronization or we need to guarantee that it can't change while the reference is in use. Not being able to make either of these guarantees is one of the few cases where returning a copy of an object makes sense – we just have to be sure that we synchronize access to the object that we’re copying while we’re copying it.

In the actual code that I was reviewing it turned out that m_bar might not exist, so the type of m_bar was also chosen badly. For a nullable value (i.e. one that “might not exist”) your options are a pointer type or boost::optional. The latter only makes sense if Bar is small or you really want to avoid dynamic memory allocation (boost::optional avoids dynamic memory allocation at the cost of always being as big as what it can contain, even when empty). Bar is not “small”, nor is avoiding allocation a concern, so a pointer type is called for here. If we had a good reason to keep returning a shared_ptr from GetBar then m_bar would have to be a shared_ptr – otherwise we couldn't share it. But a shared_ptr return means that we're sharing ownership of the returned object, and this type of ownership does not come without overhead.

There is no need to share ownership in this case, it is perfectly fine for the lifetime of m_bar to be tied to lifetime of the Foo object that contains it. So the natural choice is unique_ptr, even with the required addition of a move constructor and inability to use the default copy constructor (unique_ptr can only be moved, no copying, so we can’t use the default copy constructor that simply copies all the member variables; instead we have to define our own copy constructor that copies the object pointed to and stores it in the new unique_ptr in the new Foo object). With this choice of type for m_bar the return type of GetBar should be a bare pointer, probably const. Yep, bare pointers still have valid uses in this age of smart pointers, returning a possibly null object being one of the big ones. Just think of them as nullable references that imply nothing about ownership.

There's no way to return a unique_ptr without transferring ownership, unique_ptr is just built that way. So even if we wanted to, we couldn't return a unique_ptr from GetBar since we don't want to transfer ownership. We could return a reference to the unique_ptr, but that's just needlessly exposing an implementation detail of Foo – there's no reason for anyone to know that m_bar is a unique_ptr unless ownership is at stake. Thus our only real choice for the return type is a bare pointer.

In the real code I was looking at we didn’t need to worry about thread synchronization issues in GetBar. The Foo object in use was always const and guaranteed to be around as long as we needed it, so m_bar couldn’t change or go away. Had the immutability of m_bar not been guaranteed then we would have needed to return a copy. In the real code Bar is just a bag of doubles so it’s not cheap to copy, and move semantics can’t help since there’s no way to implement the move machinery for this class other than just copying the object. We could rely on the return-value-optimization or modify the function to take an output argument (i.e. manual RVO), but we’d still be copying m_bar. If you need to avoid this copy you’re best off making m_bar a shared_ptr to a const Bar and returning a copy of the shared_ptr from GetBar (being sure to synchronize the copying of m_bar). There’s overhead involved in copying the shared_ptr, but its not as bad as copying the Bar object itself. And most of the overhead of copying the shared_ptr is thread synchronization on the reference count. Were we copying the object we'd be doing synchronization there too. So performance-wise things are a wash on the thread synchronization front and we only need to worry about the cost of the object copy.

Even in this simple example there’s a lot to think about in terms of member variable types and return types – we didn’t even get into argument types. We could throw stack based variables in there to round things out, but there’s not a lot to say about the stack. A stack variable is either the result of calling another function, something that is going to be passed to another function, something that is going to be returned from the current function, or an intermediate result. In all but the last case the type is fixed by how you’re going to use it. In the last case you just want to use a plain old object and be done with it. So the stack isn’t really all the interesting compared with the other context.

There's also quite a few ways to refer to objects: the objects themselves (i.e. value semantics), references, bare pointers, unique ptr, shared_ptr, and boost::optional were mentioned above. Throw in weak_ptr and we’ve got seven ways to refer to objects – and that's ignoring more esoteric types like boost::variant. Over the next seven or so posts I’m going to lay out my philosophy on when to use each reference type in each context in much more detail than I’ve gone into above (and more detail than people will probably want to read, perhaps I should have prefaced detail with excruciating since there's a lot of detail to be gone into). Hopefully it won’t take me years to get through all the types, though given my track record with posting frequency that might be hoping for too much.

Update: I’ve now started the series threatened above.

Advertisements

4 comments

  1. […] a while back I posted about some issues that I found with my co-workers code while doig a review. At the end of the post I said that I’m going to go into excruciating detail about when I think […]

  2. […] wants to go into about which reference type to use when, we now arrive at part 2: references (see motivation and part […]

  3. […] series going into too much detail about what type of reference to use when is bare pointers (see motivation and parts 1 and 2). Even though we now have an abundance of smart pointers to choose from – about […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: