C++ programming language: Smart pointers

RAII
Unique pointer
Shared pointer
Weak pointer
Instrusive pointer

¶RAII

Before you dive into the various pointers, their code and examples below, we need to consider a crucial concept in C++. It is RAII idiom. Understanding of this idiom answers the question why we need to have smart pointers in general.

RAII abbreviation stands for Resourece Acquisition Is Initialization. What does it mean?

The idea behind this is that, in addition to the fact that this resource requires initial initialization before starting working with it, also it requires its proper release.

Typical examples of such resources are

Connections to databases (open / close)
Network sockets (open / close)
Files (open / close)
Mutexes (lock / unlock)
Memory (allocate / deallocate)

Why do we need to release a resource explicitly? It is a good question. Sometimes it depends on the resource itself and how it is implemented. But, for the sake of the example, we can consider Writing to a file. Imagine you opened a file and started writing there, then forgot to close it while the resource is still alive. What bad can happen? You can expect the file on the Disk contains all the changes you have written but the thing is that contents can be incomplete. Why? Because flushes to the Disk can happen not every time but upon reaching some predefined buffer size and the last flush can be done when closing the resource.

The more simple answer is that explicit closing the resource can not only do some postponed work but also releases memory and some other resources.

How can you forget to release a resource? Easy, you can have some big function which initialize the resource in the beginning and releases it in the end. But it is easy to imagine some time later you added an additional check and return inbetween and forgot you need to release the resource.

Example:

void foo(int b)
{
    int val = new int(5);

    // some logic here ...

    if (b > 5) return; // or throw std::runtime_exception("b > 5")

    // some logic here ...

    delete val; // memory will not be freed if b > 5
}

Also we can accidentally call delete several times which will lead to segmentation fault:

void foo(int b)
{
    int val = new int(5);

    // some logic here ...

    delete val; // memory will not be freed if b > 5

    // some logic here ...

    delete val; // segmentation fault, second deletion
}

or even worse we can try to work with the object after it was deleted:

void foo(int b)
{
    int val = new int(5);

    // some logic here ...

    delete val; // memory will not be freed if b > 5

    // no standard way to check if memory was already freed
    *val = 3; // segmentation fault, access of freed memeory
}

Plus, it is not easily possible to distinguish when the pointer is the pointer to a single object or to an array of objects.

To avoid all such situations, RAII is required. For a resource you need to work with, a wrapper class should be created which performs initialization of the resource in its constructor and resource release in the destructor and, of course, contains some additional methods to work with the resource. So instead of working with the resource directly, you use this wrapper class. When instance of this wrapper goes out of scope, its destrcutor will be called and resource will be released automatically.

First attempt to create a generalized wrapper (smart pointer) in C++ was in 1998 when auto_ptr was introduced. Now it is considered to be legacy, deprecated and forbidden to be used.

Simple possible implementation:

template<typename T>
class auto_ptr
{
public:
    auto_ptr(T * ptr)
        : ptr_(ptr)
    {}

    ~auto_ptr()
    { if (ptr_) delete ptr_; }

    auto_ptr(const auto_ptr & other)
    { operator=(other); }

    auto_ptr operator =(const auto_ptr &)
    {
        if (ptr_) delete ptr_;
        ptr_ = other.ptr_;
        other.ptr_ = nullptr;

        return *this;
    }

public:
    T & operator *() const
    { return ptr_; }

private:
    T * ptr_;
};

it works good for the cases considered above but if we decide to copy such pointer we face probably unexpected behavior:

#include <memory>

void foo()
{
    auto ptr = std::auto_ptr<int>(new int(5));
    auto ptr2 = ptr;

    std::cout << *ptr2 << "\n"; // works ok, copied as expected
    std::cout << *ptr << "\n"; // segmentation fault, original pointer nullified
}

Thus C++ standardization committee decided to make such behavior more explicit for programmers and deprecated auto_ptr in favor of unique_ptr in C++11 when move semantics was introduced. Just to underline it, before C++11 there was no move semantics and so it was impossible to implement auto_ptr with explicit unique ownership as we consider it below.

¶Unique pointer

std::unique_ptr is a smart pointer used to manage exclusive ownership of a resource. It makes sure that only one such pointer owns the resource at any given time. This ownership model prevents multiple pointers from accidentally deleting the same resource.

Simple possible implementation:

template<typename T>
class unique_ptr
{
public:
    // Ctrs
    explicit unique_ptr(T * ptr)
        : ptr_(ptr)
    {}

    ~unique_ptr()
    { if (ptr_) delete ptr_; }


    unique_ptr(unique_ptr && src)
    { operator=(src) }

    unique_ptr & operator=(unique_ptr && src)
    {
        if (ptr_) delete ptr_;

        ptr_ = src.ptr_;
        src.ptr_ = nullptr;

        return *this;
    }

    // Copy constructors are forbidden
    unique_ptr(const unique_ptr &) = delete;
    unique_ptr operator =(const unique_ptr &) = delete;

public:
    T * operator ->() const
    { return ptr_; }

    T * get() const
    { return ptr_; }

private:
    T * ptr_ = nullptr;
};

if we now try to use unique_ptr from standard library for the case we considered earlier we get the following:

#include <memory>

void foo()
{
    auto ptr = std::unique_ptr<int>(new int(5));
    auto ptr2 = ptr; // compilation error, copy constructor is deleted

    std::cout << *ptr2 << "\n";
}

Proper way here is to use std::move construct:

#include <memory>

void foo()
{
    auto ptr = std::unique_ptr<int>(new int(5));
    auto ptr2 = std::move(ptr); // no error, ptr is moved to ptr2 and ptr is nullified

    std::cout << *ptr2 << "\n";
    // still segmentation fault, need to check ptr state before using it,
    // but language servers in this case can produce WARNING saying
    // that you try to use variable after it is moved
    std::cout << *ptr << "\n";
}

Just to conclude:

unique_ptr must be used in situations when only one pointer can point to some object in memory, i.e. exclusively has ownership over it.
Size of unique_ptr coincises with size of raw C pointer.
Operations with unique_ptr are not slower comparing with raw pointers after enabling compiler optimizations because in this case compiler is able to produce assembler which would be equal to code using raw pointer, i.e. unpacks calls to methods of unique_ptr.

The typical example where unique_ptr can be used is in the Factory function pattern:

#include <memory>

enum class OpType
{
    AsIs = 0,
    Squared
};

class BaseOp
{
public:
    virtual ~BaseOp() = default;
    virtual int get() = 0;
};

class AsIs : public BaseOp
{
public:
    AsIs(int val)
        : val_(val)
    {}

    virtual int get() override
    { return val_; }

private:
    int val_;
};

class Squared : public BaseOp
{
public:
    Squared(int val)
        : val_(val)
    {}

    virtual int get() override
    { return val_ * val_ ; }

private:
    int val_;
};

std::unique_ptr<BaseOp> makeOperation(OpType type, int val)
{
    std::unique_ptr<BaseOp> op;

    switch (type) {
    case OpType::AsIs:
        op.reset(new AsIs(val));
        break;
    case OpType::Squared:
        op.reset(new Squared(val));
        break;
    default:
        break;
    };

    return op;
}

What happens in this case, we have a Factory function which allocates memory for an object we need to construct, determined by its first argument type and second argument value val, and returns pointer to the calling function. It turns out that the deletion of the object also lies on shoulders of the calling function in this case.

¶Shared pointer

std::shared_ptr is a smart pointer used to manage shared ownership of a resource. Such pointer pointing to some object can be copied as many times as needed and pass around your codebase. All such copies will point to the same object in memory. Deletion of the resource will happen only when all shared pointers is deleted or if we reassign another resource to the last alive smart pointer.

Simple possible implementation:

template<typename T>
class shared_ptr
{
public:
    explicit shared_ptr(T * ptr)
        : ptr_(ptr)
        , ref_count_(new std::size_t(1))
    {}

    shared_ptr(shared_ptr<T> & other)
        : ptr_(other.ptr_)
        , ref_count_(&(++(*other.ref_count_)))
    {}

    ~shared_ptr()
    {
        if (ptr_ && --(*ref_count_) == 0) {
            delete ptr_;
            delete ref_count_;
            std::cout << "Destructor called only once when the last alive pointer is deleted" << "\n";
        }
    }

    T & operator *()
    { return *ptr_; }

    std::size_t use_count()
    { return (ref_count_)? *ref_count_ : 0; }

private:
    T * ptr_;
    std::size_t * ref_count_;
};

int main() {
    shared_ptr<int> ptr1{new int(10)};
    std::cout << "ptr1: " << *ptr1 << ", use count = " << ptr1.use_count() << "\n";

    shared_ptr<int> ptr2 = ptr1;
    std::cout << "ptr1: " << *ptr1 << ", use count = " << ptr1.use_count() << "\n";
    std::cout << "ptr2: " << *ptr2 << ", use count = " << ptr2.use_count() << "\n";
    return 0;
}

ptr1: 10, use count = 1
ptr1: 10, use count = 2
ptr2: 10, use count = 2
Destructor called only once when the last alive pointer is deleted

Since different shared pointers ideally should be able to point to different resources even of the same type, the reference counter is implemented as a pointer to counter instead of a static member of the template class. Having it as the static member would break counter logic and we would get the following:

template<typename T>
class shared_ptr
{
public:
    explicit shared_ptr(T * ptr)
        : ptr_(ptr)
    {
        ref_count_ = 1;
    }

    shared_ptr(const shared_ptr<T> & other)
        : ptr_(other.ptr_)
    {
        ++ref_count_;
    }

    ~shared_ptr()
    {
        if (ptr_ && --ref_count_ == 0) {
            delete ptr_;
        }
    }

    T & operator *()
    { return *ptr_; }

    std::size_t use_count()
    { return ref_count_; }

private:
    T * ptr_;
    inline static std::size_t ref_count_ = 0;
};

int main() {
    shared_ptr<int> ptr1{new int(10)};
    shared_ptr<int> ptr2 = ptr1;
    shared_ptr<int> ptr3{new int(11)};
    std::cout << "ptr1: " << *ptr1 << ", use count = " << ptr1.use_count() << "\n";
    std::cout << "ptr2: " << *ptr2 << ", use count = " << ptr2.use_count() << "\n";
    std::cout << "ptr3: " << *ptr3 << ", use count = " << ptr3.use_count() << "\n";
    return 0;
}

ptr1: 10, use count = 1
ptr2: 10, use count = 1
ptr3: 11, use count = 1
Destructor called only once when the last alive pointer is deleted

As you can see reference counter is incorrect and shared for two different resources, first one used by ptr1 and ptr2 and second one used by ptr3.

¶Assignment operator

template<typename T>
class shared_ptr
{
public:
    explicit shared_ptr(T * ptr)
        : ptr_(ptr)
        , ref_count_(new std::size_t(1))
    {}

    shared_ptr(shared_ptr<T> & other)
        : ptr_(other.ptr_)
        , ref_count_(&(++(*other.ref_count_)))
    {}

    ~shared_ptr()
    {
        if (ptr_ && --(*ref_count_) == 0) {
            delete ptr_;
            delete ref_count_;
            std::cout << "Destructor called only once when the last alive pointer is deleted" << "\n";
        }
    }

    /**
     * ASSIGNMENT OPERATOR checks if current object must be deleted
     * and new one is copied over with increasing its shared reference counter
     */
    shared_ptr & operator =(const shared_ptr<T> & other)
    {
        if (this == &other) return *this;

        if (ptr_ && --(*ref_count_) == 0) {
            delete ptr_;
            delete ref_count_;
        }

        ptr_ = other.ptr_;
        ref_count_ = other.ref_count_;

        if (ptr_) ++(*ref_count_);

        return *this;
    }

    T & operator *()
    { return *ptr_; }

    std::size_t use_count()
    { return (ref_count_)? *ref_count_ : 0; }

private:
    T * ptr_;
    std::size_t * ref_count_;
};

int main() {
    shared_ptr<int> ptr1{new int(10)};
    shared_ptr<int> ptr2{new int(11)};

    std::cout << "ptr1: " << *ptr1 << ", use count = " << ptr1.use_count() << "\n";

    ptr1 = ptr2;

    std::cout << "ptr1: " << *ptr1 << ", use count = " << ptr1.use_count() << "\n";
    std::cout << "ptr2: " << *ptr2 << ", use count = " << ptr2.use_count() << "\n";
    return 0;
}

ptr1: 10, use count = 1
ptr1: 11, use count = 2
ptr2: 11, use count = 2
Destructor called only once when the last alive pointer is deleted

For move construct the logic is much simpler not involving reference counter:

shared_ptr(shared_ptr<T> && other)
    : ptr_(other.ptr_)
    , ref_count_(other.ref_count_)
{
    other.ptr_ = nullptr;
    other.ref_count_ = nullptr;
}

¶Real implementation

What was above is just an example of how it could look in a simplest version. Of course, real std::shared_ptr differs in some aspects:

It does two allocations and contains two pointers, one for type T and another for structurte called Control block.
Control block holds the following members:
1. raw pointer to type T
2. use_count counter for the number of existing shared_ptrs pointing to the same object.
3. weak_use_count counter for the number of existing weak_ptrs pointing to the same object (We will consider it below).
4. Deleter is custom deleter if provided when shared_ptr is constructed (its type is erased).
5. Allocator is custom allocator if provided when shared_ptr is constructed (its type is erased).

TO BE CONTINUED…

¶Weak pointer

TO BE CONTINUED…

¶Intrusive pointer

Intrusive shared pointer is not a part of the C++ standard since it can be mimiced with std::shared_ptr, but it is still useful to get acquainted with this type of smart pointers.

TO BE CONTINUED…