Blog Archive for / 2016 / 02 /

Core C++ - lvalues and rvalues

Saturday, 27 February 2016

One of the most misunderstood aspect of C++ is the use of the terms lvalue and rvalue, and what they mean for how code is interpreted. Though lvalue and rvalue are inherited from C, with C++11, this taxonomy was extended and clarified, and 3 more terms were added: glvalue, xvalue and prvalue. In this article I'm going to examine these terms, and explain what they mean in practice.

Before we discuss the importance of whether something is an lvalue or rvalue, let's take a look at what makes an expression have each characteristic.

The Taxonomy

The result of every C++ expression is either an lvalue, or an rvalue. These terms come from C, but the C++ definitions have evolved quite a lot since then, due to the greater expressiveness of the C++ language.

rvalues can be split into two subcategories: xvalues, and prvalues, depending on the details of the expression. These subcategories have slightly different properties, described below.

One of the differences is that xvalues can sometimes be treated the same as lvalues. To cover those cases we have the term glvalue — if something applies to both lvalues and xvalues then it is described as applying to glvalues.

Now for the definitions.

glvalues

A glvalue is a Generalized lvalue. It is used to refer to something that could be either an lvalue or an xvalue.

rvalues

The term rvalue is inherited from C, where rvalues are things that can be on the Right side of an assignment. The term rvalue can refer to things that are either xvalues or prvalues.

lvalues

The term lvalue is inherited from C, where lvalues are things that can be on the Left side of an assignment.

The simplest form of lvalue expression is the name of a variable. Given a variable declaration:

A v1;

The expression v1 is an lvalue of type A.

Any expression that results in an lvalue reference (declared with &) is also an lvalue, so the result of dereferencing a pointer, or calling a function that returns an lvalue reference is also an lvalue. Given the following declarations:

A* p1;

A& r1=v1;

A& f1();

The expression *p1 is an lvalue of type A, as is the expression f1() and the expression r1.

Accessing a member of an object where the object expression is an lvalue is also an lvalue. Thus, accessing members of variables and members of objects accessed through pointers or references yields lvalues. Given

struct B{
    A a;
    A b;
};

B& f2();
B* p2;
B v2;

then f2().a, p2->b and v2.a are all lvalues of type A.

String literals are lvalues, so "hello" is an lvalue of type array of 6 const chars (including the null terminator). This is distinct from other literals, which are prvalues.

Finally, a named object declared with an rvalue reference (declared with &&) is also an lvalue. This is probably the most confusing of the rules, if for no other reason than that it is called an rvalue reference. The name is just there to indicate that it can bind to an rvalue (see later); once you've declared a variable and given it a name it's an lvalue. This is most commonly encountered in function parameters. For example:

void foo(A&& a){
}

Within foo, a is an lvalue (of type A), but it will only bind to rvalues.

xvalues

An xvalue is an eXpiring value: an unnamed objects that is soon to be destroyed. xvalues may be either treated as glvalues or as rvalues depending on context.

xvalues are slightly unusual in that they usually only arise through explicit casts and function calls. If an expression is cast to an rvalue reference to some type T then the result is an xvalue of type T. e.g. static_cast<A&&>(v1) yields an xvalue of type A.

Similarly, if the return type of a function is an rvalue reference to some type T then the result is an xvalue of type T. This is the case with std::move(), which is declared as:

template <typename T>
constexpr remove_reference_t<T>&&
move(T&& t) noexcept;

Thus std::move(v1) is an xvalue of type A — in this case, the type deduction rules deduce T to be A& since v1 is an lvalue, so the return type is declared to be A&& as remove_reference_t<A&> is just A.

The only other way to get an xvalue is by accessing a member of an rvalue. Thus expressions that access members of temporary objects yield xvalues, and the expression B().a is an xvalue of type A, since the temporary object B() is a prvalue. Similarly, std::move(v2).a is an xvalue, because std::move(v2) is an xvalue, and thus an rvalue.

prvalues

A prvalue is a Pure rvalue; an rvalue that is not an xvalue.

Literals other than string literals (which are lvalues) are prvalues. So 42 is a prvalue of type int, and 3.141f is a prvalue of type float.

Temporaries are also prvalues. Thus given the definition of A above, the expression A() is a prvalue of type A. This applies to all temporaries: any temporaries created as a result of implicit conversions are thus also prvalues. You can therefore write the following:

int consume_string(std::string&& s);
int i=consume_string("hello");

as the string literal "hello" will implicitly convert to a temporary of type std:string, which can then bind to the rvalue reference used for the function parameter, because the temporary is a prvalue.

Reference binding

Probably the biggest difference between lvalues and rvalues is in how they bind to references, though the differences in the type deduction rules can have a big impact too.

There are two types of references in C++: lvalue references, which are declared with a single ampersand, e.g. T&, and rvalue references which are declared with a double ampersand, e.g. T&&.

lvalue references

A non-const lvalue reference will only bind to non-const lvalues of the same type, or a class derived from the referenced type.

struct C:A{};

int i=42;
A a;
B b;
C c;
const A ca{};

A& r1=a;
A& r2=c;
//A& r3=b; // error, wrong type
int& r4=i;
// int& r5=42; // error, cannot bind rvalue
//A& r6=ca; // error, cannot bind const object to non const ref
A& r7=r1;
// A& r8=A(); // error, cannot bind rvalue
// A& r9=B().a; // error, cannot bind rvalue
// A& r10=C(); // error, cannot bind rvalue

A const lvalue reference on the other hand will also bind to rvalues, though again the object bound to the reference must have the same type as the referenced type, or a class derived from the referenced type. You can bind both const and non-const values to a const lvalue reference.

const A& cr1=a;
const A& cr2=c;
//const A& cr3=b; // error, wrong type
const int& cr4=i;
const int& cr5=42; // rvalue can bind OK
const A& cr6=ca; // OK, can bind const object to const ref
const A& cr7=cr1;
const A& cr8=A(); // OK, can bind rvalue
const A& cr9=B().a; // OK, can bind rvalue
const A& cr10=C(); // OK, can bind rvalue
const A& cr11=r1;

If you bind a temporary object (which is a prvalue) to a const lvalue reference, then the lifetime of that temporary is extended to the lifetime of the reference. This means that it is OK to use r8, r9 and r10 later in the code, without running the undefined behaviour that would otherwise accompany an access to a destroyed object.

This lifetime extension does not extend to references initialized from the first reference, so if a function parameter is a const lvalue reference, and gets bound to a temporary passed to the function call, then the temporary is destroyed when the function returns, even if the reference was stored in a longer-lived variable, such as a member of a newly constructed object. You therefore need to take care when dealing with const lvalue references to ensure that you cannot end up with a dangling reference to a destroyed temporary.

volatile and const volatile lvalue references are much less interesting, as volatile is a rarely-used qualifier. However, they essentially behave as expected: volatile T& will bind to a volatile or non-volatile, non-const lvalue of type T or a class derived from T, and volatile const T& will bind to any lvalue of type T or a class derived from T. Note that volatile const lvalue references do not bind to rvalues.

rvalue references

An rvalue reference will only bind to rvalues of the same type, or a class derived from the referenced type. As for lvalue references, the reference must be const in order to bind to a const object, though const rvalue references are much rarer than const lvalue references.

const A make_const_A();

// A&& rr1=a; // error, cannot bind lvalue to rvalue reference
A&& rr2=A();
//A&& rr3=B(); // error, wrong type
//int&& rr4=i; // error, cannot bind lvalue
int&& rr5=42;
//A&& rr6=make_const_A(); // error, cannot bind const object to non const ref
const A&& rr7=A();
const A&& rr8=make_const_A();
A&& rr9=B().a;
A&& rr10=C();
A&& rr11=std::move(a); // std::move returns an rvalue
// A&& rr12=rr11; // error rvalue references are lvalues

rvalue references extend the lifetime of temporary objects in the same way that const lvalue references do, so the temporaries associated with rr2, rr5, rr7, rr8, rr9, and rr10 will remain alive until the corresponding references are destroyed.

Implicit conversions

Just because a reference won't bind directly to the value of an expression, doesn't mean you can't initialize it with that expression, if there is an implicit conversion between the types involved.

For example, if you try and initialize a reference-to-A with a D object, and D has an implicit conversion operator that returns an A& then all is well, even though the D object itself cannot bind to the reference: the reference is bound to the result of the conversion operator.

struct D{
    A a;
    operator A&() {
        return a;
    }
};

D d;
A& r=d; // reference bound to result of d.operator A&()

Similarly, a const lvalue-reference-to-E will bind to an A object if A is implicitly convertible to E, such as with a conversion constructor. In this case, the reference is bound to the temporary E object that results from the conversion (which therefore has its lifetime extended to match that of the reference).

struct E{
    E(A){}
};

const E& r=A(); // reference bound to temporary constructed with E(A())

This allows you to pass string literals to functions taking std::string by const reference:

void foo(std::string const&);

foo("hello"); // ok, reference is bound to temporary std::string object

Other Properties

Whether or not an expression is an lvalue or rvalue can affect a few other aspects of your program. These are briefly summarised here, but the details are out of scope of this article.

In general, rvalues cannot be modified, nor can they have their address taken. This means that simple expressions like A()=something or &A() or &42 are ill-formed. However, you can call member functions on rvalues of class type, so X().do_something() is valid (assuming class X has a member function do_something).

Class member functions can be tagged with ref qualifiers to indicate that they can be applied to only rvalues or only lvalues. ref qualifiers can be combined with const, and can be used to distinguish overloads, so you can define different implementations of a function depending whether the object is an lvalue or rvalue.

When the type of a variable or function template parameter is deduce from its initializer, whether the initializer is an lvalue or rvalue can affect the deduced type.

End Note

rvalues and lvalues are a core part of C++, so understanding them is essential. Hopefully, this article has given you a greater understanding of their properties, and how they relate to the rest of C++.

If you liked this article, please share with the buttons below. If you have any questions or comments, please submit them using the comment form.

Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: cplusplus, lvalues, rvalues
Stumble It! | Submit to Reddit | Submit to DZone

Comment on this post

If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.

C++ Concurrency in Action now available in Chinese!

Wednesday, 03 February 2016

Last week there was considerable excitement in my house as I received my copies of the Chinese translation of my book. The code looks the same, and they spelled my name correctly, but that's all I can tell. I can't read a word of Chinese, so I hope the content has translated OK, and doesn't read like it's been run through automatic translation software.

It's a great feeling to know that my book is going to reach a wider audience, joining the ranks of the C++ books available in Chinese. As Bjarne commented when I posted on Facebook, "we are getting there".

Posted by Anthony Williams
[/ news /] permanent link
Tags: C++, concurrency, multithreading, book
Stumble It! | Submit to Reddit | Submit to DZone