Standardizing Variant: Difficult Decisions
Wednesday, 17 June 2015
One of the papers proposed for the next version of the C++ Standard is N4542: Variant: a type safe union (v4). As you might guess from the (v4) in the title, this paper has been discussed several times by the committee, and revised in the light of discussions.
Boost has had a variant type for a long time, so it only
seems natural to standardize it. However, there are a couple of design decisions
made for boost::variant
which members of the committee were uncomfortable
with, so the current paper has a couple of differences from
boost::variant
. The most notable of these is that boost::variant
has a
"never empty" guarantee, whereas N4542 proposes a variant that can be empty.
Why do we need empty variants?
Let's assume for a second that our variant is never empty, as per
boost::variant
, and consider the following code with two classes A
and B
:
variant<A,B> v1{A()};
variant<A,B> v2{B()};
v1=v2;
Before the assignment, v1
holds a value of type A
. After the assignment
v1=v2
, v1
has to hold a value of type B
, which is a copy of the value held
in v2
. The assignment therefore has to destroy the old value of type A
and
copy-construct a new value of type B
into the internal storage of v1
.
If the copy-construction of B
does not throw, then all is well. However, if
the copy construction of B
does throw then we have a problem: we just
destroyed our old value (of type A
), so we're in a bit of a predicament
— the variant isn't allowed to be empty, but we don't have a value!
Can we fix it? Double buffering
In 2003 I wrote
an article about this,
proposing a solution involving double-buffering: the variant type could contain
a buffer big enough to hold both A
and B
. Then, the assignment operator
could copy-construct the new value into the empty space, and only destroy the
old value if this succeeded. If an exception was thrown then the old value is
still there, so we avoid the previous predicament.
This technique isn't without downsides though. Firstly, this can double the size of the variant, as we need enough storage for the two largest types in the variant. Secondly, it changes the order of operations: the old value is not destroyed until after the new one has been constructed, which can have surprising consequences if you are not expecting it.
The current implementation of boost::variant
avoids the first problem by
constructing the secondary buffer on the fly. This means that assignment of
variants now involves dynamic memory allocation, but does avoid the double-space
requirement. However, there is no solution for the second problem: avoiding
destroying the old value until after the new one has been constructed cannot be
avoided while maintaining the never-empty guarantee in the face of throwing copy
constructors.
Can we fix it? Require no-throw copy construction
Given that the problem only arises due to throwing copy constructors, we could easily avoid the problem by requiring that all types in the variant have a no-throw copy constructor. The assignment is then perfectly safe, as we can destroy the old value, and copy-construct the new one, without fear of an exception throwing a spanner in the works.
Unfortunately, this has a big downside: lots of useful types that people want to
put in variants like std::string
, or std::vector
, have throwing copy
constructors, since they must allocate memory, and people would now be unable to
store them directly. Instead, people would have to use
std::shared_ptr<std::string>
or create a wrapper that stored the exception in
the case that the copy constructor threw an exception.
template<typename T>
class value_or_exception{
private:
std::optional<T> value;
std::exception_ptr exception;
public:
value_or_exception(T const& v){
try{
value=v;
} catch(...) {
exception=std::current_exception();
}
}
value_or_exception(value_or_exception const& v){
try{
value=v.value;
exception=v.exception;
} catch(...) {
exception=std::current_exception();
}
return *this;
}
value_or_exception& operator=(T const& v){
try{
value=v;
exception=std::exception_ptr();
} catch(...) {
exception=std::current_exception();
}
return *this;
}
// more constructors and assignment operators
T& get(){
if(exception){
std::rethrow_exception(exception);
}
return *value;
}
};
Given such a template you could have
variant<int,value_or_exception<std::string>>
, since the copy constructor would
not throw. However, this would make using the std::string
value that little
bit harder due to the wrapper — access to it would require calling get()
on the value, in addition to the code required to retrieve it from the variant.
variant<int,value_or_exception<std::string>> v=get_variant_from_somewhere();
std::string& s=std::get<value_or_exception<std::string>>(v).get();
The code that retrieves the value then also needs to handle the case that the
variant might be holding an exception, so get()
might throw.
Can we fix it? Tag types
One proposed solution is to add a special case if one of the variant types is a
special tag type like
empty_variant_t
. e.g. variant<int,std::string,empty_variant_t
. In this case,
if the copy constructor throws then the special empty_variant_t
type is stored
in the variant instead of what used to be there, or what we tried to
assign. This allows people who are OK with the variant being empty to use this
special tag type as a marker — the variant is never strictly "empty", it
just holds an instance of the special type in the case of an exception, which
avoids the problems with out-of-order construction and additional
storage. However, it leaves the problems for those that don't want to use the
special tag type, and feels like a bit of a kludge.
Do we need to fix it?
Given the downsides, we have to ask: is any of this really any better than allowing an empty state?
If we allow our variant
to be empty then the code is simpler: we just write
for the happy path in the main code. If the assignment throws then we will get
an exception at that point, which we can handle, and potentially store a new
value in the variant there. Also, when we try and retrieve the value then we
might get an exception there if the variant is empty. However, if the expected
scenario is that the exception will never actually get thrown, and if it does
then we have a catastrophic failure anyway, then this can greatly simplify the
code.
For example, in the case of variant<int,std::string>
, the only reason
you'd get an exception from the std::string
copy constructor was due to
insufficient memory. In many applications, running out of dynamic memory is
exceedingly unlikely (the OS will just allocate swap space), and indicates an
unrecoverable scenario, so we can get away with assuming it won't happen. If our
application isn't one of these, we probably know it, and will already be writing
code to carefully handle out-of-memory conditions.
Other exceptions might not be so easily ignorable, but in those cases you probably also have code designed to handle the scenario gracefully.
A variant with an "empty" state is a bit like a pointer in the sense that you
have to check for NULL
before you use it, whereas a variant without an empty
state is more like a reference in that you can rely on it having a value. I can
see that any code that handles variants will therefore get filled with asserts
and preconditions to check the non-emptiness of the variant.
Given the existence of an empty variant, I would rather that the various
accessors such as get<T>()
and get<Index>()
threw an exception on the empty
state, rather than just being ill-formed.
Default Construction
Another potentially contentious area is that of default construction: should a
variant type be default constructible? The current proposal has variant<A,B>
being default-constructible if and only if A
(the first listed type) is
default-constructible, in which case the default constructor default-constructs an
instance of A
in the variant. This mimics the behaviour of the core language
facility union
.
This means that variant<A,B>
and variant<B,A>
behave differently with
respect to default construction. For starters, the default-constructed type is
different, but also one may be default-constructible while the other is not. For
some people this is a surprising result, and undesirable.
One alternative options is that default construction picks the first default-constructible type from the list, if there are any, but this still has the problem of different orderings behaving differently.
Given that variants can be empty, another alternative is to have the default constructed variant be empty. This avoids the problem of different orderings behaving differently, and will pick up many instances of people forgetting to initialize their variants, since they will now be empty rather than holding a default-constructed value.
My preference is for the third option: default constructed variants are empty.
Duplicate types
Should we allow variant<T,T>
? The current proposal allows it, and makes the
values distinct. However, it comes with a price: you cannot simply construct a
variant<T,T>
from a T
: instead you must use the special constructors that
take an emplaced_index_t<I>
as the first parameter, to indicate which entry
you wish to construct. Similarly, you can now no longer retrieve the value
merely by specifying the type to retrieve: you must specify the index, as this
is now significant.
I think this is unnecessary overhead for a seriously niche feature. If people
want to have two entries of the same type, but with different meanings, in their
variant then they should use the type system to make them different. It's
trivial to write a tagged_type
template, so you can have tagged_type<T,struct SomeTag>
and tagged_type<T,struct OtherTag>
which are distinct types, and thus
easily discriminated in the variant. Many people would argue that even this is
not going far enough: you should embrace the Whole Value Idiom, and write a
proper class for each distinct meaning.
Given that, I think it thus makes sense for variant<T,T>
to be ill-formed. I'm
tempted to make it valid, and the same as variant<T>
, but too much of the
interface depends on the index of the type in the type list. If I have
variant<T,U,T,T,U,int>
, what is the type index of the int
, or the T
for
that matter? I'd rather not have to answer such questions, so it seems better to
make it ill-formed.
What do you think?
What do you think about the proposed variant template? Do you agree with the design decisions? Do you have a strong opinion on the issues above, or some other aspect of the design?
Have your say in the comments below.
Posted by Anthony Williams
[/ cplusplus /] permanent link
Tags: cplusplus, standards, variant
Stumble It! | Submit to Reddit | Submit to DZone
If you liked this post, why not subscribe to the RSS feed or Follow me on Twitter? You can also subscribe to this blog by email using the form on the left.
Design and Content Copyright © 2005-2024 Just Software Solutions Ltd. All rights reserved. | Privacy Policy
31 Comments
IMO it would be pretty horrible to always allow empty variants. As you say, the difference between the two approaches is akin to the one between pointers and references and we all know that it's preferable to use the latter for the values that are not optional. Yet, if any variant could be empty, we'd be always working with pointer-like objects and not even have a choice of using a reference.
OTOH, having a default ctor in variant<> makes more sense to me and, if we imposed a requirement that the first of the variant types were default constructible, it would allow to ensure that the variant can be default constructed as well *and* could also provide the solution to the first problem as we could then just reset the variant to the default state if the copy ctor throws. This would avoid the need for either pre-reserving extra memory or allocating it dynamically and would also do things in the expected order, i.e. destroy A first and then [try to] construct B which is, IMHO, rather important. And in the very rare case of failure while doing the latter the variant would be reset to A{} (i.e. the same value as a default-constructed variant).
Finally, allowing duplicate types in the variant<> types list is just completely unnecessary extravagance and definitely shouldn't be supported.
I'd vote for disallowing duplicate types, by analogy with overloading and templates. If you allowed duplicates, then not only would construction be more complex, so would visitors in the style proposed by N4542.
@VZ: Making the first type default-constructible doesn't solve the empty state problem. It has to be *nothrow* default constructible to do that, because otherwise you might be unable to default construct the object when trying to do so for the failed assignment.
Since the variant_with_empty can be trivially and efficiently implemented* with a template alias on top of the current design, but the reverse is not true, I strongly prefer the current design.
*: template<typename Types...> using variant_with_empty = variant<empty_t, Types...>;
Almost the same is true for duplicate types, you do not incur the syntactic and semantic difficulties unless you do use duplicated types. If you really want to make sure that duplicated types are not possible you can create another template alias that refuses to compile in that case by using concepts/enable_if.
@Fabio: Your variant_with_empty is not "implementable on top of the current design": the current (N4542) design *is* a variant with empty :-)
The current Boost design provides the never-empty guarantee with dynamic allocation and strong exception safety. Adding empty_t to the list doesn't make it more efficient: it will never be used unless you explicitly store an empty_t.
In order to make your variant_with_empty an efficient overlay on a variant without an empty state, the underlying variant would have to default-construct the first type if copy construction failed and the first type was nothrow-default-constructible. That still leaves the question of what to do if copy construction fails and the first type is NOT nothrow-default-constructible.
Clients having to checking for empty state everywhere is just passing the problem over to the client. Its cumbersome. I think the boost solution of using dynamic allocation is the lesser evil.
Somebody posted on reddit that one can:
1) copy construct the new type into a local variable (may throw) 2) destroy the old type (never throws: noexcept destructor required) 3) move construct the new type from the local variable (never throws iff noexcept move constructor)
Since noexcept move constructors are the rule, I think this should work.
Having said that, variants should be Regular types, and thus default constructible. Default constructing a variant should have zero cost, so they should have an empty state. Making the empty state a tag is too complicated for too little gain. Accessing a default constructed variant should either be undefined behavior or throw an exception.
It is better to get a loud exception that code that silently fails because it uses some default constructed value from a variant. Rules like "the first type that is default constructible" is just too complicated and too error prone.
Variants with duplicate elements are too error prone either, the user should employ the type system to make those types unique and be able to access them by type. That way the type of the variant says what each element means.
I like the idea of allowing variant to be empty; I never actually understood why boost has taken the opposite route (added complexity and inefficency of implementation, for apparent simplicity of use, but then look at order of operations during assignement is prone to quietly break programs). Potentially-empty variant captures the underlying concept better than boost one, exactly for the reasons you spell above.
(copied from my reddit comment) I was at the committee meeting where this was discussed, and I argued strongly in favor of the current design. The problem with default-constructing to an "empty" state is that your programs are now replete with empty variants. It is promoting the empty state to a valid state. This complicates code every time a variant is used. You must now sprinkle your code with ASSERTs or conditionals. That's really undesirable.
By default-constructing to the first element in the variant (like built-in unions do), the only source of "empty" variants will be throwing moves. Not only are these rare, but since it's an exception, you have already been notified that your variant is now invalid. Yes, if you swallow the exception and proceed as if nothing has gone wrong, then you can end up accessing an invalid variant. The answer is simple: DON'T SWALLOW EXCEPTIONS. You shouldn't be doing that, anyway.
The design the committee is currently looking at makes it possible to code as if a variant<int, string> can only have an int or a string, without paying for the never-empty guarantee.
Notice the changed terminology. The current design doesn't have an "empty" state. It has an invalid state. I don't consider that mere sophistry. If you don't swallow exceptions, you can have your cake and eat it.
I think the current std variant design is about as close to ideal as we can get with C++. I was also at the meeting where this was discussed and was strongly in favour of the design.
std::tuple is an attempt to mimic the mathematical cross product. In other words, given types A and B, I want a std::tuple<A,B> to have a value of type A and a value of type B. No one would argue that std::tuple<A,B> should also have another state called "empty" where it contains neither a value of type A nor a value of type B. The same reasoning works with std::variant when we see what it is attempting to model.
std::variant is an attempt to mimic the mathematical discriminated union. In other words, given types A and B, I want a std::variant<A,B> to have a value of type A or a value of type B. Again, a state called "empty" makes little sense given what we are trying to model.
std::variant has an invalid state because of the strange C++-ism of throwing copy-constructors. An invalid state certainly isn't desirable, but it beats all the alternatives. The case where it arises is very rare and, because an exception is thrown, the programmer is already informed that there is a problem and can deal with it. The end result is that we get a std::variant type that we can safely assume is valid when passed as a parameter to a function. We make the same validity assumption when we skip null checks for references passed to functions.
I agree with Eric in that variant should not be empty. It would be a terrible decision to clutter all your code. I usually avoid empty types in my designs also.
After all, for me, conceptually, a variant holds either of the types contained, not either plus a special blank type.
BTW, would be nice to see a variant taking advantage of a proposal like operator dot from Lenexa meeting... :)
Boost.Variant mentions a policy based implementation as a future direction. Has that ever been attempted? I could live with policies, but I don't want to check every access to the variant.
I agree with Eric Niebler: empty is a possible error state of variant and you will be informed about such errors by exceptions. If a user ignores that information, that's what the user asked for. If one does never check for such errors it should be possible to static_assert if the assignment of the variant is nothrow (when all variant members have nothrow assignment).
Thank you Eric and David for posting your comments.
The problem with the "invalid" state is that all operations on variants with that state are undefined, except those that initialize it with a new value. e.g. in N4542, copy construction has the precondition that "w.valid() is true". This makes invalid variants truly toxic, and requires that code must be littered with checks of the validity.
On the other hand, if the variants were just "empty", then we could copy them freely.
Alternatively, copying an invalid variant could throw an exception. This would not be as bad as the undefined behaviour of a precondition violation, but is still pretty bad.
Consider for example vector<variant<A,B>>, where A and B have throwing move constructors. Suppose we have an instance of this type, and we insert in the middle. Then suppose that one of the move operations required to make room throws. We now have an invalid variant somewhere in the container. What happens now? If the implementation does anything other than assign *to* that invalid variant then we have a precondition violation and undefined behaviour. If the implementation tries to undo, it may cause another move to throw. If it leaves things alone, the vector now has an invalid variant as a trap for the unwary.
As far as I can see, libstdc++ leaves things alone, so the vector now has an invalid variant somewhere, but you don't know where. Ouch.
I can't think of another library type where a failed move-assignment leaves the target in an untouchable state. Yes, you might not know the contents of a container or a string after a failed move assignment, but it's not an invalid object that cannot even be copied.
I think we need to make the "empty" state a valid state for the variant if it's going to exist at all. Yes, that means that code might need to check for it, but that's the case with "invalid" variants, and nowhere near as dangerous. In the vast majority of cases, you won't have empty variants, and if you do get an empty one, then trying to use the value with get() or visit() can throw an exception. I think this will require less checking than if you allow toxic invalid variants which give you undefined behaviour.
I still think that a default-constructed variant should be empty, in order to encourage people to actually think about what value they want. If get() and visit() throw then uninitialized variants will soon show up in testing.
Maybe I missed something, but if an empty variant throws when accessed with get() or visit() doesn't this make the empty state an invalid state? Also how can I use an empty variant with your design? I mean, if for my use case empty is a valid state, I've no way to use it, because visit() throws (get() exceptions can be avoided by checking for emptiness first). Your vector example has the same issues if you allow an empty state that throws when used, because I've to check every element in the vector for emptiness before actually using it; if you change emptiness with validity, it's exactly the same thing.
@Alessandro: the difference is subtle, but important. An "empty" state is a valid state of the variant, which can be copied around without problem, and which has defined behaviour when accessed (e.g. throws an exception). An "invalid" state as proposed in N4542 is toxic, as any use of an invalid variant (including copying) is a precondition violation and undefined behaviour.
Visiting an empty variant must throw because visit(f,v) is equivalent to f(get<Index>(v)) where Index is the current index of the variant, and there isn't a current index for an empty variant.
I completely agree with what Eric and David said above and I think they covered everything there is to say about copy construction.
Some additional thoughts on default construction:
There is no intuitive semantics for default construction. Constructing to an empty type to me seems just as arbitrary as constructing to the first type. In either case I'd probably have to consult a reference manual when writing code that attempts to default construct a variant.
I agree with the pain points mentioned by Anthony, but since there is no obvious right semantics, I'd rather stick with the behavior that is consistent with union. Even if you could argue that union was broken in this regard, I am not sure that variant was a good place for attempting to fix it.
C++ is already plagued by inconsistencies and I don't think the benefits are high enough here to justify introducing another one.
Duplicate types shall definitely be allowed.
typedef does not create a new, merely an alias. That means, that the user is not aware in general, which types are the same and which are not. For example,
variant<size_t, unsigned int>
may contain duplicate types or not depending on the platform, without the user knowing it. If duplicate types are not allowed, this variant will compile or not dependent on something which is not only uncontrolled by the user, but also invisible to the user. This also would make code less future-proof. Imagine we have
using T1 = vector<int>; using T2 = list<int>; // and somewhere else variant<T1, T2> x;
and then 3 years later someone realizes that the list os slow and changes it to the vector: using T2 = vector<int> and suddenly in some distant part of the code we get an invalid variant.
Concluding, the standard shall avoid non-local effects, and thus types in the variant shall not interfere with each other, meaning that duplicate types are better be allowed.
Another example:
Imagine we want to return from the function either the result of type result_t, defined in some library, or an error code error_t, defined also in a library. What are these types is an implementation detail of that library, they might both be typedef's to int, for example. We would like to use
variant<result_t, error_t> // Attention! result_t and error_t might be the same, but we cannot know
But this code would compile or not depending on implementation detail of these types, which are not under our control, and even not visible to us.
I strongly agree with Eric that variants should not introduce empty states that will need checking everywhere. However, I favour avoiding dynamic allocation and prefer that all element types require support for no-throw move construction. This way copy-assignment to variant could copy construct to temp, destroy old value, and then move construct from temp. Requiring no-throw move construction encourages users to provide no-throw move constructors, which is a good thing, and classes that don't support it can still be wrapped e.g. with a smart pointer.
"However, it leaves the problems for those that don't want to use the special tag type, and feels like a bit of a kludge."
I don't see problem with empty tag (or any other way of providing "option" here) - it is the user who should decide what he needs.
I don't want to pay price for having empty variant in code which does not need it. For instance, with empty variant we have to pay price of possible throw on every visitation (even if none of overloads throw) or there should be asserts within variant::unchecked_visit - which basically means that we can't rely on type system to prove non-emptiness in cases when it is really not empty.
I would prefer to have two kinds of variants:
1. One which is never empty. It will have copy assignment enabled only if types have noexcept copy or noexcept move, otherwise invoking copy assignment will be compile-time error. This would work even for std::vector or std::string - take them by value and then move-construct.
2. One with empty / default-constructor semantics. It could be variant<empty_t, Types...>, or optional_variant<Types...>, or optional<variant<Types...>> (with some internal cooperation between optional <-> variant).
I have a question to opponents of "empty" variant, perhaps there is sane alternative we can reach. Existing boost::variant design aside (really, let's not consider it here), do you agree that is would be counterintuitive to prohibit std::variant<int, void>? If so, what is this "void" if not an empty type? Further more, what do you think would do default construction of std::variant<void, int>?
When move semantics were introduced, lots of people argued that a moved-from object should be singular (like invalidated iterators). In the end, the consensus was that the result should be a "valid but unspecified state" for the object in question. e.g. for std::string you know it is a valid string, but it could have any contents, and you need to check before doing anything.
I feel that the target variant of a failed copy-assignment is similar. I would be happy to say that it's in a valid but unspecified state, but if none of the types have a nothrow default constructor, then that is hard to achieve unless you allow the variant to be empty. An empty variant that is singular is a real pain.
Note: I am not arguing about the terminology. You can call the empty variant "invalid" if you like. The problem I have is with the undefined behaviour of operations on such a variant. I want get() and visit() to throw, and copying or moving such a variant to succeed (making the target empty/invalid in turn).
My comments:
1) Allow null variants? Absolutely not. I agree that this will litter the code with lots of error-checking. 2) Allow variant<T,T>? Absolutely. I use a home-grown variant to represent (part of) ASN.1 defined types. ASN.1 types are (arcane but really nifty in some places) technology to transfer messages between systems - e.g. a ground station and an aircraft. You typically only send one type of message, and that message is modeled like a variant (ASN.1 name is choice). So a message could be setup like this (asn.1):
DownlinkMessage:= CHOICE { [0] WILCO NULL [1] UNABLE NULL ... [37] REQUEST_FLIGHT_LEVEL FlightLevel ...
};
Which could be modeled like this:
using downlink_message = choice<null,null,....,flight_level,.... };
(where null is an empty element).
3) I would not mind having double-buffering in a variant. In my approach I require no throw move-constructable elements just because all elements happen to be so and move the elements in place. Should I need to have an element that cant be safely moved, I would probably favor the double buffer.
/Peter
Anthony Williams writes: > I want get() and visit() to throw, and copying or moving such a variant to succeed (making the target empty/invalid in turn).
I don't understand the people arguing for this. How is a "valid but unspecified" state any better? You can't read from such an object in a correct program. All you're going to do is spread "unspecified" (i.e. corrupted) data around your program. By making it UB, you permit an implementation to assert. That's actually useful behavior. Reading "unspecified" state is a bug. Always. I want to be notified if there is a bug in my program.
boost::variant also has this feature that if any of the potentially stored types has a noexcept default constructor, this constructor is used to initialize the state of the variant: no double storage is necessary then. It has the surprising effect that upon assigning type B to a variant holding type A, you may end up with the variant holding some type C. Did the Committee consider this option?
100% agree with what is proposed. I don't see the issue with empty variants and it makes everything natural.
C++ programmers should already know the risks of uninitialized variables and handle it correctly. Throwing scenarios correspond to what I would expect. That's what exceptions are meant to do.
How about variant<A,B,std::exception_ptr> as a standard way of dealing with exceptions?
variant<T, T> shouldn't be allowed, empty variants should be allowed and a variant should default construct empty. Anything else is unintuitive.
I very much dislike default construction to the first type in the variant. It means treating one variant differently than the rest, whereas I think ideally the variant should treat it's variadic as a set: any type is either in the variant, or not, and that's all. It means that code that doesn't break on one code branch will break in another, based on what type is being used. Things like unordered_map::[] will work well for the first type but not others, this can result in errors showing up later in development.
I would prefer no default constructor by default, but maybe there can be a special tagged type which if passed in as the first type in the variant, enables default construction to the first type.
variant<int, double> x; // no default constructor variant<default_tag_type, int, double> y; // default constructs to int
The latter can easily be handled via specialization and enable_if.
Also note that I think that repeated types should be allowed. Each variant however, should provide a typedef that is a mathematical set of all the types, ordered in some deterministic way. So variant<int, double>::types would be a TypeList<int, double> and same for variant<double, int>. Maybe this is a bit over the top but it seems neat, and order really shouldn't matter in these things.
Hello there,
About "invalid but not empty": As far as I understand it by now (haven't read the proposal, but just your blog), "empty" for the proposed variant is more like an "emergency state" rather than what "empty" means for shared_ptr<>, vector<> or optional<>.
I have hard as argument against that, that you need to sprinkle "if(...)" around your code to guard against that invalid state, before you access the variant. But I don't think that is the case
1) You are not allowed to copy/move the invalid variant (please correct me if that's wrong).
2) The variant can only become invalid if the assignment operation throws.
If 1) would be changed, and the target of a move/copy from an invalid variant became invalid, you could easily "infect" your whole codebase with invalid variants and then it in fact becomes hards to verify all your variants are valid. So I strongly disgree to break 1).
Just my two n00bie cents.
I saw many people prefer not-null std::experimental::variant, or at least make it as not-null as possible (e.g. construct the first element in default ctor).
First, not-null makes perfect sense for IMMUTABLE (not const) object, since it's gonna never change, and probably make sense to be always initialized with something informative. If someone really needs an nullable immutable object, an immutable_optional<T> wrapper will be my suggestion. Just like what they do in all functional programming languages (I also suggest that immutable objects should usually be copyable but not moveable, because you can always share the same piece of data by storing a reference/shared_ptr in an immutable object to make copying cheap).
However, this is C++, not yet another functional language. Today C++ doesn't explicitly support immutable semantic. By convention, most of standard library types are are MUTABLE, for example std::function, std::shared_ptr, std::unique_ptr, all sorts of containers, etc.
It makes sense to make mutable objects nullable, because of move semantics: for any movable object, programmers have to define a "valid but not specified" state to ensure that objects are destructed/reassigned correctly after moving. It's simply inevitable. For mutable objects, it also makes sense to be made always default constructible, since user may want to define it first, and mutate it later (don't forget, we are using an imperative language. if/for/while statements don't return values, and people hate ternary operator). Then it turns out using "valid but not specified" state as the default constructed state makes perfect sense.
If you are still unhappy with those evil mutations, recent GSL proposed not_null template, which make all mutable objects also never-nullable. Then it becomes a question "which should be by default and implicit, nullable or not-nullable?", I suggest nullable for mutable objects, and not-nullable for immutable objects.
Chandler (https://www.youtube.com/watch?v=7P536lci5po, 52:12) also mentioned that, particularly for argument list evaluation order, code that relies on a specific evaluation order is harder to understand, and one way isn't more reasonable than the other. In such case he'd rather not to endorse that. I hold the same argument here for speaking against default constructing the first alternative in variant. Once it's standardized, people will start to rely on such behavior, which doesn't make any sense and adds extra complexity to the library (monostate).