Fork me on GitHub

Cast operators in C++

Alright, I'm done bitching about Java. The maelstrom from my last post was exhausting (on both sides of the issue), and I don't have enough energy to deal with that every single week. Plus, I'm basically out of content on Java (more or less). Originally, that meant that I would spend this week on Python. However, I saw that as kicking another anthill altogether, so I'm pushing that one back at least a week.

Instead, let's talk about C++.

As you may have noticed, the title of this article is somewhat narrow in scope. That's mostly because I couldn't compress my C++ critique to a single article, and slightly to abuse the good faith of my current readerbase in a desperate bid to draw out whatever lifespan this blog has to the last possible second. It's also because I work with C++ fairly rarely, so writing these articles requires drawing out stale vitrol from the dank recesses of my mind, which takes time to cultivate into proper rage. Plus, my relationship with C++ is so much less hateful: I have on multiple occasions actively chosen to use C++ for tasks. So this will be the first of (let's say) 3 articles. (This number will not update with the number of articles I write. It will just become wrong.)

Brief History

I'm assuming my audience is mostly made up of classmates or colleagues, but it doesn't hurt to go over how C++ came to be. Around the early 70s, Dennis Ritchie was building Unix at Bell Labs, when he decided he needed a language that was more portable, more readable, and less error-prone than assembly code. He built C instead, and the programmers of the 70s took to it like the other young adults of the 70s took to recreational drug use. By the 80s, Bjarne Stroustrup saw the drug addiction-like damage C was doing to people, and decided that the fix of the problem would be to add more features, which is eerily similar to heroin's early history. He spearheaded efforts to build a new language, which he dubbed, "C with Classes, Exceptions, Metaprogramming, Generics, Algorithms, and everything else I could think of." This was shortened to "C with Classes," which was later shortened to "C++" because languages with long names are rarely successful. [citation needed]

(Side note: because of post-increment semantics, Bjarne made a language that was the same as what C used to be, and changed C in the process. The C++-influenced changes in C since the 80s attest to this.)

Cast Operators

Suppose you have two types, T and V, and an instance of each, t and v respectively. In C, converting a T to a V is easy: just write v = (V)t and it'll probably do something that makes sense. (To elaborate, it will either make a logical conversion if T and V are both numerical types, and it will just reinterpret the value at that point in memory without changing it otherwise. Or it will crash in an implementation-defined way if you're being an idiot, and you deserve that.)

C++ decided that C's handling of the situation was, like the rest of the language, insufficient, and went about fixing it in an extremely questionable way, just like they did with the rest of the language. They noted a few problems:

  • Casts are too easy to hide in code. If you're looking for places where Ts are casted to Vs, what string are you going to search the code for? (V)? Good luck with that.
  • Casts don't show any intent. What if you thought that Ts have a logical interpretation as Vs, but in fact your code just ends up manipulating parts of Vs that you really shouldn't be touching? To the C compiler, you could want to access parts of V like that, because that's also something C-programmers do, and it has no reason to assume you're a sensible human being.
  • Casts aren't user-definable. Well, they kind of are, if you assume that your structs are always packed in the same way across different architectures and operating systems. (See above for why the compiler doesn't assume you aren't doing this.)

To fix these perceived problems, C++ did a bunch of things. First and foremost, they created four cast operators:

  • static_cast - you know that t can be thought of as a V, or you're willing to pay some pretty hefty consequences if you're wrong.
  • dynamic_cast - T and V are pointer/reference-types, and T is a base type of V. You're pretty sure t is actually a V, but you don't want your program to explode if you're wrong, so you're okay if an exception (specifically std::bad_cast) is thrown when your assumption turns out to be false. (If everything is pointers, NULL/nullptr is returned instead.)
  • reinterpret_cast - you know that t isn't a V, but you think that the binary data in a T makes sense interpreted as a V, so you want the compiler to just look the other way for a second or two.
  • const_cast - You know that Ts say they shouldn't be modified, but you have reason to believe that it's okay to do so (assuming that T is a const V). You can also cast away volatility, but I don't really know why you would.

(Boost is also trying to add lexical_cast to the group, which would act like [tT]oString in Java/C# but in a bidirectional way, which is kind of neat.)

If you didn't catch my tone, most of these casts are almost always wrong. Let's go through them again:

  • static is fine, generally speaking. Nice and greppable, clearly shows intent, and is comparable in speed to a C-cast.
  • dynamic shows too much doubt on the part of the programmer: why do you only think t is a V? Where is this object coming from if you can't accurately describe its type but you still need to know what it is? Most importantly, why are you ever downcasting? (It's also really slow compared to the others.)
  • Do I really need to explain why reinterpret is bad? If static won't compile, then you should step back, take a deep breath, and go back to writing assembly. A std::pair<int, int> is not and will never be a double.
  • const is justifiable when you're dealing with stupid APIs that give you const variables that you know you can actually write to. If you're const_casting in code entirely written by you or someone you know, you should really reconsider.

For backwards-compatability, C++ includes the C-style cast operator. It's behaviour is basically to run through all these cast operators until one makes sense. This is, of course, almost never what you want, since it solves none of the problems the operator had in C.

Oh, and there are two other cast operator-like options provided in C++:

  • you can define an operator V function inside T's definition. Or anywhere else, because operators can be overloaded in any scope.
  • you can define a constructor in V that takes a single argument that is implicitly convertible to a T.

Of course, I'm leaving something out here. See, both of these are implicit, meaning you can just write v = t; and it will just pick the first thing that works. Sure, you can declare either as explicit, but only in C++11-onwards: explicit only worked for single-argument constructors for at least 15 years. This lead to ridiculous efforts to get around the language's limitation including (but not limited to) the safe-bool idiom for pointer-like truth-value detection. (This is one of my favourite instances of an extremely complex solution to a simple problem.) Thankfully, this kind of innovation is no longer necessary, but it's still a "gotcha" to watch for, since so few other languages mark conversions as implicit by default.

(I'm not even going to begin on how complicated this gets with templates and determining function prototypes, because this article is long enough and I want to do templates as a separate article.)

Does C++ handle this all wrong? I'm on the fence with that. Sure, languages like C# have a more structured, logical approach to casts and conversions (eg. limiting the scope, forcing an explicit/implicit declaration etc.), but C++'s design has always been about giving the programmer as many tools as possible without restricting their freedom. Really, it should be common sense not to abuse reinterpret_cast, but if you're crazy enough to want to, well, C++ will hand you the keyword. It's not going to stop you from pointing the gun at your foot, but it'll help you dial 911. (Unfortunately, it won't stop you from tearing out the phone lines.) The implicit-cast-by-default thing is strange, sure, but once you're aware of it you can probably handle it. All in all, C++ lets you do what you want quickly and (if you choose to) clearly, and that's all that really matters.

To sum up:

  • C++ adds 4 more cast operators. 3 are usually a sign of bad programming somewhere along the line.
  • C++ lets you define conversions, but you might be surprised when it decides to apply them.
  • The design of casting in C++ makes sense for C++, as long as you're okay with that

C++ is huge. I just wrote more on one tiny part of it than I wrote on all of Java. And I'm probably going to write more, given the right feedback. If that sounds like a good thing, let me know. 'Til then, I'll just continue waiting for my Java-based IDE to unfreeze.

(Seriously. Not even kidding)

J

Comments !

social