Hacker Timesnew | past | comments | ask | show | jobs | submitlogin

It's indistinguishable from unspecified behavior, not from undefined behavior. Unspecified behavior has to pick from a finite list of allowed behaviors. Undefined behavior can do anything.


A program with corrupted state can essentially do anything. Yes it's still a question of run-time checks the runtime has to protect against it. But the compiler is probably deriving a lot of assumptions from the assumption that there wasn't overflow.


“Undefined behavior” is a term of art in programming languages that means something more specific than “the program may do something odd.”

The compiler is not allowed to derive any assumptions from it. It only could if it were UB.


But did the rust compiler assume that the integer would not overflow? It did so in Debug mode where runtime checks were added. If it's not the case in Release mode, does that mean semantics are different between Debug and Release?


The semantics are well-defined in both modes. You can predict exactly what will happen in either case. In C, the semantics are not defined at all, you can't predict what will happen and it's allowed to change between compilations of the same source.

It will probably get omitted, since Undefined Behavior isn't allowed by the C abstract machine, but sadly compilers are allowed to emit code for UB in the source (partly because some UB is only detectable at runtime). Sometimes disabling optimizations will incorrectly allow codegen to run for source lines which have UB, tricking people into thinking that optimizations are breaking their program. Compilers are allowed to do this, since behaviors other than "omit the offending statement" are unfortunately allowed by the standard, so it's not a compiler bug.


UB is a runtime property. As far as you can statically verify some code parts, you can see UB at compile time, but the point of UB is exactly that it is about stuff you can't predict, or that is hard to predict as a compiler.

Now why you can cook up trivial artificial examples where a compiler will remove some code sections based on statically detected UB, instead of printing an error, you have to ask the compiler authors.

> The semantics are well-defined in both modes.

So they're not the same? So the behaviour is not uniquely defined by the source code alone, but is actually _very_ different based on compile mode? Between two modes whose point was never to have different semantics, but to have the _same_ semantics while being debuggable vs being fast?

> You can predict exactly what will happen in either case. In C, the semantics are not defined at all, you can't predict what will happen and it's allowed to change between compilations of the same source.

You can make the same "predictability" argument for C, you can easily write a compiler that has semantics exactly laid out. Case in point: -fwrapv. Case in point: UBSAN.


You can write a C compiler with exactly laid out well-defined semantics. You can't assume those semantics hold for C-the-language, because it doesn't define those semantics. UB is a property of the language, not just of a given compiler. The Rust reference defines the semantics of the safe subset of Rust without any UB, so any compliant Rust compiler won't have UB in that subset. The reference also defines the guarantees which the programmer must uphold within `unsafe` blocks to avoid UB, as long as those are upheld there's no UB at all.

I understand that. It makes no practical difference. 99,99% of my additions don't rely on signed overflow for example, and if I'd ever need it there are ways to get just it.

Or tell me how you write a Rust program differently given that signed overflow is apparently defined? I bet you write it exactly the same way, and you get pretty much the same behaviour in practice. And we're even only debating actual overflow situations, meaning there is a bug whatever the compiled behaviour is.

C the language doesn't even guarantee that the machine has native integers with 8, 16, 32, 64 bits etc, that a cacheline is 64 bits, that a page is 4K, and here I am, writing programs for exactly that.


> But did the rust compiler assume that the integer would not overflow?

It did not.

> It did so in Debug mode where runtime checks were added.

It didn't assume in that case either. It did a well defined thing: add checks.

> If it's not the case in Release mode, does that mean semantics are different between Debug and Release?

Strictly speaking, the language doesn't know about "release mode", as that's a Cargo thing. But yes, in practice, the semantics are different based on various things: it could be debug vs release, it could also be flags that control the behavior. But that's still distinct from "undefined behavior" as a concept. The behavior is well defined, with multiple possible options for behaviors.


So in Rust, you are actually specificing TWO programs with a single source? Those Rust users are surely too clever for my liking!

You can tune a C compiler as well to have a very specific defined behaviour for integer overflow. You can add -fwrapv or you can add UBSAN.

The user never intended overflow to happen, because if they did, they could have used something like __builtin_mul_overflow() or whatever. Or they are an emotionally unstable user with destructive tendencies. The user also never intended the program to abort with a (nicely formatted) error message, unless they are a very very sad depressed nihilistic user who also never runs their program in Release mode.

To say that overflow would be defined in Rust is at least half a lie. We could agree that cargo has a choice of diagnostic policy though, a policy how to handle what is essentially a state with no defined or useful path forward, or in other words, UB.

Throwing errors might be a wanted property to detect oversights. C ecosystem has UBSAN too! But essentially the same is still true: Basic arithmetic operations are not closed over the numbers 0..2^N. Rust doesn't have a (unique and useful) definition for those operations for a subset of numbers. Even if you claim the operations are defined (say wrapping arithmetic in Release mode), it's not what the programmer wants. Probably the majority of algorithms work over natural numbers or integer numbers. These algorithms don't work when the arithmetic on integers modulo 2^N.

So the user has to constrain the set of valid inputs, and do manual sanitization, just like in C.


> You can tune a C compiler as well to have a very specific defined behaviour for integer overflow. You can add -fwrapv or you can add UBSAN.

This is an example of a compiler flag that adds definition to undefined behavior, which is of course, legal to do. That doesn't change that in the standard, it is undefined behavior, and in Rust, it is not.

> To say that overflow would be defined in Rust is at least half a lie.

In the context of "undefined behavior", it is not a lie at all.

> So the user has to constrain the set of valid inputs, and do manual sanitization, just like in C.

No, because the consequences of how the two languages define these behaviors are very, very different.


As you said, overflow is defined in Cargo, given a specific build type and/or specific build flags. It's not defined in Rust.

Just saying that it's defined and then not saying what the definition is, is no different from saying it's undefined.


No, “release mode vs debug mode” is defined in Cargo. What’s defined in Rust is the debug_assertions flag, which is one of the things that Cargo will set by default as part of the debug mode by default.

> Just saying that it's defined and then not saying what the definition is, is no different from saying it's undefined.

It actually is, because, as I said earlier, “undefined behavior” is a term of art with very specific meaning. Regardless, it is defined: there are two possible behaviors, with one guaranteed with that flag and the other chosen by implementations.


I think people make up way too much of it. What is the actual term of art? What is the meaning of UB? If you look in the standard, UB is basically what its name says, it is behaviour (or state) that is not defined. It can be anything. And that makes sense in many cases: What if you construct a random pointer, and read it or write it? It's not useful or practically possible to define the behaviour from then on. So the behaviour is left undefined, simple as that.

Now are there many cases of UB in C, many more than strictly need to exist on contemporary platforms? For sure there are. But does it affect me? Not unless I need a specific behaviour common to most contemporary platforms that I can't get within the confines of C, even considering compiler specific extensions. Honestly I can't come up with any of the top of my head. Maybe some integer-shifting stuff or such, if the compiler was able to prove I'm doing sth undefined, it can leave out that code (or delete my mail, for the doomers). Personally, it hasn't happened to me, and it's on the compiler authors to not do stupid things too.

Leaving all the semantic hair-splitting aside. What is the practical difference in how you write a Rust program compared to a C program, given that integer overflow is "defined" in Rust?


> It didn't assume in that case either. It did a well defined thing: add checks.

It did. The compiler added the checks (which panic on overflow, from a quick web search) precisely so it (and importantly, the developer!) can assume the overflow didn't happen in the subsequent code. Unless you consider a panic a defined state, and consider wrap-on-overflow equally valid in all cases, it's essentially the same as UB. (panic seems to be considered "unrecoverable").

Difference is _at most_ that C spec gives compiler more freedom to "implement UB", but then again, hit any unsafe code in Rust with wrapped around integer, you probably have comparable practical result -- machine doing random things, corrupting memory and so on.


Okay, I am going to leave this here, as it’s clear to me that we’re not coming to an understanding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: