It's a cost/benefit thing. You should try to design your program sensibly when it comes to performance, because the cost of a failed architectural decision is high. What you shouldn't do is micro-optimizations which make the code longer and harder to read.
I'm not sure a 'readable' preemptive micro optimization is any more sensible. I think we should limit 'sensible' to preemptive macro and architectural optimization. A programmer should examine what their compiler is spitting out before they decide they can do better.
What is a "micro-optimization" is debatable, but if you have two ways of doing the same thing, and you can do the fastest one without impact safety or readability, it's not wrong to pick the fastest one (for instance, no point in building list A in order to append A to list B when you can directly add stuff in list B, like the code I refactored away one hour ago).
As for architectural optimization, I disagree. There are things you're really going to have a hard time adding post facto. If you know you're going to need performance, it's important to know what kind of program you're going to build (this may include benchmarking prototypes).
this feature [label address-of operator] is implemented as a nonstandard extension by most production grade C and C++ compilers
This feature is entirely a GCC extension, and is only implemented by "production grade" compilers to the extent that they try to be compatible with source that compiles with GCC.
Many - I'd guess, probably a majority of - C compilers that people consider "production grade" do not in fact understand this construct.
This made me chuckle. The author has a very different perspective from most software developers I know.
...and it might seem that in many cases it should not matter how the control flow is described and the same optimization might be applied as long as a resulting program is functionally the same. This raises a question of whether compiler will do the same optimizations if programmer is using if-then-else. In theory, it might be the same, or might be different. But instead of guessing, let’s find out.
I love the author's attitude towards experimentation. It reminds me of John Carmack: "If you aren't sure which way to do something, do it both ways and see which works better."
Back in the 90s, C was indeed considered high level. Most people had a background of assembly+something else (like Pascal or even Basic). The typical C+libs approach was rather high level in comparison.
Why disassemble when the source code is available?
Because there isn't a one-to-one relationship between the original source and the compiled code. Optimizing C compilers (meaning all modern C compilers) frequently go so far as to substitute completely different algorithms for code sequences that they recognize. The source code tells you the intended behavior, the assembly tells you what's actually happening.
Why not instruct the compiler to stop after code generating assembly from C?
You are right that you could get the most of the same information by looking at the generated assembly (-S with gcc/icc/clang). There usually isn't a lot of savings in time, though, as assembling is a fast operation.
Also, the compiler generated assembly is often messy, and cluttered with machine generated comments. Disassembled code is in a standard format, and can be easier to read. The disassembly also tells you alignment and instruction length, which can occasionally be essential.
All who haven't taken note, please do!