All my methods take 316 arguments, and I like it that way

gecko · on Dec 30, 2009

This argument is absurd.

All functional languages have mutable global variables. Most provide explicit support for mutable values (e.g., Scheme, OCaml), but even in the "pure" ones, there are the secretly global mutable values. Clojure gets them from its STM and from classes in the JVM. Erlang stashes them in ETS, DETS, and Mnesia. Haskell shoves them into the IO and State monads (among others) and hopes you won't notice. In all of these cases, the variable changes may be isolated, threadsafe, or internally consistent, or however you wish to describe it, but if that's not mutable state at a global level, I don't know what on Earth you'd call it. In all cases, this means that a function called twice in a row from the same point in a program may return two different values--and that's exactly what the author is saying is a flaw of imperative languages.

Turning things around, while it's theoretically possible that any given function mutates global state in random ways, no good program is written that way. Instead, the relevant state is passed to a function in the form of a struct or class--exactly what you'd be doing in a functional language, as it happens. The only difference is that the function can, and sometimes does, mutate the structure passed to it, whereas the functional equivalent would likely return a slightly different version of that struct. But, again, a combination of passing things conservatively, naming functions well, and making heavy use of const/final/sealed/what have you to lock things down makes this problem largely absent in practice.

I am not saying that functional programming can't a huge improvement for some things. I am utterly convinced that a multithreaded program should be written in a functional way, if not in a purely functional language. I likewise tremendously favor functional languages for algorithm-heavy applications, where the ease and safety with which I can backtrack and memoize can yield huge productivity gains. But saying that all imperative programs have hundreds of arguments, while functional languages have only those defined? Unless you do no output with the outside world, that claim is simply ludicrous.

cemerick · on Dec 30, 2009

STM, agents/actors, et al. are not "mutable global variables" -- they are all ways of working with state within the bounds of defined semantics. Huge difference, and largely irrelevant to the current discussion.

Who said anything about mutating global state in random ways? All you need is to have some state mutated in some way that you aren't aware of to make life hell in an imperative environment. Considering that that happens all the time, especially given the modern conveniences of large frameworks and libraries...

> But, again, a combination of passing things conservatively, > naming functions well, and making heavy use of > const/final/sealed/what have you to lock things down makes > this problem largely absent in practice.

This is flatly wrong, and equivalent to saying that you can write concurrent code in an imperative environment without a problem as long as you use locks just right.

doty · on Dec 30, 2009

In all cases, this means that a function called twice in a row from the same point in a program may return two different values--and that's exactly what the author is saying is a flaw of imperative languages.

For Haskell, this is false. In order to get two different results from the invocation of the same function you must pass in two different arguments. The monadic functions work by yielding a new version of the world, as it were.

Here's a nice explanation of Haskell IO that makes sense to me: http://www.haskell.org/haskellwiki/IO_inside

Xichekolas · on Dec 30, 2009

Just to chip in a link, this explanation was the one that made Monads click for me:

http://blog.sigfpe.com/2006/08/you-could-have-invented-monad...

jrockway · on Dec 30, 2009

Not if you use IORefs, foreign calls, unsafePerformIO, MVars, TVars, etc. In all cases except unsafePerformIO, though, the type system ensures that you are prepared to handle this unusual behavior. That's the primary difference between Haskell and the imperative languages. (In Haskell, it's easy to not accidentally use effectful code. In most imperative languages I've used, it is impossible to control your use of side effects.)

derefr · on Dec 30, 2009

When people say "Haskell is a functional language," I believe they actually mean "the subset of Haskell that doesn't include unsafePerformIO et all is a functional language." (The fact that standard library functions call those functions is an implementation detail.)

jrockway · on Dec 30, 2009

Haskell shoves them into the IO and State monads (among others) and hopes you won't notice.

In both of these cases, if you don't notice, your program won't compile anymore. So really, your statement is the opposite of the truth. Haskell hopes you do notice.

jerf · on Dec 30, 2009

"Erlang stashes them in ETS, DETS, and Mnesia. Haskell shoves them into the IO and State monads (among others) and hopes you won't notice."

A global variable is a variable that is accessible and modifiable at all times in all parts of the program with no particular access control, since the idea of "access control" wasn't even around at the time. C, Python, and Perl all have global variables. Despite your contention, Erlang and Haskell do not.

Mnesia, ETS, and DETS all control access via message passing; it's a common misconception but you never get direct access to an Mnesia table and with ETS I think you get a choice of "direct access in one process only" or "ETS spawns a process to manage the table" (not sure about DETS, but I guarantee it still won't be global). How could you even get direct access to a table that may be on another physical server? Mnesia has transactions for Pete's sake; how much less "no particular access control" can you get?! The closest thing to global variables you can get in Erlang is the process dictionary, which is still process-local. Erlang is probably the hardest language I know to have a true global in, because even if you cheat and have a C extension that has a C global in it, odds are that Erlang will still access it via an Erlang process wrapped around it which still has the effect of de facto locking the value into atomic reads and writes.

Haskell's State monad does not have anything remotely resembling a "global" value; it is explicitly scoped, labelled by the type system so that you can't cheat your way into it, and it literally desugars into threading functional values through a series of function calls. If that's a global variable, then you've devalued the term "global variable" to the point where every parameter of every function call is a global variable. The IO monad can contain a global value by the definition I gave above, if and only if you use "unsafePerformIO", which is an escape hatch that loudly proclaims "I AM LEAVING PURE FUNCTIONAL BEHIND AND I AM ASKING FOR THE RESULTING PAIN", and, indeed, leaves pure functional behind and asks for the resulting pain. So while that does allow you a global variable, if you really push it, it will actually end up very hard to use due to the difficult-to-predict order of evaluation and isn't a useful thing to do with unsafePerformIO. Otherwise, you only get access to the variable in IO-labeled functions, which except in pathological programs won't be anything like "all functions" and is certainly not globally available to everything. And you'll still need to do some major-league cheating to get access to even the "global" variables I mention here without passing explicit references to them all around; I'm still not sure that's possible without a C extension though I don't know enough to rule it out completely.

I can't argue for the other languages you mention since I don't know them.

Your argument is flawed, since you have some sort of weirdly potent definition of "global variable" that results in numerous things that aren't "global" getting called "global".

tel · on Dec 31, 2009

At the same time the article IS absurd. The author even admits it.

FP is letting us get control of our state in ways undreamt of previously. Unfortunately, we're still not really good at it.

iamaleksey · on Dec 30, 2009

  ets:new(global_state, [set, public, named_table])

Now we've got a named table that any process may read or write to. It is possible.

jerf · on Dec 30, 2009

As I said, it is a common misconception, but you do not get direct access to the table you just created in this case. Ets sets up a process which manages the table and all manipulations on this table go through messages to that process. It's all nicely wrapped behind an API that hides these messages, but they are there nevertheless. A value that has to be obtained via message passing is not a global value. There is code in the way that is controlling access, verifying legality, and in general it compiles to a whole array of stuff, as opposed to the true global variable case which can be represented by one memory save.

You have to be careful to think about this clearly. The defining characteristic of a true global is not that it can be obtained in some manner from everywhere in the program; by that standard even in Haskell everything's global, as there exists some organization of the program that would permit access of a given value from everywhere (though it would be one hell of a bastardized program). The characteristics are that everybody has direct, simultaneous access. If "ets" tables qualify as "global variables", then so do values sitting on a MySQL server; in erlang the difference between a MySQL value and an ETS value is pretty minimal, you access both via message passing. This is not a useful definition of "global value".

Now, it may still turn out that even with this level of access, it is "too free" and still subject to a reduced amount of the pain that global variables can bring. Haskell is a constructive example of a language that implicitly makes that argument and further distinguishes this case. But we don't benefit from failing to discriminate between a managed value that the manager permits anything to access the gate (albeit still in a managed fashion), and a completely unmanaged global value. They do have some similarities, but nevertheless, managed variables are nowhere near as bad as truly global variables.

If you're not careful and you come at this with fuzzy enough thinking, you'll end up spinning yourself into a tizzy where everything's the same. In fact what the situation is is that there's tens or hundreds of ever-so-slightly different cases and terms that try to cross the language barriers often have some issue, but there's still some broad patterns we can find, and if you find yourself just smooshing them all together you're going to find it hard to pick the right tool for the right job.

grogers · on Dec 30, 2009

The point is that with the named_table option, you don't need the tid of the table to access it, you can just use the name. Which makes it a global entry point. It doesn't matter if it requires messages to make it happen, it is still an implicit dependency. The same applies to named processes, etc.

Because of that you can write this code (to piggyback on the example you are responding to):

foo(Key) -> ets:member(global_state, Key).

Instead of:

foo(Tid, Key) -> ets:member(Tid, Key).

In the second case, all dependencies are explicit, in the first case, there is a dependency on the atom global_state being the name of the table you are accessing.

badoumba · on Dec 31, 2009

agree

statictype · on Dec 30, 2009

Doesn't this fit the definition of a strawman argument?

It's entirely possible and very natural to write imperative programs that don't rely on every possible global variable your application has ever declared.

True, if you're writing a library for consumption by others, then you would have to make it clear that your function depends on certain global variables being defined. But I would like to believe most people writing APIs for use by other programmers would know better than to rely on this type of behavior.

nostrademons · on Dec 30, 2009

I think a lot of people missed the article that this one is replying to:

http://prog21.dadgum.com/54.html

That article's main point was that functional programming doesn't work because deep in the bowels of two otherwise-pure leaf functions, you might want to make their behavior interdependent. In an imperative language, you can do this through adding a simple global variable. In a functional language, you have to change the signatures of all the callers to add this new parameter.

A lot of people pointed out that in a real program, you're asking for massive maintenance headaches if you connect two independent leaf functions through a global variable and make their behavior interdependent. In that context, this isn't a strawman, because somebody actually did suggest structuring programs this way.

Make a silly suggestion, get a silly reply.

I think both authors would do well to focus less on individual languages or programming paradigms and more on writing actual large-scale software systems. Because the imperative, OO, and functional worlds basically converge on that point: don't do that. You don't want the behavior of apparently-pure functions to change based on hidden, undocumented state. And once you document it, you might as well add it to the function signature, which is your "executable documentation" anyway.

cemerick · on Dec 30, 2009

I wasn't replying to that article -- at best, I was going off on a tangent prompted by a comment to that article.

nostrademons · on Dec 30, 2009

Well, then it's a strawman. ;-) Nobody decent actually writes imperative programs where every function depends upon the global state of the whole program.

jrockway · on Dec 30, 2009

The problem is that you can't know for sure until you read every line of code in the program, or at least read every function in a given call chain.

In languages like Haskell, the type system ensures that there are functions whose effects can be understood by only reading that one function.

kelnos · on Dec 30, 2009

True, but you're essentially arguing for a "nanny" language model that doesn't let you do things that someone decided is "unsafe" or "bad design." While I'd agree that writing a program in an imperative language where everything depends on hidden global state is a very poor choice, I'd rather use a language that trusts me to design my application as I see fit.

jimbokun · on Dec 31, 2009

'True, but you're essentially arguing for a "nanny" language model that doesn't let you do things that someone decided is "unsafe" or "bad design."'

Isn't the same true for, say, garbage collection? Higher level languages are often defined by the things they don't let you do, compared to lower level languages, in order to make your life easier.

Of course, if you're a C++ programmer, you probably agree that both mandatory garbage collection and immutable state are equally nanny-language features.

kelnos · on Dec 31, 2009

"Of course, if you're a C++ programmer..."

Ugh, no. C programmer. I don't have an intrinsic problem with GC (as it allows me to be lazy), though admittedly there are quite a few bad GC implementations out there that can cause noticeable performance issues.

Immutable state has its uses, but, again, it's a question of whether or not it's mandatory.

jrockway · on Dec 30, 2009

No I'm not. I'm arguing for safety as the default, and un-safety as an option. Haskell gives me this.

The usual imperative languages don't even make safety an option. You just have to try hard and hope for the best.

olavk · on Dec 30, 2009

In Haskell you can easily fake an environment with a bunch of mutable global variables avaliable throughout the program, although this is frowned upon. Since global mutable variables are discouraged even in imperative languages, I agree that the argument is a strawman.

There is however a nice property that Haskell makes it explicit in the type signature if global mutable state is available to the function. This makes it a lot easier to guarantee that large parts of a program is indeed side-effect free.

ltbarcly3 · on Dec 30, 2009

I've got this car. It's absurdly easy to run over people. All you have to do is drive on the sidewalk. With a boat, this would be somewhere between completely re-engineering it and impossible.

bad_user · on Dec 30, 2009

Oh, you should drive your boat near a beach on a hot day. Either way you can run over strangers ... however, in general if you want to be prepared for a specific target (in general), then it's more likely to get the job done with a car since not all people go to the beach or take a swim :)

So while driving a boat is fun, keep that car nearby.

billjings · on Dec 30, 2009

Part of the craft of programming is expressing a computation or interaction as simply as possible.

Now, somehow the original point (http://prog21.dadgum.com/54.html) has missed its mark. The original point was this: that for certain kinds of interactions, an imperative solution with shared state solution is vastly less complicated and preferable to a purely functional solution. This is because in some cases the purely function solution will litter a whole hierarchy of function calls or an entire set of data structures with additional data, but the imperative, stateful solution will not.

Now, obviously the problem with imperative is:

"...that state gets threaded everywhere, and you can't look at any function individually and know how it will behave."

But what is the problem with functional programming? That you can't look at any bit of state in the program unless some other part of the program has explicitly given it to you.

Now, I'd say that 95% of the time I'd rather have that problem than the problems that come with state - like having to worry about call order semantics, reentrancy, etc etc. But dammit - 5% of the time I'd rather use the state and be done with it!

And more generally speaking, 100% of the time I'd rather have a concrete example of something that is implemented simply today than an abstract idea of why something might be complicated in the future.

fauigerzigerk · on Dec 30, 2009

"The behaviour of every function in a mutable, imperative environment is dependent upon the state of all of the other (variables|attributes|bindings|whatever) in your program at the time the function is invoked."

That's not true in general, but the problematic truth is that you don't know when it is true and when it is not true in any particular situation. That is, there are no formal guarantees that tell the caller of a function that the function does not depend on each and every mutable state in the program.

The second issue is the fact that methods in object oriented code tend to depend on too much state held in the instance variables of their containing class. This is mostly due to bad design. But there must be something else if even the best programmers out there feel the need to make this design mistake over and over again when using OO languages.

I am sympathetic to functional programming but for some of the code I write I have to carefully optimize the memory layout of my data structures. Adding an immutability requirement to very stringent memory usage and performance requirements is just not feasible in these situations.

eru · on Dec 30, 2009

From the article:

> Of course, there is a place for mutable, imperative programming. The fellow who wrote the blog post to which I linked above appears to work on games, one of the few places where one could unapologetically use an imperative programming language with mutable state.

Comtrast http://lambda-the-ultimate.org/node/1277 where the authors of Unreal point out that functional programs will be the future of game programming.

cemerick · on Dec 31, 2009

Thanks for that pointer -- I added that link to the original post.

showell · on Dec 30, 2009

This seems like a wild exaggeration:

"The behaviour of every function in a mutable, imperative environment is dependent upon the state of all of the other (variables|attributes|bindings|whatever) in your program at the time the function is invoked."

You can write a function in an otherwise mutable and imperative environment like Python:

pi = 3.15 # damn, typo for the value of pi! pi = pi * 15 # mutate it, wrongly again

def sum(a, b): return a + b print sum(7, 8)

Please tell me how sum() depends on pi. The program correctly prints 15.

silentbicycle · on Dec 30, 2009

It doesn't. It is a wild exaggeration. It usually isn't the 300+ variables, just a few obscured by multiple levels of indirection that can cause unexpected inconsistencies. (It gets even worse when you drag in inheritance...)

Each function only depends on the arguments passed in, and variables outside of its scope which are referenced. However, the same is true of every function called inside the function, and the dependency is transitive, so the impact of changes to variables that aren't explicitly threaded through leaks outside to functions that never mention them. That's the problem. Hidden state.

It's also the kind of issue that usually looks incredibly contrived in small (say blog-post-sized) code samples, but isn't funny anymore when you have to deal with big lumps of spaghetti code. Global variables make reasoning about dataflow hard. Real problem, poor description.

olavk · on Dec 30, 2009

Clearly you can see in the implementation of the function that it doesn't rely on any globals. However if you look at the function as a black box, you cant really be sure what global state it relies on. This could be an issue if you use larger libraries you didn't write yourself (or in my case, if I use code I wrote more than two weeks ago!).

In a language like Haskell the type signatures clearly indicates which global state the function (and other functions called by it) has access to, which makes it a lot easier to reason about side effects, even for code where you haven't read the source.

Of course, this approach also have its downsides. If you decide for debugging purposes to add a logging function to an otherwise pure function deep inside a pure part of your program, you may have to change a whole lot of code.

KirinDave · on Dec 30, 2009

"Sum" is written functionally, thus it is not dependent on the outside world. You used a functional stateless method to show that imperative stateful programming is safe.

jrockway · on Dec 30, 2009

How do you know that "+" doesn't depend on pi? Oh. Now you see the problem.

EricBurnett · on Dec 30, 2009

If you need 100% guarantees, then yes, do things in a purely functional way. However, I think the point is that in any sane program you can easily and safely make the assumption that + does not depend on pi in this situation.

Yes, anything _can_ depend on any and all globals, but with proper practices, documentation, code reviews and what have you, one can make simplifying assumptions on what _will_ happen.

jimbokun · on Dec 31, 2009

"...but with proper practices, documentation, code reviews and what have you, one can make simplifying assumptions on what _will_ happen."

Or you can leave out the practices, documentation, and code reviews and use a functional language and get the same result. Yes, those things are useful for other reasons, but the current argument is about avoiding the inherent dangers of global variables.

cemerick · on Dec 30, 2009

Of course, the point is that a caller of any function in an imperative environment doesn't know what state its implementation depends upon.

anonjon · on Dec 30, 2009

It is an exaggeration; but only a wild one if you are doing very simple examples.

Imagine you have a 'mutable' program and I have a line that says something like this.

(defun frob (a b) (+ (foo a) (bar b)))

Not only is frob dependent on the definitions of foo and bar, it is also dependent on the the definitions of any mutable globals that are within foo and bar.

If you imagine your program as a directed cyclic graph of functions, and pretend that mutable globals are really functions that set or get a position in memory, you can see that adding more globals to your program (and using them) increases the complexity of your graph mostly by adding cycles to it. (As well horizontal jumps across the entire graph).

It is true that sometimes you need these cycles to write a program (vs. a program that causes your computer to heat up and do nothing), but it is not wildly inaccurate to say that using them all the time makes your program more complex.

I'm not even sure that I'm on board with the supposition that global variables shouldn't be in the language; (saying 'shouldn't' exist at all to a language feature is entirely non-pragmatic).

I think it is more accurate to say that they should be in the language, but they should have a cost greater than locals, and it should be best practice to use them as little as is reasonably possible.

eru · on Dec 30, 2009

Also frob is dependent on the (mutable) definitions of foo and bar. Functions can be redefined in e.g. Python and Scheme. (And functions are variables, too.)

euccastro · on Dec 31, 2009

Same for +.

Jach · on Dec 30, 2009

I see the main point, and I try not to use globals when I can, but I can't help but recall a quote I read from somewhere: "People who are afraid of globals are usually afraid of girls, spiders, etc."

KirinDave · on Dec 30, 2009

I wouldn't trust the person who told you that, anymore.

Pretty much the last decade of software engineering has been a slow realization how how untenable the use of large amounts of global state is.

Jach · on Dec 30, 2009

You're right, of course, but sometimes globals can be the magic sauce you need to hack out something quickly. (I find global hate to be similar to 'goto' hate.)

cemerick · on Dec 31, 2009

Ah, the "I'm a big man, I code by writing bits to the platters with a magnet" argument. (Quick, prize to the first xkcd link here!)

Jach · on Jan 1, 2010

I found it an amusing quote, and I'm a strong believer in using the right tool for the right task. :)

francoisdevlin · on Dec 31, 2009

http://xkcd.com/378/

You asked :)

nerme · on Dec 30, 2009

Graphical data flow languages like Max/MSP, Puredata, LabVIEW, and even Quartz Composer make a lot more sense for functional programing.

Every time I see someone "explaining" what is going on in a functional language, it looks just like a Max patch! There are multiple inputs attached to multiple outputs. Of course, if you're well familiar with whatever textual based language you're working with, you are visualizing this process, but still, why aren't all functional languages moving in this direction?

The great thing is you can still drop text-based code in to one of these graphical dataflow languages.

I write a LOT of code in Max/MSP and Quartz Composer. Thankfully, both support Javascript, because some things are just plain easier if you can drop back in to a procedural, text-based language.

Even writing your own compiled plugins/externals is pretty straightforward. Quartz uses Obj-C and Max/MSP uses C++, although I think you can mix in or use pure Obj-C as well. I haven't had to write any functionality in to Max that wasn't already available with the core package or from CNMAT, UBC Toolbox, etc. (yet!)

Anyways, what I'm getting at is that there is a huge advantage in laying things out in a graphical way, especially for concurrent programming.

What I find interesting is that there is very little support for these languages. There are no patterns, best-practices, etc... I've had to bring over the basic concepts from other text-based languages when I'm writing a larger project... and often when I'm looking at other people's patches I'm astonished at the lack of maintainability, the amount of code repetition, etc... it can easily turn in to a spidery mess if you're not used to what you're doing!

eru · on Dec 30, 2009

Can you do higher-order functions in Max?

nerme · on Dec 30, 2009

Obviously not in the same manner... but I feel that proper use of patcher/bpatcher objects and utilizing the meta-programming capabilities of it's javascript environment by allowing you to dynamically generate objects and connections allow you to accomplish pretty much whatever it is that you need to do... while gaining my favorite benefits of higher-order functions, namely, tighter, more organized code with less redundancy and awesome interfaces.

That being said, Max is still missing some very important aspects, however, such as iteration and recursion.

Interestingly, Quartz Composer does have a method for iteration, one that I frequently use... it works really well, actually!

I really like text-based programming, but I'm increasingly seeing the benefits of graphical programming.

eru · on Dec 31, 2009

Graphical programming and lambda calculus can go together well. Have a look at "To Dissect a Mockingbird: A Graphical Notation for the Lambda Calculus with Animated Reduction" (http://dkeenan.com/Lambda/) for some ideas.

Quarrelsome · on Dec 30, 2009

Can we not just summarise this whole thing with the words:

Shared state should be avoided, if possible and pragmatic. Now go do this in whatever language you wish.

Psyonic · on Dec 30, 2009

No, because its not just about you. How can I as a developer know if a function I'm calling in C uses more than its inputs? That's the real issue.

vidarh · on Dec 30, 2009

Except it hardly ever is an issue. I can't remember last time I came across a situation in any of the languages I use that support global variables where some function depended on some global variable without the docs making a huge fuss about it. It was certainly many years ago.

cemerick · on Dec 30, 2009

None of this has to do with globals -- the key issue is accessibility of shared mutable state from within imperative functions. That is an inescapable constant outside of FP environments.

cemerick · on Dec 30, 2009

No, because what language and environment you choose impacts your ability / likelihood of actually doing that.

Quarrelsome · on Dec 30, 2009

Sorry, I meant mutable state.

stcredzero · on Dec 30, 2009

Reminds me of one day, when a Squeak Smalltalk newbie showed up on the mailing list, not understanding why his music-related program won't compile. He has his whole program in one method, and needs several hundred temporary variables to store all of his note objects. When informed of the 255 temp variable limit, he starts getting outraged.

ssp · on Dec 31, 2009

316 arguments is clearly an exaggeration, but there's a reason programmers are admonished against global variables. For example many of the classic Unix utilities were built around a pile of global variables, and they could be quite difficult to understand.

Sometimes you see people making fun of programmers who have eliminated the pile of global variables by passing around a pointer to a structure containing all the variables instead. But this actually is an improvement because the type signature of each function now indicates precisely what state can be modified.

cousin_it · on Dec 31, 2009

Late to the party again.

The way I see it, when you face a programming problem you mentally divide it into subtasks and use solve each one in the most natural, sweet-spot manner learned from experience. Have to parse a minilanguage? Use an FP approach. Control UI widgets? Do it in OOP style, that's what it was designed for. Keep track of global game state? Just write it imperatively and be done with it.

As a professional you need to understand the good ideas in your field, but as a responsible worker you need the wisdom to avoid pushing those good ideas beyond their proven area of application. One day you'll read an article somewhere that will demonstrate how Haskell is a great fit for ragdoll physics simulations (for example), like it happened before with Parsec for parsing (another example). Until that day... don't rush in blindly believing that purity is better for every class of problem. It might be, but we don't have anything resembling a proof yet. So learn from experience, and subdivide.

jamii · on Dec 31, 2009

The arguments here all seem to assume that you're writing the code yourself. I think the best argument against rampant global state is when dealing with other peoples code.

An example from a couple of months back. I was using an open source library for parsing LaTeX code before it was added to a search index. After getting the alpha version up and running I ran into a subtle bug - malformed input such as "$$ \dot{C}O_2" would break every future parse of any mathematical input. So somewhere in this 20k loc library a global variable is not being reset after parsing. Since this is an imperative language there is no way to figure out which part of the call chain is calling which global variables so I potentially have to read every line of code of find it.

After two days I managed to track it down by writing a script that reads the source code to identify every static global variable in the code and tracks their values between calls. This is not the ideal way to find bugs.

tlb · on Dec 30, 2009

Can someone point to a useful program written in a 99%+ functional style?

eru · on Dec 30, 2009

GHC is written in Haskell. Darcs may also be useful. You can also look at http://en.wikipedia.org/wiki/Haskell_(programming_language)#... for some examples. It's hard to escape functional style in Haskell.

eru · on Dec 31, 2009

There are also some interesting applications in Clean. They (used to) ship a jump-n-run game as an example with their compilers and IDE.

ajross · on Dec 30, 2009

This seems to be missing the point. Yes, it's true that the functional programming equivalent of functions with side effects is functions with lots of extra arguments to capture the equivalent state.

Which is kind of the point: in an imperative language, you can capture all that complexity -- all 316 arguments, even -- with a few quick hacks to the side effect behavior of your functions. And that's a feature. It makes things simpler sometimes (obviously it can hurt, too -- tools can't substitute for taste). And it just can't be expressed in functional terms.

jrockway · on Dec 30, 2009

And it just can't be expressed in functional terms.

Incorrect. Say you want to read some global state:

   a = do
       info <- ask
       return (info + 1)

   b = do
       info <- ask
       moreInfo <- a
       return (info + moreInfo)

   runReader b 42 -- ==> 42 + (42 + 1) ==> 85

This lets you weave information in as deeply as you want. If b doesn't need "info" but a does, and b calls a, that's still fine. You don't need to change b to make the information available to a. Just like a "global variable".

You can also accumulate arbitrary information:

  fib :: Int -> Writer [String] Int
  fib 0 = tell ["fib 0"] >> return 1
  fib 1 = tell ["fib 1"] >> return 1
  fib x = do
    tell ["fib " ++ show x]
    a <- fib (x - 1)
    b <- fib (x - 2)
    return $ a + b

  runWriter (fib 4) ==> (5, ["fib 4", "fib 3", ...])

Again, very much like a global variable holding log data, but without many undesirable possibilities (like clobbering the log history accidentally, or performing IO during the computation) that are inherent in "global variables".

And of course, you can read and write state, if you want:

  inc :: State Int Int
  inc = do
    old <- get
    let new = old + 1
    put new
    return new

  fib 0 = inc >> return 1
  fib 1 = inc >> return 1
  fib x = do
    inc
    a <- fib (x - 1)
    b <- fib (x - 2)
    return $ a + b

  runState (fib 25) 0 ==> (121393, 242785)

This is just like imperative programming, except easier and more fine-grained. Note how the number of times "fib" has been called is scoped to the "runState" invocation, and that we don't even need a "variable name". And as with Reader and Writer, if a called function needs the State, the calling function doesn't need to pass it in.

So basically, many variations on the hacks that imperative programmers can be implemented easily in functional languages. Imperative programming languages give you a box to do with as you please, but functional programming languages can give you a box that you can prevent yourself from accidentally misusing. The best of both worlds.

ajross · on Dec 30, 2009

Doesn't the fact that you had to write this treatise (which, due only partially to syntax impedance, I frankly don't understand) to explain how to set a flipping global variable maybe clue you in that it's kinda complicated in functional languages? :)

pictureotf · on Dec 30, 2009

This says more about your understanding of Haskell than anything else, really.

The GP's example is not a _global_ variable, by the way. Global variables are not really available in Haskell, except if you use unsafePerformIO:

  import System.IO.Unsafe
  {-# NOINLINE global #-}
  global :: IORef Int
  global = unsafePerformIO $ newIORef 0

ajross · on Dec 31, 2009

Ignorant as charged, I guess. And the fact that neither of you can see the embarassing complexity of these solutions relative to, again, setting a variable reflects what, your corresponding ignorance of von neumann architectures?

jrockway · on Dec 31, 2009

Oh, we see the "complexity" in implementing this. (Note: not very complex in real life.) But we also see the embarrassing complexity of maintaining software that uses a "quick global variable" here and there.

It's nice to know what your software does without having to guess and pray. That is worth saying "runState foo 42" instead of just "foo".

ajross · on Dec 31, 2009

Right. So you guys are purists who refuse to use imperative constructs even when they're objectively simpler, either because (1) you see hidden costs the rest of us don't or (2) you're wrong.

Isn't that the point of this whole argument?

jrockway · on Dec 31, 2009

due only partially to syntax impedance, I frankly don't understand

Consider:

   foo <- bar

the same as:

   var foo = bar();

and:

   foo >>= bar

like:

   var x = foo();
   return bar(x);

and:

   foo >> bar

as:

   foo();
   return bar();

jules · on Dec 30, 2009

Twisted thinking. He first mentions that functional programs have to thread state they use through the entire program by passing it as parameters to functions. Then he says that imperative procedures can depend implicitly on the global variables they use. The equivalent in a functional program is adding a parameter to all the functions that use the state, and to all functions that call these functions.

He goes on to say this:

> Would you ever intentionally write a method signature that takes 316 arguments? Would you use any library that contained such a function signature? No? Then why are you using tools that force such craziness upon you?

So we're now expecting to get a conclusion, namely that functional programming is inferior to imperative programming, because you get functions with 316 parameters that thread the state through the program. But no, despite setting up an entire argument against functional programming he acts like he's arguing against imperative programming:

> Of course, there is a place for mutable, imperative programming.

I'm very confused.

DougBTX · on Dec 30, 2009

I'm very confused.

I expected a different conclusion from the one you describe, so it didn't sound twisted to me.

In the last question you quote, he refers to "tools". The question will only make sense if you understand that he is referring to imperative tools, not functional ones.

because you get functions with 316 parameters that thread the state through the program.

He assumes that you would never write a single function which depends on 316 different parameters, but that there could be 313 global variables.

Now you have a function which takes three arguments.

His argument is that to understand what this function depends on will require reading all the code in the function and all the code in every function which it calls, to work out which global variables it accesses. Without doing that, you have to assume that this simple function depends on all 313 of them, even though you would never write a function which explicitly took that many arguments.

On the other hand, in a purely functional language, you would still have a function with only three arguments, and simply by reading the function signature, you can understand what it will depend on.

The middle ground is to say, hey, but the function really did depend on some of those global variables, so in the real world the function will have more than three arguments. More than three, but less than 316, lets call it five.

But you find yourself with six other functions which depend on the same two global variables, while nothing else does. So, you go off and make that explicit by having one class with two instance variables and a few methods, one of which takes three arguments. And then you ban global variables. Welcome to OOP.

jules · on Dec 31, 2009

> His argument is that to understand what this function depends on will require reading all the code in the function and all the code in every function which it calls, to work out which global variables it accesses. Without doing that, you have to assume that this simple function depends on all 313 of them, even though you would never write a function which explicitly took that many arguments.

That is not what he said. He said that a function in an imperative language depends on all global state in the program, not could depend:

> The behaviour of every function in a mutable, imperative environment is dependent upon the state of all of the other (variables|attributes|bindings|whatever) in your program at the time the function is invoked.

This is obviously false. And if you did have a function that was dependent on all 316 global variables, then your equivalent functional program will have functions all over the place threading 316 state parameters through the program! That is what I was trying to say in the post above.

Now lets consider the more reasonable situation that the function does not depend on all global variables:

> The middle ground is to say, hey, but the function really did depend on some of those global variables, so in the real world the function will have more than three arguments. More than three, but less than 316, lets call it five.

So what's the advantage of explicitly writing out the parameters? You could do the same in an imperative program: just write a comment above the function definition that says which globals it uses and which ones it modifies. The advantage of mutable state is that you don't have to do this: you can abstract over stateful entities. Consider logging for example. You could have a global variable called "log" that is a list of strings. Then define a log function:

    logmessages = []
    
    def log(msg):
      logmessages <- cons(msg,logmessages)

If you want to log something in the middle of your program you can just insert a log("message") call. If you wanted to do the same in a functional program you'd have to thread the logmessages variable through your entire program. You need to change all functions that want to log and all functions that call functions that want to log to take an extra argument and to return an extra value (changing the functions so that they return tuples, and all call sites have to unpack the tuple, etc). Which solution is better? The whole point of imperative programming is that you don't have to write these kludges.

It gets even worse very quickly if you don't just consider mutable global variables, but also mutable objects. It is not at all obvious how you'd translate an imperative program that mutates objects into a functional program without essentially writing an interpreter for an imperative language. There are papers on doing this when there is a single reference to the object that gets mutated (Zippers), but the real power of state comes from having multiple references to a single stateful entity.

Imperative programming has native support for stateful entities (or you could say has native support for time), in functional programming you have to manage time yourself. It's like manual memory management vs garbage collection.

cemerick · on Dec 30, 2009

You need to read the post again. What you thought I was writing, were quotes from others.

snorkel · on Dec 31, 2009

All my methods take 1 argument: the entire state of the universe and everything in it.

pwnstigator · on Dec 30, 2009

It's hard to avoid this, even in a functional language like OCaml (which has mutable ref cells).

Haskell does a pretty good job, by requiring functions that have side effects to show it in their type signature. If you're doing Haskell right, the number of functions that have side effects is very small as a proportion of the number of functions you'll write in total, so most functions you write will be purely referentially transparent. Of course, I'd argue that this is true if you're using any high-level language properly, except possibly in GUI programming.

jfl · on Dec 30, 2009

All my methods take just a few argument, I learned to design carefully the state of my programs and manage adequatly how function access that state. Imperative programming works for me (and many others). You say I am doing it wrong and want me to do it your way. Proselytism ?

KirinDave · on Dec 30, 2009

No one is going to argue that your program doesn't work when it seems to work.

Unless you're using threads and locks. Then it's probably a safe bet that it will fail in ways you don't expect. ;)

jfl · on Dec 31, 2009

Actually, my webapps are full of threads. Again, Imperative style is not the wrong way, and FP the right way. They are just different. Why not let people find their own way ? Just give the information they need to make up their mind. And why not be honest and tell them that learning Haskell is probably more painful that learning to deal with the mutable state.

KirinDave · on Dec 31, 2009

You don't understand what I'm saying... The likelihood that there are no deadlocks or race conditions in your threaded applications is incredibly low without a massive amount of testing. Even experts in the field can mess up and find bugs surfacing years later that they didn't expect. The fact that it's hard to make a non-trivial multi-threaded programs bulletproof is proven both theoretically and empirically.

That's why there is a movement in the software community to consolidate threading portions of the code and boil them down to the simplest constructs that could possibly work. It reduces the debugging load significantly. And that's why a lot of time is being invested developing strategies that are atomic and insensitive to threaded programming, because it lets you sidestep the problem entirely. Functional programming is just the leading edge of this trend, it's already sweeping through the imperative and objective world. Things like Software Transactional Memory and Erlang's thread management are examples of tenable approaches. These approaches are a roadmap of how we can move forward and make truly and clearly reliable threaded programs. Programs that don't have land mines waiting for a 1:1000000 shot to set them off and inexplicably ruin everything.

This isn't about "finding your own way." It's about groping for any handhold as we're about to fall into a dark precipice of totally unmanageable massively-parallel and/or distributed code.

P.S. Haskell only seems harder to you because you're clearly not dealing with all the problems that mutable state brings. It's much easier to ignore something than to learn to deal with it.

cemerick · on Dec 31, 2009

There is a reality out there, we're not arguing over pastels vs. chalk vs. pencils. You can write webapps in assembly if you're so motivated and tenacious.

on Dec 31, 2009

[dead]

cemerick · on Dec 31, 2009

A caller of a function in an imperative environment cannot assume anything about the scope of that function's access of mutable state within the program. My characterization of the situation might be a little spectacular w.r.t. specifics (e.g. 316 arguments, etc), but I've not seen a robust refutation of the principle.

There's no doubt that there is subjectivity w.r.t. practices and methodologies, not to mention domain-specific tendencies and requirements. But it's a little absurd to claim that, very specifically, mutable state is on par with persistent data structures (or, at the very least, immutable state constructs) on almost any axis of reliability or ability to reason about a program's operation, especially in conjunction with any degree of concurrency.

"Pain" is definitely subjective. IMO, if you're not feeling at least a little pain, you're not pushing your skills, your craft, etc. Too much pain, and it's possible you've not kept yourself limber enough to stretch for the big wins, regardless of context.

jfl · on Jan 1, 2010

"A caller of a function in an imperative environment cannot assume anything about the scope of that function's access of mutable state within the program"

This really doesn't make sense.

main() { A a = new A(); f(); ... } the caller main() can assume that "a" is not in scope of f(). Is it the robust refutation you need ?

The caller can further assume that the scope of f() is limited to the part of the state for which there is a path from a global variable or from one of its parameters (You can see the state as a graph. The nodes are the data and the links are the references) Therefore, this scope is not the whole state but a limited subset of the state. Claiming otherwise is wrong and does not help to bring attention to what need to be improved : this subset is itself a superset of the part of the state that the function really need. We need new techniques to hide more state to the function, for example some kind dereferencement control ?

KirinDave · on Jan 1, 2010

I believe cemerick meant, "A caller of a function in an imperative environment cannot assume anything about the scope of that function's access of mutable state within the program without reading the entire program." If they verify it by perfectly understanding the underlying code, the of course that doesn't hold.

But even in the code snippet you gave us, we really don't know if global state is being manipuated. For example, A could be handling global counters or referencing global variables (common example: stdin and stdout). And how do you know that a doesn't get manipulated by f()? You'd have to check! Perhaps A's constructor registers all A's it every creates and then f() marks all A's for protection from garbage collection?

Really, we can't assume anything about the code snippet you gave us without verifying it. The code might be modular, it might be reasonably modular, or it might be a complete mess.

This boundary of uncertainty is what functional programmers are railing against, and what you're defending.

jfl · on Jan 2, 2010

We have to agree to disagree. Peace.