For all the thought you've put into your comment you clearly have a lot of your ego -- probably too much -- invested in being the guy who's not afraid to publicly point out it might be worth making scientific inquiry into racial differences.
Let me point out exactly how you're being stupid here.
The article is about future ramifications of genetic sequencing technology finally becoming economical enough that it's possible to cheaply sequence a person's complete DNA.
You're right that this will allow research to be done of the form: grab a DB with 100k different peoples' sequences; discover strong statistical associations between such-and-such genes and such-and-such amount of alcohol intolerance, and notice that those genes are particularly prevalent amongst people of asian descent.
Where you go awry is your conceptualization of how this information will likely be used; it's a failure to take a premise you've accepted in one part of your scenario -- the premise that DNA sequencing will be cheap-and-thorough enough to allow for research like just sketched -- and failing to apply it to another part of your scenario -- the doctor-patient interaction, and so on.
Given a world in which DNA sequencing is super-cheap and super-affordable, which doctor-patient interaction is likelier:
- doctor says to patient: "you clearly are asian just by looking at you. asians have been shown to have lower alcohol tolerance on average. be careful with how much alcohol you imbibe."
- doctor says to patient: "your DNA sequence contains 7 of the 10 major genes associated with higher levels of alcohol intolerance. this isn't surprising given your family history, but it's nice to know concretely what you've got. be careful with how much alcohol you imbibe."
...which is why you're being stupid: race is a proxy measure for ancestry that's "field-performable" (just look!) and cost-effective and is not an entirely useless proxy measure for ancestry; it is indeed the case that were political bugaboos different than they are that some useful research could be done to characterize racial differences that could be put to general use.
It's also a terribly inaccurate proxy and given the ability to do direct DNA sequencing would be dropped like hotcakes; why use the inaccurate proxy when you can make a direct measurement?
So even in your hypothetical example of looking for a wife: when this research is available, would you not be better served testing for specific genes in a particular individual rather than cutting it off at the level of race? You might still want to move to Ankara if you discovered that "Turks" are likelier than Brits to carry the genes you want your kids to have, but would you really not take it further and check the specific woman's DNA directly (the way that some jews test for tay-sachs at some point in the courtship process)? If you take the research seriously enough to move to Ankara to stack the deck why stop there?
You're essentially making the same kind of mental error you saw in the early dot-com days where people understood that "in the future, you'll buy stuff at home over the internet" -- so were semi-prescient -- but couldn't discard what they already knew about "shopping"; the consequence being that they'd write articles about how you'd sign into some 3d virtual world and then visit an online bookstore and browse the shelves in virtual reality (transporting outdated ideas into a world of new possibilities).
Say a proper Bayesian who's a fan of 'da Bears' just saw the bears lose again, badly. How is he supposed to update these two beliefs:
- this season 'da Bears' will tear shit up every Sunday (fwiw assume prior on this has it as likely)
- I am usually over-optimistic about 'da Bears''s performance (fwiw assume prior on this also has it as likely)
...in light of the new evidence?
(Side note: I'm specifically not talking about bayesian analysis or statistics in a formal setting with a prespecified, finite list of allowed hypotheses (like spam or ham).
If I wrote this out in extreme detail I'd get nothing done today, I think you can charitably fill in the gaps and missing pieces of what I'm getting at; if not I'll fill them out in a few.)
There are a ton of things to clarify, but I'll take a stab at it.
A proper Bayesian who is betting and doesn't want to lose badly and still ends up as a fan of the 'da Bears' has probably got a prior that gives some confident edge to them winning. Belief A could be stated using this joint distribution as the product of their predicted chances of winning each game in order. If it's actually 'likely' then that means you've got a pretty incredible edge on them winning (the size proportional to how much season is left).
Belief B is a weird one though. It's a meta-hypothesis about the calibration of your own personal beliefs. The evidence you use to update on this belief is the discrepancy between what you would honestly predict and what actually happened. A proper Bayesian with money on the line would want to recalibrate as best as they could using data already available in order to get the probability of B as low as possible before starting to look at A.
So our proper Bayesian first looks over old predictions he has about the Bear's performance, reworking whatever internal understanding of the factors that go into winning in football he has, until he is well calibrated. At this point, his probability for belief A has almost certainly dropped because it's a pretty unlikely thing for a team to just take a thing apart at every single game for the whole season, but if he still ends up with a strong prior on them winning then a single loss, even if it's pretty bad, won't shift it around a lot.
In short, he'll think about it a lot, cancel out whatever personal biases he can manage, then bet conservatively unless he has some sort of knowledge that provides a really, really strong edge on them winning. IMO, he's got an inside line with some dirty, dirty men.
The specific example of 'da Bears' was a weak attempt at humor. I can't easily clarify what I'm getting at and keep this short, and I've got limited time so I'll do the best I can.
I see a lot of people using "informal" bayesian reasoning (meaning a lot of talk about priors and updating and reference to theorems but never any use of actual distributions beyond super-super-cursory examples applied to trivial situations like the boy/girl thing here or stuff like the monty hall problem).
I don't have any problem at all with bayesian analysis applied in a rigorous setting to a rigorously specified problem (like spam detection and so on).
In an informal setting I'm extremely skeptical of the uses I tend to see b/c there's no careful attempt to clearly delineate which informally-statable hypotheses are valid and which are "invalid" "meta-hypotheses" like the optimism thing.
What you've described here is a way in which someone reasonably smart would eliminate the meta hypothesis, which is fine. In general I wouldn't expect it to be feasible to take a full mental inventory, do a topological sort on your beliefs, and then apply the same procedure; most people most of the time will be running around holding partially-inconsistent beliefs (where "hold" means if you were to ask them to give an estimate of, say, what beliefs they had about what # of their beliefs were likely to wind up revealed to be significantly off in the future, or to give an estimate of what they believe about the frequency with which they'd encounter evidence leading to significant revisions of their beliefs, they'd have an answer on offer which would still have "work to do", the way the unexamined belief that "I'm too optimistic about the bears" really has work to be done).
What I'm curious about is if there's either a clearly-specifiable criteria for which types of beliefs or hypotheses are workable and which are "too meta to work", or there's some kind of theorem guaranteeing that starting out with "inconsistent" beliefs -- in the sense of "meta-hypotheses" like with da bears -- you can apply this algorithm to process evidence and over time you'll converge on beliefs that're at least more consistent than you started with.
It's hard to say much more without getting formal and I'm out of time for now; since I'm mainly concerned with informal use of "bayesian" metaphors it's not hugely critical to formalize this stuff but later I could give it a proper whack.
That's definitely an interesting space. I think the highly principled side of the Bayesian boat would state that meta-hypotheses are tied into your prior on model building information (Pr(I), etc) and that it needs to be updated alongside everything else. So now if your hypothesis' posterior becomes something like Pr(H, theta, I) the whole business needs to be updated and will include all the meta level intellectual rigor. At this point I feel like I might be walking into the space of Structural Causal Modeling and I'm not too well versed there at all.
In the informal setting though you're only ever likely to be trying to "update" one belief at a time, so, yeah, it definitely requires intellectual care to make sure to follow dependencies. Worse though, is that it should be possible to two have codependent estimations and if you aren't aware of that codependency you won't ever be able to get along.
I think that's all interesting, but I'm not sure it applies to informal situations as well as one might hope. Frequently, Bayesian techniques are only used informally in conjunction with strong rationalist heuristics which help to build these reductionist hierarchies of effects and then allow for clear(er) methodology to find an accurate answer.
Few people thinking carefully and rationally would be willing to bet on their beliefs so long as they know that thy have an outstanding miscalibration. That's why scientists, good scientists anyway, will so often preclude things with disclaimers. They want you to be aware of whatever biases they can before you start to judge their opinions.
Just something I had a hard time figuring out and seems poorly documented:
--type-set=scala=.scala # works in .ackrc
--type-set scala=.scala # doesn't work in .ackrc
...despite the version with one equals being correct command-line sequence. Only seems to matter if you want each flag on a single line in your .ackrc file. Might not be true for everyone.
I have a longer comment here that argues the opposite: list-comprehensions are only a syntactic pun or two away from set-builder notation, which is a higher level-of-abstraction (it declaratively states what it is) than map+filter (which specify a procedure to generate it, albeit at higher level of abstraction than a for-loop).
If you can put together a nontrivial usage of map+filter with at least three source collections that's more concise than the equivalent list comprehension I'll (figuratively) eat my hat.
I'm not so sure if set notation is higher level, but for the example in your longer comment, map/filter/product is not that bad if you use it wisely. Here is my version:
map(lambda (w,s,l): {'widget': w, 'sprocket': s, 'location': l}
filter(lambda (w,s,l): l.hasInStock(w) and l.hasInStock(s) and w.isUsableWith(s),
product(widgets, sprockets, locations)))
Well, I agree it is not any conciser that its list comprehension (about the same I guess?). Nothing is perfect, like you said, know you tool :)
You'll be thrilled (lol) to know that the lambda-tuple syntax isn't in python 3 (!); it does make my examples more concise.
The argument in favor of set-notation being higher level is it's less specific (it doesn't explicitly provide a sequence of operations, just an outcome).
List comprehensions look like set notation but have an implicit procedural translation you have to keep in mind to use them well, so it's a toss-up.
I prefer map/filter/reduce when sequencing has large performance implications but for simple filtering or raw-data-shaping comprehensions read more smoothly.
> You'll be thrilled (lol) to know that the lambda-tuple syntax isn't in python 3
Now you know why many people like me are gradually pissed off by Python and start the exile ... currently trying Scala and it seems a nice language (with list comprehension too! :)
> List comprehensions look like set notation but have an implicit procedural translation you have to keep in mind to use them well, so it's a toss-up.
Actually I think that's the problem I have with list comprehensions: I use them a lot in my code, usually 1~3 levels nested and then have a hard time tracking down the order of implicit loops (which is inner vs outer) and make sure the intermediate variables (x for x in y for y in z ...) do not clash ... OK maybe I'm using it too much and in the wrong way :(
> I prefer map/filter/reduce when sequencing has large performance implications but for simple filtering or raw-data-shaping comprehensions read more smoothly.
I didn't know map/filter is faster than list comprehensions? I thought both are optimized by Python interpreter. But I like the idea of knowing that at least map can be parallelized easily. But since Python does not utilize multicore in a decent way, all bets are off :(
Using do-notation is probably cheating, but what do you think of this? (in Haskell):
do { w <- widgets;
s <- sprockets;
l <- locations;
guard (l `hasInStock` w);
guard (l `hasInStock` s);
guard (w `isUsableWith` s);
return (w, s, l); }
We're trading horizontal space for vertical space. I think it's much clearer than either list comprehensions or plain map/filters. It's the best of both worlds.
I like the look of it. Am I correct that the order the statements are in translates into the order things are evaluated in when the code is called?
If eg I edited it to be:
do { l <- locations;
w <- widgets;
guard (l `hasInStock` w);
s <- sprockets;
guard (l `hasInStock` s);
guard (w `isUsableWith` s);
return (w,s,l); }
Does that force it to go through "location-first" and only check the w and s of (w,s,l) for compatibility if it has already ascertained that w and s are in stock @ l?
It depends slightly on how you came to functional programming.
You can arguably trace Python's list comprehension syntax all the way back to setl, a "set-theoretic programming language", and the resemblance to mathematical set-builder notation is intentional; compare
- let B = { g(a) s.t. | a \in A and f(a) holds}
- B = [g(a) for a in A if f(a)]
If you're used to thinking in sets having to decompose into "maps" and "filters" is a speedbump; easy to do but nice to avoid.
Where list comprehensions really start to shine is making it comparatively trivial to pull from multiple source collections without a lot of ugly machinery:
[{'widget':w,'sprocket':s,'location':l} for w in widgets for s in sprockets for l in locations if l.hasInStock(w) and l.hasInStock(s) and w.isUsableWith(s)]
...which is about where explicit map + filter start to become annoying. You can use:
map(lambda i: {'widget':i[0], 'sprocket':i[1], 'location':i[2]}, filter(lambda i: i[2].hasInStock(i[0]) and i[2].hasInStock(i[1]) and i[0].isUsableWith(i[1]), itertools.product(widgets,sprockets,locations)))
...but to my eyes that is not only very ugly but just going by character count the # of characters given over to keywords (map, lambda, filter) instead of "what i'm doing here" is huge. Additionally use of itertools forces use of tuples for your intermediate values and thus the lambdas are gobbledegook until you get to the tail end of the statement and see that i[0] == widget, i[1] == sprocket, and i[2] == location. I could define some constants (WIDGET = 0, SPROCKET = 1, LOCATION = 2, etc) but now it's even longer.
It gets even worse if you try to be clever with the sequence of operations.
You might look at that definition and say lo! I can pre-filter out stuff not in stock at each location and make things more efficient. Naively you'd wind up with:
Under some circumstances that might be substantially faster than the previous approach. But compare the equivalent but you could have instead gone with:
[{'widget':w,'sprocket':s,'location':l} for l in locations for w in [widget for widget in widgets if l.hasInStock(widget)] for s in [sprocket for sprocket in sprockets if l.hasInStock(sprocket)] if w.isUsableWith(s)]
So you lose a little flexibility in simple cases at in exchange for increasing the scope of what you can get away with as "readable" one-liners.
Python's lambda sucks, but that does not necessarily mean map/filter sucks with it too. Of course there are cases where list comprehensions are more convenient and more powerful (esp. if they are more like their "real" counterparts in Haskell and friends). But in cases when map/filter are more convenient, I like the option of using them.
I'm partially defending list comprehensions here and partially challenging you to consider the possibility that there are higher levels of abstraction out there than those employed by functional programming primitives like map and filter.
Level 0: for-loops with an explicit accumulator
Level 1: map + filter (!)
Level 2: ??? arguably an atemporal set-theoretic approach
In practice in python list comprehensions are a superior syntax for computing with multiple source collections.
(!) Really all you need is reduce
map = lambda f,l: reduce(lambda h,t: h + f(t), l, [])
filter = lambda f,l: reduce(lambda h,t: h + t if f(t) else h, l, [])
You'd be silly to implement them that way of course but know your tools.
2) IIRC, Google also has some functional/provable code
In general, large, critical datasets are routinely processed by businesses. Businesses who use separate testing groups, code reviews, and tiger teams to validate each others' work. None of this is new information (at least to me)
Of course, transparency doesn't extend beyond the organization's walls in many instances. If you thought I was saying that it did then I mis-communicated.
However, in many instances, transparency does extend well beyond an organizations boundaries to the public domain.
Many scientific and business code-bases are open source. Since the advent of portable languages and VM-based back-ends, the practice of open sourced software has increased - not decreased. Many algorithms themselves are published and peer-reviewed. You can purchase Corman's Introduction to Algoritms, Knuth's books Sedgewick's books, the Stony Brook Algorithm Repository. Scientific software has been especially drawn to participate in open-source. From the beginning, efforts of scientific software were focused on to increased accuracy and minimizing error. Start from Anthony Ralston's books and Richard Hamming's book for instance.
Peer review is available and performed at numerous levels. The original criticism "no one" does peer review is over-drawn, misleading, and I would say, unfounded.
Yes, you did miscommunicate; even allowing for the miscommunication you're making an extremely poorly-considered analogy.
The kind of peer review that matters for purposes of scientific integrity is review by outsiders; eg, a paper is "peer-reviewed" when experts in the field not involved in writing it or conducting its research give it a going-over and see if it appears to hold up.
The kind of transparency that matters for purposes of scientific integrity is making data available as-is to outsiders, so that they can meaningfully replicate your results ab initio (or, perhaps, not!).
Neither Google nor Amazon conducts meaningful amounts of peer review in the scientific sense nor are they transparent in the scientific sense (nor should they be, the last thing I want is any old anyone seeing the raw data backing someone else's gmail account or search history).
So you're making a useless assertion in the context of the issue at hand: neither Google nor Amazon does much "peer review" (to my knowledge Microsoft in fact does to some limited extent with shared source and hiring 3rd party auditors for some important code chunks); neither is "transparent".
In this thread you'll find multiple people speaking from positions of actual experience working on large-scale endeavors of scientific computation who've commented at length upon their internal practices.
I'm not going to repeat what they've typed but if you read those comments you will see that at least those commenting here were in fact engaging in what you're calling "peer review" and "transparency" wrt their development practices; their reports match my experience concerning scientific computation with large annual budgets but I won't claim any authority for my anecdotal experiences.
I will close out by posting a protip here; you might benefit from it but it's not specifically aimed at you.
If a blog with a name like "chicagoboyz" makes a bold, sweeping, and somewhat shocking assertion about an entire area of human endeavor -- that it itself has no plausible claim to expertise in (it's an ec blog, not a large-scale computation blog, and no claim of direct experience was made that I saw) -- and you find yourself nodding your head and thinking "yeah, that sounds plausible", proceed to do the following:
- slow down, step away from the computer, and count to 20 backwards in greek
- ask yourself: do I have any concrete knowledge, at all, about the area this claim is being made? Any involvement in a project in that area, or business involving that area, etc.? (In this case: do you have any direct experience with large-scale computational efforts in science? do you know anyone who's been involved in such a project? anything beyond the flamebait du jour?)
- if you do have any concrete knowledge: great, you have at least some nonzero evidence base from which your initial "yeah, that's right feeling" may or may not be substantiated. Think carefully about what you already know and see if the "yeah that's right" feeling holds up.
- if you don't have any concrete knowledge: you've given yourself an awesome opportunity for self-discovery and personal growth. Clearly there's something that makes you want to uncritically believe this specific sweeping claim about some area about which you literally know nothing concrete; we generally consider who believe sweeping claims without evidence suckers, and we've found an area where your preexisting biases leave you a sucker, and therefore at the mercy of others. You might still have the right intuition about the sweeping claim, but at least take the opportunity to de-suckerify yourself on this front before drawing your conclusion.
I found the way that the linked interpreter above implements dimensions rather confusing, mostly.
Thanks for the offer, but I didn't get that far because I decided I was better off focusing on Erlang for the time being. I'm very curious about that language family, I'm just trying to not spread myself too thin. When I get back to it, I'll try working through the J labs - it's probably a much better way to pick it up than untangling a semi-obfuscated interpreter.
Yeah I've not studied that interpreter so if that's your specific interest no luck sadly; rank is an easy idea but wouldn't surprise me if there's some trick to it in an implementation that short.
I think it'd be easier to untangle if you already had a working knowledge of the language but given the line-noisy look of J I can't blame you for trying the shortcut.
The interactive J labs are very good. You might also find the J for C programmers discussion of rank particularly helpful if you are looking at learning by understanding an implementation.
I figure the array languages will make a minor comeback soon what with gpus / larrabee / etc showing up on the horizon and what with more-versatile input devices showing up (making it easier to go back to funky symbols instead of line noise).
The fit imho is that at least if you stick to numerics you'd have to try pretty hard not to be writing code in a way that could be easily translated into something that'd be runnable on a gpu (or larrabee).
...(and gpus / larrabee etc. aren't solely vector processors, but the idea is apparent).
Most of the bulk numeric actions in an array language map pretty nicely to the data-parallel approach you need to use to take advantage of a gpu or larrabee (if it ever shows up); in particular take a look through this:
...and see how much more straightforward it'd be to take advantage of (compared to SSE and so on). Your interpreter has to be a little more sophisticated (work has to be kept in units of 512 bytes) but seems much more tractable than previously.
Since this isn't a new idea there's history to learn from; it was previously the case that you'd get a speedup from offloading work to the vector units but not really a cost-proportionate one. But now if you look at the performance differential between cpus and gpus and their relative costs it starts making sense again.
Another point of experience: out of the box it is pretty poor if you work on the macintosh; by default there's nothing mapped to "command" and there're not enough "keys" on it to remap anything to command without giving up something else you probably need (like shift or alt or control).
You can work around this with a footswitch but it's another point to keep in mind.
Mine suffered a similar fate to yours: in a box, awaiting some future period where I have enough downtime to ride out a month or two of reduced typing speed to train myself.
If the article's really about catastrophe theory I'd better-motivate the catastrophe reference; dropping it in at the last second is pretty much a wtf moment for a reader not forewarned.
Motivate it in the abstract as like (but rephrased to match whatever style + diction constraints you're operating under):
Informally, a catastrophe means bad stuff happens all of a sudden; in the area of mathematics known as 'Catastrophe Theory' we have a more formal definition, but the same intuition applies, with a slight caveat we will come to momentarily.
Consider a car driving on an icy road. One minute it's handling smoothly, but then all of a sudden it starts drifting on the ice; the driver attempts to reacquire control but without success. The car spins out of control and lodges into a snowbank (thankfully everyone inside unhurt).
Our intuition says this is a catastrophe (perhaps a small catastrophe, but a catastrophe nonetheless): one minute everything was as normal, but then something terrible happened.
A catastrophe theorist would agree -- a catastrophe did just occur -- but here the caveat comes into play: a mathematician's catastrophe isn't the horrible crash into the snowbank. Instead, the mathematician's catastrophe is the loss-of-control, as in the moment during which the car transitioned from still-steerable to uncontrollably-drifting.
Catastrophe theory is, loosely speaking, the attempt to characterize and understand the fine structure of transitions between different states-of-operation (like the transition from steerable to drifting).
Thankfully not all "catastrophes" are catastrophes in the casual sense of the word. To provide a sense of the flavor of catastrophe I've prepared a much happier example of "catastrophe" involving racing boats (no crashes, I promise!) and as a bonus you'll also learn quite a bit about what makes boats fast or slow.
...then in the conclusion reiterate that the transition between the planing mode and the "normal" mode is the catastrophe (it's the road, not the destination, that matters).
===
Be careful with the use of "we".
It's good b/c it makes it friendly + inclusive but it makes things very jarring when of a sudden you drop to a 3rd person neutral point of view (eg: "Our truck is now a sports car." is more coherent with your overall turn than "The truck is now a sports car.").
===
Then the idea of planing arose. When planing a boat is no longer displacing water, it's skipping over the top. Some of its "lift" comes from the dynamic force of the water hitting the bottom of the hull, and so less water has to be forced out of the way. Less pushing, less bow wave, more speed.
...is clunky. You introduce the concept (planing) before you define it. When in the next sentence you do define "planing" you do so indirectly: does "planing" mean "a planing boat is skipping over the top of the water, instead of sitting amidst the water" or is "planing" some as-yet unspecified thing that has as a side effect the property that when a boat is planing it's skipping over the water?
Not enough time to try rewriting this for you but consider defining-and-motivating planing first -- "If we could get out of the water somehow we could go faster" (but more accurate and better-phrased) -- and then introducing the term "planing" second (We can, and call this "planing", but again better-phrased).
===
But let's ask the reverse (actually "converse") question. For a given amount of drag, how fast are we going?
Too vague to be helpful; only appeal is a seemingly-more-objective test for non-obviousness (nonobvious iff prerequisites developed AND not solved in previous 10 years), but vagueness of relevant terms negates any putative step forward in "objectivity". Also provides incentives to slow down the rate of innovation and has potential to be self-defeating.
I won't harass you about the vagueness as I think that's pretty obvious (do you really want either federal judges or, god forbid, a jury of citizens to be the judge of what is "commercially viable"?).
The incentive to slow down comes from your 10-year rule: if I have a good idea that is economically useful, ceteris paribus the longer I wait to patent it the likelier I am to actually get an enforceable patent (b/c the prerequisites aren't getting any younger). Now obviously in some cases there will be concern that rivals or competitors will jump in front and patent it instead, but in many smallish fields there really aren't that many people working on the same topic at the same time and what competitors there are will all be operating under the same overall incentive (and thus will themselves have some incentive to delay as long as possible).
So the incentive to file for a patent immediately is the same as under the status quo -- make sure you get out ahead of competitors -- but you add in a countervailing incentive to delay that the present system (for all its faults) does not have.
This is particularly egregious wrt brand-new technologies (like, eg, a transistor or the laser): anything making use of a laser or a transistor wouldn't be patentable for ~10 years after the development of that technology. This may help prevent some stupid patents -- land-grab stuff around doing stuff with a laser -- but it also means that any invention that's mostly-done but missing something (like, eg, it'd work if only I had a source of coherent light) now has a 10-year time out on it, also, as it won't have been solvable for 10 years until 10 years have passed from the introduction of the missing piece.
You could try patching your solvability criterion to work around the "missing piece" phenomena but if you put a tiny loophole in next thing you know people will be driving a caravan of 18-wheelers through it.
Your proposal's criteria is also at the margins somewhat self-defeating, especially in the case of a trial in which a patent's validity is called into question.
The person defending it basically has to defend the patent by minimizing its novelty: all the constituents have to be old + well-understood and so on. So the ideal patent is minimally-inventive: it takes solves a known problem using well-known stuff in novel combination...but not too novel, as something too novel runs the risk of exposing you to arguments that the problem you solved wasn't solvable b/c solving it required too many of your own innovations.
The person attacking it has to attack it by maximizing its novelty: claim that the problem wasn't solvable for as long as you're claiming b/c it depended on sub-inventions and so forth; rather than being a minimally-non-obvious recombination of existing technologies you're actually inventing stuff, you see, which works against your patent's validity.
The 10-year period + commercial viability argument cuts both ways: on the one hand if it was solvable for awhile and was commercially viable you can use that as evidence for non-obviousness, but on the other hand your attacker can use the same facts to claim it either wasn't solvable before you went and invented stuff, wasn't actually commercially viable, or some other reason; you're making an overbroad assumption about what seems like reasonable assumptions for other people to hold about the invention process.
So you wind up with inventors forced to downplay the inventiveness of their inventions -- and shy away from overly-novel areas of inquiry -- and attackers overplaying the inventiveness of the inventions for which they hope to invalidate the patents, which scenario is comical but unproductive.
What you see as deficiencies, I see as advantages.
The point of the patent system (at least in the USA) is to provide incentives for people to come up with ideas that otherwise people wouldn't come up with. If you are racing to the patent office for fear that someone else will come up with the idea, then your actions demonstrate that you likely shouldn't be granted the patent.
For instance take the "mostly done" issue. If you have it mostly done, until that last piece arrives nobody has any idea how many other people had it mostly done as well. Which means that it is impossible to figure out how innovative your idea is, and therefore we don't have good evidence that you really deserve a patent. Given that handing out patents causes real economic harm to others, I don't think they should be handed out on such poor evidence and therefore don't want to see those patents handed out so quickly.
I am strongly against allowing an exception because it opens up a common form of patent abuse. The story is that a small company invents a new technology and get a patent on it. A big company would like the patent, but can't get access. Instead they get patents on all of the things the small company needs to do to commercialize their invention. Effectively they build a "patent wall" around the existing patent. And now the small company is forced to come to terms with the big ones.
The one issue you raise that I agree with is that often people build on their own ideas. The better mousetrap is often better in several ways, not all of which you think of at one time. But in that case I see two possibilities. One is that the subsequent innovations can be used without your other innovations. In that case you can get a patent on them. The other is that the subsequent innovations can't be used without your other innovations. In that case the patent you can get on your previous innovations will protect your subsequent as well. Either way things work out.
I'd say you're slightly misreading the point of the patent system; I'd argue the underlying "point" is what it says it is in the constitution -- to promote the progress of the useful arts + sciences -- which isn't exactly the same thing as "come up with ideas that people wouldn't otherwise come up with".
EG: one aspect of the patent system is that it promotes invention of work-around, "me-too" inventions to get around patent restrictions (eg: PNG vs. GIF, for a computer example).
I'd argue that's more often a bug than a feature: worthwhile inventions would happen anyways, and engineering-around-a-known-solution is far more often a deadweight loss than a benefit to the economy as a whole; there many others (mainly members of the patent bar) that'd argue that such work-arounds are a feature, as they promote "new invention" that otherwise wouldn't happen (since you'd just use the known-good solution).
I'm sort of assuming you usually see this "my way" on this issue; if you do, then framing it as "think of stuff that wouldn't be invented otherwise" doesn't give you much to stand on (as clearly most workarounds wouldn't be thought of without the patent system...).
Sticking with the promotion-of-progress language gives you a much firmer frame for your arguments.
Your framing also is making you overlook the importance of disclosure in the patent system; it's not an accident that a patent not only describes what it does (separate alumina from bauxite) but also how it works (supposedly in enough detail that someone else could implement the invention by reading the patent, though in practice there's a strong incentive to obfuscate that as much as you can get away with).
The argument here is that this promotes the progress of the useful arts and sciences as it makes the knowledge underlying a particular invention available essentially immediately -- as soon as the filer rushes to the patent office -- allowing work on derivative inventions to start immediately, thereby increasing the rate at which new ideas are come-up-with, etc.
So while on the one hand you're kind-of right -- patents that have lots of simultaneous inventors probably are too obvious to be useful -- you've not addressed the real thrust of the delay issue (which I admittedly could have made clearer):
- for "good" patents in your system there's an incentive to wait as long as possible (as the longer you wait the greater the odds your patent is valid); this'd be especially true for the ones that don't have much to worry about from co-filers b/c they're legit inventions
- this means that the rate of disclosure of the genuinely-novel inventions would be expected to go down, as ceteris paribus there's more incentive to delay filing and therefore delay disclosure of the underlying ideas
- so the calculation what is the effect on the rate of progress of the useful arts + sciences under your proposal is roughly ("increased progress due to lack of bogus patents no longer gumming up the works") - ("reduced rate of disclosure of truly novel inventions"), and imho the latter term would be quite substantial and would need more arguments to justify it
The simpler hack to get what you want is removing the presumption of validity (too lazy to check if I mentioned it already or not); this would change the patent-infringement workflow from:
- file lawsuit; patents assumed valid until (or if) defendant successfully challenges every relevant claim in every relevant patent
to:
- file lawsuit; filer must successfully "validate" each relevant claim in every relevant patent, and only then will proof of infringement imply damages are merited
...and this can be phased in in ways that'd not be crazily disruptive (eg: phased introduction and/or the presumption of (in)validity is on a per-area basis, so pharma is presumed valid but not software).
I accept your correction that there is a difference between "promoting progress" and "coming up with ideas". (By coincidence I just reread http://arxiv.org/abs/math.HO/9404236 which argues, among other things, that progress in mathematics can be helped by not proving facts too quickly.)
I agree with you on the intent of the design of the patent system. However in a world filled with overbroad junk patents, where accidental infringement is much better than willful infringement, there are real incentives for innovative people to _not_ read patents. Because reading them opens you up to liability without teaching you anything you couldn't have thought of on your own.
As for the cost of waiting, you are right that the incentive to wait is an inefficiency in my proposal. However the aim here is to achieve a good trade-off. Obviously more detailed analysis is required to find whether it achieves as good a trade-off as I think it does.
However for many years we blindly accepted very large increases in how easy it was to get a patent without substantive debate, and there is now substantial evidence that this has been a bad thing. So it seems to me that it shouldn't be hard to improve on the current state, and I think my proposal would be an improvement.
Finally I have to say that I don't understand what you mean by "presumption of validity". The way things work now is that if IBM wants me to enter into a cross-licensing arrangements, we both know that they can threaten me with a large stack of patents. We both know that most of those patents won't stand up in court, but fighting it will impose serious costs on me and they are likely to get me on something. So even though we both presume the patents are invalid, the threat is still good.
How would you change that dynamic? If your idea is that they spend their money to validate their patent before I spend a dime, how will we know it has been validated to my satisfaction? In theory that kind of validation is the job of the patent office. However regulatory capture has made that review a joke, and would eventually eventually ruin any other level of review you could add. But if I have to spend money in the process of getting the patent validated, then IBM still can threaten me as they would today.
I am agnostic on the platonic notion of patents but generally would agree that what we have now is sufficiently far from optimal that the status quo ought not be accorded much deference.
I think generally the little-guy versus big-conglomerate isn't a winnable battle; if you "level the playing field" the elephant stomps the mouse, and if you try and give the mouse a hand grenade you usually wind up giving the elephant a nuke. Thus where I'm coming from there's a background assumption that the little guy will always and everywhere be prone to get screwed, and the only sensible discussion is "how many different ways?" and "what're the systemic effects of such-and-such-policy".
Thus in that light switching to a presumption of invalidity isn't a panacea -- and has its own drawbacks, as any "solution" would have -- and is mainly addressed at mitigating the effect of junk patents, which is what your proposal is also aimed at.
In writing my response to your IBM scenario I found what seems to be one of the ways the 18-wheeler would drive through your proposal so I've put it below, before I get to explaining how the IBM scenario plays out under reduced-presumption-of-validity.
Incidentally, one area where your proposal would fail is that the IBM scenario. So you invent something novel (patentable under your regime). For this to be patentable it has to be a nonobvious combination of existing tech that's been around and commercially viable for some duration of time.
Now IBM looks at it and figures out some other components you need to actually implement it and files junk patents on those in order to force you to cross-license. IBM argues:
- the technology needed has been around for however long your proposal requires (or else your patent wouldn't hold)
- the overall product your invention enabled has been commercially viable for as long as required (or else your patent wouldn't hold)
- their patents are thus "nonobvious" (lol) combinations of preexisting technology that solve problems related to bringing to market a technology that for a long time has been commercially viable if brought to market
In theory you can patch this by tweaking the notion of commercially available or solvable to be sufficiently narrow, but in practice that's very hard language to draft in a loophole-minimized fashion; you'd think it ought to be the case that you can set up criteria that'd make it such that "mechanism for transforming lead to gold" ought to be easily recognizable as "unique" and commercially viable and "mechanism for absorbing shock from spontaneous-change-of-weight during transmutation from base metal to gold" isn't (eg: b/c it is only viable as part of a device that itself is commercially viable only due to your recent patent), but the more you start predicating your notion of "commercial viability" on "independent of any other invention" the more you undercut your likelihood of getting a patent at all.
In some ways your proposed criteria makes this worse: the kinda of stupid patents they'd look for are going to be super generic and useless (like the shock-absorber) and thus not sought after until they're needed to try and screw you out of your invention; thus their uselessness works in their favor (they won't be invented yet b/c they're so dumb) and being generic they have a pretty easy time proving commercial viability (b/c shock absorbers already have a lot of uses, and ours is better than others in some way...).
So you wind up falling back on the same-old pretty-useless nonobvious criteria to defend against IBM if they actively seek to encircle your patent with their junk. In theory they would potentially have a much smaller of junk patents laying around but without patches they'd have a much easier time rounding up junk patents once you got your non-junk patent.
OK. The validity thing plays out like this. In the IBM-threat scenario the threat is that IBM has a bunch of junk patents in technologies around whatever it is you patented, so to actually produce a product using your patent you need to get licenses from them or risk being sued into oblivion; they use the threat of that outcome to compel a cross-licensing agreement, and then run off and make the product you were going to make, leaving you in trouble.
Introducing the presumption of invalidity changes that negotiation slightly.
On the one hand, IBM's threats become a lot more empty; if you go ahead and build your device -- IBM's patents notwithstanding -- then they might well sue you but they'd have to prove their BS patents are not-BS before the trial gets very far, and if they're actually BS then they probably can't get them proven not-BS (judges self-select to take directions about stuff like "presumption" very seriously).
On the other hand say IBM goes and makes a product incorporating your invention, and leaves it up to you to go and sue them. Things here are a little less clear, which is why it's not really a panacea (but what is?). It's easy to say something like "if it's really a good patent it'd hold up in court and you'd be fine", but with any change from the status quo predicting how things actually turn out is a little tricky.
So at the margins the lone-wolf / little-guy inventor has a harder time winning an infringement suit b/c they have to prove their patent is invalid; it's thus a systemic tradeoff between far less "bite" for BS/junk patents in exchange for at the margins having some patents wind up practically unenforceable.
The compare/contrast wrt to your proposal is that your proposal leaves patents harder to get -- thereby cutting back on the BS paten threat -- but doesn't touch the strength of issued patents; the presumption of validity tweak is intended to undermine the utility of issued patents and not have too deleterious an effect on strength.
As far as validating your own patent -- if it held up in court already it's known-good. Some variants of this proposal include a notion of patents-plus (essentially, patents subject to extra scrutiny) available for patents undergoing heavier scrutiny; I'm not a fan of that for a number of reasons, but it's another option (essentially pay extra to get more research done and have your patent enjoy stronger presumed strength).
It basically trades where we are now -- too easy to get junk patents and use them to extort money from productive enterprises -- for a setup where the junk-patent-extortioneering becomes much harder, but at the expense of leaving some lone-wolf holders of valid patents less able to make use of their patents then they are under the current situation.
There are some counterbalances you can throw in -- eg, jacking up the infringement penalties => if you DO win against IBM it'll cost them a ton more money, thereby keeping the expected outcome in the same ballpark despite the reduced likelihood of the little guy winning -- but you can't get around the marginal effect of pushing some patents into the unenforceable range.
I tend to think that's probably an improvement over the current situation, but opinions can differ.
Let me point out exactly how you're being stupid here.
The article is about future ramifications of genetic sequencing technology finally becoming economical enough that it's possible to cheaply sequence a person's complete DNA.
You're right that this will allow research to be done of the form: grab a DB with 100k different peoples' sequences; discover strong statistical associations between such-and-such genes and such-and-such amount of alcohol intolerance, and notice that those genes are particularly prevalent amongst people of asian descent.
Where you go awry is your conceptualization of how this information will likely be used; it's a failure to take a premise you've accepted in one part of your scenario -- the premise that DNA sequencing will be cheap-and-thorough enough to allow for research like just sketched -- and failing to apply it to another part of your scenario -- the doctor-patient interaction, and so on.
Given a world in which DNA sequencing is super-cheap and super-affordable, which doctor-patient interaction is likelier:
- doctor says to patient: "you clearly are asian just by looking at you. asians have been shown to have lower alcohol tolerance on average. be careful with how much alcohol you imbibe."
- doctor says to patient: "your DNA sequence contains 7 of the 10 major genes associated with higher levels of alcohol intolerance. this isn't surprising given your family history, but it's nice to know concretely what you've got. be careful with how much alcohol you imbibe."
...which is why you're being stupid: race is a proxy measure for ancestry that's "field-performable" (just look!) and cost-effective and is not an entirely useless proxy measure for ancestry; it is indeed the case that were political bugaboos different than they are that some useful research could be done to characterize racial differences that could be put to general use.
It's also a terribly inaccurate proxy and given the ability to do direct DNA sequencing would be dropped like hotcakes; why use the inaccurate proxy when you can make a direct measurement?
So even in your hypothetical example of looking for a wife: when this research is available, would you not be better served testing for specific genes in a particular individual rather than cutting it off at the level of race? You might still want to move to Ankara if you discovered that "Turks" are likelier than Brits to carry the genes you want your kids to have, but would you really not take it further and check the specific woman's DNA directly (the way that some jews test for tay-sachs at some point in the courtship process)? If you take the research seriously enough to move to Ankara to stack the deck why stop there?
You're essentially making the same kind of mental error you saw in the early dot-com days where people understood that "in the future, you'll buy stuff at home over the internet" -- so were semi-prescient -- but couldn't discard what they already knew about "shopping"; the consequence being that they'd write articles about how you'd sign into some 3d virtual world and then visit an online bookstore and browse the shelves in virtual reality (transporting outdated ideas into a world of new possibilities).