fredag 19 september 2014

Superintelligence odds and ends: index page

Nick Bostrom's Superintelligence: Paths, Dangers, Strategies is such an important, interesting and thought-provoking book that it has taken me several blog posts to comment on it. Here, to help the reader find her way in my writings on this topic, I provide a list of links to these posts, plus a few others.

After two initial mentions of Bostrom's book when it had just been released in July this year... ...I posted my review of the book on September 10: I then quickly followed up my review with a sequence of five blog posts with further comments on the book, under the joint heading Superintelligence odds and ends: That exhausts, for the time being, my list of blog posts devoted explicitly to Bostrom's Superintelligence, but I have a large number of further blog posts that treat the same or closely related topics as his book, such as the following: For those readers who, due to their weak or non-existent knowledge of Swedish, feel prevented from reading some of these posts, perhaps Google Translate can provide some assistance. Its translations are neither beautiful, nor perfectly accurate, but in many cases they can help readers identify the gist of a blog post.

torsdag 18 september 2014

Superintelligence odds and ends V: What is an important research accomplishment?

My review of Nick Bostrom's important book Superintelligence: Paths, Dangers, Strategies appeared first in Axess 6/2014, and then, in English translation, in a blog post here on September 10. The present blog post is the last in a series of five in which I offer various additional comments on the book (here is an index page for the series).


The following two-sentence paragraph, which opens Chapter 15 of Superintelligence, is likely to anger many of my mathematician colleagues.
    A colleague of mine likes to point out that a Fields Medal (the highest honor in mathematics) indicates two things about the recipient: that he was capable of accomplishing something important, and that he didn't. Though harsh, the remark hints at a truth.

At this point, I urge the angry mathematicians reading this not to stop reading, and not to conclude that Bostrom is a jackass and/or a moron unworthy of further attention. There is more to his "harsh" position than first meets the eye. And less, because to those of us who continue reading, it quickly becomes clear that he is not saying that the mathematical results discovered by some or all Fields Medalists are unimportant. Instead, he has two interesting and original points to make about research in mathematics (and in other disciplines), one general, and one more concrete. The general point is that the value of the discovery of a result is not equal to the value of the result itself, but rather the value of how much earlier we, as a consequence of the discovery, learned the result compared to what would have been the case without that particular discovery.1 The more concrete point is that even if deep and ground-breaking results in pure mathematics are valuable in themselves (as opposed to whatever scientific or engineering applications may eventually grow out of them), then there may be a vastly more efficient way to advance mathematics (compared to what the typical Fields Medalist engages in), namely to contribute to the development of AI or of transhumanistic technologies for enhancement of human cognitive capacities, in order for the next generation of mathematicians (made of flesh-and-blood or of silicon) to be in a vastly better position to make even deeper and even more ground-breaking discoveries. Here's how Bostrom explains his position:
    Think of a "discovery" as an act that moves the arrival of information from a later point in time to an earlier time. The discovery's value does not equal the value of the information discovered but rather the value of having the information available earlier than it otherwise would have been. A scientist or a mathematician may show great skill by being the first to find a solution that has eluded many others; yet if the problem would soon have been solved anyway, then the work probably has not much benefited the world. There are cases in which having a solution even slightly sooner is immensely valuable, but this is more plausible when the solution is immediately put to use, either by being deployed for some practical end or serving as the foundation to further theoretical work. And in the latter case [...] there is great value in obtaining the solution slightly sooner only if the further work it enables is itself both important and urgent.

    The question, then, is [...] whether it was important that the medalist enabled the publication of the result to occur at an earlier date. The value of this temporal transport should be compared to the value that a world-class mathematical mind could have generated by working on something else. At least in some cases, the Fields Medal might indicate a life spent solving the wrong problem - perhaps a problem whose allure consisted primarily in being famously difficult to solve.

    Similar barbs could be directed at other fields, such as academic philosophy. Philosophy covers some problems that are relevant to existential risk mitigation - we encountered several in this book. Yet there are also subfields within philosophy that have no apparent link to existential risk or indeed any practical concern. As with pure mathematics, some of the problems that philosophy studies might be regarded as intrinsically important, in the sense that humans have reason to care about them independently of any practical application. The fundamental nature of reality, for instance, might be worth knowing about, for its own sake. The world would arguably be less glorious if nobody studied metaphysics, cosmology, or string theory. However, the dawning prospect of an intelligence explosion shines a new light on this ancient quest for wisdom.

    The outlook now suggests that philosophic progress can be maximized via an indirect path rather than by immediate philosophizing. One of the many tasks on which superintelligence (or even just moderately enhanced human intelligence) would outperform the current cast of thinkers is in answering fundamental questions in science and philosophy. This reflection suggests a strategy of deferred gratification. We could postpone work on some of the eternal questions for a little while, delegating that task to our hopefully more competent successors - in order to focus our own attention on a more pressing challenge: increasing the chance that we will actually have competent successors. This would be high-impact philosophy and high-impact mathematics.

If this is not enough to calm down those readers feeling anger on behalf of mathematics and mathematicians, Bostrom furthermore offers the following conciliatory footnote:
    I am not suggesting that nobody should work on pure mathematics or philosophy. I am also not suggesting that these endeavors are especially wasteful compared to all the other dissipations of academia or society at large. It is probably very good that some people can devote themselves to the life of the mind and follow their intellectual curiosity wherever it leads, independent of any thought of utility or impact. The suggestion is that at the margin, some of the best minds might, upon realizing that their cognitive performance may become obsolete in the forseeable future, want to shift their attention to those theoretical problems for which it makes a difference whether we get the solution a little sooner.
The view of research priorities and the value of mathematical, philosophical and scientific progress that Bostrom offers in the above passages may seem provocative at first, but in fact it strikes me wise and balanced. Are there any aspects of this issue he has failed to take into account? Of course there are, but the question should be whether there are any such aspects that are sufficiently relevant to overthrow his conclusion. Here's the best one I can come up with for the moment:

Perhaps the main value of a mathematical discovery lies not in the result itself, but in the process leading up to the discovery, and perhaps it is important that the cognitive work is done by an ordinary human tather than an enhanced human or some super-AI. Well, a bit of enhancement is OK - many years of education, plus some caffeine - but anything much beyond that reduces the value of the discovery significantly.

Something along those lines. But, honestly, doesn't it sound arbitrary, artificial, and more than a little anthropochauvinistic? It is certainly not an argument with which the mathematical community can hope to convince tax payers to support research in mathematics. Perhaps some similar argument might work for music or for literature, as the audience might have a preference for songs or novels they know are written by ordinary humans rather than by some superintelligence.2 But the case is very different for mathematics, because the population of people who can appreciate and enjoy, say, Wiles' proof of Fermat's Last Theorem or Perelman's proof of the Poincaré conjecture, is very small and consists almost exclusively of professional mathematicians. So using the argument for mathematics comes very close to asking taxpayers to support mathematical research because it is enjoyable to mathematicians.

The process-more-important-then-result objection fails to convince. All in all, I think that Bostrom's new perspective on the value of research findings, although of course not the only valid viewpoint, is very much worth putting on the table when discussing priorities regarding which research areas to fund.3


1) This notion of the value of a discovery is not entirely unproblematic, however. Consider the case of my friends Svante Linusson and Johan Wästlund, and their their solution to the famous problem of proving Parisi's conjecture. On the very same day that they announced their result, another group, consisting of Chandra Nair, Balaji Prabhakar and Mayank Sharma, announced that they had achieved the same thing (using a different approach). For the sake of the argument, let us make the following simplifying assumptions:
    (a) the two works and their timings were independent (almost true),

    (b) there is no extra value in having the two different proofs of the result compared to having just one (plain false),

    (c) without the two works, it would have taken another ten years for the scientific community to come up with a proof of Parisi's conjecture (pure speculation on my part).

With these assumptions, Bostrom's way of attaching value to research discoveries has some strange consequences. The work of Linusson and Wästlund is deemed worthless (because in view of the Nair-Prabhakar-Sharma paper, they did not accelerate the proof of Parisi's conjecture). Similarly and symmetrically, the Nair-Prabhakar-Sharma paper is deemed worthless. Yet, Bostrom has to accept that the two papers, taken together, are valuable, because they gave us proof of Parisi's conjecture ten years earlier than what would have been the case without them.

Such superadditivity of values is not unusual. A hot dog on its own may be worthless to me, and the same may go for a bun, but together they constitute a highly delicious and valuable meal. But the Linusson-Wästlund and the Nair-Prabhakar-Sharma papers, exhibiting the same superadditivity, still does not fit the hot-dog-and-bun pattern, because unlike the hot dog and the bun, each of the papers contains, on its own, the whole thing we value (the early arrival of the proof of Parisi's conjecture). Strange.

2) And chess. As a chess amateur, I enjoy studying the games of world champions and other grandmasters. For more than a decade, there have been computer programs that play clearly better chess than the very best human chess players. And yet, I do not find even remotely the same thrill in studying games between these programs, compared to those played between humans.

3) It will be interesting to see how this statement will be received by my friends and colleagues in the mathematics community. My hope and my belief is that the position I'm endorsing will be appreciated for its nuances and recognized as a point of view that merits discussion. But I am not certain about this. If worst comes to worst, my statement will be widely condemned and perhaps even mark the end of a 15-or-so years period during which I have received a steady stream of invitations and requests to take on various positions of trust in which I am expected to defend the interests of research mathematics. I would not welcome such a scenario, but I much prefer it to one in which I refrain from speaking openly on important issues.

tisdag 16 september 2014

Superintelligence odds and ends IV: Geniuses working on the control problem

My review of Nick Bostrom's important book Superintelligence: Paths, Dangers, Strategies appeared first in Axess 6/2014, and then, in English translation, in a blog post here on September 10. The present blog post is the fourth in a series of five in which I offer various additional comments on the book (here is an index page for the series).


The topic of Bostrom's Superintelligence is dead serious: the author believes the survival and future of humanity is at stake, and he may well be right. He treats the topic with utmost seriousness. Yet, his subtle sense of humor surfaces from time to time, diverting nothing from his serious intent, but providing bits of enjoyment for the reader. Here I wish to draw attention to a footnote which I consider a particularly striking example of Bostrom's way of exhibiting a slightly dry humor at the same time as he means every word he writes. What I have in mind is Footnote 10 in the book's Chapter 14, p 236. The context is a discussion on whether it improves or worsens the odds of a favorable outcome of an AI breakthrough with a fast takeoff (a.k.a. the Singularity) if, prior to that, we have performed transhumanistic cognitive enhancement of humans. As usual, there are pros and cons. Among the pros, Bostrom suggests that improved cognitive skills may make it easier for individual researchers as well as society as a whole to recognize the crucial importance of what he calls the control problem, i.e., the problem of how to turn an intelligence explosion into a controlled detonation with consequences that are in line with human values and favorable to humanity. And here's the footnote:
    Anecdotally, it appears those currently seriously interested in the control problem are disproportionately sampled from one extreme end of the intelligence distribution, though there could be alternative explanations of this impression. If the field becomes fashionable, it will undoubtedly be flooded with mediocrities and cranks.
The community of researchers currently working seriously on the control problem is very small - if their head count even reaches the realm of two-digit numbers, it is not by much. Bostrom is one of its two most well-known members; the other is Eliezer Yudkowsky. I'd judge both of them to have cognitive capacities fairly far into the high end of "the intelligence distribution" (and I imagine myself to be in a reasonable position to calibrate - as a research mathematician, I know a fair number of people (including Fields Medalists) in various parts of that high end). Bostrom is undoubtedly aware of his own unusual talents, as well as of the strong social norm saying that one should not talk about one's own high intelligence, yet his devotion to honest unbisaed matter-of-fact presentation of what he perceives as the truth (always with uncertainty bars) leads him in this case to override the social norm.

I like that kind of honesty, even though it carries with it a nonnegligible risk of antagonizing others. Yudkowsky, in fact, has been known for going far - much further than Bostrom does here - in speaking openly about his own cognitive talents. And he does receive a good deal of shit for that, such as in Alexander Kruels's recent blogpost devoted to what he considers to be "Yudkowsky’s narcissistic tendencies".

All this makes the footnote multi-layered in a humorous kind of way. I also think the footnote's final sentence about what happens "if the field becomes fashionable" carries with it a nice touch of humor. Bostrom has a farily extreme propensity to question premises and conclusions, he is well aware of this, and I do think this last sentence (which points out a downside to what is clearly a main purpose of the book - namely to draw attention to the control problem) is written with a wink to that propensity.

måndag 15 september 2014

Superintelligence odds and ends III: Political reality and second-guessing

My review of Nick Bostrom's important book Superintelligence: Paths, Dangers, Strategies appeared first in Axess 6/2014, and then, in English translation, in a blog post here on September 10. The present blog post is the third in a series of five in which I offer various additional comments on the book (here is an index page for the series).


A breakthrough in AI leading to a superintelligence would, as Bostrom underlines in his book, be a terribly dangerous thing. Among many other aspects and considerations, he discusses whether our chances of surviving such an event are better if technological progress in this area speeds up or is slowed down, and this turns out to be a complicated and far from straightforward issue. On balance, however, I tend to think that in most cases we're better off with a slower progress towards an AI breakthrough.

Yet, in recent years I've participated in a couple of projects (with Claes Strannegård) ultimately aimed at creating an artificial general intelligence (AGI); see, e.g., this paper and this one. Am I deliberately worsening humanity's survival chances in order to do work I enjoy or to promote my academic career?

That would be bad, but I think what I'm doing is actually defensible. I might of course be deluding myself, but what I tell myself is this: The problem is not so much the speed of progress towards AGI itself, but rather the ratio between this speed and the speed at which we make concrete progress on what Bostrom calls the control problem, i.e., the problem of figuring out how to make sure that a future intelligence explosion becomes a controlled detonation with benign consequences for humanity. Even though the two papers cited in the previous paragraph show no hint of work on the control problem, I do think that in the slighly longer run it is probably on balance beneficial if, through my involvement in AI work and participation in the AI community, I improve the (currently dismally low) proportion of AI researchers caring about the control problem - both through my own head count of one, and by influencing others in the field. This is in line with a piece of advice recently offered by philosopher Nick Beckstaed: "My intuition is that any negative effects from speeding up technological development in these areas are likely to be small in comparison with the positive effects from putting people in place who might be in a position to influence the technical and social context that these technologies develop in."

On p 239 of Superintelligence, Bostrom outlines an alternative argument, borrowed from Eric Drexler, that I might use to defend my involvement in AGI research:
    1. The risks of X are great.
    2. Reducing these risks will require a period of serious preparation.
    3. Serious preparation will begin only once the prospect of X is taken seriously by broad sectors of society.
    4. Broad sectors of society will take the prospect of X seriously only once a large research effort to develop X is underway.
    5.The earlier a serious research effort is initiated, the longer it will take to deliver (bacause it starts from a lower level of pre-existing enabling technologies).
    6. Therefore, the earlier a serious research effort is initiated, the longer the period during which serious preparation will be taking place, and the greater the reduction of the risks.
    7. Therefore, a serious research effort toward X should be initiated immediately.
Thus, in Bostrom's words, "what initially looks like a reason for going slow or stopping - the risks of X being great - ends up, on this line of thinking, as a reason for the opposite conclusion." The context in which he discusses this is the complexity of political reality, where, even if we figure out what needs to be done and go public with it, and even if our argument is watertight, we cannot take for granted that our proposal will be implemented. Any idea we have arrived at concerning the best way forward...
    ...must be embodied in the form of a concrete message, which is entered into the arena of rhetorical and political reality. There it will be ingored, misunderstood, distorted, or appropriated for various conclicting purposes; it will bounce around like a pinball, causing actions and reactions, ushering in a cascade of consequences, the upshot of which need bear no straightforward relationship to the intentions of the original sender. (p 238)
In such a "rhetorical and political reality" there may be reason to send not the message that most straightforwardly and accurately describes what's on our mind, but rather the one that we, after careful strategic deliberation, consider most likely to trigger the responses we're hoping for. The 7-step argument about technology X is an example of such second-guessing.

I feel very uneasy about this kind of strategic thinking. Here's my translation of what I wrote in a blog post in Swedish earlier this year:
    I am very aware that my statements and my actions are not always strategically optimal [and I often do this deliberately]. I am highly suspicious of too much strategic thinking in public debate, because if everyone just says what he or she considers strategically optimal to say, as opposed than offering their true opinions, then we'll eventually end up in a situation where we can no longer see what anyone actually thinks is right. To me that is a nightmare scenario.
Bostrom has similar qualms:
    Ther may [...] be a moral case for de-emphasizing or refraining from second-guessing moves. Trying to outwit one another looks like a zero-sum game - or negative-sum, when one considers the time and energy that would be dissipated by the practice as well as the likelihood that it would make it generally harder for anybody to discover what others truly think and to be trusted when expressing their own opinions. A full-throttled deployment of the practices of strategic communication would kill candor and leave turth bereft to fend for herself in the backstabbing night of political bogeys. (p 240)

lördag 13 september 2014

Superintelligence odds and ends II: The Milky Way preserve

My review of Nick Bostrom's important book Superintelligence: Paths, Dangers, Strategies appeared first in Axess 6/2014, and then, in English translation, in a blog post here on September 10. The present blog post is the second in a series of five in which I offer various additional comments on the book (here is an index page for the series).


Concerning the crucial problem of what values we should try to instill into an AI that may turn into a superintelligence, Bostrom discusses several approaches. In part I of this Superintelligence odds and ends series I focused on Eliezer Yudkowsky's so-called coherent extrapolated volition, which Bostrom holds forth as a major option worthy of further consideration. Today, let me focus on the alternative that Bostrom calls moral rightness, and introduces on p 217 of his book. The idea is that a superintelligence might be successful at the task (where we humans have so far failed) of figuring out what is objectively morally right. It should then take objective morality to heart as its own values.1,2

Bostrom sees a number of pros and cons of this idea. A major concern is that objective morality may not be in humanity's best interest. Suppose for instance (not entirely implausibly) that objective morality is a kind of hedonistic utilitarianism, where "an action is morally right (and morally permissible) if and only if, among all feasible actions, no other action would produce a greater balance of pleasure over suffering" (p 219). Some years ago I offered a thought experiment to demonstrate that such a morality is not necessarily in humanity's best interest. Bostrom reaches the same conclusion via a different thought experiment, which I'll stick with here in order to follow his line of reasoning.3 Here is his scenario:
    The AI [...] might maximize the surfeit of pleasure by converting the accessible universe into hedonium, a process that may involve building computronium and using it to perform computations that instantiate pleasurable experiences. Since simulating any existing human brain is not the most efficient way of producing pleasure, a likely consequence is that we all die.
Bostrom is reluctant to accept such a sacrifice for "a greater good", and goes on to suggest a compromise:
    The sacrifice looks even less appealing when we reflect that the superintelligence could realize a nearly-as-great good (in fractional terms) while sacrificing much less of our own potential well-being. Suppose that we agreed to allow almost the entire accessible universe to be converted into hedonium - everything except a small preserve, say the Milky Way, which would be set aside to accommodate our own needs. Then there would still be a hundred billion galaxies devoted to the maximization of pleasure. But we would have one galaxy within which to create wonderful civilizations that could last for billions of years and in which humans and nonhuman animals could survive and thrive, and have the opportunity to develop into beatific posthuman spirits.

    If one prefers this latter option (as I would be inclined to do) it implies that one does not have an unconditional lexically dominant preference for acting morally permissibly. But it is consistent with placing great weight on morality. (p 219-220)

What? Is it? Is it "consistent with placing great weight on morality"? Imagine Bostrom in a situation where he does the final bit of programming of the coming superintelligence, to decide between these two worlds, i.e., the all-hedonium one versus the all-hedonium-except-in-the-Milky-Way-preserve.4 And imagine that he goes for the latter option. The only difference it makes to the world is to what happens in the Milky Way, so what happens elsewhere is irrelevant to the moral evaluation of his decision.5 This may mean that Bostrom opts for a scenario where, say, 1024 sentient beings will thrive in the Milky Way in a way that is sustainable for trillions of years, rather than a scenarion where, say, 1045 sentient beings will be even happier for a comparable amount of time. Wouldn't that be an act of immorality that dwarfs all other immoral acts carried out on our planet, by many many orders of magnitude? How could that be "consistent with placing great weight on morality"?6


1) It may well turn out (as I am inclined to believe) that no objective morality exists or that the notion does not make sense. We may instruct the AI to, in case it discovers that to be the case, shut itself down or to carry out some other default action that we have judged to be harmless.

2) A possibility that Bostrom does not consider is that perhaps any sufficiently advanced superintelligence will do so, i.e., it will discover objective morality and go on to act upon it. Perhaps there is some, yet unknown, principle of nature that dictates that any sufficiently intelligent creature will do so. In my experience, many people who are not used to thinking about superintelligence the way, e.g., Bostrom and Yudkowsky do, suggest that something like this might be the case. If I had to make a guess, I'd say this is probably not the case, but on the other hand it doesn't seem so implausible as to be ruled out. It would contradict Bostrom's so-called orthogonality thesis (introduced in Chapter 7 of the book and playing a central role in much of the rest of the book), which says (roughly) that almost any values are compatible with arbitrarily high intelligence. It would also contradict the principle of goal-content integrity (also defended in Bostrom's Chapter 7), stating (again roughly) that any sufficiently advanced intelligence will act to conserve its ultimate goal and value function. While I do think both the orthogonality thesis and the goal-content integrity principle are plausible, they have by no means been deductively demonstrated, and either of them (or both) might simply be false.7

3) A related scenario is this: Suppose that the AI figures out that hedonistic utilitarianism is the objectively true morality, and that it also figures out that any sentient being always comes out negatively on its "pleasure minus suffering" balance, so that the world's grand total of "pleasure minus suffering" will always sum up to something negative, except in the one case where there are no sentient creatures at all in the world. This one case of course yields a balance of zero, which turns out to be optimal. Such an AI would proceed to do as best as it can to exterminate all sentient beings in the world.

But could such a sad statement about the set of possible "pleasure minus suffering" balances really be true? Well, why not? I am well aware that many people (including myself) report being mostly happy, and experiencing more pleasure than suffering. But are such reports trustworthy? Mightn't evolution have shaped us into having highly delusional views about our own happiness? I don't see why not.

4) For some hints about the kinds of lives Bostrom hopes we might live in this preserve, I recommend his 2006 essay Why I want to be a posthuman when I grow up.

5) Bostrom probably disagrees with me here, because his talk of "nearly-as-great good (in fractional terms)" suggests that the amount of hedonium elsewhere has an impact on what we can do in the Milky Way while still acting in a way "consistent with placing great weight on morality". But maybe such talk is as misguided as it would be (or so it seems) to justify murder with reference to the fact that there will still be over 7 billion other humans remaining well and alive?

6) I'm not claiming that if it were up to me, rather than Bostrom, I'd go for the all-hedonium option. I do share his intuitive preference for the all-hedonium-except-in-the-Milky-Way-preserve option. I don't know what I'd do under these extreme circumstances. Perhaps I'd even be seduced by the "in fractional terms" argument that I condemned in Footnote 5. But the issue here is not what I would do, or what Bostrom would do. The issue is what is "consistent with placing great weight on morality".

7) For a very interesting critique of the goal-content integrity principle, see Max Tegmark's very recent paper Friendly Artificial Intelligence: the Physics Challenge and his subsequent discussion with Eliezer Yudkowsky.

fredag 12 september 2014

Superintelligence odds and ends I: What if human values are fundamentally incoherent?

My review of Nick Bostrom's important book Superintelligence: Paths, Dangers, Strategies appeared first in Axess 6/2014, and then, in English translation, in a blog post here on September 10. The present blog post is the first in a series of five in which I offer various additional comments on the book (here is an index page for the series).


The control problem, in Bostrom's terminology, is the problem of turning the intelligence explosion into a controlled detonation, with benign consequences for humanity. The difficulties seem daunting, and certainly beyond our current knowledge and capabilities, but Bostrom does a good job (or seemingly so) systematically partitioning it into subtasks. One such subtask is to work out what values to instill in the AI that we expect to become our first superintelligence. What should it want to do?

Giving an explicit collection of values that does not admit what Bostrom calls perverse instantiation seems hard or undoable.1 He therefore focuses mainly on various indirect methods, and the one he seems most inclined to tentatively endorse is Eliezer Yudkowsky's so-called coherent extrapolated volition (CEV):
    Coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.
The language here is, as Yudkowsky admits, a bit poetic. See his original paper for a careful explication of all the involved concepts. The idea is that we would like to have the AI take on our values, but for various reasons (we do not agree with each other, our values are confused and incoherent, many of us are jerks or jackasses, and so on) it is better that the AI works a bit more on our actual values to arrive at something that we would eventually recognize as better and more coherent.

CEV is an interesting Idea. Maybe it can work, maybe it can't, but it does seem to be worth thinking more about. (Yudkowsky has thought about it hard for a decade now, and has attracted a sizeable community of followers.) Here's one thing that worries me:

Human values exhibit, at least on the surface, plenty of incoherence. That much is hardly controversial. But what if the incoherence goes deeper, and is fundamental in such a way that any attempt to untangle it is bound to fail? Perhaps any search for our CEV is bound to lead to more and more glaring contradictions? Of course any value system can be modified into something coherent, but perhaps not all value systems cannot be so modified without sacrificing some of its most central tenets? And perhaps human values have that property?

Let me offer a candidate for what such a fundamental contradiction might consist in. Imagine a future where all humans are permanently hooked up to life-support machines, lying still in beds with no communication with each other, but with electrodes connected to the pleasure centra of our brains in such a way as to constantly give us the most pleasurable experiences possible (given our brain architectures). I think nearly everyone would attach a low value to such a future, deeming it absurd and unacceptable (thus agreeing with Robert Nozick). The reason we find it unacceptable is that in such a scenario we no longer have anything to strive for, and therefore no meaning in our lives. So we want instead a future where we have something to strive for. Imagine such a future F1. In F1 we have something to strive for, so there must be something missing in our lives. Now let F2 be similar to F1, the only difference being that that something is no longer missing in F2, so almost by definition F2 is better than F1 (because otherwise that something wouldn't be worth striving for). And as long as there is still something worth striving for in F2, there's an even better future F3 that we should prefer. And so on. What if any such procedure quickly takes us to an absurd and meaningless scenario with life-suport machines and electrodes, or something along those lines. Then no future will be good enough for our preferences, so not even a superintelligence will have anything to offer us that aligns acceptably with our values.2

Now, I don't know how serious this particular problem is. Perhaps there is some way to gently circumvent its contradictions. But even then, there might be some other fundamental inconsistency in our values - one that cannot be circumvented. If that is the case, it will throw a spanner in the works of CEV. And perhaps not only for CEV, but for any serious attempt to set up a long-term future for humanity that aligns with our values, with or without a superintelligence.


1) Bostrom gives many examples of perverse instantiations. Here's one: If we instill the value "make us smile", the AI might settle for paralyzing human facial musculatures in such a way that we go around endlessly smiling (regardless of mood).

2) And what about replacing "superintelligence" with "God" in this last sentence? I have often mocked Christians for their inability to solve the problem of evil: with an omnipotent and omnibenevolent God, how can there still be suffering in the world? Well, perhaps here we have stumbled upon an answer, and upon God's central dilemma. On one hand, he cannot go for a world of eternal life-support and electrical stimulus of our pleasure centra, because that would leave us bereft of any meaning of our lives. On the other hand, the alternative of going for one of the suboptimal worlds such as F1 or F2 would leave us complaining. In such a world, in order for us to be motivated to do anything at all, there must be some variation in our level of well-being. Perhaps it is the case that no matter how high our general such level is, we will perceive any dip in well-being as suffering, and be morally outraged at the idea of a God who allows it. But he still had to choose some such level, and here we are.

(Still, I think there is something to be said about the level of suffering God chose for us. He opted for the Haiti 2010 earthquake and the Holocaust, when he could have gone for something on the level of, say, the irritation of a dust speck in the eye. How can we not conclude that this makes him evil?)

torsdag 11 september 2014

Om väder och klimat i Upsala Nya Tidning

Upsala Nya Tidnings historik vad gäller debattinlägg i klimatfrågan är långt ifrån entydigt ärorik och briljant. Man skulle kanske kunna tro att den artikel av Lennart Bengtsson från 2009 rubricerad "Växthusgasernas inverkan är ringa" som jag nämnde under Bengtsson-turbulensen i våras skulle vara det absoluta lågvattenmärket, men faktum är att det finns ännu värre exempel, som Wibjörn Karléns debattinlägg 2009, och Sten Kaijsers 2011.

Idag bjuds emellertid UNT:s läsare på en ordentlig uppryckning i form av en text om förhållandet mellan väder och klimat, rubricerad "Sårbart samhälle ingen valfråga?", undertecknad av yours truly tillsammans med Mikael Karlsson, ordförande i European Environmental Bureau (och tidigare i Svenska Naturskyddsföreningen), och meteorologen Pär Holmgren. Så här inleds den:
    Det väder som råder en viss dag på en viss plats kan beskrivas som en liten bit i ett stort pussel, som i sin helhet ger en bild av klimatet.

    Vädret utspelas alltså inom klimatpusslets ramar, men den pågående klimatförändringen gör att hela ramen förflyttas. Extremväder ligger nära kanten på ramen och när ramen är i rörelse förändras också sannolikheterna för såväl kraftig nederbörd och översvämningar som torka och storbränder. Det är mot den bakgrunden som sommarens extremväder för det första ska förstås. För det andra är extremvädret en varningssignal i nutid om en framtid, där dagens extrema händelser kan bli det normala.

    De som säger att extremvädret inte har med klimatförändringen att göra har därför fel.

    Visst hade de extrema väderhändelserna den här sommaren kunnat ske även utan mänsklig klimatpåverkan, men det kan sägas om vädret även i ett förändrat klimat år 2050. Med den logiken beror då ingen väderhändelse på klimatet. Det är en missvisande retorik som bygger på en statisk och avgränsad syn på vetenskapen om klimatet.

    En förklaring till den förvirrade debatten kan vara att extremväder är en känslig fråga ett valår, särskilt när politiken varit alltför passiv med tanke på de många studier som pekat på stora behov av klimatanpassning. Den som spelat ner klimatfrågan och arbetet med att förebygga utsläpp och anpassa samhället till den klimatförändring som redan pågår vill förstås hellre skylla på vädret än erkänna att man gjort för lite.

Läs hela artikeln här!