Skip to content

Category: Excerpt

The Power of Collider

最近一直在读之前提到的 The Book of Why,我觉得 collider 的概念可能是这本书里最重要的几个概念之一。本来我也可以用自己的语言写一个介绍,但好像懒得动笔,就节选书中几段话放在这里(顺序是我刻意安排的)。

X 与 Y 相关的三种解释:

  1. X 是 Y 的原因;
  2. X 和 Y 有共同的原因;
  3. collider。

We live our lives as if the common cause principle were true. Whenever we see patterns, we look for a causal explanation. In fact, we hunger for an explanation, in terms of stable mechanisms that lie outside the data. The most satisfying kind of explanation is direct causation: X causes Y. When that fails, finding a common cause of X and Y will usually satisfy us. By comparison, colliders are too ethereal to satisfy our causal appetites.

Judea Pearl. 2018. The Book of Why. Chapter 6

什么是 collider?

ABC. This is the most fascinating junction, called a “collider.” Felix Elwert and Chris Winship have illustrated this junction using three features of Hollywood actors: TalentCelebrityBeauty. Here we are asserting that both talent and beauty contribute to an actor’s success, but beauty and talent are completely unrelated to one another in the general population.

We will now see that this collider pattern works in exactly the opposite way from chains or forks when we condition on the variable in the middle. If A and C are independent to begin with, conditioning on B will make them dependent. For example, if we look only at famous actors (in other words, we observe the variable Celebrity = 1), we will see a negative correlation between talent and beauty: finding out that a celebrity is unattractive increases our belief that he or she is talented.

This negative correlation is sometimes called collider bias or the “explain-away” effect. For simplicity, suppose that you don’t need both talent and beauty to be a celebrity; one is sufficient. Then if Celebrity A is a particularly good actor, that “explains away” his success, and he doesn’t need to be any more beautiful than the average person. On the other hand, if Celebrity B is a really bad actor, then the only way to explain his success is his good looks. So, given the outcome Celebrity = 1, talent and beauty are inversely related—even though they are not related in the population as a whole. Even in a more realistic situation, where success is a complicated function of beauty and talent, the explain-away effect will still be present. This example is admittedly somewhat apocryphal, because beauty and talent are hard to measure objectively; nevertheless, collider bias is quite real, and we will see lots of examples in this book.

Judea Pearl. 2018. The Book of Why. Chapter 3

另外两个 collider 的例子:

Try this experiment: Flip two coins simultaneously one hundred times and write down the results only when at least one of them comes up heads. Looking at your table, which will probably contain roughly seventy-five entries, you will see that the outcomes of the two simultaneous coin flips are not independent. Every time Coin 1 landed tails, Coin 2 landed heads. How is this possible? Did the coins somehow communicate with each other at light speed? Of course not. In reality you conditioned on a collider by censoring all the tails-tails outcomes.

Judea Pearl. 2018. The Book of Why. Chapter 6

The correlation we observe is, in the purest and most literal sense, an illusion. Or perhaps even a delusion: that is, an illusion we brought upon ourselves by choosing which events to include in our data set and which to ignore. It is important to realize that we are not always conscious of making this choice, and this is one reason that collider bias can so easily trap the unwary. In the two-coin experiment, the choice was conscious: I told you not to record the trials with two tails. But on plenty of occasions we aren’t aware of making the choice, or the choice is made for us.

The distorting prism of colliders is just as prevalent in everyday life. As Jordan Ellenberg asks in How Not to Be Wrong, have you ever noticed that, among the people you date, the attractive ones tend to be jerks? Instead of constructing elaborate psychosocial theories, consider a simpler explanation. Your choice of people to date depends on two factors: attractiveness and personality. You’ll take a chance on dating a mean attractive person or a nice unattractive person, and certainly a nice attractive person, but not a mean unattractive person. It’s the same as the two-coin example, when you censored tails-tails outcomes. This creates a spurious negative correlation between attractiveness and personality. The sad truth is that unattractive people are just as mean as attractive people—but you’ll never realize it, because you’ll never date somebody who is both mean and unattractive.

Judea Pearl. 2018. The Book of Why. Chapter 6

在控制变量的时候,一定不要控制 collider,因为:

[I]n a collider, ABC, exactly the opposite rules hold. The variables A and C start out independent, so that information about A tells you nothing about C. But if you control for B, then information starts flowing through the “pipe,” due to the explain-away effect.

Judea Pearl. 2018. The Book of Why. Chapter 4

余秋雨最近这几节课讲得很好

第117集 说真话:不用虚假替代真实
第118集 装扮习惯:虚假生态中的文人
第119集 伪精英:空谈是他们唯一的生命方式
第120集 判断真伪文人的基本标准
第121集 面对前辈:不要把尊重变成迷思
第122集 泰斗还是「太逗」:艺术在创新中展开生命力

余秋雨 中国文化必修课

有空我再从中摘选几段贴在这里。

The Science of Well-Being

好几年没有在 Coursera 上听课了。今天是偶然的机会,我在重新听「好和弦」讲流行抒情乐钢琴伴奏之后,又去这个视频里的主唱 JR 的 YouTube 频道看到他在三天前更新的视频介绍了这个耶鲁大学最受欢迎的课程The Science of Well-Being, by Laurie Santos

以下摘自《纽约时报》今年 1 月底的报道:

Students have long requested that Yale offer a course on positive psychology, according to Woo-Kyoung Ahn, director of undergraduate studies in psychology, who said she was “blown away” by Dr. Santos’s proposal for the class.

本科生心理研究主任 Woo-Kyoung Ahn 表示,长期以来,学生们一直要求耶鲁开设一门积极心理学课程。她说,桑托斯博士提出开设这门课程时,她「特别高兴」。

Administrators like Dr. Ahn expected significant enrollment for the class, but none anticipated it to be quite so large. Psychology and the Good Life, with 1,182 undergraduates currently enrolled, stands as the most popular course in Yale’s 316-year history. The previous record-holder — Psychology and the Law — was offered in 1992 and had about 1,050 students, according to Marvin Chun, the Yale College dean. Most large lectures at Yale don’t exceed 600.

安博士等管理人员预计这门课的选修人数会很多,但谁也没预料到会这么多。「心理学与美好生活」这门课目前有 1182 名本科生选修,成为耶鲁大学 316 年历史上最受欢迎的课程。耶鲁学院的院长 Marvin Chun 表示,此前的纪录保持者是 1992 年推出的「心理学和法律」课程,约有 1050 名学生选修。耶鲁的大多数大型课程的选修人数都不超过 600 人。

Yale’s Most Popular Class Ever: Happiness
耶鲁史上最受欢迎课程:快乐

Google 了一下发现这门课已经上线 Coursera(《纽约时报》今年 1 月底报道这门课的时候还只是说很快就会上线)。最近几年觉得国内上 Coursera 的网络状况真的不太好,当然我也不是随时都在测试,毕竟试过几次感觉很糟糕之后就不会太有动力去听课了。但今天的网络效果很好,不知道是不是最近用了另一家代理服务……

我自己对积极心理学(positive psychology)一直比较感兴趣,但也有好几年没有继续阅读这方面的内容了,希望这门课能带给我新的收获吧,积极心理学对个人幸福感的研究在我的哲学里是一块非常重要的基础内容。

附一封 Santos 老师的欢迎信:

Dear Learner,

Congratulations on taking part in this journey! Over the next several weeks, we’ll explore what new results in psychological science teach us about how to be happier, how to feel less stressed, and how to flourish more. We’ll then have a chance to put these scientific findings into practice by building the sorts of habits that will allow us to live a happier and more fulfilling life.

In Spring 2018, I taught “Psychology and the Good Life” for the first time. I created this Yale course because I was worried about the levels of student depression, anxiety, and stress that I was seeing as a Professor and Head of College at Yale. I originally developed this course to teach Yale students how the science of psychology can provide important hints about how to make wiser choices and how to live a life that’s happier and more fulfilling. Since I’m not an expert on positive psychology, I began by learning more about this topic, diving into the work of pioneering scientists like Martin SeligmanEd DienerBarbara FredricksonSonja LyubomirskyMihaly CsikszentmihalyiDaniel GilbertRobert Emmons, and others. I also learned more about work in social psychology and behavior change, including work by scholars such as Liz DunnMike NortonNick EpleyGabriele Oettingen, and others. The Yale course was my attempt at synthesizing work in positive psychology along with the science of behavior change. My goal was to present these scientific findings in a way that made it clear how this science could be applied in people’s daily lives.

When I first developed the class, I had no idea it would become the most popular class ever taught at Yale University. The Yale class was featured in both the national and international news media, and I was flooded with requests from people around the world to find a way to share the content of this Yale class more broadly.

This Coursera class is an attempt to do just that. My goal is to share the insights from that popular Yale class with learners far beyond Yale. To make the lectures feel more intimate, we filmed at my home in one of Yale’s residential colleges with a small group of Yale students in the audience. I hope you’ll enjoy this more personal format, which allows you to hear the sorts of questions Yale students had about the material and how they applied the science in their daily lives. We understand that many of you taking the course are not currently college students, but we hope you see yourselves as though you are part of this virtual classroom.

During this course, you’ll have the opportunity to enhance your own well-being by implementing a few simple research-based methods to your own life.

I am thrilled to share this information with a wider audience. As you go through the lessons please share your feedback with the course team! You can direct item-specific feedback via content flags and general course feedback in the Discussion Forums or in the post-course survey when you complete the course.

Sincerely,
Laurie

Causal Revolution: 描述因果的数学语言

To appreciate the depth of this gap, imagine the difficulties that a scientist would face in trying to express some obvious causal relationships—say, that the barometer reading B tracks the atmospheric pressure P. We can easily write down this relationship in an equation such as B = kP, where k is some constant of proportionality. The rules of algebra now permit us to rewrite this same equation in a wild variety of forms, for example, P = B/k, k = B/P, or B–kP = 0. They all mean the same thing—that if we know any two of the three quantities, the third is determined. None of the letters k, B, or P is in any mathematical way privileged over any of the others. How then can we express our strong conviction that it is the pressure that causes the barometer to change and not the other way around? And if we cannot express even this, how can we hope to express the many other causal convictions that do not have mathematical formulas, such as that the rooster’s crow does not cause the sun to rise?

My college professors could not do it and never complained. I would be willing to bet that none of yours ever did either. We now understand why: never were they shown a mathematical language of causes; nor were they shown its benefits. It is in fact an indictment of science that it has neglected to develop such a language for so many generations. Everyone knows that flipping a switch will cause a light to turn on or off and that a hot, sultry summer afternoon will cause sales to go up at the local ice-cream parlor. Why then have scientists not captured such obvious facts in formulas, as they did with the basic laws of optics, mechanics, or geometry? Why have they allowed these facts to languish in bare intuition, deprived of mathematical tools that have enabled other branches of science to flourish and mature?

Part of the answer is that scientific tools are developed to meet scientific needs. Precisely because we are so good at handling questions about switches, ice cream, and barometers, our need for special mathematical machinery to handle them was not obvious. But as scientific curiosity increased and we began posing causal questions in complex legal, business, medical, and policy-making situations, we found ourselves lacking the tools and principles that mature science should provide.

Belated awakenings of this sort are not uncommon in science. For example, until about four hundred years ago, people were quite happy with their natural ability to manage the uncertainties in daily life, from crossing a street to risking a fistfight. Only after gamblers invented intricate games of chance, sometimes carefully designed to trick us into making bad choices, did mathematicians like Blaise Pascal (1654), Pierre de Fermat (1654), and Christiaan Huygens (1657) find it necessary to develop what we today call probability theory. Likewise, only when insurance organizations demanded accurate estimates of life annuity did mathematicians like Edmond Halley (1693) and Abraham de Moivre (1725) begin looking at mortality tables to calculate life expectancies. Similarly, astronomers’ demands for accurate predictions of celestial motion led Jacob Bernoulli, Pierre-Simon Laplace, and Carl Friedrich Gauss to develop a theory of errors to help us extract signals from noise. These methods were all predecessors of today’s statistics.

Ironically, the need for a theory of causation began to surface at the same time that statistics came into being. In fact, modern statistics hatched from the causal questions that Galton and Pearson asked about heredity and their ingenious attempts to answer them using cross-generational data. Unfortunately, they failed in this endeavor, and rather than pause to ask why, they declared those questions off limits and turned to developing a thriving, causality-free enterprise called statistics.

This was a critical moment in the history of science. The opportunity to equip causal questions with a language of their own came very close to being realized but was squandered. In the following years, these questions were declared unscientific and went underground. Despite heroic efforts by the geneticist Sewall Wright (1889–1988), causal vocabulary was virtually prohibited for more than half a century. And when you prohibit speech, you prohibit thought and stifle principles, methods, and tools.

Readers do not have to be scientists to witness this prohibition. In Statistics 101, every student learns to chant, “Correlation is not causation.” With good reason! The rooster’s crow is highly correlated with the sunrise; yet it does not cause the sunrise.

Unfortunately, statistics has fetishized this commonsense observation. It tells us that correlation is not causation, but it does not tell us what causation is. In vain will you search the index of a statistics textbook for an entry on “cause.” Students are not allowed to say that X is the cause of Y—only that X and Y are “related” or “associated.”

… I hope with this book to convince you that data are profoundly dumb. Data can tell you that the people who took a medicine recovered faster than those who did not take it, but they can’t tell you why. Maybe those who took the medicine did so because they could afford it and would have recovered just as fast without it.

Over and over again, in science and in business, we see situations where mere data aren’t enough. Most big-data enthusiasts, while somewhat aware of these limitations, continue the chase after data-centric intelligence, as if we were still in the Prohibition era.

As I mentioned earlier, things have changed dramatically in the past three decades. Nowadays, thanks to carefully crafted causal models, contemporary scientists can address problems that would have once been considered unsolvable or even beyond the pale of scientific inquiry. For example, only a hundred years ago, the question of whether cigarette smoking causes a health hazard would have been considered unscientific. The mere mention of the words “cause” or “effect” would create a storm of objections in any reputable statistical journal.

Even two decades ago, asking a statistician a question like “Was it the aspirin that stopped my headache?” would have been like asking if he believed in voodoo. To quote an esteemed colleague of mine, it would be “more of a cocktail conversation topic than a scientific inquiry.” But today, epidemiologists, social scientists, computer scientists, and at least some enlightened economists and statisticians pose such questions routinely and answer them with mathematical precision. To me, this change is nothing short of a revolution. I dare to call it the Causal Revolution, a scientific shakeup that embraces rather than denies our innate cognitive gift of understanding cause and effect.

Side by side with this diagrammatic “language of knowledge,” we also have a symbolic “language of queries” to express the questions we want answers to. For example, if we are interested in the effect of a drug (D) on lifespan (L), then our query might be written symbolically as: P(L|do(D)). In other words, what is the probability (P) that a typical patient would survive L years if made to take the drug? This question describes what epidemiologists would call an intervention or a treatment and corresponds to what we measure in a clinical trial. In many cases we may also wish to compare P(L|do(D)) with P(L |do(not-D)); the latter describes patients denied treatment, also called the “control” patients. The do-operator signifies that we are dealing with an intervention rather than a passive observation; classical statistics has nothing remotely similar to this operator.

We must invoke an intervention operator do(D) to ensure that the observed change in Lifespan L is due to the drug itself and is not confounded with other factors that tend to shorten or lengthen life. If, instead of intervening, we let the patient himself decide whether to take the drug, those other factors might influence his decision, and lifespan differences between taking and not taking the drug would no longer be solely due to the drug. For example, suppose only those who were terminally ill took the drug. Such persons would surely differ from those who did not take the drug, and a comparison of the two groups would reflect differences in the severity of their disease rather than the effect of the drug. By contrast, forcing patients to take or refrain from taking the drug, regardless of preconditions, would wash away preexisting differences and provide a valid comparison.

Mathematically, we write the observed frequency of Lifespan L among patients who voluntarily take the drug as P(L|D), which is the standard conditional probability used in statistical textbooks. This expression stands for the probability (P) of Lifespan L conditional on seeing the patient take Drug D. Note that P(L|D) may be totally different from P(L|do(D)). This difference between seeing and doing is fundamental and explains why we do not regard the falling barometer to be a cause of the coming storm. Seeing the barometer fall increases the probability of the storm, while forcing it to fall does not affect this probability.

Judea Pearl. 2018. The Book of Why

基因演化视角与个体感受视角

The currency of evolution is neither hunger nor pain, but rather copies of DNA helixes. Just as the economic success of a company is measured only by the number of dollars in its bank account, not by the happiness of its employees, so the evolutionary success of a species is measured by the number of copies of its DNA. If no more DNA copies remain, the species is extinct, just as a company without money is bankrupt. If a species boasts many DNA copies, it is a success, and the species flourishes. From such a perspective, 1,000 copies are always better than a hundred copies. This is the essence of the Agricultural Revolution: the ability to keep more people alive under worse conditions.

Yet why should individuals care about this evolutionary calculus? Why would any sane person lower his or her standard of living just to multiply the number of copies of the Homo sapiens genome? Nobody agreed to this deal: the Agricultural Revolution was a trap.

如果要衡量某种物种演化成功与否,评断标准就在于世界上其 DNA 螺旋的拷贝数的多寡。这很类似于货币的概念,就像今天如果要说某家公司行不行,我们看的是它的市值有多少钱,而不是它的员工开不开心;物种的演化成功,看的就是这个物种 DNA 拷贝数在世界上的多寡。如果世界上不再有某物种的 DNA 拷贝,就代表该物种已经绝种,也等于公司没有钱而宣告倒闭。而如果某个物种还有许多个体带着它的 DNA 拷贝存在于这个世上,就代表着这个物种演化成功、欣欣向荣。从这种角度看来,1000 份 DNA 拷贝永远都强过 100 份。这正是农业革命真正的本质:让更多的人却以更糟的状况活下去。

但是,身为个人,为什么要管这种演化问题?如果有人说,为了「增加智人基因组在世界上的拷贝数」,希望你降低自己的生活水平,你会同意吗?没有人会同意这笔交易。简单说来,农业革命就是一个陷阱。

The pursuit of an easier life resulted in much hardship, and not for the last time. It happens to us today. How many young college graduates have taken demanding jobs in high-powered firms, vowing that they will work hard to earn money that will enable them to retire and pursue their real interests when they are thirty-five? But by the time they reach that age, they have large mortgages, children to school, houses in the suburbs that necessitate at least two cars per family, and a sense that life is not worth living without really good wine and expensive holidays abroad. What are they supposed to do, go back to digging up roots? No, they double their efforts and keep slaving away.

种种想让生活变得轻松的努力,反而给人带来无穷的麻烦;而且这可不是史上的最后一次。就算今天,仍然如此。有多少年轻的大学毕业生投身大企业、从事各种劳心劳力的工作,发誓要努力赚钱,好在 35 岁就退休,去从事他们真正有兴趣的事业?但等他们到了 35 岁,却发现自己背着巨额贷款,要付子女的学费,要养在高级住宅区的豪宅,每家得有两部车,而且觉得生活里不能没有高级红酒和去国外的假期。他们该怎么做?他们会放下一切,回去野外采果子挖树根吗?当然不可能,而是加倍努力,继续把自己累得半死。

Unfortunately, the evolutionary perspective is an incomplete measure of success. It judges everything by the criteria of survival and reproduction, with no regard for individual suffering and happiness. Domesticated chickens and cattle may well be an evolutionary success story, but they are also among the most miserable creatures that ever lived. The domestication of animals was founded on a series of brutal practices that only became crueler with the passing of the centuries.

不幸的是,演化观点并不是唯一判断物种成功与否的标准。它一切只考虑到生存和繁殖,而不顾个体的痛苦或幸福。虽然就演化而言,驯化的鸡和牛很可能是最成功的代表,但它们过的其实是生物有史以来最惨的生活。动物的驯化是建立在一系列的野蛮作为上,而且随着时间的前行,残忍程度只增不减。

Yet from the viewpoint of the herd, rather than that of the shepherd, it’s hard to avoid the impression that for the vast majority of domesticated animals, the Agricultural Revolution was a terrible catastrophe. Their evolutionary ‘success’ is meaningless. A rare wild rhinoceros on the brink of extinction is probably more satisfied than a calf who spends its short life inside a tiny box, fattened to produce juicy steaks. The contented rhinoceros is no less content for being among the last of its kind. The numerical success of the calf’s species is little consolation for the suffering the individual endures.

然而,如果从牛羊的观点而非牧者的观点来看农业革命,就会发现对绝大多数的家畜来说,这是一场可怕的灾难。这些演化的「成功」是没有意义的。就算是濒临绝种的野生犀牛,比起被关在小格子里变肥、等着成为鲜美牛排的肉牛,日子应该还是好过得多。虽然自己的物种即将灭绝,但这丝毫不会影响那头野生犀牛对自己生活的满意程度。相较之下,肉牛这个物种虽然在数量上大获成功,却完全无法安慰那些单独个体所承受的痛苦。

Yuval Noah Harari. 2011. Sapiens: A Brief History of Humankind
尤瓦尔·赫拉利《人类简史:从动物到上帝》 林俊宏 译