Information Processing

Pessimism of the Intellect, Optimism of the Will     Archive   Favorite posts   Twitter: @steve_hsu

Saturday, May 23, 2015

Ioannidis at MSU

These videos are from an interview I did with John Ioannidis when he visited Michigan State earlier this month. The whole thing (29 min) and more short clips are available here.

Is 85% of NIH funding wasted?

Early candidate gene studies rarely replicated, but GWAS hits do.

The flyer for his talk:

Friday, May 22, 2015

Genetic architecture and predictive modeling of quantitative traits

As an experiment I recorded this video using slides from a talk I gave last week at NIH. I will be giving similar talks later this spring/summer at Human Longevity Inc. and BGI. The commonality between these institutions is that all three are on the road to accumulating a million human genomes. Who will get there first?

Recording the video was easy using Keynote, although it's a bit odd to talk to yourself for an hour. I recommend that everyone do this, in order to reach a much larger audience than can fit in a lecture hall :-)

Genetic architecture and predictive modeling of quantitative traits

I discuss the application of Compressed Sensing (L1-penalized optimization or LASSO) to genomic prediction. I show that matrices comprised of human genomes are good compressed sensors, and that LASSO applied to genomic prediction exhibits a phase transition as the sample size is varied. When the sample size crosses the phase boundary complete identification of the subspace of causal variants is possible. For typical traits of interest (e.g., with heritability ~ 0.5), the phase boundary occurs at N ~ 30s, where s (sparsity) is the number of causal variants. I give some estimates of sparsity associated with complex traits such as height and cognitive ability, which suggest s ~ 10k. In practical terms, these results imply that powerful genomic prediction will be possible for many complex traits once ~ 1 million genotypes are available for analysis.

Thursday, May 21, 2015

Fifty years of twin studies

The most interesting aspect of these results is that for many traits there is no detectable non-additivity. That is, gene-gene interactions seem to be insignificant, and a simple linear genetic architecture is consistent with the results.
Meta-analysis of the heritability of human traits based on fifty years of twin studies
Nature Genetics (2015) doi:10.1038/ng.3285

Despite a century of research on complex traits in humans, the relative importance and specific nature of the influences of genes and environment on human traits remain controversial. We report a meta-analysis of twin correlations and reported variance components for 17,804 traits from 2,748 publications including 14,558,903 partly dependent twin pairs, virtually all published twin studies of complex traits. Estimates of heritability cluster strongly within functional domains, and across all traits the reported heritability is 49%. For a majority (69%) of traits, the observed twin correlations are consistent with a simple and parsimonious model where twin resemblance is solely due to additive genetic variation. The data are inconsistent with substantial influences from shared environment or non-additive genetic variation. This study provides the most comprehensive analysis of the causes of individual differences in human traits thus far and will guide future gene-mapping efforts.
See also Additivity and complex traits in mice:
You may have noticed that I am gradually collecting copious evidence for (approximate) additivity. Far too many scientists and quasi-scientists are infected by the epistasis or epigenetics meme, which is appealing to those who "revel in complexity" and would like to believe that biology is too complex to succumb to equations. ("How can it be? But what about the marvelous incomprehensible beautiful sacred complexity of Nature? But But But ...")

I sometimes explain things this way:

There is a deep evolutionary reason behind additivity: nonlinear mechanisms are fragile and often "break" due to DNA recombination in sexual reproduction. Effects which are only controlled by a single locus are more robustly passed on to offspring. ...

Many people confuse the following statements:

"The brain is complex and nonlinear and many genes interact in its construction and operation."

"Differences in brain performance between two individuals of the same species must be due to nonlinear (non-additive) effects of genes."

The first statement is true, but the second does not appear to be true across a range of species and quantitative traits.
On the genetic architecture of intelligence and other quantitative traits (p.16):
... The preceding discussion is not intended to convey an overly simplistic view of genetics or systems biology. Complex nonlinear genetic systems certainly exist and are realized in every organism. However, quantitative differences between individuals within a species may be largely due to independent linear effects of specific genetic variants. As noted, linear effects are the most readily evolvable in response to selection, whereas nonlinear gadgets are more likely to be fragile to small changes. (Evolutionary adaptations requiring significant changes to nonlinear gadgets are improbable and therefore require exponentially more time than simple adjustment of frequencies of alleles of linear effect.) One might say that, to first approximation, Biology = linear combinations of nonlinear gadgets, and most of the variation between individuals is in the (linear) way gadgets are combined, rather than in the realization of different gadgets in different individuals.

Linear models work well in practice, allowing, for example, SNP-based prediction of quantitative traits (milk yield, fat and protein content, productive life, etc.) in dairy cattle. ...

Wednesday, May 20, 2015

Imperial exams and human capital

The dangers of rent seeking and the educational signaling trap. Although the imperial examinations were probably g loaded (and hence supplied the bureaucracy with talented administrators for hundreds of years), it would have been better to examine candidates on useful knowledge, which every participant would then acquire to some degree.

See also Les Grandes Ecoles Chinoises and History Repeats.
Farewell to Confucianism: The Modernizing Effect of Dismantling China’s Imperial Examination System

Ying Bai
The Hong Kong University of Science and Technology

Imperial China employed a civil examination system to select scholar bureaucrats as ruling elites. This institution dissuaded high-performing individuals from pursuing some modernization activities, such as establishing modern firms or studying overseas. This study uses prefecture-level panel data from 1896-1910 to compare the effects of the chance of passing the civil examination on modernization before and after the abolition of the examination system. Its findings show that prefectures with higher quotas of successful candidates tended to establish more modern firms and send more students to Japan once the examination system was abolished. As higher quotas were assigned to prefectures that had an agricultural tax in the Ming Dynasty (1368-1643) of more than 150,000 stones, I adopt a regression discontinuity design to generate an instrument to resolve the potential endogeneity, and find that the results remain robust.
From the paper:
Rent seeking is costly to economic growth if “the ablest young people become rent seekers [rather] than producers” (Murphy, Shleifer, and Vishny 1991: 529). Theoretical studies suggest that if a society specifies a higher payoff for rent seeking rather than productive activities, more talent would be allocated in unproductive directions (Acemoglu 1995; Baumol 1990; Murphy, Shleifer, and Vishny 1991, 1993). This was the case in late Imperial China, when a large part of the ruling class – scholar bureaucrats – was selected on the basis of the imperial civil examination.1 The Chinese elites were provided with great incentives to invest in a traditional education and take the civil examination, and hence few incentives to study other “useful knowledge” (Kuznets 1965), such as Western science and technology.2 Thus the civil examination constituted an institutional obstacle to the rise of modern science and industry (Baumol 1990; Clark and Feenstra 2003; Huff 2003; Lin 1995).

This paper identifies the negative incentive effect of the civil exam on modernization by exploring the impact of the system’s abolition in 1904-05. The main empirical difficulty is that the abolition was universal, with no regional variation in policy implementation. To better understand the modernizing effect of the system’s abolition, I employ a simple conceptual framework that incorporates two choices open to Chinese elites: to learn from the West and pursue some modernization activities or to invest in preparing for the civil examination. In this model, the elites with a greater chance of passing the examination would be less likely to learn from the West; they would tend to pursue more modernization activities after its abolition. Accordingly, the regions with a higher chance of passing the exam should be those with a larger increase in modernization activities after the abolition, which makes it possible to employ a difference-in-differences (DID) method to identify the causal effect of abolishing the civil examination on modernization.

I exploit the variation in the probability of passing the examination among prefectures – an administrative level between the provincial and county levels. To control the regional composition of successful candidates, the central government of the Qing dynasty (1644-1911) allocated a quota of successful candidates to each prefecture.3 In terms of the chances of individual participants – measured by the ratio of quotas to population – there were great inequalities among the regions (Chang 1955). To measure the level of modernization activities in a region, I employ (1) the number of newly modern private firms (per million inhabitants) above a designated size that has equipping steam engine or electricity as a proxy for the adoption of Western technology and (2) the number of new Chinese students in Japan – the most import host country of Chinese overseas students (per million inhabitants) as a proxy of learning Western science. Though the two measures might capture other things, for instance entrepreneurship or human capital accumulation, the two activities are both intense in modern science and technology, and thus employed as the proxies of modernization. ...
From Credentialism and elite employment:
Evaluators relied so intensely on “school” as a criterion of evaluation not because they believed that the content of elite curricula better prepared students for life in their firms – in fact, evaluators tended to believe that elite and, in particular, super-elite instruction was “too abstract,” “overly theoretical,” or even “useless” compared to the more “practical” and “relevant” training offered at “lesser” institutions – but rather due to the strong cultural meanings and character judgments evaluators attributed to admission and enrollment at an elite school. I discuss the meanings evaluators attributed to educational prestige in their order of prevalence among respondents. ...

Saturday, May 16, 2015

The Grisly Folk

H.G. Wells on the first encounters between modern humans and Neanderthals. See also The Neanderthal problem and Neanderthals dumb?
The Grisly Folk: ... But one may doubt if the first human group to come into the grisly land was clever enough to solve the problems of the new warfare. Maybe they turned southward again to the gentler regions from which they had come, and were killed by or mingled with their own brethren again. Maybe they perished altogether in that new land of the grisly folk into which they had intruded. Yet the truth may be that they even held their own and increased. If they died there were others of their kind to follow them and achieve a better fate.

That was the beginning of a nightmare age for the little children of the human tribe. They knew they were watched.

Their steps were dogged. The legends of ogres and man-eating giants that haunt the childhood of the world may descend to us from those ancient days of fear. And for the Neandertalers it was the beginning of an incessant war that could end only in extermination.

The Neandertalers, albeit not so erect and tall as men, were the heavier, stronger creatures, but they were stupid, and they went alone or in twos and threes; the menfolk were swifter, quicker-witted, and more social — when they fought they fought in combination. They lined out and surrounded and pestered and pelted their antagonists from every side. They fought the men of that grisly race as dogs might fight a bear. They shouted to one another what each should do, and the Neandertaler had no speech; he did not understand. They moved too quickly for him and fought too cunningly.

Many and obstinate were the duels and battles these two sorts of men fought for this world in that bleak age of the windy steppes, thirty or forty thousand years ago. The two races were intolerable to each other. They both wanted the eaves and the banks by the rivers where the big flints were got. They fought over the dead mammoths that had been bogged in the marshes, and over the reindeer stags that had been killed in the rutting season. When a human tribe found signs of the grisly folk near their cave and squatting place, they had perforce to track them down and kill them; their own safety and the safety of their little ones was only to be secured by that killing. The Neandertalers thought the little children of men fair game and pleasant eating. ...
Razib Khan discusses other examples from this genre.

The ravages of time

These make me happy and sad at the same time.

Tuesday, May 12, 2015

The view from here: vast and mysterious

An even better visualization would be the state vector of our universe rotating in a vast Hilbert space. Too bad my brain can't picture it!

Monday, May 11, 2015

New kids on the blockchain

WSJ reports on institutional interest in blockchain technologies.
WSJ: Nasdaq OMX Group Inc. is testing a new use of the technology that underpins the digital currency bitcoin, in a bid to transform the trading of shares in private companies.

The experiment joins a slew of financial-industry forays into bitcoin-related technology. If the effort is deemed successful, Nasdaq wants to use so-called blockchain technology in its stock market, one of the world’s largest, and potentially shake up systems that have facilitated the trading of financial assets for decades. ...

The blockchain is maintained, updated and verified by a vast global network of independently owned computers known as “miners” that collectively work to prove the ledger’s authenticity.

In theory, this decentralized system for verifying information means transactions need no longer be channeled through banks, clearinghouses and other middlemen. Advocates say this “trustless” structure means direct transfers of ownership can occur over the blockchain almost instantaneously without the risk of default or manipulation by an intermediating third party.

One idea is that encrypted, digital representations of share certificates could be inserted into minute bitcoin transactions known as “Satoshis,” facilitating an immediate, verifiable transfer of stock ownership from seller to buyer.

Still, bitcoin-based settlement remains untested in the real world. Regulators worry about the anonymous status of the bitcoin miners that collectively manage the system. It is conceivable that bad actors might one day take over the mining network and destroy the integrity of its verification system, some say.

... Real-time settlement has been a goal of regulators and investors alike as it would reduce the risk of counterparty failure and free up billions of dollars of capital that is sidelined during that wait period.

Oliver Bussmann, chief investment officer of Swiss bank UBS AG, last year said the blockchain was the biggest disrupting force in the financial sector, meaning its success could potentially have far-reaching ramifications for banks, trading houses and others. His bank has since established a special blockchain lab to study uses of the technology.

Nasdaq named Fredrik Voss, a vice president, as its new “blockchain technology evangelist” to lead efforts to increase use of the technology.
See my earlier discussion Crypto-currencies, Bitcoin and Blockchain. For these kind of applications I think miners should not be the primary mechanism for blockchain verification. The bank / exchange should do it themselves and then post a large bounty (e.g., $50 million dollars) to any miner who finds an error in the publicly available blockchain. Of course, in these scenarios it would be nice to have more functionality than just transfers (which is all Bitcoin can do now). It would be trivial to encode options, derivatives contracts, conditional agreements, etc. in the blockchain. See Ethereum.
7. One interesting scenario is for a country (Singapore? Denmark?) or large financial entity (Goldman, JPM, Visa) to issue its own crypto currency, managing the blockchain itself but leaving it in the public domain so that third parties (including regulators) can verify transactions. Confidence in this kind of "Institutional Coin" (IC) would be high from the beginning. An IC with Ethereum-like capabilities could revolutionize the financial industry. In place of an opaque web of counterparty relationships leading to systemic risk, the IC blockchain would be easily audited by machine. Regulators would require that the IC authority know its customers, so pseudonymity would only be partial.

Saturday, May 09, 2015

Our Kids and Coming Apart

Nick Lemann reviews Our Kids: The American Dream in Crisis by Robert D. Putnam. At the descriptive level, Putnam's conclusions seem very similar to those of Charles Murray in Coming Apart.

Of course, description is much easier to obtain than causality.
NYBooks: ... By the logic of the book, access to social capital ought to be strongly associated with going to college and doing well there—otherwise, why stress it so strongly? The syllogism would be: social capital leads to educational attainment, which leads to mobility. But for his classmates, Putnam reports, academic achievement was the factor most predictive of college attendance, and the link between such achievement and parental encouragement (of the kind he has copiously praised in the main body of the book) was only “modestly important,” and “much weaker” than the link between class rank and college attendance. Not only that:
No other measure of parental affluence or family structure or neighborhood social capital (or indeed anything else we had measured)—none of the factors that this book has shown are so important in producing today’s opportunity gap—had any appreciable effect on college attendance or other educational attainment.
In the methods appendix, Putnam refers readers to his website for more detail on his findings about his classmates. There, he writes:
No measure of parental resources adds any predictive power whatsoever—not parental occupational status, not parental unemployment, not family economic insecurity during high school, not homeownership, not neighborhood characteristics, and not family structure…. Parental education, parental encouragement, and class rank were all modestly predictive of extracurricular participation, but holding constant those variables, extracurricular participation itself was unrelated to college-going.
So is it really the case that Putnam has shown that strong social capital once produced individual opportunity—let alone that the deterioration of social capital has produced what he calls the opportunity gap? The passages I just quoted seem to indicate that the strong association between social capital and opportunity that is Putnam’s core assertion has not been proven. Putnam doesn’t define “social capital” precisely enough to rigorously test its effects, even on as small and unrepresentative a sample as the one in his survey, and he doesn’t attempt to test its effects precisely in the present. It could even be that, rather than social capital generating prosperity, prosperity might generate social capital, which would mean Putnam has been showing us the effects of inequality, not the causes.
It seems possible to me that:

1. American society has become increasingly meritocratic in the last 50 years, with advancement more and more dependent on largely heritable attributes such as cognitive ability, conscientiousness, future time orientation, etc. Consequently, gaps between different SES groups have become more and more difficult to remediate.

2. External forces, such as automation and global economic competition, have placed a larger and larger premium on attributes such as those listed above, leaving Americans of below average ability at a severe disadvantage.

The consequences of these observations are exacerbated by an increasingly winner take all economic system.

If these points are correct, then Our Kids and Coming Apart are documenting consequences, not causes.

See also Income, Wealth and IQ , US Economic Mobility and Random microworlds: the mystery of nonshared environment.

Friday, May 08, 2015

Rockhold winning and losing

I was surprised at how easily Luke Rockhold beat Lyoto Machida a few weeks ago. Once it went to the ground Rockhold completely dominated the fight.

If you're a grappler you might enjoy this video of Rockhold getting destroyed by a much smaller Rustam Chsiev at Grappler's Quest. Unbelievable how physical this match was. Luke could do nothing to Chsiev.

Thursday, May 07, 2015

Peter Visscher: Genomics, Big Data, Medicine, and Complex Traits

Another good talk from the Genomics, Big Data, and Medicine Seminar Series at the Icahn School of Medicine (Mt. Sinai). Peter starts his talk by discussing height as a classical model trait, giving credit to Galton for first investigating heritability and related ideas, and noting the approximate additivity of genetic effects. @16min, state of the art genomic prediction of height from GIANT collaboration.

Interestingly, Visscher is Dutch for Fisher -- as in Ronald Fisher (the father of population genetics and early pioneer in statistics).

See Maxwell's demon and genetic engineering.
Ronald Fisher on positive alleles for intelligence, in Mendelism and Biometry (1911):

Suppose we knew, for example, 20 pairs of mental characters [loci in the genome]. These would combine in over a million pure mental types; [some of] these would naturally occur rather less frequently than once in a billion; or in a country like England about once in 20,000 generations [assuming the positive variants are somewhat rare]; it will give some idea as to the excellence of the best of these types when we consider that the Englishmen from Shakespeare to Darwin have occurred within 10 generations; the thought of a race of men combining the illustrious qualities of these giants, and breeding true to them, is almost too overwhelming, but such a race will inevitably arise in whatever country first sees the inheritance of mental characters elucidated.

Sunday, May 03, 2015

Replication is hard; understanding what that means is even harder

Bad news for psychology -- only 39 of 100 published findings were replicated in a recent coordinated effort.
Nature | News: An ambitious effort to replicate 100 research findings in psychology ended last week — and the data look worrying. Results posted online on 24 April, which have not yet been peer-reviewed, suggest that key findings from only 39 of the published studies could be reproduced. ...
The article goes on:
But the situation is more nuanced than the top-line numbers suggest (See graphic, 'Reliability test'). Of the 61 non-replicated studies, scientists classed 24 as producing findings at least “moderately similar” to those of the original experiments, even though they did not meet pre-established criteria, such as statistical significance, that would count as a successful replication.  [ Yeah, right. ]
This makes me suspect bounded cognition -- humans trusting their post hoc stories and intuition instead of statistical criteria chosen before planned replication attempts.

The most tragic thing about Ioannidis's work on low replication rates and wasted research funding is that while medical researchers might pay lip service to his results (which are highly cited), they typically have not actually grasped the implications for their own work. In particular, they typically have not updated their posteriors to reflect the low reliability of research results, even in the top journals.

Thursday, April 30, 2015

DNA Dreams at Harvard

This is a panel discussion of the documentary film DNA Dreams (see below), about BGI and its Cognitive Genomics Lab.

Moderator: Dr. Evelynn Hammonds, Director of the Project on Race & Gender in Science & Medicine, Hutchins Center for African & African American Research/Barbara Gutmann Rosenkrantz Professor of the History of Science

Panelists include: (L to R)
1. George Church, Robert Winthrop Professor of Genetics, Harvard Medical School
2. Bregjte van der Haak, Filmmaker
3. Arthur Kleinman, Director of the Harvard University Asia Center and Professor of Anthropology and Medical Anthropology at Harvard University
4. Peter Galison, Pellegrino University Professor, Director of the Collection of Historical Scientific Instruments, Harvard University
Peter Galison is dismissive of "single parameter" measures of cognitive ability. George Church replies quite effectively. Certainly anyone who has thought seriously about IQ or g knows that it is only a crude measure of (compressed approximation to) a multi-dimensional set of mental abilities. I wonder how Peter would react to learning that his grandchild would be born with a mutation depressing the meaningless "single parameter" in question to an SD below normal. Would he just shrug it off as unimportant?

I believe this is the entire documentary:

Wednesday, April 29, 2015

Value-Added College Rankings

New report from Brookings estimates value added (in terms of economic success) by university, controlling for input factors such as student quality and family income. This is just the first step toward outcomes-driven rankings of universities that will be far more useful than the existing rankings, which are largely based on prestige. Brief summary.

See also Defining Merit.
Brookings: The choice of whether and where to attend college is among the most important investment decisions individuals and families make, yet people know little about how institutions of higher learning compare along important dimensions of quality. This is especially true for the nearly 5,000 colleges granting credentials of two years or fewer, which together graduate nearly 2 million students annually, or about 39 percent of all postsecondary graduates. Moreover, popular rankings of college quality, such as those produced by U.S. News, Forbes, and Money, focus only on a small fraction of the nation’s four-year colleges and tend to reward highly selective institutions over those that contribute the most to student success.

Drawing on a variety of government and private data sources, this report presents a provisional analysis of college value-added with respect to the economic success of the college’s graduates, measured by the incomes graduates earn, the occupations in which they work, and their loan repayment rates. This is not an attempt to measure how much alumni earnings increase compared to forgoing a postsecondary education. Rather, as defined here, a college’s value-added measures the difference between actual alumni outcomes (like salaries) and predicted outcomes for institutions with similar characteristics and students. Value-added, in this sense, captures the benefits that accrue from both measurable aspects of college quality, such as graduation rates and the market value of the skills a college teaches, as well as unmeasurable “x factors,” like exceptional leadership or teaching, that contribute to student success.

While imperfect, the value-added measures introduced here improve on conventional rankings in several ways. They are available for a much larger number of postsecondary institutions; they focus on the factors that best predict objectively measured student economic outcomes; and their goal is to isolate the effect colleges themselves have on those outcomes, above and beyond what students’ backgrounds would predict.

Using a variety of private and public data sources, this analysis finds that:

Graduates of some colleges enjoy much more economic success than their characteristics at time of admission would suggest. Colleges with high value-added in terms of alumni earnings include not only nationally recognized universities such as Cal Tech, MIT, and Stanford, but also less well-known institutions such as Rose-Hulman Institute of Technology in Indiana, Colgate in upstate New York, and Carleton College in Minnesota. Two-year colleges with high-value added scores include the New Hampshire Technical Institute, Lee College near Houston, and Pearl River Community College in Mississippi. ...
On the quality of PayScale compensation figures used in the analysis:
...There are a number of ways one can assess whether or not PayScale accurately captures the earnings of graduates—or whether the sample is statistically biased by the voluntary nature of its data collection.

Broadly, PayScale earnings by major for U.S. residents with bachelor’s degrees can be compared to similar data from the ACS, which annually samples 1 percent of the U.S. population.30 The correlation between the two is what matters most for this analysis, since value-added calculations are based on relative differences between predicted and actual earnings.

The correlation between bachelor’s degree holders on PayScale and median salaries by major for workers in the labor force from the Census Bureau is 0.85 across 158 majors matched between the two databases.
Effects of student ability and family SES:
For each of the three post-attendance outcomes measured here—mid-career salary, loan repayment rate, and occupational earnings power—student test scores, math scores in particular, are highly correlated: 0.76 for mid-career salaries and 0.69 for student loan repayment and occupational earnings power (Figure 3). Other student characteristics, such as the percentage receiving Pell grants, also correlate highly with these outcomes, though not as highly as test scores.

Go Beavers! Note Caltech grads are also much more likely to win Nobel Prizes and be elected to the National Academy of Science or Engineering than graduates of any other university. Claims that a technical education bestows narrow technical skill without conveying deep ideas and teaching critical thinking are silly.
... Alumni from Cal Tech list the highest-value skills on their LinkedIn profiles (Table 3); their skills include algorithm development, machine learning, Python, C++, and startups (that is, starting a new business). Cal Tech is followed closely by Harvey Mudd and MIT. Babson College, also in the top 10, focuses on business rather than science; its course offerings teach many quantitative skills relevant for business-oriented STEM careers. Many graduates from the Air Force Academy are prepared for high-paying engineering jobs in the military and at large defense contractors. They list skills like aerospace and project planning.

Amusingly, you can see by looking at the tables of regression results that Asian share of enrollment is strongly positively correlated with mid-career earnings, whereas female share is negatively correlated. This is not surprising, given that STEM skills are big drivers of compensation.

Colgate's strong performance (and probably that of Carleton and some others) cannot be explained in terms of STEM skills -- see discussion and figure, on page 16.

Thursday, April 23, 2015

CRISPR edits in human zygotes

Results such as these had been the subject of rumors for some time. Also covered in Nature News.

It is very early days for this technology -- the off-target rate can probably be reduced significantly using better methods. But in the near term, safety and efficacy issues make PGD a better technique for improving human reproduction. See, e.g., PGD in the US and Israel and Single Cell Sequencing in PGD.
CRISPR/Cas9-mediated gene editing in human tripronuclear zygotes
(Protein and Cell -- open access)

Genome editing tools such as the clustered regularly interspaced short palindromic repeat (CRISPR)-associated system (Cas) have been widely used to modify genes in model systems including animal zygotes and human cells, and hold tremendous promise for both basic research and clinical applications. To date, a serious knowledge gap remains in our understanding of DNA repair mechanisms in human early embryos, and in the efficiency and potential off-target effects of using technologies such as CRISPR/Cas9 in human pre-implantation embryos. In this report, we used tripronuclear (3PN) zygotes to further investigate CRISPR/Cas9-mediated gene editing in human cells. We found that CRISPR/Cas9 could effectively cleave the endogenous β-globin gene (HBB). However, the efficiency of homologous recombination directed repair (HDR) of HBB was low and the edited embryos were mosaic. Off-target cleavage was also apparent in these 3PN zygotes as revealed by the T7E1 assay and whole-exome sequencing. Furthermore, the endogenous delta-globin gene (HBD), which is homologous to HBB, competed with exogenous donor oligos to act as the repair template, leading to untoward mutations. Our data also indicated that repair of the HBB locus in these embryos occurred preferentially through the non-crossover HDR pathway. Taken together, our work highlights the pressing need to further improve the fidelity and specificity of the CRISPR/Cas9 platform, a prerequisite for any clinical applications of CRSIPR/Cas9-mediated editing.
The table below is from the Supplement.

Wednesday, April 22, 2015

Earnings by educational attainment 1990-2013

This graphic is from today's NYTimes: Why American Workers Without Much Education Are Being Hammered.

Aside from the human capital (education) point the figure makes, I'm a bit puzzled by the following: real per-capita GDP is probably up at least ~50% (e.g., ~2% x 23 years) over the 1990-2013 period. Where did those gains go? Into the pockets of a small invisible group that doesn't show up in the graph (note use of medians, not averages)? It seems that everyone except the members of this small group were "hammered" over the last two decades ...

Note added (with better data): This source has 2013 GDP at $16 trillion versus $9 trillion in 1990 (both figures in 2009 dollars). Total US population went up 26% (316 million from 249 million). The percentage of the population with college degrees went from about 20% to 30%. It still appears to me that much of GDP increase during the period did not go to workers or ordinary people.

If you annualize any of the real income changes in the graph over 23 years, the change is small -- less than 1% per year. Yet real GDP grew at about 3% per year on average during the period. The graph below (from this 2007 post) might shed some light on the mystery (even the top quintile saw little income appreciation):

More here.

Tuesday, April 21, 2015

China's Ideological Spectrum

These researchers identify a dominant principal component in the Chinese ideological spectrum. Discussed on Sinica podcast.

China's Ideological Spectrum

Jennifer Pan (Harvard University - Graduate School of Arts and Sciences)
Yiqing Xu (MIT - Department of Political Science)

We offer the first large scale empirical analysis of ideology in contemporary China to determine whether individuals fall along a discernible and coherent ideological spectrum, and whether there are regional and inter-group variations in ideological orientation. Using principal component analysis (PCA) on a survey of 171,830 individuals, we identify one dominant ideological dimension in China. Individuals who are politically conservative, who emphasize the supremacy of the state and nationalism, are also likely to be economically conservative, supporting a return to socialism and state-control of the economy, and culturally conservative, supporting traditional, Confucian values. In contrast, political liberals, supportive of constitutional democracy and individual liberty, are also likely to be economic liberals who support market-oriented reform and social liberals who support modern science and values such as sexual freedom. This uni-dimensionality of ideology is robust to a wide variety of diagnostics and checks. Using post-stratification based on census data, we find a strong relationship between liberal orientation and modernization -- provinces with higher levels of economic development, trade openness, urbanization are more liberal than their poor, rural counterparts, and individuals with higher levels of education and income and more liberal than their less educated and lower-income peers.
Warning: PCA is the tool of the devil ;-)

Sunday, April 19, 2015

Ulam on physical intuition and visualization

The picture above is of von Neumann, Feynman, and Ulam. More Ulam. See also the nature of intuition and intuition and the two brains.
Adventures of a Mathematician: (p.147-148) ... the main ability to have was a visual, and also an almost tactile, way to imagine the physical situations, rather than a merely logical picture of the problems.

The feeling for problems in physics is quite different from purely theoretical mathematical thinking. It is hard to describe the kind of imagination that enables one to guess at or gauge the behavior of physical phenomena. Very few mathematicians seem to possess it to any great degree. Johnny [vN], for example, did not have to any extent the intuitive common sense and "gut" feeling or penchant for guessing what happens in given physical situations. His memory was mainly auditory, rather than visual.

Another thing that seems necessary is the knowledge of a dozen or so physical constants, not merely of their numerical value, but a real feeling for their relative orders of magnitude and interrelations, and, so to speak, an instinctive ability to "estimate."

I knew, of course, the values of constants like the velocity of light and maybe three or four other fundamental constants—the Planck constant h, a gas constant R, etc. Very soon I discovered that if one gets a feeling for no more than a dozen other radiation and nuclear constants, one can imagine the subatomic world almost tangibly, and manipulate the picture dimensionally and qualitatively, before calculating more precise relationships.

Most of the physics at Los Alamos could be reduced to the study of assemblies of particles interacting with each other, hitting each other, scattering, sometimes giving rise to new particles. Strangely enough, the actual working problems did not involve much of the mathematical apparatus of quantum theory although it lay at the base of the phenomena, but rather dynamics of a more classical kind—kinematics, statistical mechanics, large-scale motion problems, hydrodynamics, behavior of radiation, and the like. In fact, compared to quantum theory the project work was like applied mathematics as compared with abstract mathematics. If one is good at solving differential equations or using asymptotic series, one need not necessarily know the foundations of function space language. It is needed for a more fundamental understanding, of course. In the same way, quantum theory is necessary in many instances to explain the data and to explain the values of cross sections. But it was not crucial, once one understood the ideas and then the facts of events involving neutrons reacting with other nuclei.
This "dynamics of a more classical kind" did not require intuition for entanglement or high dimensional Hilbert spaces. But see von Neumann and the foundations of quantum statistical mechanics for examples of the latter.

Saturday, April 18, 2015

Summer of '69

I love this video. The clothes, the hair, the faces -- they're all so familiar. Every person in the video looks like someone I grew up with in the midwest :-)

Tuesday, April 14, 2015

2:1 faculty preference for women on STEM tenure track (PNAS)

The results described below suggest that faculty evaluators of STEM job applicants tend to favor women over men. Certainly, most departments receive strong incentives and signals from above to increase numbers of women and underrepresented minorities among their faculty. Women could still face obstacles at other points in their careers, such as during promotion or merit reviews, or in the competition for resources such as grant funding or lab space. Nevertheless, I think gender discrimination has decreased significantly during my adult life.

This article is also discussed in Nature. See also STEM, Gender, and Leaky Pipelines and Gender differences in preferences, choices, and outcomes. Earlier blog posts citing research by Ceci and Williams.
National hiring experiments reveal 2:1 faculty preference for women on STEM tenure track (PNAS)

Wendy M. Williams and Stephen J. Ceci

National randomized experiments and validation studies were conducted on 873 tenure-track faculty (439 male, 434 female) from biology, engineering, economics, and psychology at 371 universities/colleges from 50 US states and the District of Columbia. In the main experiment, 363 faculty members evaluated narrative summaries describing hypothetical female and male applicants for tenure-track assistant professorships who shared the same lifestyle (e.g., single without children, married with children). Applicants' profiles were systematically varied to disguise identically rated scholarship; profiles were counterbalanced by gender across faculty to enable between-faculty comparisons of hiring preferences for identically qualified women versus men. Results revealed a 2:1 preference for women by faculty of both genders across both math-intensive and non–math-intensive fields, with the single exception of male economists, who showed no gender preference. Results were replicated using weighted analyses to control for national sample characteristics. In follow-up experiments, 144 faculty evaluated competing applicants with differing lifestyles (e.g., divorced mother vs. married father), and 204 faculty compared same-gender candidates with children, but differing in whether they took 1-y-parental leaves in graduate school. Women preferred divorced mothers to married fathers; men preferred mothers who took leaves to mothers who did not. In two validation studies, 35 engineering faculty provided rankings using full curricula vitae instead of narratives, and 127 faculty rated one applicant rather than choosing from a mixed-gender group; the same preference for women was shown by faculty of both genders. These results suggest it is a propitious time for women launching careers in academic science. Messages to the contrary may discourage women from applying for STEM (science, technology, engineering, mathematics) tenure-track assistant professorships.

Saturday, April 11, 2015

IQ prediction from structural MRI

These authors use machine learning techniques to build sparse predictors based on grey/white matter volumes of specific regions. Correlations obtained are ~ 0.7 (see figure).

I predict that genomic estimators of this kind will be available once ~ 1 million genomes and cognitive scores are available for analysis. See also Myths, Sisyphus and g.
MRI-Based Intelligence Quotient (IQ) Estimation with Sparse Learning (PLOS)

In this paper, we propose a novel framework for IQ estimation using Magnetic Resonance Imaging (MRI) data. In particular, we devise a new feature selection method based on an extended dirty model for jointly considering both element-wise sparsity and group-wise sparsity. Meanwhile, due to the absence of large dataset with consistent scanning protocols for the IQ estimation, we integrate multiple datasets scanned from different sites with different scanning parameters and protocols. In this way, there is large variability in these different datasets. To address this issue, we design a two-step procedure for 1) first identifying the possible scanning site for each testing subject and 2) then estimating the testing subject’s IQ by using a specific estimator designed for that scanning site. We perform two experiments to test the performance of our method by using the MRI data collected from 164 typically developing children between 6 and 15 years old. In the first experiment, we use a multi-kernel Support Vector Regression (SVR) for estimating IQ values, and obtain an average correlation coefficient of 0.718 and also an average root mean square error of 8.695 between the true IQs and the estimated ones. In the second experiment, we use a single-kernel SVR for IQ estimation, and achieve an average correlation coefficient of 0.684 and an average root mean square error of 9.166. All these results show the effectiveness of using imaging data for IQ prediction, which is rarely done in the field according to our knowledge.
Training and testing of models was performed as described below. They had only 164 individuals in their sample, so IIUC the average correlation is computed on test samples of ~16 individuals. It would be good to see their predictors tested on larger data sets. I wonder how stable the predictor variables (feature coefficients) were across partitions.
We performed experiments with 10-fold cross-validations. Specifically, we randomly partitioned each dataset into 10 subsets with no replacement, and used 9 out of the 10 subsets for training and the remaining one for testing. To further avoid a possible bias during partitioning, we repeated the experiments 10 times.
Some background from the paper. Strangely, they don't cite the Thompson lab (UCLA) results on brain size and intelligence (21k individuals). IIRC from their results, brain size alone correlates 0.4 with IQ.
... Uncovering human intelligence has always been of major interest in cognitive neuroscience. With the advent of brain imaging, there have been efforts to investigate the relation between brain anatomy and intelligence [3,4], and substantial understanding has been achieved in the field. For example, Supekar et al. showed that the size and circuitry of certain parts of children’s brains could be a potential predictor for how well they would respond to intensive math tutoring [5]. Chen et al. [6] demonstrated that the volumetric analysis of gray matter (GM) from structural Magnetic Resonance Imaging (MRI) could be used to predict a subsequent decline in IQ in children with sickle cell disease. McDaniel et al. [3] found that the volume of the brain is positively correlated with IQ according to MRI-based experiments. Frangou et al. [7] reported positive correlations between IQ score and GM density of the orbitofrontal cortex, cingulate gyrus, cerebellum, and thalamus, but negative correlation between IQ score and the caudate nucleus. On the other hand, Navas-Sanchez et al. [8] investigated the relationship between IQ score and microstructure of white matter (WM) tracts using diffusion tensor imaging (DTI), and found that IQ score is positively correlated with fractional anisotropy (FA). Kim et al. [9] found that lower performance in verbal IQ score is correlated with the decrease of FA values. In another DTI-based study, Welcome et al. [10] discovered that the volume of WM fiber tracts is correlated with nonverbal IQ score. Inspired by these strong correlations between brain anatomy and IQ score, we propose, in this study, a novel framework to estimate IQ by using GM and WM features extracted from structural MRI. ...
Their results might give some indication as to which regions of the brain are responsible for most of the population variation in IQ. Below are the brain regions most commonly identified as "features" by sparse learning methods.

From the comments (55% of variance means a correlation just larger than 0.7):
There are lots of recent studies that have tried to estimate IQ from MRI or EEG readings (sometimes called "neurometric" IQ); many of the teams are based in South Korea and Malaysia. The Malaysian group, based at the MARA University of Technology, has published about a dozen papers over the past two years, involving hundreds of subjects. They can now use EEG readings to sort subjects into one of seven IQ ranges (e.g. 90-100, 120-130) with 83% accuracy; this figure jumps to 98% when subjects are sorted into one of three IQ ranges (low, medium, or high). The South Korean researchers, at Seoul National University, have been combining MRI and fMRI scans to predict IQ scores, and in late 2012 they were granted a patent for their "neurobiological method for measuring human intelligence," which can explain up to 55% of the variance between individual IQ scores. An example (from Dec 2013) is at

Additional links:

Thursday, April 09, 2015

For this you went to Harvard?

Personal assistants of the world, unite!

Better to reign in Hell, than serve in Heaven  ;-)
Dissent Magazine: ... When I was an undergrad at Harvard, the English department produced fancy brochures about the opportunities available to its majors: teacher, editor, Rhodes scholar. Personal assistant was not listed. I hadn’t even heard of such positions until senior year, when older friends, artistically inclined friends, started snagging them. It’s the position I think I’ve heard most about now.

Nearly every exclusive field runs on assistants. The actor James Franco, like Buddha before him, had an assistant keep track of his meals and school assignments. The critic and writer Daphne Merkin has employed a steady stream of Ivy-educated elves. They’re tasked with everything from editing to returning dead houseplants. Bestselling novelist John Irving (The Cider House Rules, A Prayer for Owen Meany) has an assistant who types up his roughly twenty-five pages of handwritten manuscript a day. He recruits exclusively from liberal arts schools in cold climates like Middlebury and Vassar, to ensure his hires can survive the winter at his home in Dorset, Vermont. During the 2008 presidential season, recent Harvard grad Eric Lesser impressed senior advisor to the president, David Axelrod, with his color-coded system for tracking Obama’s campaign luggage. Lesser was taken on as Axelrod’s “special assistant,” assuming responsibility for everything from supervising his boss’s diet to organizing the first-ever presidential Seder.

Welcome to the main artery into creative or elite work—highly pressurized, poorly recompensed, sometimes exhilarating, sometimes menial secretarial assistance. From the confluence of two grand movements in American history—the continued flight of women out of the home and into the workplace, and the growing population of arts and politically oriented college graduates struggling to survive in urban epicenters that are increasingly ceded to bankers and consultants—the personal assistant is born. ...

... One of the most exceptional—and mysterious—personal assistantship programs is run by a hedge fund billionaire in New York. For years, his human resources staff used to tuck the same discreet, neatly boxed advertisement in alongside the dense criticism of the New Republic and the New York Review of Books, as well as in Ivy League alumni magazines:
RESEARCH ASSOCIATE/PERSONAL ASSISTANT New York City—Highly intelligent, resourceful individuals with exceptional communication skills sought to undertake research projects and administrative tasks for one of Wall Street’s most successful entrepreneurs. We welcome applications from writers, musicians, artists, or others who may be pursuing other professional goals in the balance of their time. $90-110k/yr to start (depending on qualifications). Resume to:
The firm recruits and interviews year-round, whether there are openings or not. In addition to ads, the billionaire’s people email Phi Beta Kappa and summa students from top colleges about openings at the firm, though they are also likely scouting for assistants. “Although much of our work involves the use of advanced mathematical and computational techniques,” the email reads, “we are equally interested in speaking with brilliant liberal arts graduates, regardless of major, who are open to the possibility of a career they may never have previously considered.” It might be the only time in their lives that art students or English majors are courted by a potential employer. “The firm,” the email continues, “ . . . can give serious consideration only to individuals having extraordinary intellectual capabilities, communication skills, and general ‘real world’ competence.” Of the many who apply, a handful are called to New York, where their “real world competence” is quantified in no fewer than five management consulting-style interviews. Interviewees sign non-disclosure forms, and if hired as personal assistants, are essentially barred from saying where they work. When pressed, they might say they are writing books or “making music.” ...

Saturday, April 04, 2015

Multigenerational mobility: does the Son Also Rise?

The working paper below on multigenerational mobility arrives at smaller intergenerational correlations than Greg Clark obtained (e.g., 0.4 vs 0.7). I found Clark's results hard to explain, at least in genetic terms, because estimates of assortativity in mating are much lower than required.

Related posts here and here. From the second link:
Correlations as high as 0.7 -- 0.8 are implausible from genetic factors alone without highly assortative mating. Traits such as height and IQ have narrow sense heritabilities as large as h2 ~ 0.6, so fraction of variance accounted for is ~ 60%, and midparent-child correlation as high as ~ 0.8, but under even somewhat random mating the parental midpoint is significantly closer to average than the phenotype of the more exceptional parent. This would cause children to regress to the mean much faster in height and IQ than in social status as indicated in Clark's data. It's also important to note that social status itself is only imperfectly correlated to observable phenotypes such as IQ, Conscientiousness or Extraversion. See Intergenerational mobility: Bowles, Gintis, Clark for more.
Solon's results seem to be consistent with Bowles and Gintis.
What Do We Know So Far about Multigenerational Mobility?

Gary Solon
Michigan State University

“Multigenerational mobility” refers to the associations in socioeconomic status across three or more generations. This article begins by summarizing the longstanding but recently growing empirical literature on multigenerational mobility. It then discusses multiple theoretical interpretations of the empirical patterns, including the one recently proposed in Gregory Clark’s book The Son Also Rises.

... contrary to Clark’s prediction, most group-average studies other than his own – including the surnames-based work by Chetty et al. – have estimated much smaller intergenerational associations.
Clark was recently interviewed on KQED Forum. Michael Krasny was willing to entertain Clark's Social Darwinistic perspective ;-)

Income, wealth, and IQ

I'm occasionally asked about financial returns to cognitive ability. As a rough rule of thumb, judging from the graphs below (obtained here), I would say:
On average, an increase of IQ by one SD corresponds to  ~ $30k per annum of additional income. (Somewhat less than 1 SD in income; the distribution is far from normal.)

By early middle age, individuals > 90th percentile in IQ have, typically, more than twice the wealth of individuals who are of average IQ.
If you can find better data than what is shown below, please let me know. (How do bottom decile adults manage to earn ~ $40k per annum, on average? Does this include transfer payments?)

Of course, you can turn this around to estimate the increased (heritable) cognitive ability endowments of high income parents relative to average parents. That might help to clarify causality in results such as this one: "Studies show that children from low-income families have smaller brains and lower cognitive abilities." (Even Nature susceptible to faulty logic.)

Note, there is good evidence that positive returns to IQ persist above high thresholds (e.g., IQ=120, or even top 1 percent ability). See here and here.

Income mobility is strongly affected by IQ. In fact, IQ is a much stronger predictor variable than race for escaping the bottom quintile of income (Pew Trust report; NLSY again, AFQT=IQ scores):

This last figure is very problematic for the "Social Status/Wealth causes IQ" position. It seems to be the other way around: the kids escaping bottom quintile childhoods all experienced poverty, but the ones with higher cognitive ability were more likely to move up. (Recall that adopted children tend to resemble their biological parents much more than their adoptive ones; family environment has a limited effect on IQ, which is highly heritable.)
Pew: Individuals with higher test scores in adolescence are more likely to move out of the bottom quintile, and test scores can explain virtually the entire black-white mobility gap. Figure 13 plots the transition rates against percentiles of the AFQT test score distribution. The upward-sloping lines indicate that, as might be expected, individuals with higher test scores are much more likely to leave the bottom income quintile. For example, for whites, moving from the first percentile of the AFQT distribution to the median roughly doubles the likelihood from 42 percent to 81 percent. The comparable increase for blacks is even more dramatic, rising from 33 percent to 78 percent. Perhaps the most stunning finding is that once one accounts for the AFQT score, the entire racial gap in mobility is eliminated for a broad portion of the distribution. At the very bottom and in the top half of the distribution a small gap remains, but it is not statistically significant.

Blog Archive