| 
		
			|  | 1.5x speed |  | 
					Join Date: Sep 2015 Location: Houston, TX USA 
						Posts: 12,299
					 Favourites (more) :
 BB2023: Jordan CBB22: Gabby Allen     |  | 
	| 1.5x speed 
				 
				Join Date: Sep 2015 Location: Houston, TX USA 
					Posts: 12,299
				 Favourites (more) :
 BB2023: Jordan CBB22: Gabby Allen     | 
				 Psychology's Replication Crisis Is Real, Many Labs 2 Says 
 
			
			Source:https://www.theatlantic.com/science/...s-real/576223/ 
	Quote: 
	
		| Psychology’s Replication Crisis Is Running Out of Excuses
Another big project has found that only half of studies can be repeated. And this time, the usual explanations fall flat.
 
 Over the past few years, an international team of almost 200  psychologists has been trying to repeat a set of previously published  experiments from its field, to see if it can get the same results.  Despite its best efforts, the project, called Many Labs 2,  has only succeeded in 14 out of 28 cases. Six years ago, that might  have been shocking. Now it comes as expected (if still somewhat  disturbing) news.
 
 In recent years, it has become painfully clear that psychology is facing a “reproducibility crisis,” in which even famous, long-established phenomena—the stuff of textbooks and TED Talks—might not be real. There’s social priming, where subliminal exposures can influence our behavior. And ego depletion, the idea that we have a limited supply of willpower that can be exhausted. And the facial-feedback hypothesis, which simply says that smiling makes us feel happier.
 
 One  by one, researchers have tried to repeat the classic experiments behind  these well-known effects—and failed. And whenever psychologists undertake large projects, like Many Labs 2, in which they replicate past experiments en masse, they typically succeed, on average, half of the time.
 
 Ironically enough, it seems that one of the most reliable findings in psychology  is that only half of psychological studies can be successfully repeated.
 
 That failure rate is especially galling, says Simine Vazire  from the University of California at Davis, because the Many Labs 2  teams tried to replicate studies that had made a big splash and been  highly cited. Psychologists “should admit we haven’t been producing  results that are as robust as we’d hoped, or as we’d been advertising  them to be in the media or to policy makers,” she says. “That might risk  undermining our credibility in the short run, but denying this problem  in the face of such strong evidence will do more damage in the long  run.”
 
 Many psychologists have blamed these replication failures on  sloppy practices. Their peers, they say, are too willing to run small  and statistically weak studies that throw up misleading fluke results,  to futz around with the data until they get something interesting, or to only publish positive results while hiding negative ones in their file drawers.
 But skeptics have argued that the misleadingly named “crisis” has more mundane explanations.  First, the replication attempts themselves might be too small. Second,  the researchers involved might be incompetent, or lack the know-how to  properly pull off the original experiments. Third, people vary, and two  groups of scientists might end up with very different results if they do  the same experiment on two different groups of volunteers.
 
 The  Many Labs 2 project was specifically designed to address these  criticisms. With 15,305 participants in total, the new experiments had,  on average, 60 times as many volunteers as the studies they were  attempting to replicate. The researchers involved worked with the  scientists behind the original studies to vet and check every detail of  the experiments beforehand. And they repeated those experiments many  times over, with volunteers from 36 different countries, to see if the  studies would replicate in some cultures and contexts but not others.  “It’s been the biggest bear of a project,” says Brian Nosek from the Center for Open Science, who helped to coordinate it. “It’s 28 papers’ worth of stuff in one.”
 
 Despite the large sample sizes and the blessings of the original teams, the  team failed to replicate half of the studies it focused on. It couldn’t,  for example, show that people subconsciously exposed to the concept of  heat were more likely to believe in global warming, or that moral transgressions create a need for physical cleanliness in the style of Lady Macbeth, or that people who grow up with more siblings are more altruistic. And as in previous big projects,  online bettors were surprisingly good at predicting beforehand which  studies would ultimately replicate. Somehow, they could intuit which  studies were reliable.
 
 But  other intuitions were less accurate. In 12 cases, the scientists behind  the original studies suggested traits that the replicators should  account for. They might, for example, only find the same results in  women rather than men, or in people with certain personality traits. In  almost every case, those suggested traits proved to be irrelevant. The  results just weren’t that fickle.
 
 Likewise,  Many Labs 2 “was explicitly designed to examine how much effects varied  from place to place, from culture to culture,” says Katie Corker,  the chair of the Society for the Improvement of Psychological Science.  “And here’s the surprising result: The results do not show much  variability at all.” If one of the participating teams successfully  replicated a study, others did, too. If a study failed to replicate, it  tended to fail everywhere.
 
 It’s worth dwelling on this because  it’s a serious blow to one of the most frequently cited criticisms of  the “reproducibility crisis” rhetoric. Surely, skeptics argue, it’s a  fantasy to expect studies to replicate everywhere. “There’s a massive  deference to the sample,” Nosek says. “Your replication attempt failed?  It must be because you did it in Ohio and I did it in Virginia, and  people are different. But these results suggest that we can’t just wave  those failures away very easily.”
 
 This doesn’t mean that cultural differences in behavior are irrelevant. As Yuri Miyamoto  from the University of Wisconsin at Madison notes in an accompanying  commentary, “In the age of globalization, psychology has remained  largely European [and] American.” Many researchers have noted that  volunteers from Western, educated, industrialized, rich, and democratic  countries—WEIRD nations—are an unusual slice of humanity who think differently than those from other parts of the world.
 In the majority of the Many Labs 2 experiments, the team found very few differences between WEIRD volunteers and those from other countries. But Miyamoto notes that its analysis was a little crude—in considering “non-WEIRD  countries” together, it’s lumping together people from cultures as  diverse as Mexico, Japan, and South Africa. “Cross-cultural research,”  she writes, “must be informed with thorough analyses of each and all of  the cultural contexts involved.”
 
 Nosek  agrees. He’d love to see big replication projects that include more  volunteers from non-Western societies, or that try to check phenomena  that you’d expect to vary considerably outside the WEIRD bubble. “Do we need to assume that WEirDness matters as much as we think it does?” he asks. “We don’t have a good evidence base for that.”
 
 Sanjay Srivastava  from the University of Oregon says the lack of variation in Many Labs 2  is actually a positive thing. Sure, it suggests that the large number  of failed replications really might be due to sloppy science. But it  also hints that the fundamental business of psychology—creating careful  lab experiments to study the tricky, slippery, complicated world of the  human mind—works pretty well. “Outside the lab, real-world phenomena can  and probably do vary by context,” he says. “But within our carefully  designed studies and experiments, the results are not chaotic or  unpredictable. That means we can do valid social-science research.”
 
 The alternative would be much worse. If it turned out that people were so  variable that even very close replications threw up entirely different  results, “it would mean that we could not interpret our experiments,  including the positive results, and could not count on them happening  again,” Srivastava says. “That might allow us to dismiss failed  replications, but it would require us to dismiss original studies, too.  In the long run, Many Labs 2 is a much more hopeful and optimistic  result.”
 |  |