Wednesday, December 10, 2014

There is nothing new about the Knowledge Café or is there? by David Gurteen

There is nothing new about the Knowledge Café or is there? 

When people say that something is not new, they usually mean that they are familiar with the concept and its in common practice. 

To my mind, when this objection is levelled at the Knowledge Cafe - it means that they do not fully understand it. 

When I look at how organizations operate and the behaviours of people in organizations - it is quite apparent that people are either not aware of the fundamental principles and the power of good conversation or they understand them but do not to change their way of doing things either out of habit, laziness or choice. 

Why in meetings and presentations are we still so dependent on Powerpoint? Why is the dominant format of a talk, a long presentation with lots of Powerpoint slides and a very short time for Q&A? Why is no time included for reflection and no time for conversations amongst the participants in order for them to engage with the topic or issue? Why do we insist on talking at each other rather than with each other. 

Why is the dominant layout of our meeting rooms: either lecture style or large tables, when we know from experience and observation that these layouts are not conducive to good conversation? The research shows that good conversations take place in small groups of 3 or 4 people sitting around a small round table or even no table at all. 

Why in meetings, especially those where the people do not know each other well, do we not allow time for socialisation and relationship building before getting down to business when again the research shows that such socialisation improves people's cognitive skills. Why are circles rarely used in meeting's when the research and our own personal experience demonstrates their power? 

Why do managers and facilitators seek to control meetings so tightly and are afraid of negative talk or dissent. By surpressing people's fears, doubts and uncertainties - you do not eliminate them - you just drive them underground. Peter Block says "Yes" has no meaning if there is not the option to say "No". You need to bring people's doubts and fears out into the open and talk about them at length. 

And why when we know from research that group intelligence relates to how members of a team talk to each other. That it depends on the social sensitivity of the group members and on the readiness of the group to allow members to take equal turns in the conversation. And that groups where one person dominates are less collectively intelligent than in groups where the conversational turns are more evenly distributed, do we allow the same old people to dominate the conversations in our meetings and do nothing to encourage the quieter ones to engage and speak up. 

The Knowledge Cafe may not be totally new but it addresses all these issues and more but as a conversational method is still sadly very poorly adopted. 

In fact in many organizations conversation is seen as wasting time. But slowly this is changing. More and more people are starting to understand the power of conversation and take a conversational approach to the way that they connect, relate and work with each other. They see themselves as Conversational Leaders.

Wednesday, November 12, 2014

The Most Hilarious Proofreading Mistake in a Scientific Paper Ever by George Dvorsky

This is an actual quote from a scientific paper, published recently — and apparently without editing. Apparently the authors didn't think much of one of the papers they were citing. And their publisher didn't bother to edit out their pre-publication snark.
Ugh, this is not the kind of thing you want to see in a scientific journal. It makes us lose faith in peer review, and by consequence, the scientific method itself.
Four months after being published, someone finally noticed that a fish mating paper in the journal Ethology — "Variation in Melanism and Female Preference in Proximate but Ecologically Distinct Environments" — contained a rather embarrassing passage that both the authors and the peer reviewers failed to notice.
As Retraction Watch reported yesterday, the journal quickly removed the paper after the issue was brought to light.
Later, corresponding author Zach Culumber told Retraction Watch:
No, this was not intentional. It was added into the paper by a co-author during revision (after peer-review). It was unfortunately an oversight that became incorporated into the paper during the process of sending the manuscript back and forth between co-authors. The comment in question was not spotted during the proofing process with the journal. Neither myself nor any of the co-authors have any ill-will towards any other investigators, and I would never condone this sentiment towards another person or their work. We are working with the Journal now to correct the mistake. As the corresponding author, I apologize for the error.
Wiley says it's going to investigate the error and republish a corrected version as soon as possible, which now appears to have been done.

Sunday, November 9, 2014

Does Media Violence Predict Societal Violence? It Depends on What You Look at and When by Christopher J. Ferguson

This article presents 2 studies of the association of media violence rates with societal violence rates. In the first study, movie violence and homicide rates are examined across the 20th century and into the 21st (1920–2005). Throughout the mid-20th century small-to-moderate correlational relationships can be observed between movie violence and homicide rates in the United States. This trend reversed in the early and latter 20th century, with movie violence rates inversely related to homicide rates. In the second study, videogame violence consumption is examined against youth violence rates in the previous 2 decades. Videogame consumption is associated with a decline in youth violence rates. Results suggest that societal consumption of media violence is not predictive of increased societal violence rates.

Tuesday, October 21, 2014

Popular Mechanics: 6 Warning Signs That a Scientific Study is Bogus

Was the Paper Published in a Peer-Reviewed Journal?

"If it wasn't, you have no reason to trust it," says Ivan Oransky, former executive editor at Reuters and cofounder of the blog Retraction Watch. "The peer-review system, as flawed as it is, stands between us and really poor science." Also, find out if the journal or its publisher is on Jeffrey Beall's list of questionable open-access journals, at

What is the Journal's Impact Factor?

The impact factor is the average number of times a journal's papers are cited by other researchers. You can usually find this information on the journal's home page or by searching "impact factor" along with its name. Check out the impact factor of other journals in that field of research to see how they compare. 

Do the Researchers Cite Their Own Papers?

If so, this is a red flag that they are promoting views that fall outside the scientific consensus. Citations are listed at the end of a paper. 

How Many Test Subjects Were Used?

A large number of test subjects makes a study more robust and reduces the likelihood that the results are random. In general, the more questions a paper asks, the greater its sample size should be. Most reliable papers contain something called a p-value, which measures the probability (p) that a study's results occurred by random chance. In science a p-value of 0.05 suggests the study's conclusions may be meaningful. Smaller p-values are better. 

Does it Rely on Correlation?

Cigarette smoking has declined dramatically in the U.S. in the past few decades, and so has the national homicide rate. But just because two events occur at the same time doesn't mean that one caused the other. 

Have the Results Been Reproduced?

To find out, search the paper's name on Google Scholar and click on the Cited By link beneath the name. This will list other researchers who mention the paper in their own publications, and may also give you a clearer view of how other researchers critiqued the paper. 

Monday, August 18, 2014

Surveys Can Make People Go Extreme by Esther Inglis-Arkell

There are all kinds of reasons why people don't tell the truth when asked questions. Sometimes they suddenly turn into fanatics. They hate, or love, anything. Here's how you catch people when they go extreme, or when they try to just get along.
We already know that people deliberately lie when given surveys on sex and drugs, but they also lie when given surveys about the importance of flossing and whether people should smoke in shopping malls. The difference is, many people don't even know that they're lying. People are driven to exaggerate (or even invent) their likes and dislikes, and so when they're asked to score, from one to five, their support for an issue or agreement with a statement, they avoid the middle and go right for one and for five.
This bias, called "extreme response bias" has annoyed many manufacturers, or politicians, who believed their targeted audience was passionately in favor of a new flavor of coke or a ban on littering, trotted the idea out, and gotten a lackluster response. Sometimes people are actually passionate about a subject, and sometimes they just want to be that way. Researchers took a look at separating out the two. They came up with a few guidelines to tell if people were inflating their opinions.
First of all, the more options you give a person, the more likely they are to go for the fringe opinion. A survey asking people to rate their experience on a scale of one to five will get far fewer extreme responses than a survey that asks people to rate their experience on a scale from one to ten. Individually, people with more education tend to be less extreme in their responses. The most telling variable, though, is another kind of bias.
Acquiescence bias is the tendency of a surveyed individual to go along with whatever the surveyor suggests. This is why researchers agonize over trying to make each question as neutral as possible. Ask people "don't you think smoking should be completely banned in malls," and they will tend to say yes. Ask them, "don't you think people should be allowed to smoke in public malls," and they will also tend to say yes. In order to be accurate, researchers can't tip their hands and let people know what answer they expect, or want. If, on the other hand, what the researchers want is to tell how many people responding to their questions are just going along with it, they can put out two surveys, one with a question that tips people one way, and one with a re-worded version of the question that tips people the other way.
Acquiescence bias tends to be a harbinger of extreme response. If people aren't going to be honest - either with the surveyors or themselves - they're at least going to be enthusiastically dishonest. So the more acquiescence everyone gets, the more extremity they should expect to see.

Friday, July 25, 2014

New algorithm identifies data subsets that will yield the most reliable predictions by Larry Hardesty

Much artificial-intelligence research addresses the problem of making predictions based on large data sets. An obvious example is the recommendation engines at retail sites like Amazon and Netflix.

But some types of data are harder to collect than online click histories —information about geological formations thousands of feet underground, for instance. And in other applications—such as trying to predict the path of a storm—there may just not be enough time to crunch all the available data.
Dan Levine, an MIT graduate student in aeronautics and astronautics, and his advisor, Jonathan How, the Richard Cockburn Maclaurin Professor of Aeronautics and Astronautics, have developed a new technique that could help with both problems. For a range of common applications in which data is either difficult to collect or too time-consuming to process, the technique can identify the subset of data items that will yield the most reliable predictions. So geologists trying to assess the extent of underground petroleum deposits, or meteorologists trying to forecast the weather, can make do with just a few, targeted measurements, saving time and money.
Levine and How, who presented their work at the Uncertainty in Artificial Intelligence conference this week, consider the special case in which something about the relationships between data items is known in advance. Weather prediction provides an intuitive example: Measurements of temperature, pressure, and wind velocity at one location tend to be good indicators of measurements at adjacent locations, or of measurements at the same location a short time later, but the correlation grows weaker the farther out you move either geographically or chronologically.

Tuesday, July 22, 2014

Emotional Contagion on Facebook? More Like Bad Research Methods by JOHN M. GROHOL, PSY.D.

A study (Kramer et al., 2014) was recently published that showed something astonishing — people altered their emotions and moods based upon the presence or absence of other people’s positive (and negative) moods, as expressed on Facebook status updates. The researchers called this effect an “emotional contagion,” because they purported to show that our friends’ words on our Facebook news feed directly affected our own mood.

Nevermind that the researchers never actually measured anyone’s mood.

And nevermind that the study has a fatal flaw. One that other research has also overlooked — making all these researchers’ findings a bit suspect.

Putting aside the ridiculous language used in these kinds of studies (really, emotions spread like a “contagion”?), these kinds of studies often arrive at their findings by conducting language analysis on tiny bits of text. On Twitter, they’re really tiny — less than 140 characters. Facebook status updates are rarely more than a few sentences. The researchers don’t actually measure anybody’s mood.

So how do you conduct such language analysis, especially on 689,003 status updates? Many researchers turn to an automated tool for this, something called the Linguistic Inquiry and Word Count application (LIWC 2007). This software application is described by its authors as:

The first LIWC application was developed as part of an exploratory study of language and disclosure (Francis, 1993; Pennebaker, 1993). As described below, the second version, LIWC2007, is an updated revision of the original application.
Note those dates. Long before social networks were founded, the LIWC was created to analyze large bodies of text — like a book, article, scientific paper, an essay written in an experimental condition, blog entries, or a transcript of a therapy session. Note the one thing all of these share in common — they are of good length, at minimum 400 words.

Why would researchers use a tool not designed for short snippets of text to, well… analyze short snippets of text? Sadly, it’s because this is one of the few tools available that can process large amounts of text fairly quickly.

Who Cares How Long the Text is to Measure?

You might be sitting there scratching your head, wondering why it matters how long the text it is you’re trying to analyze with this tool. One sentence, 140 characters, 140 pages… Why would length matter?

Length matters because the tool actually isn’t very good at analyzing text in the manner that Twitter and Facebook researchers have tasked it with. When you ask it to analyze positive or negative sentiment of a text, it simply counts negative and positive words within the text under study. For an article, essay or blog entry, this is fine — it’s going to give you a pretty accurate overall summary analysis of the article since most articles are more than 400 or 500 words long.
For a tweet or status update, however, this is a horrible analysis tool to use. That’s because it wasn’t designed to differentiate — and in fact, can’t differentiate — a negation word in a sentence.1

Let’s look at two hypothetical examples of why this is important. Here are two sample tweets (or status updates) that are not uncommon:
    “I am not happy.”
    “I am not having a great day.”
An independent rater or judge would rate these two tweets as negative — they’re clearly expressing a negative emotion. That would be +2 on the negative scale, and 0 on the positive scale.

But the LIWC 2007 tool doesn’t see it that way. Instead, it would rate these two tweets as scoring +2 for positive (because of the words “great” and “happy”) and +2 for negative (because of the word “not” in both texts).

That’s a huge difference if you’re interested in unbiased and accurate data collection and analysis.

And since much of human communication includes subtleties such as this — without even delving into sarcasm, short-hand abbreviations that act as negation words, phrases that negate the previous sentence, emojis, etc. — you can’t even tell how accurate or inaccurate the resulting analysis by these researchers is. Since the LIWC 2007 ignores these subtle realities of informal human communication, so do the researchers.2

Perhaps it’s because the researchers have no idea how bad the problem actually is. Because they’re simply sending all this “big data” into the language analysis engine, without actually understanding how the analysis engine is flawed. Is it 10 percent of all tweets that include a negation word? Or 50 percent? Researchers couldn’t tell you.3

Even if True, Research Shows Tiny Real World Effects

Which is why I have to say that even if you believe this research at face value despite this huge methodological problem, you’re still left with research showing ridiculously small correlations that have little to no meaning to ordinary users.

For instance, Kramer et al. (2014) found a 0.07% — that’s not 7 percent, that’s 1/15th of one percent!! — decrease in negative words in people’s status updates when the number of negative posts on their Facebook news feed decreased. Do you know how many words you’d have to read or write before you’ve written one less negative word due to this effect? Probably thousands.
This isn’t an “effect” so much as a statistical blip that has no real-world meaning. The researchers themselves acknowledge as much, noting that their effect sizes were “small (as small as d = 0.001).” They go on to suggest it still matters because “small effects can have large aggregated consequences” citing a Facebook study on political voting motivation by one of the same researchers, and a 22 year old argument from a psychological journal.4

But they contradict themselves in the sentence before, suggesting that emotion “is difficult to influence given the range of daily experiences that influence mood.” Which is it? Are Facebook status updates significantly impacting individual’s emotions, or are emotions not so easily influenced by simply reading other people’s status updates??

Despite all of these problems and limitations, none of it stops the researchers in the end from proclaiming, “These results indicate that emotions expressed by others on Facebook influence our own emotions, constituting experimental evidence for massive-scale contagion via social networks.”5 Again, no matter that they didn’t actually measure a single person’s emotions or mood states, but instead relied on a flawed assessment measure to do so.

What the Facebook researchers clearly show, in my opinion, is that they put too much faith in the tools they’re using without understanding — and discussing — the tools’ significant limitations.6


Kramer, ADI, Guillory, JE, Hancock, JT. (2014). Experimental evidence of massive-scale emotional contagion through social networks. PNAS.

  1. This according to an inquiry to the LIWC developers who replied, “LIWC doesn’t currently look at whether there is a negation term near a positive or negative emotion term word in its scoring and it would be difficult to come up with an effective algorithm for this anyway.” []
  2. I could find no mention of the limitations of the use of the LIWC as a language analysis tool for purposes it was never designed or intended for in the present study, or other studies I’ve examined. []
  3. Well, they could tell you if they actually spent the time validating their method with a pilot study to compare against measuring people’s actual moods. But these researchers failed to do this. []
  4. There are some serious issues with the Facebook voting study, the least of which is attributing changes in voting behavior to one correlational variable, with a long list of assumptions the researchers made (and that you would have to agree with). []
  5. A request for clarification and comment by the authors was not returned. []
  6. This isn’t a dig at the LIWC 2007, which can be an excellent research tool — when used for the right purposes and in the right hands. []

Monday, July 14, 2014

io9: Anti-Obamacare Ads Backfired, Says A New Statistical Analysis by Mark Strauss

Opponents of the Affordable Care Act have spent an estimated $450 million on political ads attacking the law, outspending supporters of Obamacare 15-to-1. But a state-by-state comparison of negative ads and enrollment figures suggests the attacks ads actually increased public awareness of the healthcare program.
Niam Yaraghi, a Brookings Institution expert on the economics of healthcare, based his analysis on recently released data (below) that tallies how much money was spent on anti-Obamacare ads in each state.
He then examined Affordable Care Act (ACA) data to determine enrollment ratios. Although more than 8 million Americans have signed-up to purchase health insurance through the marketplaces during the first open enrollment period, that number masks the tremendous variation in participation across states. For instance, while the enrollment percentage in Minnesota is slightly above 5%, in Vermont, close to 50%of all eligible individuals have signed up for Obamacare.
Yataghi found that after controlling for other state characteristics such as low per capita income population and average insurance premiums, he observed a positive association between the anti-ACA spending and enrollment:
This implies that anti-ACA ads may unintentionally increase the public awareness about the existence of a governmentally subsidized service and its benefits for the uninsured. On the other hand, an individual's prediction about the chances of repealing the ACA may be associated with the volume of advertisements against it. In the states where more anti-ACA ads are aired, residents were on average more likely to believe that Congress will repeal the ACA in the near future. People who believe that subsidized health insurance may soon disappear could have a greater willingness to take advantage of this one time opportunity.

Thursday, April 10, 2014

How to read and understand a scientific paper: a guide for non-scientists

Before you begin: some general advice
Reading a scientific paper is a completely different process than reading an article about science in a blog or newspaper. Not only do you read the sections in a different order than they’re presented, but you also have to take notes, read it multiple times, and probably go look up other papers for some of the details. Reading a single paper may take you a very long time at first. Be patient with yourself. The process will go much faster as you gain experience.

Monday, March 24, 2014

Use the "Triple Nod" in Interviews

The triple nod is the non-verbal equivalent of the ellipses. It is a nonverbal cue for someone to keep talking. If you are introverted and aren't great at making conversations, you want to encourage the person you are speaking with to keep talking. Once they are done speaking and pause, nod three times in quick succession and they will often continue. If not, you can pick up where the conversation left off, but this is a great way of showing engagement and lengthening a discussion.

Thursday, February 27, 2014

Publishers withdraw more than 120 gibberish papers by Richard Van Noorden

Conference proceedings removed from subscription databases after scientist reveals that they were computer-generated.

The publishers Springer and IEEE are removing more than 120 papers from their subscription services after a French researcher discovered that the works were computer-generated nonsense.

Over the past two years, computer scientist Cyril Labbé of Joseph Fourier University in Grenoble, France, has catalogued computer-generated papers that made it into more than 30 published conference proceedings between 2008 and 2013. Sixteen appeared in publications by Springer, which is headquartered in Heidelberg, Germany, and more than 100 were published by the Institute of Electrical and Electronic Engineers (IEEE), based in New York. Both publishers, which were privately informed by Labbé, say that they are now removing the papers.

Among the works were, for example, a paper published as a proceeding from the 2013 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering, held in Chengdu, China. (The conference website says that all manuscripts are “reviewed for merits and contents”.) The authors of the paper, entitled ‘TIC: a methodology for the construction of e-commerce’, write in the abstract that they “concentrate our efforts on disproving that spreadsheets can be made knowledge-based, empathic, and compact”. (Nature News has attempted to contact the conference organizers and named authors of the paper but received no reply*; however at least some of the names belong to real people. The IEEE has now removed the paper).

*Update: One of the named authors replied to Nature News on 25 February. He said that he first learned of the article when conference organizers notified his university in December 2013; and that he does not know why he was a listed co-author on the paper. "The matter is being looked into by the related investigators," he said. 


How to create a nonsense paper

Labbé developed a way to automatically detect manuscripts composed by a piece of software called SCIgen, which randomly combines strings of words to produce fake computer-science papers. SCIgen was invented in 2005 by researchers at the Massachusetts Institute of Technology (MIT) in Cambridge to prove that conferences would accept meaningless papers — and, as they put it, “to maximize amusement” (see ‘Computer conference welcomes gobbledegook paper’). A related program generates random physics manuscript titles on the satirical website arXiv vs. snarXiv. SCIgen is free to download and use, and it is unclear how many people have done so, or for what purposes. SCIgen’s output has occasionally popped up at conferences, when researchers have submitted nonsense papers and then revealed the trick.

Labbé does not know why the papers were submitted — or even if the authors were aware of them. Most of the conferences took place in China, and most of the fake papers have authors with Chinese affiliations. Labbé has emailed editors and authors named in many of the papers and related conferences but received scant replies; one editor said that he did not work as a program chair at a particular conference, even though he was named as doing so, and another author claimed his paper was submitted on purpose to test out a conference, but did not respond on follow-up. Nature has not heard anything from a few enquiries.

“I wasn’t aware of the scale of the problem, but I knew it definitely happens. We do get occasional e-mails from good citizens letting us know where SCIgen papers show up,” says Jeremy Stribling, who co-wrote SCIgen when he was at MIT and now works at VMware, a software company in Palo Alto, California.
“The papers are quite easy to spot,” says Labbé, who has built a website where users can test whether papers have been created using SCIgen. His detection technique, described in a study1 published in Scientometrics in 2012, involves searching for characteristic vocabulary generated by SCIgen. Shortly before that paper was published, Labbé informed the IEEE of 85 fake papers he had found. Monika Stickel, director of corporate communications at IEEE, says that the publisher “took immediate action to remove the papers” and “refined our processes to prevent papers not meeting our standards from being published in the future”. In December 2013, Labbé informed the IEEE of another batch of apparent SCIgen articles he had found. Last week, those were also taken down, but the web pages for the removed articles give no explanation for their absence.
Ruth Francis, UK head of communications at Springer, says that the company has contacted editors, and is trying to contact authors, about the issues surrounding the articles that are coming down. The relevant conference proceedings were peer reviewed, she confirms — making it more mystifying that the papers were accepted.
The IEEE would not say, however, whether it had contacted the authors or editors of the suspected SCIgen papers, or whether submissions for the relevant conferences were supposed to be peer reviewed. “We continue to follow strict governance guidelines for evaluating IEEE conferences and publications,” Stickel said.


A long history of fakes

Labbé is no stranger to fake studies. In April 2010, he used SCIgen to generate 102 fake papers by a fictional author called Ike Antkare [see pdf]. Labbé showed how easy it was to add these fake papers to the Google Scholar database, boosting Ike Antkare’s h-index, a measure of published output, to 94 — at the time, making Antkare the world's 21st most highly cited scientist. Last year, researchers at the University of Granada, Spain, added to Labbé’s work, boosting their own citation scores in Google Scholar by uploading six fake papers with long lists to their own previous work2.

Labbé says that the latest discovery is merely one symptom of a “spamming war started at the heart of science” in which researchers feel pressured to rush out papers to publish as much as possible.
There is a long history of journalists and researchers getting spoof papers accepted in conferences or by journals to reveal weaknesses in academic quality controls — from a fake paper published by physicist Alan Sokal of New York University in the journal Social Text in 1996, to a sting operation by US reporter John Bohannon published in Science in 2013, in which he got more than 150 open-access journals to accept a deliberately flawed study for publication.

Labbé emphasizes that the nonsense computer science papers all appeared in subscription offerings. In his view, there is little evidence that open-access publishers — which charge fees to publish manuscripts — necessarily have less stringent peer review than subscription publishers.

Labbé adds that the nonsense papers were easy to detect using his tools, much like the plagiarism checkers that many publishers already employ. But because he could not automatically download all papers from the subscription databases, he cannot be sure that he has spotted every SCIgen-generated paper.

Sunday, January 26, 2014

The changing face of psychology

 Psychology is championing important changes to culture and practice, including a greater emphasis on transparency, reliability, and adherence to the scientific method. Photograph: Sebastian Kaulitzki/Alamy

After 50 years of stagnation in research practices, psychology is leading reforms that will benefit all life sciences.

In 1959, an American researcher named Ted Sterling reported something disturbing. Of 294 articles published across four major psychology journals, 286 had reported positive results – that is, a staggering 97% of published papers were underpinned by statistically significant effects. Where, he wondered, were all the negative results – the less exciting or less conclusive findings? Sterling labelled this publication bias a form of malpractice. After all, getting published in science should never depend on getting the “right results”.

You might think that Sterling’s discovery would have led the psychologists of 1959 to sit up and take notice. Groups would be assembled to combat the problem, ensuring that the scientific record reflected a balanced sum of the evidence. Journal policies would be changed, incentives realigned.

Sadly, that never happened. Thirty-six years later, in 1995, Sterling took another look at the literature and found exactly the same problem – negative results were still being censored. Fifteen years after that, Daniele Fanelli from the University of Edinburgh confirmed it yet again. Publication bias had turned out to be the ultimate bad car smell, a prime example of how irrational research practices can linger on and on.

Now, finally, the tide is turning. A growing number of psychologists – particularly the younger generation – are fed up with results that don’t replicate, journals that value story-telling over truth, and an academic culture in which researchers treat data as their personal property. Psychologists are realising that major scientific advances will require us to stamp out malpractice, face our own weaknesses, and overcome the ego-driven ideals that maintain the status quo.
Here are five key developments to watch in 2014.
1. Replication

The problem: The best evidence for a genuine discovery is showing that independent scientists can replicate it using the same method. If it replicates repeatedly then we can use it to build better theories. If it doesn't then it belongs in the trash bin of history. This simple logic underpins all science – without replication we’d still believe in phlogiston and faster-than-light neutrinos.

In psychology, attempts to closely reproduce previous methods are rarely attempted. Psychologists tend to see such work as boring, lacking in intellectual prowess, and a waste of limited resources. Some of the most prominent psychology journals even have explicit policies against publishing replications, instead offering readers a diet of fast food: results that are novel, eye catching, and even counter-intuitive. Exciting results are fine provided they replicate. The problem is that nobody bothers to try, which litters the field with results of unknown (likely low) value.

How it’s changing: The new generation of psychologists understands that independent replication is crucial for real advancement and to earn wider credibility in science. A beautiful example of this drive is the Many Labs project led by Brian Nosek from the University of Virginia. Nosek and a team of 50 colleagues located in 36 labs worldwide sought to replicate 13 key findings in psychology, across a sample of 6,344 participants. Ten of the effects replicated successfully.

Journals are also beginning to respect the importance of replication. The prominent outlet Perspectives on Psychological Science recently launched an initiative that specifically publishes direct replications of previous studies. Meanwhile, journals such as BMC Psychology and PLOS ONE officially disown the requirement for researchers to report novel, positive findings.

2. Open Access 

The problem: Strictly speaking, most psychology research isn’t really “published” – it is printed within journals that expressly deny access to the public (unless you are willing to pay for a personal subscription or spend £30+ on a single article). Some might say this is no different to traditional book publishing, so what's the problem? But remember that the public being denied access to science is the very same public that already funds most psychology research, including the subscription fees for universities. So why, you might ask, is taxpayer-funded research invisible to the taxpayers that funded it? The answer is complicated enough to fill a 140-page government report, but the short version is that the government places the business interests of corporate publishers ahead of the public interest in accessing science.
How it’s changing: The open access movement is growing in size and influence. Since April 2013, all research funded by UK research councils, including psychology, must now be fully open access – freely viewable to the public. Charities such as the Wellcome Trust have similar policies. These moves help alleviate the symptoms of closed access but don’t address the root cause, which is market dominance by traditional subscription publishers. Rather than requiring journals to make articles publicly available, the research councils and charities are merely subsidising those publishers, in some cases paying them extra for open access on top of their existing subscription fees. What other business in society is paid twice for a product that it didn’t produce in the first place? It remains a mystery who, other than the publishers themselves, would call this bizarre set of circumstances a “solution”.

3. Open Science

The problem: Data sharing is crucial for science but rare in psychology. Even though ethical guidelines require authors to share data when requested, such requests are usually ignored or denied, even when coming from other psychologists. Failing to publicly share data makes it harder to do meta-analysis and easier for unscrupulous researchers to get away with fraud. The most serious fraud cases, such as Diederik Stapel, would have been caught years earlier if journals required the raw data to be published alongside research articles.

How it’s changing: Data sharing isn’t yet mandatory, but it is gradually becoming unacceptable for psychologists not to share. Evidence shows that studies which share data tend to be more accurate and less likely to make statistical errors. Public repositories such as Figshare and the Open Science Framework now make the act of sharing easy, and new journals including the Journal of Open Psychology Data have been launched specifically to provide authors with a way of publicising data sharing.

Some existing journals are also introducing rewards to encourage data sharing. Since 2014, authors who share data at the journal Psychological Science will earn an Open Data badge, printed at the top of the article. Coordinated data sharing carries all kinds of other benefits too – for instance, it allows future researchers to run meta-analysis on huge volumes of existing data, answering questions that simply can’t be tackled with smaller datasets.

4. Bigger Data

The problem: We’ve known for decades that psychology research is statistically underpowered. What this means is that even when genuine phenomena exist, most experiments don’t have sufficiently large samples to detect them. The curse of low power cuts both ways: not only is an underpowered experiment likely to miss finding water in the desert, it’s also more likely to lead us to a mirage.

How it’s changing: Psychologists are beginning to develop innovative ways to acquire larger samples. An exciting approach is Internet testing, which enables easy data collection from thousands of participants. One recent study managed to replicate 10 major effects in psychology using Amazon’s Mechanical Turk. Psychologists are also starting to work alongside organisations that already collect large amounts of useful data (and no, I don’t mean GCHQ). A great example is collaborative research with online gaming companies. Tom Stafford from the University of Sheffield recently published an extraordinary study of learning patterns in over 850,000 people by working with a game developer.

5. Limiting Researcher "Degrees of Freedom"

The problem: In psychology, discoveries tend to be statistical. This means that to test a particular hypothesis, say, about motor actions, we might measure the difference in reaction times or response accuracy between two experimental conditions. Because the measurements contain noise (or “unexplained variability”), we rely on statistical tests to provide us with a level of certainty in the outcome. This is different to other sciences where discoveries are more black and white, like finding a new rock layer or observing a supernova.

Whenever experiments rely on inferences from statistics, researchers can exploit “degrees of freedom” in the analyses to produce desirable outcomes. This might involve trying different ways of removing statistical outliers or the effect of different statistical models, and then only reporting the approach that “worked” best in producing attractive results. Just as buying all the tickets in a raffle guarantees a win, exploiting researcher degrees of freedom can guarantee a false discovery.

The reason we fall into this trap is because of incentives and human nature. As Sterling showed in 1959, psychology journals select which studies to publish not based on the methods but on the results: getting published in the most prominent, career-making journals requires researchers to obtain novel, positive, statistically significant effects. And because statistical significance is an arbitrary threshold (p<.05), researchers have every incentive to tweak their analyses until the results cross the line. These behaviours are common in psychology – a recent survey led by Leslie John from Harvard University estimated that at least 60% of psychologists selectively report analyses that “work”. In many cases such behaviour may even be unconscious.

How it’s changing: The best cure for researcher degrees of freedom is to pre-register the predictions and planned analyses of experiments before looking at the data. This approach is standard practice in medicine because it helps prevent the desires of the researcher from influencing the outcome. Among the basic life sciences, psychology is now leading the way in advancing pre-registration. The journals Cortex, Attention Perception & Psychophysics, AIMS Neuroscience and Experimental Psychology offer pre-registered articles in which peer review happens before experiments are conducted. Not only does pre-registration put the reins on researcher degrees of freedom, it also prevents journals from selecting which papers to publish based on the results.

Journals aren’t the only organisations embracing pre-registration. The Open Science Framework invites psychologists to publish their protocols, and the 2013 Declaration of Helsinki now requires public pre-registration of all human research “before recruitment of the first subject”.

Friday, January 10, 2014

Feynman on Scientific Method

Now I'm going to discuss how we would look for a new law. In general, we look for a new law by the following process: first, we guess it, no, don’t laugh, that’s the truth. Then we compute the consequences of the guess, to see what, if this is right, if this law we guessed is right, to see what it would imply and then we compare the computation results to nature or we say compare to experiment or experience, compare it directly with observations to see if it works.

If it disagrees with experiment, it’s wrong! In that simple statement is the key to science. It doesn’t make any difference how beautiful your guess is, it doesn’t make a difference how smart you are, who made the guess, or what his name is… If it disagrees with experiment, it’s wrong. That’s all there is to it.

It is therefore not unscientific to make a guess, although many people who are not in science think it is. For instance, I had a conversation about flying saucers, some years ago, with a layman — because I am scientific I know all about flying saucers! I said “I don’t think there are flying saucers”. So the antagonist said, “Is it impossible that there are flying saucers? Can you prove that it’s impossible?” “No, I can’t prove it’s impossible. It’s just very unlikely”. At that he said, “You are very unscientific. If you can’t prove it impossible then how can you say that it’s unlikely?” But that is the way that is scientific. It is scientific only to say what is more likely and what less likely, and not to be proving all the time the possible and impossible. To define what I mean, I might have said to him, "Listen, I mean that from my knowledge of the world that I see around me, I think that it is much more likely that the reports of flying saucers are the results of the known irrational characteristics of terrestrial intelligence than of the unknown rational efforts of extra-terrestrial intelligence." It is just more likely. That is all, and it is a very good guess. And we always try to guess the most likely explanation, keeping in the back of our minds the fact that if it does not work, then we must discuss the other possiblities.

There was, for instance, for a while, a phenomenon called super-conductivity, there still is the phenomenon, which is that metals conduct electricity without resistance at low temperatures and it was not at first obvious that this was a consequence of the known laws with these particles. Now that it has been thought through carefully enough, it is seen in fact to be fully explainable in terms of our present knowledge.

There are other phenomena, such as extra-sensory perception, which cannot be explained by our knowledge of physics here. However, that phenomenon has not been well established, and we cannot guarantee that it is there. If it could be demonstrated, of course, that would prove that physics is incomplete, and it is therefore extremely interesting to physicists whether it is right or wrong. Many, many experiments exist which show that it doesn't work. The same goes for astrological influences. If it were true that the stars could affect the day that it was good to go to the dentist - in America we have that kind of astrology - then the physics theory would be wrong, because there is no mechanism understandable in principle from these things that would make it go. That is the reason that there is some scepticism among scientists with regard to those ideas.

Now you see of course that with this method we can disprove any definite theory. We have a definite theory, a real guess, from which you can clearly compute consequences which could be compared to experiment and in principle we can get rid of any theory. You can always prove any definite theory wrong. Notice however that we never prove it right.

Suppose you invent a good guess, calculate the consequences, and discover every time that the consequences you have calculated agree with experiment. The theory is then right? No, it is simply not proved wrong.

Another thing I must point out is that you cannot prove a vague theory wrong. If the guess that you make is poorly expressed and rather vague, and the method that you use for figuring out the consequences is a little vague —you are not sure, and you say, “I think everything’s right because it’s all due to so and so, and such and such do this and that more or less, and I can sort of explain how this works...” then you see that this theory is good, because it cannot be proved wrong! Also if the process of computing the consequences is indefinite, then with a little skill any experimental results can be made to look like the expected consequences. You are probably familiar with that in other fields. ‘A’ hates his mother. The reason is, of course, because she did not caress him or love him enough when he was a child. But if you investigate you find out that as a matter of fact she did love him very much, and everything was all right. Well then, it was because she was overindulgent when he was a child! By having a vague theory it is possible to get either result. The cure for this one is the following: if it were possible to state exactly, ahead of time, how much love is not enough, and how much love is over-indulgent, then there would be a perfectly legitimate theory against which you could make tests. It is usually said when this is pointed out--when you are dealing with psychological matters things can’t be defined so precisely. Yes, but then you cannot claim to know anything about it.

Leading Questions - Yes Prime Minister

Bernard Woolley: He's going to say something new and radical in the broadcast.

Sir Humphrey: What, that silly Grand Design? Bernard, that was precisely what you had to avoid! How did this come about, I shall need a very good explanation.

Bernard Woolley: Well, he's very keen on it.

Sir Humphrey: What's that got to do with it? Things don't happen just because Prime Ministers are very keen on them! Neville Chamberlain was very keen on peace.

Bernard Woolley: He thinks ... he thinks it’s a vote winner.

Sir Humphrey: Ah, that’s more serious. Sit down. What makes him think that?

Bernard Woolley: Well the party have had an opinion poll done and it seems all the voters are in favour of bringing back National Service.

Sir Humphrey: Well have another opinion poll done to show that they’re against bringing back National Service.

Bernard Woolley: They can’t be for and against

Sir Humphrey: Oh, of course they can Bernard! Have you ever been surveyed?

Bernard Woolley: Yes, well not me actually, my house … Oh I see what you mean

Sir Humphrey: You know what happens: nice young lady comes up to you. Obviously you want to create a good impression, you don’t want to look a fool, do you?

Bernard Woolley: No

Sir Humphrey: So she starts asking you some questions: Mr. Woolley, are you worried about the number of young people without jobs?

Bernard Woolley: Yes

Sir Humphrey Appleby: Are you worried about the rise in crime among teenagers?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Do you think there is lack of discipline in our Comprehensive Schools?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Do you think young people welcome some authority and leadership in their lives?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Do you think they respond to a challenge?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Would you be in favour of reintroducing National Service?

Bernard Woolley: Oh, well I suppose I might.

Sir Humphrey Appleby: Yes or no?

Bernard Woolley: Yes.

Sir Humphrey: Of course you would, Bernard. After all you told you can’t say no to that. So they don’t mention the first five questions and they publish the last one.
Bernard Woolley: Is that really what they do?
Sir Humphrey: Well, not the reputable ones, no, but there aren’t many of those. So alternatively the young lady can get the opposite result.
Bernard Woolley: How?

Sir Humphrey Appleby: Mr. Woolley, are you worried about the danger of war?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Are you worried about the growth of armaments?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Do you think there's a danger in giving young people guns and teaching them how to kill?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Do you think it's wrong to force people to take arms against their will?

Bernard Woolley: Yes.

Sir Humphrey Appleby: Would you oppose the reintroduction of National Service?

Bernard Woolley: Yes.

Sir Humphrey Appleby: There you are, you see, Bernard. The perfect balanced sample.