Tuesday, November 26, 2013

Twenty tips for interpreting scientific claims by William J. Sutherland, David Spiegelhalter & Mark Burgman


Differences and chance cause variation. The real world varies unpredictably. Science is mostly about discovering what causes the patterns we see. Why is it hotter this decade than last? Why are there more birds in some areas than others? There are many explanations for such trends, so the main challenge of research is teasing apart the importance of the process of interest (for example, the effect of climate change on bird populations) from the innumerable other sources of variation (from widespread changes, such as agricultural intensification and spread of invasive species, to local-scale processes, such as the chance events that determine births and deaths).
No measurement is exact. Practically all measurements have some error. If the measurement process were repeated, one might record a different result. In some cases, the measurement error might be large compared with real differences. Thus, if you are told that the economy grew by 0.13% last month, there is a moderate chance that it may actually have shrunk. Results should be presented with a precision that is appropriate for the associated error, to avoid implying an unjustified degree of accuracy.

Bias is rife. Experimental design or measuring devices may produce atypical results in a given direction. For example, determining voting behaviour by asking people on the street, at home or through the Internet will sample different proportions of the population, and all may give different results. Because studies that report 'statistically significant' results are more likely to be written up and published, the scientific literature tends to give an exaggerated picture of the magnitude of problems or the effectiveness of solutions. An experiment might be biased by expectations: participants provided with a treatment might assume that they will experience a difference and so might behave differently or report an effect. Researchers collecting the results can be influenced by knowing who received treatment. The ideal experiment is double-blind: neither the participants nor those collecting the data know who received what. This might be straightforward in drug trials, but it is impossible for many social studies. Confirmation bias arises when scientists find evidence for a favoured theory and then become insufficiently critical of their own results, or cease searching for contrary evidence.

Bigger is usually better for sample size. The average taken from a large number of observations will usually be more informative than the average taken from a smaller number of observations. That is, as we accumulate evidence, our knowledge improves. This is especially important when studies are clouded by substantial amounts of natural variation and measurement error. Thus, the effectiveness of a drug treatment will vary naturally between subjects. Its average efficacy can be more reliably and accurately estimated from a trial with tens of thousands of participants than from one with hundreds.

Correlation does not imply causation. It is tempting to assume that one pattern causes another. However, the correlation might be coincidental, or it might be a result of both patterns being caused by a third factor — a 'confounding' or 'lurking' variable. For example, ecologists at one time believed that poisonous algae were killing fish in estuaries; it turned out that the algae grew where fish died. The algae did not cause the deaths2.

Regression to the mean can mislead. Extreme patterns in data are likely to be, at least in part, anomalies attributable to chance or error. The next count is likely to be less extreme. For example, if speed cameras are placed where there has been a spate of accidents, any reduction in the accident rate cannot be attributed to the camera; a reduction would probably have happened anyway.

Extrapolating beyond the data is risky. Patterns found within a given range do not necessarily apply outside that range. Thus, it is very difficult to predict the response of ecological systems to climate change, when the rate of change is faster than has been experienced in the evolutionary history of existing species, and when the weather extremes may be entirely new.

Beware the base-rate fallacy. The ability of an imperfect test to identify a condition depends upon the likelihood of that condition occurring (the base rate). For example, a person might have a blood test that is '99% accurate' for a rare disease and test positive, yet they might be unlikely to have the disease. If 10,001 people have the test, of whom just one has the disease, that person will almost certainly have a positive test, but so too will a further 100 people (1%) even though they do not have the disease. This type of calculation is valuable when considering any screening procedure, say for terrorists at airports.

Controls are important. A control group is dealt with in exactly the same way as the experimental group, except that the treatment is not applied. Without a control, it is difficult to determine whether a given treatment really had an effect. The control helps researchers to be reasonably sure that there are no confounding variables affecting the results. Sometimes people in trials report positive outcomes because of the context or the person providing the treatment, or even the colour of a tablet3. This underlies the importance of comparing outcomes with a control, such as a tablet without the active ingredient (a placebo).

Randomization avoids bias. Experiments should, wherever possible, allocate individuals or groups to interventions randomly. Comparing the educational achievement of children whose parents adopt a health programme with that of children of parents who do not is likely to suffer from bias (for example, better-educated families might be more likely to join the programme). A well-designed experiment would randomly select some parents to receive the programme while others do not.

Seek replication, not pseudoreplication. Results consistent across many studies, replicated on independent populations, are more likely to be solid. The results of several such experiments may be combined in a systematic review or a meta-analysis to provide an overarching view of the topic with potentially much greater statistical power than any of the individual studies. Applying an intervention to several individuals in a group, say to a class of children, might be misleading because the children will have many features in common other than the intervention. The researchers might make the mistake of 'pseudoreplication' if they generalize from these children to a wider population that does not share the same commonalities. Pseudoreplication leads to unwarranted faith in the results. Pseudoreplication of studies on the abundance of cod in the Grand Banks in Newfoundland, Canada, for example, contributed to the collapse of what was once the largest cod fishery in the world4.

Scientists are human. Scientists have a vested interest in promoting their work, often for status and further research funding, although sometimes for direct financial gain. This can lead to selective reporting of results and occasionally, exaggeration. Peer review is not infallible: journal editors might favour positive findings and newsworthiness. Multiple, independent sources of evidence and replication are much more convincing.

Significance is significant. Expressed as P, statistical significance is a measure of how likely a result is to occur by chance. Thus P = 0.01 means there is a 1-in-100 probability that what looks like an effect of the treatment could have occurred randomly, and in truth there was no effect at all. Typically, scientists report results as significant when the P-value of the test is less than 0.05 (1 in 20).

Separate no effect from non-significance. The lack of a statistically significant result (say a P-value > 0.05) does not mean that there was no underlying effect: it means that no effect was detected. A small study may not have the power to detect a real difference. For example, tests of cotton and potato crops that were genetically modified to produce a toxin to protect them from damaging insects suggested that there were no adverse effects on beneficial insects such as pollinators. Yet none of the experiments had large enough sample sizes to detect impacts on beneficial species had there been any5.

Effect size matters. Small responses are less likely to be detected. A study with many replicates might result in a statistically significant result but have a small effect size (and so, perhaps, be unimportant). The importance of an effect size is a biological, physical or social question, and not a statistical one. In the 1990s, the editor of the US journal Epidemiology asked authors to stop using statistical significance in submitted manuscripts because authors were routinely misinterpreting the meaning of significance tests, resulting in ineffective or misguided recommendations for public-health policy6.

Study relevance limits generalizations. The relevance of a study depends on how much the conditions under which it is done resemble the conditions of the issue under consideration. For example, there are limits to the generalizations that one can make from animal or laboratory experiments to humans.

Feelings influence risk perception. Broadly, risk can be thought of as the likelihood of an event occurring in some time frame, multiplied by the consequences should the event occur. People's risk perception is influenced disproportionately by many things, including the rarity of the event, how much control they believe they have, the adverseness of the outcomes, and whether the risk is voluntarily or not. For example, people in the United States underestimate the risks associated with having a handgun at home by 100-fold, and overestimate the risks of living close to a nuclear reactor by 10-fold7.

Dependencies change the risks. It is possible to calculate the consequences of individual events, such as an extreme tide, heavy rainfall and key workers being absent. However, if the events are interrelated, (for example a storm causes a high tide, or heavy rain prevents workers from accessing the site) then the probability of their co-occurrence is much higher than might be expected8. The assurance by credit-rating agencies that groups of subprime mortgages had an exceedingly low risk of defaulting together was a major element in the 2008 collapse of the credit markets.

Data can be dredged or cherry picked. Evidence can be arranged to support one point of view. To interpret an apparent association between consumption of yoghurt during pregnancy and subsequent asthma in offspring9, one would need to know whether the authors set out to test this sole hypothesis, or happened across this finding in a huge data set. By contrast, the evidence for the Higgs boson specifically accounted for how hard researchers had to look for it — the 'look-elsewhere effect'. The question to ask is: 'What am I not being told?'

Extreme measurements may mislead. Any collation of measures (the effectiveness of a given school, say) will show variability owing to differences in innate ability (teacher competence), plus sampling (children might by chance be an atypical sample with complications), plus bias (the school might be in an area where people are unusually unhealthy), plus measurement error (outcomes might be measured in different ways for different schools). However, the resulting variation is typically interpreted only as differences in innate ability, ignoring the other sources. This becomes problematic with statements describing an extreme outcome ('the pass rate doubled') or comparing the magnitude of the extreme with the mean ('the pass rate in school x is three times the national average') or the range ('there is an x-fold difference between the highest- and lowest-performing schools'). League tables, in particular, are rarely reliable summaries of performance.
Nature 503, 335–337 ()


  1. Doubleday, R. & Wilsdon, J. Nature 485, 301302 (2012).
  2. Borsuk, M. E., Stow, C. A. & Reckhow, K. H. J. Water Res. Plan. Manage. 129, 271282 (2003).
  3. Huskisson, E. C. Br. Med. J. 4, 196200 (1974)
  4. Millar, R. B. & Anderson, M. J. Fish. Res. 70, 397407 (2004).
  5. Marvier, M. Ecol. Appl. 12, 11191124 (2002).
  6. Fidler, F., Cumming, G., Burgman, M., Thomason, N. J. Socio-Economics 33, 615630 (2004).
  7. Fischhoff, B., Slovic, P. & Lichtenstein, S. Am. Stat. 36, 240255 (1982).
  8. Billinton, R. & Allan, R. N. Reliability Evaluation of Power Systems (Plenum, 1984).
  9. Maslova, E., Halldorsson, T. I., Strøm, M., Olsen, S. F. J. Nutr. Sci. 1, e5 (2012).

Friday, November 1, 2013

Inmates Program Logistics App For Prison


schweini writes "Inmates in an Oklahoma prison developed software that attempts to streamline the prison's food logistics. A state representative found out, and he's trying to get every other prison in Oklahoma to use it, too. According to the Washington Post, 'The program tracks inmates as they proceed through food lines, to make sure they don’t go through the lines twice... It can help the prison track how popular a particular meal is, so purchasers know how much food to buy in the future. And it can track tools an inmate checks out to perform their jobs.' The program also tracks supply shipments into the system, and it showed that food supplier Sysco had been charging different prices for the same food depending on which facility it was going to. Another state representative was impressed, but realized the need for oversight: 'If they build on what they’ve done here, they actually have to script it out. If you have inmates writing code, there has to be a continual auditing process. Food in prison is a commodity. It’s currency.'"

Wednesday, October 30, 2013

You can’t read just one: Reproducibility and multiple sources By Bonnie Swoger


There are lots of ways to mess with the heads of undergraduate students. Giving them a research assignment and failing to specify a minimum number of references needed is just one example.
“Include as many sources as you need to make your point and illustrate your thesis.”
For students, finding one scholarly article on their topic often seems to be enough. Researchers did an experiment, got some results, and answered the research question the student started with. All done, all set, time for dinner.
But science doesn’t work that way. One experiment may suggest something interesting, but it doesn’t prove anything. In fact, it is quite easy to point to many examples of intriguing scientific studies that were either proved false or that couldn’t be reproduced later on. Scientific ideas that are true should be reproducible: other researchers should be able to repeat the experiments and get similar results or use other methods to arrive at the same conclusions. You can’t say that you discovered something new if someone else can’t reproduce your result.

This fundamental scientific idea, reproducibility, may be in crisis. A recent article by Vasilevsky et al. in the journal PeerJ suggested that many scientific journal articles don’t provide the information that other scientists would need in order to replicate their results. Key information about chemicals, reactants or model organisms is often missing, despite journal requirements to include such information (Vasilevsky et al., 2013). And a recent item in The Economist suggests that this might not matter that much. The emphasis placed on new research (by funding agencies and tenure and promotion committees) means that few scientists even attempt to replicate the work of others (“Unreliable research: Trouble at the lab,” 2013).

All of this means trouble from the very beginning of a research project, before an experiment is even designed, when scientists start to do background research on their topics. In the same way that experimental scientists can’t rely on the results of just one experiment to prove something, relying on just one information source for knowledge is a sure way to end up with unreliable information. Journalists look for corroborating sources, wikipedia flags articles that need a wider variety of citations, and scholars need to find multiple scholarly articles to support their ideas.

Some innovative people, companies and publishers are trying to sort this mess out. A collaboration between PLOS ONE, Mendeley, Figshare and the Science Exchange will be attempting to replicate the results of selected projects as a part of the Reproducibility Initiative. The Reproducibility Project is a crowdsourced effort to evaluate the reproducibility of experimental results in psychology. And the Reproducible Science project aims to make the results of computational experiments reproducible by ensuring the sharing of code and data and by making that information available to reviewers who can test the results described in a manuscript they are reviewing.
Unfortunately, these innovative programs are just a drop in the bucket of modern science. Funding agencies, publishers and tenure and promotion committees still value original work more highly than verification work. Scientists who concentrated on replicating the work of others would risk their careers.
As a result it is important for students and scholars to be aware of the challenges facing the reproducibility of science. We teach students in introductory science classes that reproducibility is one of the hallmarks of science. As they learn more about their disciplines, they need to be aware of the practical challenges involved in reproducing the work of others, and the importance of finding multiple sources about a topic needs to be emphasized.

As a librarian, part of my job is to help students find additional sources related to their research topics, even if there isn’t a published reproduction of an original source. This isn’t about which database to use or whether to put quotes around a phrase. It is about getting them to think critically about their topics. For example, while there might not be a second study that repeated the experiment of the first, students can look for:
  • Studies that examined the same topic in a different way
  • Studies that used the same methodology on a different species, geographic area, etc.
  • Background studies on individual aspects of their research question, including the statistical analyses used
  • Studies that cite the original study (even if no one has tried to reproduce the results, other scholars might express doubts about their conclusions when they cite the original).
The issues surrounding reproducibility in science won’t be solved overnight, and it will take a concerted effort from scientists at all levels of the modern scientific enterprise to steer this very big ship. In the meantime, students and scholars can make special efforts to ensure that they are using the highest quality information available as the basis of their original studies.

Works Cited:
Unreliable research: Trouble at the lab.” (2013, October 19). The Economist, 409(8858), 26-30.
Vasilevsky, N. a, Brush, M. H., Paddock, H., Ponting, L., Tripathy, S. J., Larocca, G. M., & Haendel, M. A. (2013). On the reproducibility of science: unique identification of research resources in the biomedical literature. PeerJ, 1, e148. doi:10.7717/peerj.148.

Sunday, October 27, 2013

Ten Steps You Can Take Right Now Against Internet Surveillance

by Danny O'Brien - Electronic Frontier Foundation


One of the trends we've seen is how, as the word of the NSA's spying has spread, more and more ordinary people want to know how (or if) they can defend themselves from surveillance online. But where to start?
The bad news is: if you're being personally targeted by a powerful intelligence agency like the NSA, it's very, very difficult to defend yourself. The good news, if you can call it that, is that much of what the NSA is doing is mass surveillance on everybody. With a few small steps, you can make that kind of surveillance a lot more difficult and expensive, both against you individually, and more generally against everyone.
Here's ten steps you can take to make your own devices secure. This isn't a complete list, and it won't make you completely safe from spying. But every step you take will make you a little bit safer than average. And it will make your attackers, whether they're the NSA or a local criminal, have to work that much harder.
  1. Use end-to-end encryption. We know the NSA has been working to undermineencryption, but experts like Bruce Schneier who have seen the NSA documents feel thatencryption is still "your friend". And your best friends remain open source systems that don't share your secret key with others, are open to examination by security experts, and encrypt data all the way from one end of a conversation to the other: from your device to the person you're chatting with. The easiest tool that achieves this end-to-end encryption is off-the-record (OTR) messaging, which gives instant messaging clients end-to-end encryption capabilities (and you can use it over existing services, such as Google Hangout and Facebook chat). Install it on your own computers, and get your friends to install it too. When you've done that, look into PGP–it's tricky to use, but used well it'll stop your email from being an open book to snoopers.
  2. Encrypt as much communications as you can. Even if you can't do end-to-end, you can still encrypt a lot of your Internet traffic. If you use EFF's HTTPS Everywhere browser addon for Chrome or Firefox, you can maximise the amount of web data you protect by forcing websites to encrypt webpages whenever possible. Use a virtual private network (VPN) when you're on a network you don't trust, like a cybercafe.
  3. Encrypt your hard drive. The latest version of Windows, Macs, iOS and Android all have ways to encrypt your local storage. Turn it on. Without it, anyone with a few minutes physical access to your computer, tablet or smartphone can copy its contents, even if they don't have your password.
  4. Strong passwords, kept safe. Passwords these days have to be ridiculously long to be safe against crackers. That includes the password to email accounts, and passwords to unlock devices, and passwords to web services. If it's bad to re-use passwords, and bad to use short passwords, how can you remember them all? Use a password manager. Even write down your passwords and keeping them in your wallet is safer than re-using the same short memorable password — at least you'll know when your wallet is stolen. You can create a memorable strong master password using a random word system like that described at diceware.com.
  5. Use Tor. "Tor Stinks", this slide leaked from GCHQ says. That shows much the intelligence services are worried about it. Tor is an the open source program that protects your anonymity online by shuffling your data through a global network of volunteer servers. If you install and use Tor, you can hide your origins from corporate and mass surveillance. You'll also be showing that Tor is used by everyone, not just the "terrorists" that GCHQ claims.
  6. Turn on two-factor (or two-step) authentication. Google and Gmail has it; Twitter has it; Dropbox has it. Two factor authentication, where you type a password and a regularly changed confirmation number, helps protect you from attacks on web and cloud services. When available, turn it on for the services you use. If it's not available, tell the company you want it.
  7. Don't click on attachments. The easiest ways to get intrusive malware onto your computer is through your email, or through compromised websites. Browsers are getting better at protecting you from the worst of the web, but files sent by email or downloaded from the Net can still take complete control of your computer. Get your friends to send you information in text; when they send you a file, double-check it's really from them.
  8. Keep software updated, and use anti-virus software. The NSA may be attempting to compromise Internet companies (and we're still waiting to see whether anti-virus companies deliberately ignore government malware), but on the balance, it's still better to have the companies trying to fix your software than have attackers be able to exploit old bugs.
  9. Keep extra secret information extra secure. Think about the data you have, and take extra steps to encrypt and conceal your most private data. You can use TrueCrypt to separately encrypt a USB flash drive. You might even want to keep your most private data on a cheap netbook, kept offline and only used for the purposes of reading or editing documents.
  10. Be an ally. If you understand and care enough to have read this far, we need your help. To really challenge the surveillance state, you need to teach others what you've learned, and explain to them why it's important. Install OTR, Tor and other software for worried colleagues, and teach your friends how to use them. Explain to them the impact of the NSA revelations. Ask them to sign up to Stop Watching Us and other campaigns against bulk spying. Run a Tor node, or hold a cryptoparty. They need to stop watching us; and we need to start making it much harder for them to get away with it.

Tuesday, May 7, 2013

Why is this pioneering researcher fighting to have his work retracted?

When Rutgers evolutionary biologist Robert Trivers began to suspect that his co-author, William Brown, had faked the data on a widely circulated study, he was placed in the unenviable position of bringing the fraud to light. It has not been easy.
Trivers first began to suspect that something was amiss in 2007, two years after the study – which found Jamaican teens with a high degree of body symmetry were more likely to be rated "good dancers" by their peers – had been featured on the cover of Nature. As Nature News' Eugenie Samuel Reich reports, Trivers has been fighting since 2008 to have the results withdrawn from the scientific literature, at the occasional expense of his reputation: 

In seeking a retraction, Trivers self-published The Anatomy of a Fraud, a small book detailing what he saw as evidence of data fabrication. Later, Trivers had a verbal altercation over the matter with a close colleague and was temporarily banned from campus.
An investigation of the case, completed by Rutgers and released publicly last month, now seems to validate Trivers’ allegations. Brown disputes the university’s finding, but it could help to clear the controversy that has clouded Trivers’ reputation as the author of several pioneering papers in the 1970s. For example, Trivers advanced an influential theory of ‘reciprocal altruism’, in which people behave unselfishly and hope that they will later be rewarded for their good deeds. He also analysed human sexuality in terms of the investments that mothers and fathers each make in child-rearing.
Steven Pinker, a psychologist at Harvard University in Cambridge, Massachusetts, calls the dancing paper “a lark” and “journalist bait” that lacks a firm basis in theory. “It was cute rather than deep,” he says. But he describes Trivers’ earlier work as “monumental”, and says that it would be a travesty if Trivers became known for one controversial study rather than his wider contributions to evolutionary biology. “Trivers is one of the most important thinkers in the history of the biological and social sciences,” Pinker says.

Thursday, May 2, 2013

Scientificamerican.com: Research in the Digital Age: It’s More Than Finding Information… By Jody Passanisi and Shara Peters

The Role of Research in the Digital Age 

We all know that the Internet has led to an explosion of available information. When students search for information about a topic, they are met with a plethora of articles, from both credible and non-credible resources. The skill of research has always been considered to be a pillar of the social studies discipline, though the nature of research itself has been rapidly changing as the Internet develops and our society becomes less dependant on paper-bound books. As social studies teachers, it is our job to be cognizant of how these changes are having an impact on our discipline.

The Encyclopedia

Gone are the days of consulting the ever-trustworthy Encyclopedia Britannica; there used to be an inherent trust we could have that the information we found was the most relevant to our query, was presented in a (relatively) unbiased way, and was accurate. Now, finding the information is only a small fraction of the challenge of research. Students must now discern if the source they found contains accurate, factual, and documented information. Once they have done that, they must determine what the purpose of their source is, and whether or not it is presenting the information in a significantly skewed manner. This skill set is commonly found as part of university-level history curriculum, but now students as young as 4th grade need to begin developing this proficiency.

The Value of a Website

After receiving too many research papers that relied solely on Wikipedia, we realized that these skills needed to be explicitly taught, and that they needed to be developed in our social studies class. When looking for previously published curricula about Internet skills, we found Common Sense Media’s Test Before You Trust materials, which were exactly what we were looking for. They guide students through asking tough questions about each source: Is the bias readily apparent? Who paid for the website? How many sources are cited for their information? Thanks to this material, students can at least ask the right questions about the online source.

In order to have the skills to evaluate a source found on the Internet, we need to not only teach tools to do this –like those found in Common Sense Media’s Test Before You Trust materials– but we need to teach how to evaluate the perspective of the sources students read, and to students even younger than before. In other words: we need to teach about bias.


Obviously, before the advent of the Internet, historians wrote from particular perspectives. The perspective of the author of a primary source was written from the perspective of  personal experience. The letters of Abigail Adams reflect her perspective on politics, women’s  rights, and slavery in a different way from the writings of Thomas Jefferson.  Throughout history, historians have looked at events through the lens of their own biases– their writings are colored by their politics, culture, and experience. Also, the availability of certain information to those historians limited what they could and couldn’t write about.  It wasn’t as often though, when we were in middle school, that students encountered a secondary source or tertiary source beyond the encyclopedia–so teaching about bias wasn’t as necessary.

Instead now, secondary and tertiary sources on the Internet can be found by anyone and written by anyone–evaluating the bias of the source plays an important part in evaluating whether the site is useful. Since the Internet is not peer reviewed like academic journals, students are going to have to do the evaluation themselves. We teach our history students to evaluate bias by reading two different sources writing from different perspectives on the same historical event. Students find the details in the text that help shed light on what a source’s perspective is. Students find telling adjectives, figure out what information is included, what is omitted. Everything is data.

Analysis and Evaluation in Social Studies Research

The tools used for detecting the bias of a source, and the critical thinking skills they require, must become part of social studies curriculum, and earlier now than ever before. However,  critical thinking skills of evaluation and analysis that are required to detect bias aren’t necessarily developed until students reach the formative operations stage described by Piaget. While the seeds of perspective analysis need to be planted early, some students may not yet be developmentally ready for learning how to discern on their own. To assist them, there are tools to help sort through the vast amount of resources available. For example, search engines like SweetSearch only display results appropriate for students (though that doesn’t mean the sites they find are without bias).

Today, people are not necessarily considered knowledgeable based on how much information they know, but by how much facility they have with that information. As teachers in the discipline of history we have to own the idea that teaching students how to analyze and evaluate the information they find is more important than gathering that information together in one place. We ask our students to research, but it is not simply about finding information anymore. Students will need to sift through multiple perspectives on the Internet, and ultimately decide which perspectives are valuable and useful for their purpose. As social studies teachers, we have to show them HOW to research.

Thursday, April 11, 2013

The Reading Brain in The Digital Age: The Science of Paper versus Screens by Ferris Jabr

Image: Robert Drózd, Wikimedia Commons
In a viral YouTube video from October 2011 a one-year-old girl sweeps her fingers across an iPad's touchscreen, shuffling groups of icons. In the following scenes she appears to pinch, swipe and prod the pages of paper magazines as though they too were screens. When nothing happens, she pushes against her leg, confirming that her finger works just fine—or so a title card would have us believe.

The girl's father, Jean-Louis Constanza, presents "A Magazine Is an iPad That Does Not Work" as naturalistic observation—a Jane Goodall among the chimps moment—that reveals a generational transition. "Technology codes our minds," he writes in the video's description. "Magazines are now useless and impossible to understand, for digital natives"—that is, for people who have been interacting with digital technologies from a very early age.

Perhaps his daughter really did expect the paper magazines to respond the same way an iPad would. Or maybe she had no expectations at all—maybe she just wanted to touch the magazines. Babies touch everything. Young children who have never seen a tablet like the iPad or an e-reader like the Kindle will still reach out and run their fingers across the pages of a paper book; they will jab at an illustration they like; heck, they will even taste the corner of a book. Today's so-called digital natives still interact with a mix of paper magazines and books, as well as tablets, smartphones and e-readers; using one kind of technology does not preclude them from understanding another.

Nevertheless, the video brings into focus an important question: How exactly does the technology we use to read change the way we read? How reading on screens differs from reading on paper is relevant not just to the youngest among us, but to just about everyone who reads—to anyone who routinely switches between working long hours in front of a computer at the office and leisurely reading paper magazines and books at home; to people who have embraced e-readers for their convenience and portability, but admit that for some reason they still prefer reading on paper; and to those who have already vowed to forgo tree pulp entirely. As digital texts and technologies become more prevalent, we gain new and more mobile ways of reading—but are we still reading as attentively and thoroughly? How do our brains respond differently to onscreen text than to words on paper? Should we be worried about dividing our attention between pixels and ink or is the validity of such concerns paper-thin?

Since at least the 1980s researchers in many different fields—including psychology, computer engineering, and library and information science—have investigated such questions in more than one hundred published studies. The matter is by no means settled. Before 1992 most studies concluded that people read slower, less accurately and less comprehensively on screens than on paper. Studies published since the early 1990s, however, have produced more inconsistent results: a slight majority has confirmed earlier conclusions, but almost as many have found few significant differences in reading speed or comprehension between paper and screens. And recent surveys suggest that although most people still prefer paper—especially when reading intensively—attitudes are changing as tablets and e-reading technology improve and reading digital books for facts and fun becomes more common. In the U.S., e-books currently make up between 15 and 20 percent of all trade book sales.

Even so, evidence from laboratory experiments, polls and consumer reports indicates that modern screens and e-readers fail to adequately recreate certain tactile experiences of reading on paper that many people miss and, more importantly, prevent people from navigating long texts in an intuitive and satisfying way. In turn, such navigational difficulties may subtly inhibit reading comprehension. Compared with paper, screens may also drain more of our mental resources while we are reading and make it a little harder to remember what we read when we are done. A parallel line of research focuses on people's attitudes toward different kinds of media. Whether they realize it or not, many people approach computers and tablets with a state of mind less conducive to learning than the one they bring to paper.

"There is physicality in reading," says developmental psychologist and cognitive scientist Maryanne Wolf of Tufts University, "maybe even more than we want to think about as we lurch into digital reading—as we move forward perhaps with too little reflection. I would like to preserve the absolute best of older forms, but know when to use the new."

Navigating textual landscapes
Understanding how reading on paper is different from reading on screens requires some explanation of how the brain interprets written language. We often think of reading as a cerebral activity concerned with the abstract—with thoughts and ideas, tone and themes, metaphors and motifs. As far as our brains are concerned, however, text is a tangible part of the physical world we inhabit. In fact, the brain essentially regards letters as physical objects because it does not really have another way of understanding them. As Wolf explains in her book Proust and the Squid, we are not born with brain circuits dedicated to reading. After all, we did not invent writing until relatively recently in our evolutionary history, around the fourth millennium B.C. So the human brain improvises a brand-new circuit for reading by weaving together various regions of neural tissue devoted to other abilities, such as spoken language, motor coordination and vision.

Some of these repurposed brain regions are specialized for object recognition—they are networks of neurons that help us instantly distinguish an apple from an orange, for example, yet classify both as fruit. Just as we learn that certain features—roundness, a twiggy stem, smooth skin—characterize an apple, we learn to recognize each letter by its particular arrangement of lines, curves and hollow spaces. Some of the earliest forms of writing, such as Sumerian cuneiform, began as characters shaped like the objects they represented—a person's head, an ear of barley, a fish. Some researchers see traces of these origins in modern alphabets: C as crescent moon, S as snake. Especially intricate characters—such as Chinese hanzi and Japanese kanji—activate motor regions in the brain involved in forming those characters on paper: The brain literally goes through the motions of writing when reading, even if the hands are empty. Researchers recently discovered that the same thing happens in a milder way when some people read cursive.

Beyond treating individual letters as physical objects, the human brain may also perceive a text in its entirety as a kind of physical landscape. When we read, we construct a mental representation of the text in which meaning is anchored to structure. The exact nature of such representations remains unclear, but they are likely similar to the mental maps we create of terrain—such as mountains and trails—and of man-made physical spaces, such as apartments and offices. Both anecdotally and in published studies, people report that when trying to locate a particular piece of written information they often remember where in the text it appeared. We might recall that we passed the red farmhouse near the start of the trail before we started climbing uphill through the forest; in a similar way, we remember that we read about Mr. Darcy rebuffing Elizabeth Bennett on the bottom of the left-hand page in one of the earlier chapters.

In most cases, paper books have more obvious topography than onscreen text. An open paperback presents a reader with two clearly defined domains—the left and right pages—and a total of eight corners with which to orient oneself. A reader can focus on a single page of a paper book without losing sight of the whole text: one can see where the book begins and ends and where one page is in relation to those borders. One can even feel the thickness of the pages read in one hand and pages to be read in the other. Turning the pages of a paper book is like leaving one footprint after another on the trail—there's a rhythm to it and a visible record of how far one has traveled. All these features not only make text in a paper book easily navigable, they also make it easier to form a coherent mental map of the text.

In contrast, most screens, e-readers, smartphones and tablets interfere with intuitive navigation of a text and inhibit people from mapping the journey in their minds. A reader of digital text might scroll through a seamless stream of words, tap forward one page at a time or use the search function to immediately locate a particular phrase—but it is difficult to see any one passage in the context of the entire text. As an analogy, imagine if Google Maps allowed people to navigate street by individual street, as well as to teleport to any specific address, but prevented them from zooming out to see a neighborhood, state or country. Although e-readers like the Kindle and tablets like the iPad re-create pagination—sometimes complete with page numbers, headers and illustrations—the screen only displays a single virtual page: it is there and then it is gone. Instead of hiking the trail yourself, the trees, rocks and moss move past you in flashes with no trace of what came before and no way to see what lies ahead.

"The implicit feel of where you are in a physical book turns out to be more important than we realized," says Abigail Sellen of Microsoft Research Cambridge in England and co-author of The Myth of the Paperless Office. "Only when you get an e-book do you start to miss it. I don't think e-book manufacturers have thought enough about how you might visualize where you are in a book."

At least a few studies suggest that by limiting the way people navigate texts, screens impair comprehension. In a study published in January 2013 Anne Mangen of the University of Stavanger in Norway and her colleagues asked 72 10th-grade students of similar reading ability to study one narrative and one expository text, each about 1,500 words in length. Half the students read the texts on paper and half read them in pdf files on computers with 15-inch liquid-crystal display (LCD) monitors. Afterward, students completed reading-comprehension tests consisting of multiple-choice and short-answer questions, during which they had access to the texts. Students who read the texts on computers performed a little worse than students who read on paper.

Based on observations during the study, Mangen thinks that students reading pdf files had a more difficult time finding particular information when referencing the texts. Volunteers on computers could only scroll or click through the pdfs one section at a time, whereas students reading on paper could hold the text in its entirety in their hands and quickly switch between different pages. Because of their easy navigability, paper books and documents may be better suited to absorption in a text. "The ease with which you can find out the beginning, end and everything inbetween and the constant connection to your path, your progress in the text, might be some way of making it less taxing cognitively, so you have more free capacity for comprehension," Mangen says.

Supporting this research, surveys indicate that screens and e-readers interfere with two other important aspects of navigating texts: serendipity and a sense of control. People report that they enjoy flipping to a previous section of a paper book when a sentence surfaces a memory of something they read earlier, for example, or quickly scanning ahead on a whim. People also like to have as much control over a text as possible—to highlight with chemical ink, easily write notes to themselves in the margins as well as deform the paper however they choose.

Because of these preferences—and because getting away from multipurpose screens improves concentration—people consistently say that when they really want to dive into a text, they read it on paper. In a 2011 survey of graduate students at National Taiwan University, the majority reported browsing a few paragraphs online before printing out the whole text for more in-depth reading. A 2008 survey of millennials (people born between 1980 and the early 2000s) at Salve Regina University in Rhode Island concluded that, "when it comes to reading a book, even they prefer good, old-fashioned print". And in a 2003 study conducted at the National Autonomous University of Mexico, nearly 80 percent of 687 surveyed students preferred to read text on paper as opposed to on a screen in order to "understand it with clarity".

Surveys and consumer reports also suggest that the sensory experiences typically associated with reading—especially tactile experiences—matter to people more than one might assume. Text on a computer, an e-reader and—somewhat ironically—on any touch-screen device is far more intangible than text on paper. Whereas a paper book is made from pages of printed letters fixed in a particular arrangement, the text that appears on a screen is not part of the device's hardware—it is an ephemeral image. When reading a paper book, one can feel the paper and ink and smooth or fold a page with one's fingers; the pages make a distinctive sound when turned; and underlining or highlighting a sentence with ink permanently alters the paper's chemistry. So far, digital texts have not satisfyingly replicated this kind of tactility (although some companies are innovating, at least with keyboards).

Paper books also have an immediately discernible size, shape and weight. We might refer to a hardcover edition of War and Peace as a hefty tome or a paperback Heart of Darkness as a slim volume. In contrast, although a digital text has a length—which is sometimes represented with a scroll or progress bar—it has no obvious shape or thickness. An e-reader always weighs the same, regardless of whether you are reading Proust's magnum opus or one of Hemingway's short stories. Some researchers have found that these discrepancies create enough "haptic dissonance" to dissuade some people from using e-readers. People expect books to look, feel and even smell a certain way; when they do not, reading sometimes becomes less enjoyable or even unpleasant. For others, the convenience of a slim portable e-reader outweighs any attachment they might have to the feel of paper books.

Exhaustive reading
Although many old and recent studies conclude that people understand what they read on paper more thoroughly than what they read on screens, the differences are often small. Some experiments, however, suggest that researchers should look not just at immediate reading comprehension, but also at long-term memory. In a 2003 study Kate Garland of the University of Leicester and her colleagues asked 50 British college students to read study material from an introductory economics course either on a computer monitor or in a spiral-bound booklet. After 20 minutes of reading Garland and her colleagues quizzed the students with multiple-choice questions. Students scored equally well regardless of the medium, but differed in how they remembered the information.

Psychologists distinguish between remembering something—which is to recall a piece of information along with contextual details, such as where, when and how one learned it—and knowing something, which is feeling that something is true without remembering how one learned the information. Generally, remembering is a weaker form of memory that is likely to fade unless it is converted into more stable, long-term memory that is "known" from then on. When taking the quiz, volunteers who had read study material on a monitor relied much more on remembering than on knowing, whereas students who read on paper depended equally on remembering and knowing. Garland and her colleagues think that students who read on paper learned the study material more thoroughly more quickly; they did not have to spend a lot of time searching their minds for information from the text, trying to trigger the right memory—they often just knew the answers.

Other researchers have suggested that people comprehend less when they read on a screen because screen-based reading is more physically and mentally taxing than reading on paper. E-ink is easy on the eyes because it reflects ambient light just like a paper book, but computer screens, smartphones and tablets like the iPad shine light directly into people's faces. Depending on the model of the device, glare, pixilation and flickers can also tire the eyes. LCDs are certainly gentler on eyes than their predecessor, cathode-ray tubes (CRT), but prolonged reading on glossy self-illuminated screens can cause eyestrain, headaches and blurred vision. Such symptoms are so common among people who read on screens—affecting around 70 percent of people who work long hours in front of computers—that the American Optometric Association officially recognizes computer vision syndrome.

Erik Wästlund of Karlstad University in Sweden has conducted some particularly rigorous research on whether paper or screens demand more physical and cognitive resources. In one of his experiments 72 volunteers completed the Higher Education Entrance Examination READ test—a 30-minute, Swedish-language reading-comprehension exam consisting of multiple-choice questions about five texts averaging 1,000 words each. People who took the test on a computer scored lower and reported higher levels of stress and tiredness than people who completed it on paper.

In another set of experiments 82 volunteers completed the READ test on computers, either as a paginated document or as a continuous piece of text. Afterward researchers assessed the students' attention and working memory, which is a collection of mental talents that allow people to temporarily store and manipulate information in their minds. Volunteers had to quickly close a series of pop-up windows, for example, sort virtual cards or remember digits that flashed on a screen. Like many cognitive abilities, working memory is a finite resource that diminishes with exertion.

Although people in both groups performed equally well on the READ test, those who had to scroll through the continuous text did not do as well on the attention and working-memory tests. Wästlund thinks that scrolling—which requires a reader to consciously focus on both the text and how they are moving it—drains more mental resources than turning or clicking a page, which are simpler and more automatic gestures. A 2004 study conducted at the University of Central Florida reached similar conclusions.

Attitude adjustments
An emerging collection of studies emphasizes that in addition to screens possibly taxing people's attention more than paper, people do not always bring as much mental effort to screens in the first place. Subconsciously, many people may think of reading on a computer or tablet as a less serious affair than reading on paper. Based on a detailed 2005 survey of 113 people in northern California, Ziming Liu of San Jose State University concluded that people reading on screens take a lot of shortcuts—they spend more time browsing, scanning and hunting for keywords compared with people reading on paper, and are more likely to read a document once, and only once.

When reading on screens, people seem less inclined to engage in what psychologists call metacognitive learning regulation—strategies such as setting specific goals, rereading difficult sections and checking how much one has understood along the way. In a 2011 experiment at the Technion–Israel Institute of Technology, college students took multiple-choice exams about expository texts either on computers or on paper. Researchers limited half the volunteers to a meager seven minutes of study time; the other half could review the text for as long as they liked. When under pressure to read quickly, students using computers and paper performed equally well. When managing their own study time, however, volunteers using paper scored about 10 percentage points higher. Presumably, students using paper approached the exam with a more studious frame of mind than their screen-reading peers, and more effectively directed their attention and working memory.

Perhaps, then, any discrepancies in reading comprehension between paper and screens will shrink as people's attitudes continue to change. The star of "A Magazine Is an iPad That Does Not Work" is three-and-a-half years old today and no longer interacts with paper magazines as though they were touchscreens, her father says. Perhaps she and her peers will grow up without the subtle bias against screens that seems to lurk in the minds of older generations. In current research for Microsoft, Sellen has learned that many people do not feel much ownership of e-books because of their impermanence and intangibility: "They think of using an e-book, not owning an e-book," she says. Participants in her studies say that when they really like an electronic book, they go out and get the paper version. This reminds Sellen of people's early opinions of digital music, which she has also studied. Despite initial resistance, people love curating, organizing and sharing digital music today. Attitudes toward e-books may transition in a similar way, especially if e-readers and tablets allow more sharing and social interaction than they currently do. Books on the Kindle can only be loaned once, for example.

To date, many engineers, designers and user-interface experts have worked hard to make reading on an e-reader or tablet as close to reading on paper as possible. E-ink resembles chemical ink and the simple layout of the Kindle's screen looks like a page in a paperback. Likewise, Apple's iBooks attempts to simulate the overall aesthetic of paper books, including somewhat realistic page-turning. Jaejeung Kim of KAIST Institute of Information Technology Convergence in South Korea and his colleagues have designed an innovative and unreleased interface that makes iBooks seem primitive. When using their interface, one can see the many individual pages one has read on the left side of the tablet and all the unread pages on the right side, as if holding a paperback in one's hands. A reader can also flip bundles of pages at a time with a flick of a finger.

But why, one could ask, are we working so hard to make reading with new technologies like tablets and e-readers so similar to the experience of reading on the very ancient technology that is paper? Why not keep paper and evolve screen-based reading into something else entirely? Screens obviously offer readers experiences that paper cannot. Scrolling may not be the ideal way to navigate a text as long and dense as Moby Dick, but the New York Times, Washington Post, ESPN and other media outlets have created beautiful, highly visual articles that depend entirely on scrolling and could not appear in print in the same way. Some Web comics and infographics turn scrolling into a strength rather than a weakness. Similarly, Robin Sloan has pioneered the tap essay for mobile devices. The immensely popular interactive Scale of the Universe tool could not have been made on paper in any practical way. New e-publishing companies like Atavist offer tablet readers long-form journalism with embedded interactive graphics, maps, timelines, animations and sound tracks. And some writers are pairing up with computer programmers to produce ever more sophisticated interactive fiction and nonfiction in which one's choices determine what one reads, hears and sees next.

When it comes to intensively reading long pieces of plain text, paper and ink may still have the advantage. But text is not the only way to read.

Sunday, March 17, 2013

From idea to science: Knowing when you’ve got a good idea


by Aurich Lawson

One of the great untold stories in science is the process of science itself. I don't mean stories about what scientists have discovered and what that discovery tells us; we (and many others) cover those every day. I also don't mean stories about the pure joy of discovery and the excitement of finding out that everything you thought you understood was total bollocks. We cover that here at Ars occasionally, and there are plenty of books on it if you're hungry for more.
What's missing is the background for these stories of discovery. How do you take an idea from its very beginning as a casual musing through to an actual research program? What's involved in that process? How do you sort out good ideas from bad and choose what to pursue and what to abandon? That is the story that I want to tell.
Since this is the story of science-as-a-process rather than science-as-a-result, I will be using myself as an example. I am, as some of you may know, a tenure track faculty member at a research institute in the Netherlands. Being a researcher in the Netherlands is not that different from being a researcher anywhere else, so a lot of what I discuss will be familiar to scientists everywhere. Since I recently hopped on the tenure track, I have the next few years to prove that I am able to not only carry out research, but to start and manage entire research programs. And, as yet, I have no research program to manage.
What will follow is a series of posts that document my success or failure in this particular endeavor. It will not be a blog as such; instead, I am aiming to give you a flavor of what goes through our minds when we come up with an idea, and what happens afterwards. It's not enough just to have an idea—it has to meet all sorts of criteria, only some of which have anything to do with science. As such, we have to refine and structure an idea into something that could become a coherent body of research. Then we have to convince other people that it's a good idea.
This article will mainly be background: what does it mean to be a physics researcher in the Netherlands? What conditions do I have to meet? What sort of time-scale are we talking about in terms of viable ideas? As for the rest of the series:
  • The next post will discuss some of the specifics of the research program I want to build, the sort of physics that gets me out of bed in the morning.
  • The third post will be about getting the resources I need to carry out that research. How do I sell my idea, and to whom?
  • The fourth post will eventually describe the outcome of my salesmanship, and the perpetual question: what next?
There will be quite a delay between the latter posts in the series (it takes a while for grants to be evaluated). And, if I have not failed in the most abject manner possible, the process may continue from there. Success never ensues immediately after the money arrives.

The human factor

To understand how I choose between good ideas and bad ideas, we need to step back from actual physics and science and take a look at the structure of research community that I work in. Research takes resources. I don't mean money—all right, I do mean money—but it also requires time and people and lab space and support. There is a human and physical infrastructure that I have to make use of. I may be part of a research organization, but I have no automatic right of access to any of this infrastructure.
In the Netherlands, doctoral candidates are not students (nor are they, as many think of students, free labor). Instead, they are full-time staff with four-year contracts. What does this mean? First, it is very difficult to organically grow an idea from small scale research projects into something larger that has a doctoral student attached to it. The timing just doesn't work out with that four-year limit. It is difficult to begin a project with a masters or undergraduate student who will "just take a look" and then hand that over seamlessly to a PhD student should it appear promising.
This also has implications for scale. A PhD student has the right to expect a project that generates a decent body of work within those four years. A project that is going to take eight years of construction work before it produces any scientific results cannot and should not be built by a PhD student. On the other hand, a project that dries up in two years is equally bad. In other words, no matter what idea I come up with, I need to be able to say that all the candidates I hire should find enough material to write a thesis and graduate—no matter what the experimental outcome.
This means that any big idea I come up with also needs to be partitioned into chunks of the right size. If it can't, then it doesn't work in an academic institution.
Since all experimental results need to be thesis-worthy, the questions I want to answer should be open enough to accommodate failure. For instance, my ideas are often based on a single experiment: if we conduct experiment "a," we could measure property "b," and that would be so cool! But, what if "a" doesn't work? Does the student go home?
So, the core idea also needs to be structured so, should certain experiments not work, they still build something that can lead to experiments which do work. Or, if the cool new instrument we want to build can't measure exactly what I intended, there are other things it can measure. One of those other things must be fairly certain of success.
To put it bluntly: all paths must lead to results of some form.

A ticking clock

So I owe it to my students to come up with research ideas that will generate some success in the right time frame  But there's another human side—mine. As a tenure tracker, time is a big boundary condition. If I choose to forego PhD students, I could come up with an idea that involves eight years of construction before the first results might be expected (and first students hired). That would be acceptable in terms of meeting their requirements.
Unfortunately, my time would be up by then. Part way through the eight years, the director of the institute would look at my performance and promptly tell me to seek work elsewhere. The tenure track is there to give researchers a limited time to prove that they can do everything a tenured researcher should be able to. I must succeed with medium scale projects or many small scale projects rather than a big, long-term project.
That doesn't mean, however, these projects can't be pieces of some longer-term big project. I simply have to ensure that the project delivers results at all time-scales. In my case, the projects should be in the one-to-six PhD student range and should not require more than a year of instrumentation building (unless building the instrument can be counted as doing science). It should, from today, deliver lots of results within a four-year period. Or I'll be looking for another job.

Fitting in

Finally, there are institutional goals and resources. At the moment, I am in an institute that is going through a major restructuring and relocation. The institute will split up, with pieces moving to three different universities. Each piece will have a very different focus: energy research at Eindhoven, soft X-Ray optics at the University of Twente, and a free electron laser facility at University of Nijmegen. At the moment, I am part of the soft X-Ray optics group, so, my research should fit within that theme. On the other hand, as a tenure track researcher, I need to demonstrate some independence. My research still needs to be distinct from what the group already does.
These considerations, which are largely political in nature, are surprisingly important. They make the difference between enthusiastic institutional support (above and beyond what you are entitled to) and grudging assistance, limited to exactly what you are entitled to and delivered on someone else's schedule. In other words, the enthusiasm with which an institute supports a research program may very well be the difference between success and failure.
In the next installment, I will talk about the idea that I am planning: how it originated, and how I expanded it into a program that meets the requirements outlined in this post. As you will see, the raw idea, as expressed in its original form, doesn't fit well to the goals of the research group that I am a part of. My job became making it fit.

Sunday, February 3, 2013

Do objects of different masses really fall at the same rate? The Nordtvedt effect posits they don’t


by Esther Inglis-Arkell

It's been demonstrated since the 1500s that, when falling toward a certain body, objects fall at the same rate. Everyone from Galileo in Pisa to David Scott on the moon demonstrated that. But what if they're wrong? The Nordtvedt effect posits exactly that.

One of the most famous science legends has Galileo dropping two different-sized cannonballs off the leaning tower of Pisa demonstrating that objects of different mass fall at the same rate. Actually, he rolled two balls down ramps to demonstrate the effect and the tower had nothing to do with it. At the end of the Apollo 15 mission, commander David Scott dropped a hammer and a feather on the moon - where there is no air resistance - to prove the same thing. The mass of the object falling doesn't matter. What matters is the mass of the thing making the object fall. Whether a planet is grabbing a cannonball or a feather, the object falls at the same rate. This is called the equivalence principle, and has been held as scientific truth for about four hundred years.

Just to show that no physics principle is sacred, Professor Kenneth Nordtvedt of Montana State University proposed the idea that objects fall at different rates due to their mass. Or actually, he outlined exactly what we'd see happening between the moon, the Earth, and the sun if an object's mass was taken into account in the gravitational pull between it and the body it was orbiting. The Earth, being the more massive orbiting body, would fall towards the sun at a faster rate than the moon.

The effect justifies this by playing with three different concepts. The first concept is gravitational self-energy. This is roughly the idea that all the little pieces of an object have an effect on each other. A solid ball, for example, could be carved up into little shells inside each other like Russian nesting dolls, all of which are pulling at each other. This energy would be larger for big objects than for small objects.

The second two concepts are two different views of mass. There's inertial mass. Imagine an object is on perfectly greased wheels on a perfectly smooth, level floor. If you were to reach out and push it, you would have to exert enough force to move its inertial mass. Then there's gravitational mass. Imagine you now have to pick up that object against the pull of gravity. Outwardly, this seems like a harder task. I could easily push a large friend along on, for example, a wheeled office chair. I'd have a hard time picking them (and the chair) up off the ground. I'd have to use more force. But the difference in perceived mass is just because, when you lift, you're working against a force. Earth's gravity is pulling down. We know the force of Earth's gravity, and we know the mass of the Earth - take those away and I'd be using the same force pulling the person up into my arms as I would pushing them across a level floor. And if I'm using the same force - I must be moving the same mass. In other words, the inertial mass (mass of an object floating in space) and the gravitational mass (mass of an object sitting on the Earth), are the same. It's just the gravity of the Earth that's making the difference.

Every experiment has found these two masses to be identical, but Nordtvedt's idea of gravitational self-energy might change that. He posited that all those little pieces pulling at each other with their gravity might contribute to an object's gravitational mass, and not its inertial mass. Since a larger object has more mass, it would have more gravitational mass. So now, even accounting for the pull of gravity, more force is being exerted to lift a mass than to push it. And the bigger the mass is, the bigger the gap between the force required to lift it and the force required to shove it is. So the Earth is exerting more force on bigger objects than smaller ones, and bigger objects fall faster.

At least that's the idea. The Nordtvedt effect has been tested, and so far no evidence has been found that the more massive Earth is falling towards the sun faster than the moon. If there is an effect, it's very slight. But if it's there, everything we know about motion, and even relativity, changes. Wouldn't that be cool?