Human Subject: An Investigational Memoir

Previous chapter | Contents | Next chapter | References | Contact

12. The Investigation Superhighway

“Je suis convaincu que dans les sciences expérimentales en évolution, et particulièrement dans celles qui sont aussi complexes que la biologie, la découverte d'un nouvel instrument d'observation ou d'expérimentation rend beaucoup plus de services que beaucoup de dissertations systématiques ou philosophiques.”

As if the well-meaning people protecting us from the malevolent researchers didn’t have enough to worry about, along came the Internet, and suddenly they had to confront a host of new issues. Does monitoring chatrooms constitute human-subjects research? How do you obtain informed consent online? When does usability testing become human-subjects research?

A report published in 2004 by the American Psychological Association discussed the pros and cons of conducting psychological research online. Advantages include the ability to be more unobtrusive when observing behavior; the ease of recruiting large, diverse groups of subjects; the availability of transaction logs that track people’s conversations, Web browsing, purchase decisions, and other online behavior; and the cost savings that come with automation. (Kraut et al., 2004)

A disadvantage mentioned was the lack of quality control that is normally ensured by either peer review or the thorough scrutiny of funding agencies. However, most research conducted by someone affiliated with a university will need to be reviewed by an IRB (even if only to conclude that the research is exempt from review), and any paper submitted to a referred journal or conference will need to win peer approval.

Other problems cited include the difficulty of getting a truly random sample of the general public online and the lack of control over the experimental environment. Internet users are disproportionately young, white, well-off, and male, and one can easily picture a subset of this demographic, say a group of bored teenagers, sitting at their computers and working together to sabotage online surveys.

Then of course (you knew it was coming) there’s the perennial problem of protecting human subjects. The APA report states that risks to subjects are no greater online than in a laboratory setting, but that it’s more difficult to assess the risks, and they may be of different kinds. Under the Common Rule, observation of public behavior, where subjects are not personally identifiable, does not constitute human-subjects research. In the online world, however, it can be difficult to figure out what’s public and what’s private, and preserving anonymity can be tricky.

Is a chat room or online forum that’s open to anyone a public or private place? Should participants have a “reasonable” (the lawyers’ favorite word) expectation of privacy? The authors of the report generally believe that there can be no reasonable expectation of privacy in such venues, but the question is still debated by IRBs and ethicists, and different circumstances will lead different researchers to different conclusions.

Can someone be assured of anonymity when posting to an online forum? The APA authors advise researchers that even if the text they’re quoting was posted under a pseudonym, they should alter the name to avoid possible identification. They should also avoid quoting directly any snippets of text from online discussions, since these can easily be traced back to their origin through the use of search engines. In other words, if a user with the handle Bart72 writes, “I’ve been a bank robber ever since I dropped out of Springfield Elementary School in 1993,” you should probably change it to say, “Homer49 announced that he had been active in the field of personal finance since 1994.”

When the research indisputably involves human subjects, as with the use of surveys or questionnaires, there’s the problem of how to obtain informed consent online. You can’t really call it an informed consent “process” when the subject is just required to click “I consent” on a form. One proposed alternative is to have the subject agree separately to each part of the consent form. Of course, in the online world you never know who someone really is, so you can’t be sure that the person consenting to be studied is even legally competent to do so. A 12-year-old boy can easily pose as a 52-year-old man (and vice versa, though that situation raises issues of a different kind).

As for the risk to subjects, the APA report identifies two possible sources of harm to participants in online research. One type of harm can result directly from participation in the research, e.g., “emotional reactions to questions” (which would include the “inflicted insight” mentioned in chapter 10). To put it in technical terms: “Experiments that deliberately manipulate a subject’s sense of self-worth, reveal a lack of cognitive ability, challenge deeply held beliefs or attitudes, or disclose some other real or perceived characteristic may result in mental or emotional harm to some subjects.” (Kraut et al., 2004, p. 111) This risk may be equivalent to that of a face-to-face interview.

The greater risk, the authors believe, is that a subject’s confidentiality will be violated. This would most likely result from a network security breach, whether intentional or accidental, but confidentiality is also at risk when personally identifiable information is required in order to pay subjects for their participation. To minimize that risk, some online researchers give their subjects gift certificates for online stores, a process that doesn’t require subjects to reveal their identities and that has the added benefit of giving free (temporarily anyway) money to online retailers.

The APA report’s final recommendation is that review boards should have a few members with knowledge and expertise in Internet technologies. This might be practical only at institutions that have created separate IRBs to review social-science research.

From my standpoint, the most valuable aspect of the report was its reference to some Web sites where one can go to participate in online studies. One such site is the Social Psychology Network’s Online Social Psychology Studies at I took two of the surveys posted there. One involved making judgments about my own and others’ abilities at such tasks as saving money, riding a bicycle, programming a computer, and juggling. The other study, which took a little longer, was concerned with “whether romantic relationships vary as a function of attachment style, marital quality transmission across generations or both.”

At the beginning of both studies there was some semblance of an informed consent document, which I had to “sign” by clicking a button. At the end of both studies there was a page that told what exactly the study was for and what results had been accumulated thus far. This “debriefing” page is required by the psychologists’ code of ethics; if there was any deception involved, that’s where you would tell the subjects how you had duped them. The studies I was in were reasonably aboveboard, but one of the debriefing pages had this alarming statement: “Your results are highly unstable because you have supplied only 8 data points.” I didn’t think I’d done anything wrong. I considered writing to the investigators and reporting the emotional trauma caused by that rebuke.

Of course, it isn’t only social scientists, or even only academics, who are conducting online research. A whole new field of “human computation” has created a demand for people with lots of free time to spend helping researchers and product developers. Human computation involves figuring out which tasks peole can do more effectively than computers, and vice versa, and then assigning tasks accordingly. If researchers in human computation are affiliated with a university, or if they receive federal funding, then their work requires institutional review.

Luis von Ahn is a pioneer in human computation. (Some people even say that he gave the field its name.) An assistant professor at Carnegie Mellon University and a 2006 MacArthur fellow, he was instrumental in developing—and inventing the term for—the Completely Automated Public Turing test to tell Computers and Humans Apart, better known as CAPTCHA. (The concept of the Turing test, devised by mathematician Alan Turing in 1952, is that a human judge interacting with both a human being and a truly intelligent computer would not be able to tell which was which.) You may not know the word, but you’ve probably encountered CAPTCHAs if you’ve tried to access certain features or resources on the Web. The CAPTCHA is basically designed to make sure that you’re a person and not a computer. Here’s what one might look like if it were generated by someone with no artistic talent but with access to Microsoft Word:

Captcha example

Your task is to type the characters you see into a box on the Web page in order to get access to the page you want. If you were a computer, you probably couldn’t do that, i.e., you would fail the Turing test.

CAPTCHAs are mainly a security device, but von Ahn and others are working on ways to harness the human computing power that goes into solving them. One such initiative is aimed at reducing the number of OCR (optical character recognition) errors in books that are scanned and digitized. The CAPTCHAs (or reCAPTCHAs, as the new version is called) thus become words that a computer has found difficult to interpret, and the user types the correct interpretation. A news release from May 2007 begins: “A Carnegie Mellon University computer scientist is enlisting the unwitting help of thousands, if not millions, of Web users each day . . . ” (Carnegie Mellon, 2007)

The use of CAPTCHAs doesn’t qualify as research, because there’s no attempt to gain “generalizable knowledge.” Some of von Ahn’s other projects, which include an image-labeling activity called the ESP Game and a fact-collecting game called Verbosity, are aimed at accumulating large quantities of knowledge, but I doubt that it’s considered generalizable. During the development of his projects, however, he tests them on human subjects in order to refine them, and he shares what he has learned from these tests with colleagues around the world. At that point the work is definitely subject to IRB scrutiny. He learned this the hard way, by not seeking approval the first time he ran such an experiment (“They made me feel like an evil person,” he said of the IRB). (Von Ahn, 2007) Now, even though his research probably falls into the exempt category more often than not, he’s aware of the regulations.

Outside of academia, as long as you aren’t getting government funding, you’re free to mess with people’s minds as much as you like. One company that does this is Cycorp, Inc. (, developers of the Cyc knowledge base, “a vast quantity of fundamental human knowledge: facts, rules of thumb, and heuristics for reasoning about the objects and events of everyday life.” You can help build the knowledge base by playing a game called the FACTory.

The developers of the FACTory call it “a fun trivia game,” but rather than asking you questions, it asks you to confirm the truth of a series of assertions (in a Java applet that I didn’t find to be much fun at all). When I’ve played the game, the assertions have been mainly of three types: (1) statements about athletes who play on various sports teams or celebrities who starred in or directed various movies, (2) ambiguous generalizations about the world, and (3) syntactically correct but utterly nonsensical statements. Here are some examples from my FACTory sessions:

I don’t think I need to worry about copyright violation by publishing these statements, because presumably Cyc has removed them from its collection of “knowledge.”

The Web site for the FACTory claims that “every question you answer brings the world a little closer to a truly intelligent computer.” I guess it depends on how you define “intelligent.” If it means having the same amount of knowledge and the same level of certainty as your average computer user, then you can’t go wrong with anonymous, fallible, overconfident, and possibly intoxicated human collaborators.

It certainly seems to me that Cycorp’s volunteers are contributing to the development of “generalizable knowledge.” An IRB would probably consider this type of research exempt from review, or at worst of minimal risk to subjects. But what if a participant is afraid of ducks and is therefore traumatized to learn that most ducks might be taller than most bones? Or what if just seeing the title “Alien” gives someone flashbacks to the terrifying nightmares caused by seeing the movie? If Cycorp receives no federal funds, these people have no one looking out for their interests and no laws to protect them.

An IRB would also require children to have a parent’s permission before they could take part in Cyc’s research by playing the FACTory game. (There’s more about research involving children in chapter 16.) This would probably also be true for von Ahn’s ESP game ( Since the ESP Game gets some of its funding from the National Science Foundation, it may indeed have required IRB approval. However, the rules regarding protection of children are not part of the Common Rule, so NSF funding doesn’t require adherence to those rules.

Not that I claim to understand all the rules pertaining to human-subjects protection. But then I guess if they were easy to understand and interpret, there wouldn’t be so many Web sites, academic journals, and discussion boards devoted to trying to figure them out.

I had recently fallen in with a crowd of computational linguists at Big U, so I learned about a lot of opportunities for participation in linguistic research. One study, conducted by a Ph.D. student, involved editing machine translations of 150 excerpts from newspaper articles to make them comparable to a given human translation. A special interface for this task had been designed by the student. The entire activity was supposed to take about two hours to complete, but it took me more than seven hours. I never did find out exactly what the point of the research was; by the time I finished it, I really didn’t care anymore.

Either that study wasn’t considered human-subjects research or the student and his adviser weren’t aware of the federal and university regulations, because they apparently hadn’t sought IRB approval or even an exemption from review. The kinds of text being read, and the decisions being made about it, didn’t seem any more innocuous than the material in my friend Bob’s study, about which InvestiGuard staff had felt the need to interrogate him at great length.

One online linguistics experiment I volunteered for was conducted by some researchers in Italy. It consisted of making judgments as to the similarity of meanings in 65 pairs of English words. The more similar pairs included “automobile ... car” and “boy ... lad.” The ones at the other end of the scale included “asylum ... fruit” and “grin ... implement.”

I was to rate the similarity for each pair on a scale of 0 to 4, using a virtual slider that was a real pain to deal with. A text box next to the slider seemed designed for typing in a value, but it didn’t allow direct input; the slider was the only way to get a number to appear in the box. If I had moused around with the slider long enough to select the precise numbers I wanted, the experiment would have taken me about three times longer than it did. By settling for unsatisfying numbers like .17 and 2.83, I still took at least twice as long to finish the study as the estimated 10 minutes.

I don’t know if I’m abnormally slow or if researchers routinely underestimate the time commitment so that more people will agree to participate. This ploy might backfire in the real, face-to-face world, where people would tell their friends, “Don’t believe them! It takes three hours, not 45 minutes.” But if you conduct—and recruit for—studies online, the worst that can happen is that you won’t get the same participants to come back for future research. In the online “community,” hardly anyone knows and communicates regularly with anyone else.

One day someone posted a message to the IRB Forum requesting volunteers for a 15-minute survey. The purpose was to assess current attitudes and practices regarding ethics and risk assessment in genetic research. Whenever research involves genes or a person’s DNA, red flags go up in the research-ethics community, so I’m sure the survey got a lot of respondents, most of whom probably didn’t agree with the laissez-faire attitude expressed by my answers: Did I think it was OK to use someone’s stored DNA for future research without telling the donor? You bet! On the other hand, I strongly disagreed with the practice of withholding results of genetic testing from a patient merely because there was no cure for the disease detected.

I was amused to see, at the beginning of the genetics survey, the following statement: “This project is approved by the Institutional Review Board of [the institution whose IRB was conducting the survey].” In other words, the IRB had approved its own survey. You may want to remember this incident when you read about conflict of interest a few chapters from now.

Previous chapter | Contents | Next chapter | References | Contact