Back to top

"Have you ever sent out a ‘tweet’ on the popular Twitter social media service? Congratulations: Your 140 characters or less will now be housed in the Library of Congress,” reported the official blog of the Library of Congress back in April 2010. Each and every public tweet since Twitter’s inception in March 2006 has been digitally archived in the Library of Congress. “That’s a LOT of tweets, by the way: Twitter processes more than 50 million tweets every day, with the total numbering in the billions,” said Matt Raymond of the Library of Congress blog.

The idea behind the project? Twitter has become an important source of information that reaches billions of users located all over on the globe. Raymond mentioned significant tweets over the past years, including:

- the first-ever tweet from Twitter co-founder Jack Dorsey
- President Obama’s tweet about winning the 2008 election
- a set of two tweets (1 and 2) from James Buck, a graduate student of the University of California-Berkeley. As a photojournalist, he was arrested while covering an anti-government protest in Mahalla, Egypt. He was freed because of a series of events set in motion by his first tweet: “Arrested.” Buck literally “twittered” his way out of jail, as CNN reported.

The Library of Congress sees itself as a place where “important historical and other information in digital form should be preserved for the long haul,” according to Raymond. The acquisition of tweets by the Library of Congress, in turn, comes with scholarly and research implications, as well as numerous ethical questions.

First and foremost, we may ask ourselves—as some researchers have already done—whether it is ethical to harvest public tweets without first obtaining specific, informed consent by the subjects.

Michael Zimmer, Co-Director of the Center for Information Policy Research at the University of Wisconsin-Milwaukee, recalls a recent debate on precisely this subject during the workshop titled “Revisiting Research Ethics in the Facebook Era: Challenges in Emerging CSCW Research:”  “Many in the room felt that consent was not necessary since the tweets are public, a conscious choice made by the user to allow the whole world see her activity,” Zimmer said. “In short, by not restricting access to one’s account, there is no expectation of privacy.”

Zimmer himself, however, argued that, “We cannot be so quick to presume the expectations of potential research subjects. Yes, setting one’s Twitter stream to public does mean that anyone can search for you, follow you, and view your activity. However, there is a reasonable expectation that one’s tweet stream will be 'practically obscure' within the thousands (if not millions) of tweets similarly publicly viewable.”

A public tweeter, according to Zimmer, consents to making his or her tweets visible to those who take the time and energy to seek him or her out. Automatic consent to have one’s tweet stream systematically followed, harvested, archived and mined by researchers, however, has not been given by the user. Zimmer himself believes strongly that researchers should seek consent prior to capturing and using this data.

The Library of Congress might argue that they gave an “initial heads-up to the Twitter community itself via our own feed @librarycongress,” as Raymond wrote. But when comparing the number of followers of @librarycongress (50,000) to the total number of Twitter users, very few users really know that their tweets are on file for research and more.

Moreover, what does it mean to use tweets for research? Are they really a reliable source of information? In Zimmer’s opinion, tweets “are reliable in terms of what public discourse on certain topics might be (or at least the discourse of those using Twitter).”

While Zimmer would “hesitate using [tweets] for journalistic purposes without getting additional verification,” others certainly have done so. Precisely for this reason, criticism of effective journalism has also been made. On his blog, Bob Cusick, who describes himself as a “tech guy, entrepreneur and consumer of all things digital,” writes:

“I was watching CNN the other day - and they actually showed a reporter posting tweets - on the air! What the?? If I'm watching CNN - I don't want to see people Twittering... or checking their email... or writing a blog post... I want to watch people report the news.”

The question appears to be that whether we can make sense of the data extracted from tweets outside of the context of Twitter, be it for research purposes or journalistic ones. Zimmer opines, “to me, I don't think there is a way for the data to make sense outside the context of Twitter.” He explains:  “Anyone using this data must recognize the affordances of the platform itself, and how that shapes the content within.” He views the 140-character limit as a significant obstacle that makes messages brief and “therefore often lacking context or nuance. Tweets are often silly, tongue-in-cheek, aggressive, etc. There's little chance for actual conversation or thoughtful analysis.”

From all this tweeting, a new term, “Twossip,” meaning “Twitter + Gossip,” has been coined. This term specifically calls into question the reliability of information contained by tweets. The German website states that “Twossip is an entertaining collection of news and rumors originating from the German Twitter sphere.” The line between news and rumors is clearly blurred. It reads on: “Our articles have no claim to any kind of veracity and serve only to entertain and divert the reader. Don’t take it personally.”

But if researchers use the information, they are, in a sense, taking it personally. Moreover, we might ask ourselves whether new legal measures need to be put in place to regulate tweets. Cusick asks: “Does saying something about a person and then broadcasting it to all your ‘followers’ constitute slander - or are you just doing a ‘really wide-reaching IM’?” It appears that new cyber laws need to be determined, put in place and followed by legal action. Law schools would need new departments, too.

Before you get into trouble or have your information harvested for research you didn’t agree to, think before you tweet. The responsibility lies within the individual user to reflect on what he or she posts online. As the Twossip site itself advises, “think carefully about whether you want to make that information public. There’s no turning back.”

Learn more about Isabel Eva Bohrer at

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • Lines and paragraphs break automatically.
  • Web page addresses and email addresses turn into links automatically.