First legal shot across the Semantic Web’s bow - Thomson suing Zotero

Last week Thomson Reuters (the owner of EndNote Software, a widely used proprietary tool for collecting and managing scholarly bibliographic information) filed a lawsuit against Zotero, the most popular open source, Semantic Web-enabled bibliographic tool. Zotero, packaged as a Firefox extension, is a handy tool for collecting bibliographic metadata to assist scholars in managing information necessary for their research (news story, complaint). Zotero can import and export a variety of different bibliographic formats and does so in a web-friendly, RDF-enabled way. Exchanging and linking bibliographic information (ie., the title, author, publication venue) of scholarly communication is an important means to discover new links amongst individual pieces of research that are published around the world. This has been a high priority, for example, in the life sciences where new knowledge can be uncovered by linking individual pieces of research together.

The latest beta release of Zotero will read and write EndNote’s proprietary metadata format and import and export the citation formats that EndNote provides for a wide variety of academic journals. In response to this, Thomson sued the Zotero developers (an open source community hosted at George Mason University), charging that Zotero (and GMU) reverse engineered the EndNote citation file format in violation of EndNote’s end user license agreement (EULA).

The key effect of Thomson’s suit, if it succeeds, would be to create a legal doctrine that enables software developers to restrict the Semantic Web’s potential to promote data interoperability and data integration. The legal issue at bar has to do with reverse engineering and the enforceability of EULAs, both of which are important questions. And, there’s a lot of say about whether or not the compliant will stand up to legal scrutiny. That said, the Web community, as well as the scholarly community, ought to pay careful attention to this case because its outcome could have real bearing on how free we will all be in the future to exchange information and realize the knowledge-enhancing benefits of the Web through collaborative research.

Justice Brandeis and privacy protection through usage restriction

For a couple of years, colleagues of mine and I have been writing about the need to protect privacy through rules and laws restricting how information is used, not just who can access the personal information. So, I was very happy to discover that a famous early exposition of privacy rights in United States law (Olmstead v. United States (1928)), by the most famous judicial advocate of privacy rights, Justice Louis Brandeis, expressed a clear sentiment in favor of protecting privacy based on how information is used, not just whether one is entitled to have access to it or not. In the course of explaining why earlier Supreme Court legal precedents should be understood to make wiretapping illegal, Brandeis wrote

Unjustified search and seizure violates the Fourth Amendment, whatever the character of the paper; [n4] whether the paper when taken by the federal officers was in the home, [n5] in an office, [n6] or elsewhere; [n7] whether the taking was effected by force, [n8] by [p478] fraud, [n9] or in the orderly process of a court’s procedure. [n10] From these decisions, it follows necessarily that the Amendment is violated by the officer’s reading the paper without a physical seizure, without his even touching it, and that use, in any criminal proceeding, of the contents of the paper so examined — as where they are testified to by a federal officer who thus saw the document, or where, through knowledge so obtained, a copy has been procured elsewhere [n11] — any such use constitutes a violation of the Fifth Amendment.

That is to say, even if the officer was in rightful possession of the private information, it still should be understood as a violation of privacy it the police use the information against the individual. This is privacy as a set of usage rules.

Brandeis was trying to argue that wiretapping should be considered illegal under the Courts existing precedents but the majority of the Court opposed him and asserted that wiretapping was constitutional because it did not involve any physical trespass into the private property of the telephone user. So, Brandeis lost the argument in this early case and wiretapping remained constitutional (though not always legal) in the US for another 40 years. Eventually, though, the Court came around to Brandeis’ view that how the government got access to the telephone call matters less than the fact that people have, and are entitled to have, an expectation that their calls are private; that government would become too powerful it allowed to use the contents of our private communications without a warrant.

Will John McCain help the NEXT Blackberry creator?

Today a senior McCain advisor, Doug Holtz-Eakin, proudly held up Blackberry and declared:

“You’re looking at the miracle that John McCain helped create.”
AP, 16 September 2008

Bloggers on all sides of the partisan divide are having a field day with this, suggest that the McCain campaign is out of touch, desperate, or trying to top the trouble VP Al Gore got into when he was falsely accused of claiming to have invented the Internet. At best, it suggests that Eakin-Holtz was just careless. At worst, it suggests that the campaign and the candidate has deeply irrational ideas about how to promote innovation. It’s also been pointed out that there’s some irony in McCain claiming credit for the success of a Canadian company.

The real question is: what would a McCain presidency do to help enable the NEXT innovative device, service or revolutionary use of the Web? (**Full disclosure here: I’m an active supporter of Senator Obama, though this post is entirely my own and not in any way made on behalf of the Obama campaign.**)

McCain’s record in promoting innovation on the Internet and in the large information and communications marketplace is terrible. Mostly, he can claim credit for supporting incumbents over innovators and for failing, in his time as Chair of the Senate Commerce Committee to do anything at all to support the innovative and socially beneficial aspects of the Internet. While he was in the leadership of the Senate Commerce Committee (1997 - 2001, and 2003 - 2005) his contributions included:

  • being entirely AWOL in defending the openness-protecting provisions of of the Telecommunications Act of 1996 — the parts of the Act that were supposed to help assure market access to innovative new services, such as the Blackberry, were weakened, ignored or attacked by the FCC and the courts. As Chair of the Committee responsible for the law, McCain did nothing. That’s why we have an anemic choice of broadband providers in most parts of the country. This is good news for incumbent cable and telecom companies but will make it harder for the next Blackberry to get to market.
  • opposing eRate legislation that extended Internet access to schools and libraries. Not only were his policies as committee chair bad for innovators, he sought to make it harder for the non-profit sector to pay for Internet access.

What did McCain do has chair of the most powerful congressional body in the communication and information market? He mostly stood up for the interests of incumbents. He wrote letters to the FCC supporting higher cable television rates, encouraged consolidation in the telecommunications market reducing the number of local phone companies from 7 down to an eventual 3.

And today, even though he’s no longer in a leadership role on Internet and telecommunications policy, he’s still speaking up against innovation and for incumbents through opposition to even modest Net Neutrality provisions.

In the end, the campaign season slide of some advisor is nothing compared to the anti-innovation record of Senator McCain himself. We’re lucky (well, maybe :-) ) to have Blackberry’s an other innovations today. They won’t likely go away. But the question is which presidential candidate is more likely to support policies that enable the NEXT Blackberry. History shows is certainly isn’t John McCain.

Microsoft on the need for openness in scholarly tools and data

I’m sitting at the Microsoft Faculty Summit, listening to Tony Hey (VP for External Research) talk about how critical it is for scientific researchers to have open access to data and open source tools (Tony actually said ‘free software tools’) in order to solve the most critical problems of the world. Among other things, Tony highlighted the importance of attaching metadata to documents and data, mentioning some of the MSFT tools such as a MS-Office plug-in that attaches Creative Commons labels to Office docs.

Anyone who thinks that institutional views of monolithic or easy to predict….

Conflicting voices in the liberal mainstream on FISA

Today, the Senate passed a much-debated revision to the Foreign Intelligence Surveillance Act with Lots of different views out there, even amongst the mainstream liberal establishment on the upcoming FISA legislation (’Senate Passes Surveillance Bill With Immunity for Telecom Firms‘, Washington Post, William Branigin, 9 July 2008).

In advance of this vote, there has been much debate, recently because Sen. Obama announced that he would support this compromise bill and not vote in support of filibuster. (Full disclosure, I’m an Obama supporter and have helped the campaign on Internet policy issues.) In thinking about this, I thought I’d survey the range of opinion just on the liberal center. Here’s some of what I found:

Mort Halperin, highly regarded civil libertarian, former head of the ACLU Washington office, and himself a target of unwarranted government wiretapping when he was working for Henry Kissenger in the Nixon White House, writes in a New York Times Op-Ed (’Listening to Compromise‘, New York Times, 8 July 2008):

The compromise legislation that will come to the Senate floor this week is not the legislation that I would have liked to see, but I disagree with those who suggest that senators are giving in by backing this bill.

The fact is that the alternative to Congress passing this bill is Congress enacting far worse legislation that the Senate had already passed by a filibuster-proof margin, and which a majority of House members were on record as supporting.

What’s more, this bill provides important safeguards for civil liberties. It includes effective mechanisms for oversight of the new surveillance authorities by the FISA court, the House and Senate Intelligence Committees and now the Judiciary Committees. It mandates reports by inspectors general of the Justice Department, the Pentagon and intelligence agencies that will provide the committees with the information they need to conduct this oversight. (The reports by the inspectors general will also provide accountability for the potential unlawful misconduct that occurred during the Bush administration.) Finally, the bill for the first time requires FISA court warrants for surveillance of Americans overseas.

As someone whose civil liberties were violated by the government, I understand this legislation isn’t perfect. But I also believe — and here I am speaking only for myself — that it represents our best chance to protect both our national security and our civil liberties. For that reason, it has my personal support.

On the same day, the New York Times Editorial Board wrote against the bill (’Compromising the Constitution‘, 8 July 2008):

The Senate should reject a bill this week that would needlessly expand the government’s ability to spy on Americans and ensure that the country never learns the full extent of President Bush’s unlawful wiretapping.

[..]

Supporters will argue that the new bill still requires a warrant for eavesdropping that “targets” an American. That’s a smokescreen. There is no requirement that the government name any target. The purpose of warrantless eavesdropping could be as vague as listening to all calls to a particular area code in any other country.

The real reason this bill exists is because Mr. Bush decided after 9/11 that he was above the law. When The Times disclosed his warrantless eavesdropping, Mr. Bush demanded that Congress legalize it after the fact. The White House scared Congress into doing that last year, with a one-year bill that shredded FISA’s protections. Democratic lawmakers promised to fix it this year.

[..]

The bill dangerously weakens the 1978 Foreign Intelligence Surveillance Act, or FISA. Adopted after the abuses of the Watergate and Vietnam eras, the law requires the government to get a warrant to intercept communications between anyone in this country and anyone outside it — and show that it is investigating a foreign power, or the agent of a foreign power, that plans to harm America.
Proponents of the FISA deal say companies should not be “punished” for cooperating with the government. That’s Washington-speak for a cover-up. The purpose of withholding immunity is not to punish but to preserve the only chance of unearthing the details of Mr. Bush’s outlaw eavesdropping. Only a few senators, by the way, know just what those companies did.

And today, the Washington Post, often somewhat more centrist on civil liberties matters than the Times editorialized (’FISAs Fetters‘, 9 July 2008):

These are serious concerns, worth taking seriously. We are under no illusion that the measure is perfect; future fine-tuning may well be called for. The classified nature of the surveillance program makes it impossible to assess the implications with anything near certainty. But the legislation reflects, as far as we can tell, a reasonable compromise, worked out over long months of negotiations, between the legitimate needs of intelligence agencies and the legitimate privacy interests of Americans.

The measure requires an individualized, court-approved warrant to conduct surveillance targeted at Americans’ communications with those overseas and — in an expansion of existing FISA protections — at Americans abroad. Purely domestic-to-domestic communications, even among foreigners here, would require a warrant as well. Intelligence agencies would be able to target and collect the communications of non-Americans “reasonably believed to be located outside the United States,” even if their phone calls or e-mails passed through or were stored in the United States. But the agencies are required to adopt procedures to “prevent the intentional acquisition” of purely domestic communications and to minimize the retention and dissemination of such information.

more to come…

Google, Viacom, Privacy and Copyright meet the social web

In all the recent uproar (New York Times, “Google Told to Turn Over User Data of YouTube,” Michael Helft, 4 July 2008) about the fact that Google has been forced to turn over a large pile of personally-identifiable information to Viacom as part of a copyright dispute (Opinion), there is a really interesting angle pointed out by Dan Brickley (co-creator of FOAF and general Semantic Web troublemaker). Dan points out in a blog entry today that while the parties before the court are arguing about whether the YouTube ID is, by itself, personally identifiable information, the fact is that the publicly visible part of this ID in the context of other information on the Web is sufficient to identify a lot about a person, not the least of which is their name. Dan explains:

YouTube users who have linked their YouTube account URLs from other social Web sites (something sites like FriendFeed and MyBlogLog actively encourage), are no longer anonymous on YouTube. This is their choice. It can give them a mechanism for sharing ‘favourited’ videos with a wide circle of friends, without those friends needing logins on YouTube or other Google services. This clearly has business value for YouTube and similar ’social video’ services, as well as for users and Social Web aggregators.

Given such a trend towards increased cross-site profile linkage, it is unfortunate to read that YouTube identifiers are being presented as essentially anonymous IDs: this is clearly not the case. If you know my YouTube ID ‘modanbri’ you can quite easily find out a lot more about me, and certainly enough to find out with strong probability my real world identity. As I say, this is my conscious choice as a YouTube user; had I wanted to be (more) anonymous, I would have behaved differently. To understand YouTube IDs as being anonymous accounts is to radically misunderstand the nature of the modern Web.

Dan makes a really important point here. One the on hand, the fact that we are all more identifiable as a result of social networks in which we exist suggests that the judge was just plain wrong (even wronger than others have already said) in saying that the YouTube IDs are not personally-identifiable. But on the other hand, to the extent that Dan is correct about the revealing nature of the social web (true for some of us now, more and more in the future), we have to face the fact that merely limiting disclosure of personal information from one source is less and less unlikely to protect privacy effectively across the Web.

Applying this view to the Viacom v. YouTube case suggests that privacy protection has to focus more limiting how people and institutions can *use* personal information even as we recognize that it is harder and harder to protect privacy by access control alone.

Some of my colleagues and I have written about this view of privacy as Information Accountability in last month’s Communications of the ACM.

A Political Denial of Service (PDOS) attack on blogger.com?

A little transparency would go a long way toward helping keep online political discourse open, especially in the particular corner of the blogosphere run by Google (ie. blogger.com). The Herald Tribune (Bloggers take aim at Google - International Herald Tribune) reports on a controversy involving pro-Clinton blogs that might have been blocked as spam due to what we might call a PDOS (Political Denial of Service Attack) in a skirmish between Obama and Clinton partisans. The IHT asks:

Was Google’s network of online services manipulated to silence critics of Barack Obama? That was the question buzzing on a corner of the blogosphere over the past few days, after several anti-Obama bloggers were unable to update their sites, which are hosted on Googles Blogger service.

It is alleged that some pro-Clinton blogs were blocked after a number of pro-Obama users marked them as ’spam’ on blogger.com. A Google spokesperson explained:

“It appears that our anti-spam filters caused some Blogger accounts to be blocked from creating new posts,” a Google spokesman, Adam Kovacevich, said in a statement. “While we are still investigating, we believe this may have been caused by mass spam e-mails mentioning the ‘Just Say No Deal’ network of blogs, which in turn caused our system to classify the blog addresses mentioned in the e-mails as spam.”

Kovacevich said that Google had restored posting rights to the affected blogs and that it was “very important” to Google “that Blogger remain a tool for political debate and free expression.” He gave no further details about Google’s spam-monitoring techniques or how they relate to the Blogger service.

It certainly would be useful if Google could provide some transparency into what they block and why. That way, either Google or the possibly malicious spam-flaggers could be help accountable for their behavior. (In a recent CACM piece on Information Accountability we explain why accountability is so important on the Web and how we might have more of it through additions to the architecture of the Web.)

Google does a very good job of giving transparent explanations when their search results contain information that has been blocked for legal reasons such as copyright takedown notices. I hope they can find a way to bring similar transparency to their part of blogosphere.

On the retirement of a wonderful teacher — Michael DiGennario PhD

This week a wonderful high school teacher of mine retired. I was asked to write a remembrance of his teaching and thought I’d post it on this blog.

Junior year at Mamaroneck High School for me was about nothing if not being able to eat lunch in Mike DiGennaro’s classroom. By senior year, those of us who went on to take American Fiction with Mike felt we had earned the right to be there with him at lunch, but as a nerdy, socially uncomfortable 11th grader, to be invited to eat with this teacher for whom I had so much respect and fascination was the ultimate vindication of my nerdy existence.

In 1981, our class elected him to be the faculty speaker at MHS graduation. In the tradition of American literature that he loved so much, he spoke of rites of passage: our passage from high school into adulthood and his passage into the ripe old age of 40. I think this must have been a difficult transition for him, raising all sorts of questions about the meaning of his life and his sense of mortality. What has stuck with me ever since was the tremendous generosity of spirit he showed in being willing to link the passages in his life with the changes in ours. He made us feel that we were all in these transitions together and that the struggles we felt as teenagers heading off to the next stage in our lives were every bit as significant and worthy of serious regard as were the milestones in his adult life.

His graduation speech, and all of the rest of the wisdom he offered to us, imbued his relationship with us with narrative drama in a way that connected our lives to his. With this, I feel that he taught us how to take ourselves seriously at the same time as he taught us to read seriously. He taught us that these are really the same thing.

Just as Mike marked his 40 year milestone in that speech, Mike’s hero (and subject of his dissertation) Saul Bellow, began his Nobel Prize speech with a reflection on his own life 40 years prior to the award, when he was a restless undergraduate. At 40, Mike still had a lot of intellectual restlessness in him and passed that along to us. This is the restlessness of writers and readers that Bellow described in his speech:

[There is] an immense, painful longing for a broader, more flexible, fuller, more coherent, more comprehensive account of what we human beings are, who we are and what this life is for. (Nobel lecture, 1976)

The gift that Mike gave me was to look at literature and my own life with that longing but also to find some humor and perspective along the way. I think that’s why he offered us a calm place to eat lunch with him every day.

Wise thoughts on Clinton, Obama, Race and Gender

Freada Kapor Klein, a long-time leader in the field of workplace diversity and discrimination issues, has written a really thoughtful piece on Hillary Clinton’s approach to gender and race in the current (nearly over?!?!) campaign. In the course of explaining (Another 55-year-old White Woman for Obama. HuffPost 28 May 2008) why she, Freada, supports Obama and not Clinton, Klein begins by explaining:

After each loss, Hillary Clinton or her supporters raise the specter of sexism in a divisive way; this stands in sharp contrast to Barack Obama’s unifying discussion of race. Declaring gender as the “highest glass ceiling” has neither facts nor a hunger for uniting the country on her side.

She goes on to say that all quantitative research conducted by her Level Playing Field Institute on discrimination in the workplace finds that “People of color are more than three times as likely to leave solely due to unfairness (9.5%) than Caucasian heterosexual men (3.0%). In comparison, Caucasian women are only one-and-a-half times more likely to leave (4.6%).
What really struck me in her discussion of the political behavior of some groups of white women, she writes:

[I] learned that white women voters in California and Michigan voted in favor of Propositions 209 and Proposal 2, both of which effectively ended affirmative action. Why would women, who had been assured equal opportunity under affirmative action, vote for such a measure? According to those who conducted interviews with them, because they believed that a vote for themselves and people of color was a vote against their white husbands and their white children…. To me, Hillary Clinton represents the status quo: privilege protecting privilege at the expense of less affluent or “connected” populations of our society — especially with regard to creating a level playing field in American workplaces. I am supporting Barack Obama because he challenges us to build empathy, to care about others as much as ourselves, and to ask hard questions–such as who really has the unfair advantage

I suggest reading her whole post.

Important New Jersey Supreme Court decision in Internet privacy

The New Jersey Supreme Court (State of New Jersey v. Shirley Reid (A-105-06)) has issued an important decision on Internet users’ right to privacy. The case involves a dispute about whether an ISP violated a user’s privacy rights by turning over subscriber information (name, address, billing details) associated with a particular IP address. It ends up the that subpoena served on the ISP was invalid for a variety of reasons. As the user had a ‘reasonable expectation of privacy’ in her Internet activities and identifying information, and because the subpoena served on the ISP was invalid, the New Jersey court determined that the ISP should not have turned over the personal data.

The important aspect of this case in the evolving understanding of privacy on the Internet is the court’s recognition that we must look at privacy from the broad perspective of what can actually be discovered about people online. In this way, the ruling has significant strengths and weaknesses from a privacy perspective. On the one hand, the court finds that there is, today, an expectation of privacy in IP addresses because they are currently hard to link to personal identity. There have been lots of disputes in the US and the EU about whether IP addresses are ‘personally identifying information.’ (”PII” in the jargon of privacy.) This court takes a pragmatic view of this question and finds that IP addresses should be considered private for now, but that this may change. The court finds:

the reasonableness of the privacy interest may change as technology evolves. A reasonable expectation of privacy is required to establish a protected privacy interest…. Internet users today enjoy relatively complete IP address anonymity when surfing the Web. Given the current state of technology, the dynamic, temporarily assigned, numerical IP address cannot be matched to an individual user without the help of an ISP. Therefore, we accept as reasonable the expectation that one’s identity will not be discovered through a string of numbers left behind on a website.

The availability of IP Address Locator Websites has not altered that expectation because they reveal the name and address of service providers but not individual users. Should that reality change over time, the reasonableness of the expectation of privacy in Internet subscriber information might change as well. For example, if one day new software allowed individuals to type IP addresses into a “reverse directory” and identify the name of a user — as is possible with reverse telephone directories — today’s ruling might need to be reexamined.

Others have written about the legal details of this case and have suggested that it is a big win for privacy. Given the reliance on the shifting state of identity technology, I’m a little less sanguine.

This case is yet another reason why I believe (as I’ve explained elsewhere) that meaningful privacy on the Web requires rules the govern how personal information is used, not just what can be collected. Under the court’s reasoning, as our lives become more and more transparent, that would justify increasing harmful use of personal data. While it’s pretty hard to control how exposed we are all become, we still can limit how powerful institutions (governments, etc.) use personal data about us.

Bob Metcalfe’s wisdom on patents and innovation

Ethernet inventor, journalist and now venture capitalist Bob Metcalfe speaks on the lessons from the Internet community for the global warming arena. In looking at how to accelerate technical innovation to address climate change, Metcalfe asserts that:

“… the place to do research is in university labs. “The best vehicle for technology innovation is not patents, it’s students.”

Of course, Bob also manages to express is distain for monopoly, Bell Labs, and even Al Gore. (See report by Martin LaMonica.) I’m not sure about those but think he’s right on with respect to patents.

On meetings

Ever the astute observer of the various features and bugs of our collective behavior, a longtime mentor of mine, Mitch Kapor, has coined a new defintion:

Meetingboarding: (n) the sensation of being unable to breathe arising from continuous immersion in meeting after meeting

I’d add to this a characterization of email that I learned from Mitch many years ago:

The problem with email is that it has low emotional bandwidth.
-Mitch Kapor, circa 1991

Today - NPR Science Friday program on Web privacy issues

National Public Radio’s Science Friday program will feature a discussion of online privacy with Alessandro Acquisti of CMU and yours truly a little later today. It’s live from 3:00 - 4:00 pm Eastern/US, rebroadcast at various times depending on where you live, and streamed on the Web.

Listen it. Call and challenge other listeners to think about the privacy questions raised by the Semantic Web!

Update: the broadcast is streamed at this link.

Transparency for behavioral profiling

Behavioral targeting is pervasive on the Web. As documented by a very nicely-researched New York Time story today (’To Aim Ads, Web Is Keeping Closer Eye on You,’ NYT, by Louise Story, 10 March 2008.) it’s now clear that each of us who use popular search engines and portals are the subject of thousands of individual data collection events per month of Web usage.

I’m glad to see some clear analysis of the practice out there but would like to see an additional level of transparency. If it is the case that profiling is benign, then why not tell uses what aspect of their profile triggered the placement of a particular ad. The ad delivery systems all make decisions about which ads to place for a given user from some properties of that user that are either known or inferred. Why not just tell us what those properties are along with the add placement. This would go a long way toward eliminating the feeling that we’re being ’spied on’ because it would eliminate any sense of secrecy about what is learned in the course of the behavioral monitoring. My guess is that many people would ignore the profile data, but some would check it, and we’d all have piece of mind from knowing that whatever is being done is happening out in the open.

According to the Times, data is collected on which web pages we look at and is then combined with other data (demographics, browsing history, purchases on partner sites, etc.). Right on cue traditional privacy advocates declare that profiles developed in this way (based on our behavior) do (or should) make us feel uneasy:

“When you start to get into the details, it’s scarier than you might suspect,” said Marc Rotenberg, executive director of the Electronic Privacy Information Center, a privacy rights group. “We’re recording preferences, hopes, worries and fears.”

No doubt people (as least some people) feel alarmed about this and probably others are either implicitly or explicitly happy to have the right ads targeted to them. As an online ad agency exec said in the article:

“Everyone feels that if we can get more data, we could put ads in front of people who are interested in them,” he said. “That’s the whole idea here: put dog food ads in front of people who have dogs.”

Unless were going to require an outright ban on this sort of behavioral targeting, the question what to do about it. Is the goal to allay people’s fears? To limit the use of the profiles? Or to help people avoid incorrect targeting?

The statistics developed by comScore for the New York Times article do a nice job of illustrating the magnitude of data collection that happens. Jules Polonetsky, AOL’s Chief Privacy Officer, is launching a new consumer education campaign to explain the mechanics of data collection and tracking to users. The light that both the Times stories and the AOL campaign shed on marketing practices is valuable.

Many people are going to far more interested in how this profiling actually effects them, than on the overall magnitude of the practice. Is there any reason not to be upfront with people about the basis for delivering an ad? If there is, then there is reason to feel that we’re being deceived or maniplated, not assisted, by the behavior tracking techniques.

The political power of (simple) Web computing

It’s pretty amazing what a little bit of structured computer power can do when deployed on the Web. Slate’s Delegate Calculator puts in the hands of Web-enabled citizens some simple computing power that helps us to understand how the delegate counts in the upcoming Democratic primaries may effect the final outcome for Obama and Clinton over the next hours, weeks and months. The knowledge about which states have how many delegates, how they might be apportioned, etc., is information that used to be a closely guarded secret of the political intelligencia and the press. How, it’s out there for all of us to see. It’s such a useful tool that many reporters from other publications are actually writing about it:

Jonathan Alter, Hillary’s Math Problem, Newsweek (4 March 2008)

Peter Baker, Clinton Down, but not Out, for the Count, Washington Post.

Jason Tuohey, Delegate Counter, Boston Globe

Carol Lockhead, Obama Wins Vermont, But Look at the Math, San Francisco Chronicle.

Granted, Slate has a relationship with some of those new outlets, but it’s still striking to see computing make the political news.

Important FCC hearing on Net Neutrality in Cambridge, MA

I’d encourage anyone in or around the Boston, MA area to come to the Federal Communications Commission’s field hearing on Broadband Network Management Practices. I’ll be testifying along with a range of witnesses, Dave Clark and David Reed (colleagues from MIT), representatives from various commercial groups, and a number of advocacy organizations such as Free Press. I understand Congressman Ed Markey, a longtime champion of the Internet and the Web, will also be appearing.

Here are the logistical details:

Monday, Feb 25, 2008
11:00 a.m. to 4:00 p.m.
Harvard Law School, Ames Courtroom, Austin Hall
1515 Massachusetts Avenue, Cambridge, Mass.

Using the Web as an independent source of linked facts to shed light on the news

Following the story about staff changes in Senator Clinton’s presidential campaign ( Clinton Campaign Manager Is Out - The Caucus - Politics - New York Times Blog), there seemed to be some question about whether the campaign cancelled a trip from Washington, DC down to Roanoke, VA due to weather or other factors (the implication being that the campaign was in disarray so cancelled the trip). The Times wrote:

The announcement of Ms. Solis Doyle’s replacement came minutes after Mrs. Clinton was grounded by what her campaign said were high winds at Dulles Airport. After arriving at the airport for a charter flight to Roanoke, Mrs. Clinton, her staff and the traveling press corps were not allowed to board the plane.

A spokesman for Mrs. Clinton said high winds at the airport had forced “a number of planes” to be kept on the ground, and that some planes that had taken off today had suffered structural damage. (Other planes at the airport were taking off as Mrs. Clinton’s motorcade drove away, en route to Washington.)

leaving open the suggestion that the report of high winds and flight cancellations could have been a ruse.

The Web provides independent, on the spot fact checking here with just a few clicks. Follow this link and you can see that there were a number of small planes that took off from Dulles and landed at Roanoke at the time in question, though a few were late.

It would actually be useful if news reports would just reference these clearly documented facts on the Web and allow readers to draw their own conclusions, rather than leave hanging speculation.

Reciprocal Privacy for the Social Web (a.k.a. FOAF)

I’ve loved the idea of FOAF for a long time but always been bothered by the privacy risks that would result of FOAF really took off as a way to represent our social networks. Here’s an idea about how to address privacy in open social networks such as those represented by FOAF-like data structures.

It’s called (for now) REP: Reciprocal Privacy for Social Networks

ReP is a proposal to establish a reasonable privacy balance in social networking environment. Today, more and more social networks are coming onto the Web and are working to share more data across the previously-established boundaries that have previously separate these networks. Participants in social networks should have the benefit of widely shared agreements about how the information they present in those networks will be analyzed and used. To encourage the development of these social and legal privacy norms, we need a simple policy language for expressing rules associated with personal information, and a reliable, scalable mechanism for assessing accountability with those rules. We propose a new protocol by which those who share personal information on the Web can have increased confidence that this information will be used in a transparent manner and that users of the personal information will be able to be held accountable to comply with the stated usage rules.

Privacy policies and associated technologies must provide individuals harmed by breaches with legal recourse against those who abuse the norms of information usage. Hence, agreements must be clear and structured in a manner that there is a chance that the existing legal system could provide a remedy for harm. We should neither expect nor require than a single set of norms will be adequate for all users, all social networking contexts or all cultures, but there should be a common framework and a basic policy vocabulary that can express commonly used rules and be easily extended.

The key to sharing personal information across a diversity of privacy policy frameworks is to establish legal and technical mechanisms that ensures a baseline of social and legal accountability across varying rulesets. Participants in the ReP web must agree as a condition of accessing anyone else’s personal information that usage of personal information will be reported by the user to a log specified by the data subject. Further, anyone who uses the personal information must agree to require that the same set of rules (both the logging requirement and whatever usage rules came with the data) be applied to any subsequent users of the data. The log will allow the data subject to check that a specific usage of personal information complies with the specified usage limitations, and to follow the trail of accountability from the initial access of the data through to the final usage event.

This copy-left-inspired viral policy is the most effective way to assure that the original rules associated with personal data are respected as that data is re-used over and over again in a variety of contexts. In the event of misuse, the logs will provide a means to locate the mis-user and seek correction or other redress. In the event that a use of personal information is discovered which is NOT recorded in the person’s accountability log, that use is by definition a violation of the ReP policy. In many cases where such unauthorized use does real harm to the data subject, it will be possible with some amount of forensic effort will find the mis-user and enable redress. Of course, there will be anonymous mis-users of personal information. We cannot insulate Web users from those risks with ReP, but neither can any other privacy protection strategy that is feasible in an inherently open information environment.

There’s more to read in a skeletal REP design document.

The policy is still rough and the technology hasn’t been built yet, but I’d still really like reactions. :-)

GPS Luddites - the English countryside rebels against satnav

Seems that a number of villages in the English countryside are being overrun by errant trans-European trucks which are regularly misdirected by their GPS satnav systems onto roads that were better suited for horse-drawn carriages than big, long-distance trucks. According to the New York Times (”Wedmore Journal: Turn Back. Exit Village. Truck Shortcut Hitting Barrier.” Sarah Lyall, 4 December 2007, p.A7):

trucks and tractor-trailers come here [to Wedmore] all the time, as they do in similarly inappropriate spots across Britain, directed by G.P.S. navigation devices that fail to appreciate that the shortest route is not always the best route. “They have no idea where they are,” said Wayne Hahn, a local store owner who watches a daily parade of vehicles come to grief — hitting fences, shearing mirrors from cars and becoming stuck at the bottom of Wedmore’s lone hill.

The head of the parish council offers a practical suggestion:

John Sanderson, chairman of the parish council, has proposed a seemingly simple remedy: removing the route through Wedmore from the G.P.S. navigation systems used by large vehicles.

“We’d like them to have appropriate systems that would show some routes weren’t suitable for H.G.V.’s,” Mr. Sanderson said, using shorthand for heavy goods vehicles.

Mr. Sanderson said he would not go so far as to advocate eradicating Wedmore from the map.

But others go farther:

“We’ve said, ‘Just take us off the map,’ actually,” said Geoff Coombs, chairman of the parish council in Barrow Gurney, a village that, despite being too small to have a sidewalk, is host to some 15,000 vehicles a day, cars as well as larger vehicles, whose G.P.S. systems identify it as a good alternative route to Bristol Airport.

Semantic web geo-taggers, start your engines. There are lots of ways creative metadata could help here, but my guess is that as the Web gets ’smarter,’ some of what happens out in the world as a result will seem just plain dumb. :-)

Free speech-related privacy rights of book buying (and reading?) records

Last week, a Federal Magistrate in Wisconsin published an important opinion articulating limits on the government’s power to demand access to records of individuals’ book-buying activity held by 3rd parties such as Amazon.com. The case (IN RE GRAND JURY SUBPOENA TO AMAZON.COM DATED AUGUST 7, 2006) arose in the course of an FBI/IRS investigation of an individual who sells lots of used books on Amazon and was suspected of large-scale tax evasion. In order to develop the case, the Federal investigators acting through a grand jury:

directed Amazon to provide virtually all of its records regarding D’Angelo, including the identities of the thousands of customers who had bought used books from D’Angelo. The government subsequently chose to reduce this scope of this request to the identification of 120 book buyers, 30 per year for the four years under investigation. The government’s plan was for special agents of the FBI and IRS to contact these 120 used book buyers in an attempt to develop concrete evidence necessary to lay a transactional foundation for criminal charges of fraud and tax evasion against D’Angelo. The government does not suspect Amazon or D’Angelo’s customers of any wrongdoing, nor does it consider them victims of D’Angelo; they simply are bricks in the evidentiary wall being erected by the grand jury.

Rather than comply with the subpoena, Amazon exercised its legal right to move the government request ‘quashed’ as it allowed under law. Responding to this motion to quash, the Magistrated acted to to protect the First Amendment rights of the buyers whose identity would be revealed if Amazon responded to the subpoena. The Magistrate concluded that “the government is not entitled to unfettered access to the identities of even a small sample of this group of book buyers without each book buyer’s permission.” Hence, he ordered that a special procedure by which those Amazon customers who bought from the suspect during the relevant time period would be asked in an a manner that did not reveal their identity whether they would be willing, on a voluntary basis, to have their records turned over to the government.

In the end, the government withdrew the subpoena altogether, telling the Wisconsin State Journal that they were able to get names by analyzing the suspects seized computer.

Beyond the First Amendment rationale offered in this case, more striking is the Magistrates assessment of the public mood with respect to privacy in general in the wake of the Patriot Act and warrentless wiretapping activity.

…[I]t is an unsettling and un-American scenario to envision federal agents nosing through the reading lists of law-abiding citizens while hunting for evidence against somebody else. In this era of public apprehension about the scope of the USAPATRIOT Act, the FBI’s (now-retired) “Carnivore” Internet search program, and more recent highly-publicized admissions about political litmus tests at the Department of Justice, rational book buyers would have a non-speculative basis to fear that federal prosecutors and law enforcement agents have a secondary political agenda that could come into play when an opportunity presented itself. Undoubtedly a measurable percentage of people who draw such conclusions would abandon online book purchases in order to avoid the possibility of ending up on some sort of perceived “enemies list.”

While cautioning (in a footnote) that he did not formally recognize these fears to be well-founded, none the less he felt he had to act to limit government power in this case because:

…if word were to spread over the Net–and it would–that the FBI and
the IRS had demanded and received Amazon’s list of customers and their personal purchases,the chilling effect on expressive e-commerce would frost keyboards across America. Fiery rhetoric quickly would follow and the nuances of the subpoena (as actually written and served) would be lost as the cyberdebate roiled itself to a furious boil. One might ask whether this court should concern itself with blogger outrage disproportionate to the government’s actual demand of Amazon. The logical answer is yes, it should: well-founded or not, rumors of an Orwellian federal criminal investigation into the reading habits of Amazon’s customers could frighten countless potential customers into canceling planned online book purchases, now and perhaps forever.

There are two very important caveats to add, however. First, this opinion is only that of one Federal magistrate in one district court. It is not binding on any other part of the country and there are often widely divergent opinions from magistrates. Second, we don’t know who this reasoning might apply to a subpoena issued by a private party in civil litigation (say a divorce lawyer looking to impugn the integrity of an opposing spouse by revealing unsavory reading habits). Finally, as the government dropped its request altogether, this case will never be heard by any other court to be either affirmed or overturned. So, it will hang out there as one view of the privacy problems associated with subpoenas of private information held by 3rd parties.

-not clear how it applies to civil subpoenas in privacy litigation

Next Page »

Creative Commons Attribution-NonCommercial 3.0 Unported
Creative Commons Attribution-NonCommercial 3.0 Unported