Important New Jersey Supreme Court decision in Internet privacy

The New Jersey Supreme Court (State of New Jersey v. Shirley Reid (A-105-06)) has issued an important decision on Internet users’ right to privacy. The case involves a dispute about whether an ISP violated a user’s privacy rights by turning over subscriber information (name, address, billing details) associated with a particular IP address. It ends up the that subpoena served on the ISP was invalid for a variety of reasons. As the user had a ‘reasonable expectation of privacy’ in her Internet activities and identifying information, and because the subpoena served on the ISP was invalid, the New Jersey court determined that the ISP should not have turned over the personal data.

The important aspect of this case in the evolving understanding of privacy on the Internet is the court’s recognition that we must look at privacy from the broad perspective of what can actually be discovered about people online. In this way, the ruling has significant strengths and weaknesses from a privacy perspective. On the one hand, the court finds that there is, today, an expectation of privacy in IP addresses because they are currently hard to link to personal identity. There have been lots of disputes in the US and the EU about whether IP addresses are ‘personally identifying information.’ (”PII” in the jargon of privacy.) This court takes a pragmatic view of this question and finds that IP addresses should be considered private for now, but that this may change. The court finds:

the reasonableness of the privacy interest may change as technology evolves. A reasonable expectation of privacy is required to establish a protected privacy interest…. Internet users today enjoy relatively complete IP address anonymity when surfing the Web. Given the current state of technology, the dynamic, temporarily assigned, numerical IP address cannot be matched to an individual user without the help of an ISP. Therefore, we accept as reasonable the expectation that one’s identity will not be discovered through a string of numbers left behind on a website.

The availability of IP Address Locator Websites has not altered that expectation because they reveal the name and address of service providers but not individual users. Should that reality change over time, the reasonableness of the expectation of privacy in Internet subscriber information might change as well. For example, if one day new software allowed individuals to type IP addresses into a “reverse directory” and identify the name of a user — as is possible with reverse telephone directories — today’s ruling might need to be reexamined.

Others have written about the legal details of this case and have suggested that it is a big win for privacy. Given the reliance on the shifting state of identity technology, I’m a little less sanguine.

This case is yet another reason why I believe (as I’ve explained elsewhere) that meaningful privacy on the Web requires rules the govern how personal information is used, not just what can be collected. Under the court’s reasoning, as our lives become more and more transparent, that would justify increasing harmful use of personal data. While it’s pretty hard to control how exposed we are all become, we still can limit how powerful institutions (governments, etc.) use personal data about us.

Bob Metcalfe’s wisdom on patents and innovation

Ethernet inventor, journalist and now venture capitalist Bob Metcalfe speaks on the lessons from the Internet community for the global warming arena. In looking at how to accelerate technical innovation to address climate change, Metcalfe asserts that:

“… the place to do research is in university labs. “The best vehicle for technology innovation is not patents, it’s students.”

Of course, Bob also manages to express is distain for monopoly, Bell Labs, and even Al Gore. (See report by Martin LaMonica.) I’m not sure about those but think he’s right on with respect to patents.

On meetings

Ever the astute observer of the various features and bugs of our collective behavior, a longtime mentor of mine, Mitch Kapor, has coined a new defintion:

Meetingboarding: (n) the sensation of being unable to breathe arising from continuous immersion in meeting after meeting

I’d add to this a characterization of email that I learned from Mitch many years ago:

The problem with email is that it has low emotional bandwidth.
-Mitch Kapor, circa 1991

Today - NPR Science Friday program on Web privacy issues

National Public Radio’s Science Friday program will feature a discussion of online privacy with Alessandro Acquisti of CMU and yours truly a little later today. It’s live from 3:00 - 4:00 pm Eastern/US, rebroadcast at various times depending on where you live, and streamed on the Web.

Listen it. Call and challenge other listeners to think about the privacy questions raised by the Semantic Web!

Update: the broadcast is streamed at this link.

Transparency for behavioral profiling

Behavioral targeting is pervasive on the Web. As documented by a very nicely-researched New York Time story today (’To Aim Ads, Web Is Keeping Closer Eye on You,’ NYT, by Louise Story, 10 March 2008.) it’s now clear that each of us who use popular search engines and portals are the subject of thousands of individual data collection events per month of Web usage.

I’m glad to see some clear analysis of the practice out there but would like to see an additional level of transparency. If it is the case that profiling is benign, then why not tell uses what aspect of their profile triggered the placement of a particular ad. The ad delivery systems all make decisions about which ads to place for a given user from some properties of that user that are either known or inferred. Why not just tell us what those properties are along with the add placement. This would go a long way toward eliminating the feeling that we’re being ’spied on’ because it would eliminate any sense of secrecy about what is learned in the course of the behavioral monitoring. My guess is that many people would ignore the profile data, but some would check it, and we’d all have piece of mind from knowing that whatever is being done is happening out in the open.

According to the Times, data is collected on which web pages we look at and is then combined with other data (demographics, browsing history, purchases on partner sites, etc.). Right on cue traditional privacy advocates declare that profiles developed in this way (based on our behavior) do (or should) make us feel uneasy:

“When you start to get into the details, it’s scarier than you might suspect,” said Marc Rotenberg, executive director of the Electronic Privacy Information Center, a privacy rights group. “We’re recording preferences, hopes, worries and fears.”

No doubt people (as least some people) feel alarmed about this and probably others are either implicitly or explicitly happy to have the right ads targeted to them. As an online ad agency exec said in the article:

“Everyone feels that if we can get more data, we could put ads in front of people who are interested in them,” he said. “That’s the whole idea here: put dog food ads in front of people who have dogs.”

Unless were going to require an outright ban on this sort of behavioral targeting, the question what to do about it. Is the goal to allay people’s fears? To limit the use of the profiles? Or to help people avoid incorrect targeting?

The statistics developed by comScore for the New York Times article do a nice job of illustrating the magnitude of data collection that happens. Jules Polonetsky, AOL’s Chief Privacy Officer, is launching a new consumer education campaign to explain the mechanics of data collection and tracking to users. The light that both the Times stories and the AOL campaign shed on marketing practices is valuable.

Many people are going to far more interested in how this profiling actually effects them, than on the overall magnitude of the practice. Is there any reason not to be upfront with people about the basis for delivering an ad? If there is, then there is reason to feel that we’re being deceived or maniplated, not assisted, by the behavior tracking techniques.

The political power of (simple) Web computing

It’s pretty amazing what a little bit of structured computer power can do when deployed on the Web. Slate’s Delegate Calculator puts in the hands of Web-enabled citizens some simple computing power that helps us to understand how the delegate counts in the upcoming Democratic primaries may effect the final outcome for Obama and Clinton over the next hours, weeks and months. The knowledge about which states have how many delegates, how they might be apportioned, etc., is information that used to be a closely guarded secret of the political intelligencia and the press. How, it’s out there for all of us to see. It’s such a useful tool that many reporters from other publications are actually writing about it:

Jonathan Alter, Hillary’s Math Problem, Newsweek (4 March 2008)

Peter Baker, Clinton Down, but not Out, for the Count, Washington Post.

Jason Tuohey, Delegate Counter, Boston Globe

Carol Lockhead, Obama Wins Vermont, But Look at the Math, San Francisco Chronicle.

Granted, Slate has a relationship with some of those new outlets, but it’s still striking to see computing make the political news.

Important FCC hearing on Net Neutrality in Cambridge, MA

I’d encourage anyone in or around the Boston, MA area to come to the Federal Communications Commission’s field hearing on Broadband Network Management Practices. I’ll be testifying along with a range of witnesses, Dave Clark and David Reed (colleagues from MIT), representatives from various commercial groups, and a number of advocacy organizations such as Free Press. I understand Congressman Ed Markey, a longtime champion of the Internet and the Web, will also be appearing.

Here are the logistical details:

Monday, Feb 25, 2008
11:00 a.m. to 4:00 p.m.
Harvard Law School, Ames Courtroom, Austin Hall
1515 Massachusetts Avenue, Cambridge, Mass.

Using the Web as an independent source of linked facts to shed light on the news

Following the story about staff changes in Senator Clinton’s presidential campaign ( Clinton Campaign Manager Is Out - The Caucus - Politics - New York Times Blog), there seemed to be some question about whether the campaign cancelled a trip from Washington, DC down to Roanoke, VA due to weather or other factors (the implication being that the campaign was in disarray so cancelled the trip). The Times wrote:

The announcement of Ms. Solis Doyle’s replacement came minutes after Mrs. Clinton was grounded by what her campaign said were high winds at Dulles Airport. After arriving at the airport for a charter flight to Roanoke, Mrs. Clinton, her staff and the traveling press corps were not allowed to board the plane.

A spokesman for Mrs. Clinton said high winds at the airport had forced “a number of planes” to be kept on the ground, and that some planes that had taken off today had suffered structural damage. (Other planes at the airport were taking off as Mrs. Clinton’s motorcade drove away, en route to Washington.)

leaving open the suggestion that the report of high winds and flight cancellations could have been a ruse.

The Web provides independent, on the spot fact checking here with just a few clicks. Follow this link and you can see that there were a number of small planes that took off from Dulles and landed at Roanoke at the time in question, though a few were late.

It would actually be useful if news reports would just reference these clearly documented facts on the Web and allow readers to draw their own conclusions, rather than leave hanging speculation.

Reciprocal Privacy for the Social Web (a.k.a. FOAF)

I’ve loved the idea of FOAF for a long time but always been bothered by the privacy risks that would result of FOAF really took off as a way to represent our social networks. Here’s an idea about how to address privacy in open social networks such as those represented by FOAF-like data structures.

It’s called (for now) REP: Reciprocal Privacy for Social Networks

ReP is a proposal to establish a reasonable privacy balance in social networking environment. Today, more and more social networks are coming onto the Web and are working to share more data across the previously-established boundaries that have previously separate these networks. Participants in social networks should have the benefit of widely shared agreements about how the information they present in those networks will be analyzed and used. To encourage the development of these social and legal privacy norms, we need a simple policy language for expressing rules associated with personal information, and a reliable, scalable mechanism for assessing accountability with those rules. We propose a new protocol by which those who share personal information on the Web can have increased confidence that this information will be used in a transparent manner and that users of the personal information will be able to be held accountable to comply with the stated usage rules.

Privacy policies and associated technologies must provide individuals harmed by breaches with legal recourse against those who abuse the norms of information usage. Hence, agreements must be clear and structured in a manner that there is a chance that the existing legal system could provide a remedy for harm. We should neither expect nor require than a single set of norms will be adequate for all users, all social networking contexts or all cultures, but there should be a common framework and a basic policy vocabulary that can express commonly used rules and be easily extended.

The key to sharing personal information across a diversity of privacy policy frameworks is to establish legal and technical mechanisms that ensures a baseline of social and legal accountability across varying rulesets. Participants in the ReP web must agree as a condition of accessing anyone else’s personal information that usage of personal information will be reported by the user to a log specified by the data subject. Further, anyone who uses the personal information must agree to require that the same set of rules (both the logging requirement and whatever usage rules came with the data) be applied to any subsequent users of the data. The log will allow the data subject to check that a specific usage of personal information complies with the specified usage limitations, and to follow the trail of accountability from the initial access of the data through to the final usage event.

This copy-left-inspired viral policy is the most effective way to assure that the original rules associated with personal data are respected as that data is re-used over and over again in a variety of contexts. In the event of misuse, the logs will provide a means to locate the mis-user and seek correction or other redress. In the event that a use of personal information is discovered which is NOT recorded in the person’s accountability log, that use is by definition a violation of the ReP policy. In many cases where such unauthorized use does real harm to the data subject, it will be possible with some amount of forensic effort will find the mis-user and enable redress. Of course, there will be anonymous mis-users of personal information. We cannot insulate Web users from those risks with ReP, but neither can any other privacy protection strategy that is feasible in an inherently open information environment.

There’s more to read in a skeletal REP design document.

The policy is still rough and the technology hasn’t been built yet, but I’d still really like reactions. :-)

GPS Luddites - the English countryside rebels against satnav

Seems that a number of villages in the English countryside are being overrun by errant trans-European trucks which are regularly misdirected by their GPS satnav systems onto roads that were better suited for horse-drawn carriages than big, long-distance trucks. According to the New York Times (”Wedmore Journal: Turn Back. Exit Village. Truck Shortcut Hitting Barrier.” Sarah Lyall, 4 December 2007, p.A7):

trucks and tractor-trailers come here [to Wedmore] all the time, as they do in similarly inappropriate spots across Britain, directed by G.P.S. navigation devices that fail to appreciate that the shortest route is not always the best route. “They have no idea where they are,” said Wayne Hahn, a local store owner who watches a daily parade of vehicles come to grief — hitting fences, shearing mirrors from cars and becoming stuck at the bottom of Wedmore’s lone hill.

The head of the parish council offers a practical suggestion:

John Sanderson, chairman of the parish council, has proposed a seemingly simple remedy: removing the route through Wedmore from the G.P.S. navigation systems used by large vehicles.

“We’d like them to have appropriate systems that would show some routes weren’t suitable for H.G.V.’s,” Mr. Sanderson said, using shorthand for heavy goods vehicles.

Mr. Sanderson said he would not go so far as to advocate eradicating Wedmore from the map.

But others go farther:

“We’ve said, ‘Just take us off the map,’ actually,” said Geoff Coombs, chairman of the parish council in Barrow Gurney, a village that, despite being too small to have a sidewalk, is host to some 15,000 vehicles a day, cars as well as larger vehicles, whose G.P.S. systems identify it as a good alternative route to Bristol Airport.

Semantic web geo-taggers, start your engines. There are lots of ways creative metadata could help here, but my guess is that as the Web gets ’smarter,’ some of what happens out in the world as a result will seem just plain dumb. :-)

Free speech-related privacy rights of book buying (and reading?) records

Last week, a Federal Magistrate in Wisconsin published an important opinion articulating limits on the government’s power to demand access to records of individuals’ book-buying activity held by 3rd parties such as Amazon.com. The case (IN RE GRAND JURY SUBPOENA TO AMAZON.COM DATED AUGUST 7, 2006) arose in the course of an FBI/IRS investigation of an individual who sells lots of used books on Amazon and was suspected of large-scale tax evasion. In order to develop the case, the Federal investigators acting through a grand jury:

directed Amazon to provide virtually all of its records regarding D’Angelo, including the identities of the thousands of customers who had bought used books from D’Angelo. The government subsequently chose to reduce this scope of this request to the identification of 120 book buyers, 30 per year for the four years under investigation. The government’s plan was for special agents of the FBI and IRS to contact these 120 used book buyers in an attempt to develop concrete evidence necessary to lay a transactional foundation for criminal charges of fraud and tax evasion against D’Angelo. The government does not suspect Amazon or D’Angelo’s customers of any wrongdoing, nor does it consider them victims of D’Angelo; they simply are bricks in the evidentiary wall being erected by the grand jury.

Rather than comply with the subpoena, Amazon exercised its legal right to move the government request ‘quashed’ as it allowed under law. Responding to this motion to quash, the Magistrated acted to to protect the First Amendment rights of the buyers whose identity would be revealed if Amazon responded to the subpoena. The Magistrate concluded that “the government is not entitled to unfettered access to the identities of even a small sample of this group of book buyers without each book buyer’s permission.” Hence, he ordered that a special procedure by which those Amazon customers who bought from the suspect during the relevant time period would be asked in an a manner that did not reveal their identity whether they would be willing, on a voluntary basis, to have their records turned over to the government.

In the end, the government withdrew the subpoena altogether, telling the Wisconsin State Journal that they were able to get names by analyzing the suspects seized computer.

Beyond the First Amendment rationale offered in this case, more striking is the Magistrates assessment of the public mood with respect to privacy in general in the wake of the Patriot Act and warrentless wiretapping activity.

…[I]t is an unsettling and un-American scenario to envision federal agents nosing through the reading lists of law-abiding citizens while hunting for evidence against somebody else. In this era of public apprehension about the scope of the USAPATRIOT Act, the FBI’s (now-retired) “Carnivore” Internet search program, and more recent highly-publicized admissions about political litmus tests at the Department of Justice, rational book buyers would have a non-speculative basis to fear that federal prosecutors and law enforcement agents have a secondary political agenda that could come into play when an opportunity presented itself. Undoubtedly a measurable percentage of people who draw such conclusions would abandon online book purchases in order to avoid the possibility of ending up on some sort of perceived “enemies list.”

While cautioning (in a footnote) that he did not formally recognize these fears to be well-founded, none the less he felt he had to act to limit government power in this case because:

…if word were to spread over the Net–and it would–that the FBI and
the IRS had demanded and received Amazon’s list of customers and their personal purchases,the chilling effect on expressive e-commerce would frost keyboards across America. Fiery rhetoric quickly would follow and the nuances of the subpoena (as actually written and served) would be lost as the cyberdebate roiled itself to a furious boil. One might ask whether this court should concern itself with blogger outrage disproportionate to the government’s actual demand of Amazon. The logical answer is yes, it should: well-founded or not, rumors of an Orwellian federal criminal investigation into the reading habits of Amazon’s customers could frighten countless potential customers into canceling planned online book purchases, now and perhaps forever.

There are two very important caveats to add, however. First, this opinion is only that of one Federal magistrate in one district court. It is not binding on any other part of the country and there are often widely divergent opinions from magistrates. Second, we don’t know who this reasoning might apply to a subpoena issued by a private party in civil litigation (say a divorce lawyer looking to impugn the integrity of an opposing spouse by revealing unsavory reading habits). Finally, as the government dropped its request altogether, this case will never be heard by any other court to be either affirmed or overturned. So, it will hang out there as one view of the privacy problems associated with subpoenas of private information held by 3rd parties.

-not clear how it applies to civil subpoenas in privacy litigation

Data sharing and integration for local and state law enforcement

The Washington Post reports today (”System Lets Agencies In Area Share Data,” Mary Beth Sheridan, Thursday, November 29, 2007; Page B03) that over 60 state, local and federal law enforcement agencies in the Washington DC area announced a plan to share information (including 6 million mug shots and 14 million arrest records).

In what they called a breakthrough, law enforcement officials yesterday unveiled a computer system that will allow more than 60 state and local police agencies in the D.C. area to share mug shots and crime reports.

The system, Law Enforcement Information Exchange (LInX), functions like Google for police, except that the database contains law enforcement information.

This despite that fact that several years ago some civil libertarians criticized an earlier version of this this multi-state data sharing system ominously named MATRIX (Multistate Anti-Terrorism Information Exchange). The ACLU even went so far as to declare MATRIX ‘dead.‘ However, now it seems that LInX includes participation from Florida, Georgia, Hawaii, Texas, Virginia, Washington, the DC-area, and soon New Mexico.

I don’t know what sort of technology the system is built on but if it’s not Semantic Web Linked Data-style architecture now, it should be and probably soon will be.

On the importance of ICANN - a wise view from Joi Ito

On leaving the ICANN Board of Directors after a 3 year term, Joi Ito, one of the true leaders of the global Internet/Web community, writes:

Joi Ito’s Web: Three years with ICANN
With all of it’s tumultuous history and bumps and warts, ICANN, in my opinion, is the best way that we can manage names and numbers on the Internet and any new thing to try to do what it does would be less fair and probably wouldn’t work.

There are some technical architectures and ideas that might make ICANN less relevant, which would be a good thing. However, even relatively obvious things like IPv6, IDNs and DNSEC are having a hard time getting traction. I think that it would be nearly impossible to “redesign the DNS” and get people to use it. It would be like trying to redesign a flying airplane. On the other hand, their might be some evolutionary changes that make domain names less relevant.

While ICANN must continue to improve its openness and public accountability, I wholeheartedly support Joi’s view. Anyone reading this post or able to follow the link to his original owe’s ICANN a real ‘thank you.’

Privacy Lost?

The New York Times writes about new location-sharing services and worries about Privacy Lost. No doubt there will be all sort of privacy questions associated with these services but it seems pretty clear that people are going to flock to location-based services of all kinds. Some data points:

  • As the reporter, Laura Holson points out, over 50% of mobile phone now sold are GPS-capable. That number is certain to rise to near 100% over time.
  • Even without GPS, mobile networks are pretty good at inferring location by triangulation. In the US it is a requirement that mobile network operators deliver location data to 911 operators in real time, accurate to with 50-100 meters (depending on the application).
  • Non-real-time location can be just as revealing, if not more. Social- and Semantic Web enthusiasts are hot to deliver all sorts of geotagging services which are likely to be just as revealing as GPS. Yahoo and Apple are featuring geotagging in their photo services. New social mapping services such as Platial and Flappr will provide every bit as much detail, though not necessarily in real time. Imagine when the parent of a teenager discovered that teen has posted photos the are geo-tagged to show that they were taken at a house the teen was barred from visiting.

Location is hot. In a nice post on the significance of Google’s acquisition of mobile/microblogger Jaiku, Chris Messina writes:

The Web 2.0 Address Book isn’t really about how you connect to someone. It’s not really about having their home, work and secret lair addresses. It’s not about having access to their 15 different cell phone numbers that change depending on whether they’re home, at work, in the car, on a plane, in front of their computer and so on. It’s not about knowing the secret handshake and token-based smoke-signal that gains you direct access to send someone a guaranteed email that will bypass their moats of antispam protection. In the real world (outside of Silicon Valley), people want to type in the name of the recipient and hit send and have it reach the destination, in whatever means necessary, and in as appropriate a manner as possible. For this to happen, recipients need to provide a whole lot more information about themselves and their contexts to the system in order for this whole song and dance to work.

And the founder of Jaiku explains to the New York Times:

Petteri Koponen, one of the two founders of Jaiku, described the service as a “holistic view of a person’s life,” rather than just short posts. “We extract a lot of information automatically, especially from mobile phones,” Mr. Koponen said from Mountain View, Calif., where the company is being integrated into Google. “This kind of information paints a picture of what a person is thinking or doing.”

Privacy is not lost simply because people find these services useful and start sharing location. Privacy could be lost if we don’t start to figure what the rules are for how this sort of location data can be used. We’ve got to make progress in two areas:

  • technical: how can users sharing and usage preferences be easily communicated to and acted upon by others? Suppose I share my location with a friend by don’t want my employer to know it. What happens when my friend, intentionally or accidentally shares a social location map with my employer or with the public at large? How would my friend know that this is contrary to the way I want my location data used? What sorts of technologies and standards are needed to allow location data to be freely shared while respective users usage limitation requirements?
  • legal: what sort of limits ought there to be on the use of location data?
  • can employers require employees to disclose real time location data?
  • is there any difference between real-time and historical location data traces? (I doubt it)
  • under what conditions can the government get location data?

There’s clearly a lot to think about with these new services. I hope that we can approach this from the perspective that lots of location data will being flowing around and realize the the big challenge is to develop social, technical and legal tools to be sure that it is not misused.

What to do about Google and Doubleclick? Hold Google to it’s word with some Extreme factfinding about privacy practices

The proposed merger between Google and Doubleclick has raised hackles among those concerned about potential domination of the online advertising marketplace (especially Microsoft) but even more worry among privacy advocates. After a short talk over the weekend with a friend, Peter Swire, a thoughtful and knowledgeable privacy scholar, I came to the view that regulators have to develop a new, robust and scalable means of keeping track of what large data handlers such as Google are actually doing with personal information. (While the conversation with Peter was quite stimulating, I don’t know whether or not he agrees with what I’ve written here.) The mechanisms the exist today to help users make informed choices and policy makers set sound directions are simply inadequate to answer the kinds of questions posed by the Google-Doubleclick deal. Instead of formal, highly negotiated and scripted hearings, we need to much more open, flexible process in which technical experts and the interested public are able to ask detailed questions about current practices. This is not a criticism of either US or EU regulators. On both sides of the Atlantic there is a fine tradition of EU Data Protection Commissions and the US Federal Trade Commission engaging in careful and thoughtful probes of privacy-sensitive activities. However, these processes often take too long, end up producing results that are quite out of date. A lot of energy goes into addressing last year’s data handling practices by which time the leading edge of the industry has moved on.

In the 1990s, the FTC under Christine Varney’s leadership pushed operators of commercial websites to post policies stating how they handle personal information. That was an innovative idea at the time, but the power of personal information processing has swamped the ability of a static statement to capture the privacy impact of sophisticated services, and the level of generality at which these policies tend to be written often obscure the real privacy impact of the practices described. It’s time for regulators to take the next step and assure that both individuals and policy makers have information they need.

So, as part of investigating the Google-Doubleclick merger, regulators should appoint an independent panel of technical, legal and business experts to help them review, on an ongoing basis the privacy practices of Google. Key components of this process should be:

  • expert panel made up of those with technical, legal and business expertise from around the world
  • public hearings at which Google technical experts are available to answer questions about operational details of personal data handling
  • questions submitted by the public and organized in advance by the expert panel
  • staff support for the panel from participating regulatory agencies
  • real-time publication of questions and answers
  • An annual report summarizing what the panel has learned

The Internet open source and open standards communities have learned a lot over the last decade about how to use the Web to facilitate open, collaborative and often rapid development of new technologies. Web users reap the benefit of these open processes with easy access to high-quality software. Indeed, the very infrastructure of the Web and the Internet have been largely developed in this sort of open, extreme technology development process. Making public policy is different than developing technical designs, but the in-depth fact-finding that is needed to make sounds policy decisions could benefit a lot from the open, collaborative, online information gathering and sifting process that we already use for Web technology development. Of course, this would not supplant the traditional policy making role of regulators. Rather, this process would serve as a fact-gathering process to help inform regulators. If everyone was feeling really ambitious, perhaps there could even be cooperation between the various regulators around the world with a commitment to study the results from this process. Despite differences in privacy policy in different parts of the world, there has been an impressive record of information cooperation, especially at the staff level, amongst various privacy regulators around the world. This could be a good next step to take in that direction.

By way of background, regulators in the US (Federal Trade Commission) and Europe (Article 29 Working Party representing the EU’s Data Protection Authorities) are investigating both antitrust and privacy questions regarding the merger. The key privacy concern seems to be that Google would take all of the personal information it has about users (search terms, IP addresses, contents of email, location from map applications, etc.) and combine it with the personal data the Doubleclick has (demographics, click stream data from ads served) and end up with a REALLY powerful private surveillance machine.

Google says that they care about their user’s privacy rights and would never abuse the newfound power they propose to acquire. According to Nicole Wong:

“User, advertiser and publisher trust is paramount to the success of our business and to the success of our acquisition…. We can’t imagine taking any actions that would undermine these relationships or the trust people have in using our products and service.” (Washington Post, 20 April 2007)

But the question is: how will either policy makers or users know that their trust is being violated or pushed to an extreme that they’re not comfortable with? Google, to it’s credit, sees the need to provide more information about what it does with personal data. In testifying before the United States Senate, Google’s chief lawyer, David Drummond, said:

We are also exploring other ways to create more transparency in our privacy practices and policies. We have a lot of information about our privacy practices on our website, and we’re making that information even more accessible to users by adding video-format “tutorials” to help users understand privacy issues online in plain English. The first of these video tutorials has been viewed about 43,500 times on YouTube, and the second video launched earlier this week and has already been viewed hundreds of times.

But will expanded privacy policies and videos really be enough to help uses make sound decisions. Privacy regulations place a large, and I believe unsustainable, burden on users to learn the details of how services such as Google use their personal information and then weigh the current benefit of the service against the perceived privacy cost. There is mounting evidence that people will trade off a lot of future privacy risk in exchange for current convenience. I doubt that simply presenting users with more and more choices will help us arrive at a privacy policy that is sounds in the long run. For example, some privacy advocates (EPIC) demand that Google be required to get a explicit permission from all of who have Doubleclick cookies before the information associated with those cookies can be used together with personal information from Google. EPIC also asks that a lot of other information about Google’s information handling practices be made available to users, consistent with traditional privacy notions of notice and access to personal information.

Imagine the question that Google might ask when seeking permission from a user to associate their Doubleclick cookie with Google data in a mobile search application:

Google Dialog Box (FAKE): We’d like to us some of the demographic information we have about you to give you more accurate, convenient directions on your mobile phone. We will also use this data to target ads to you, just like we do with you GMail account. Click ‘Yes’ to agree or ‘No’ and they you’ll be asked to type the latitude and longitude of your ten favorite locations.

The query may not be so extreme, but the idea will be the same.

So my view is that users could use a bit of help making these decisions. That help ought to come in the form of some baseline rules about how personal information can and cannot be used. The days of saying that all users need is ‘free choice’ are over. Of course, the problems discussed here with respect to Google apply equally to many other services on the Web that handle personal information. Google and it’s merger proposal presents a good opportunity to start figuring our some of these questions, but the process and the answers would be applicable to many others as well. In order to figure out what policies should actually govern how data is used, a careful and ongoing investigate of Google’s practices, with the help of the independent board I have suggested above, would be a good place to start.

Technical standards and role of democracy

The pitched lobbying battle over whether the Microsoft Open Office XML specification should be recognized as an international technical standard has brought to the mainstream press in the US the important, though generally obscure, question of just how global information technology standards ought to be set. The debate has gotten a lot of attention because of accusations of ‘vote buying,’ something that everyone can understand and have an opinion about, but I think this misses the real issue: Is the traditional ‘one-country, one-vote approach to setting technical standards that we have inherited from the 19th century the best way to set global information technology standards the the Internet, the Web, and other widely used pieces of information infrastructure?

Here’s some background: This morning’s New York Times reports (”Microsoft Favored to Win Open Document Vote,” Kevin O’Brien, 4 Sept 2007) that

Amid intense lobbying, Microsoft is expected to squeak out a victory this week to have its open document format, Office Open XML, recognized as an international standard, people tracking the vote said Monday….“After what basically has amounted to unprecedented lobbying, I think that Microsoft’s standard is going to get the necessary amount of support,” said Pieter Hintjens, president of Foundation for a Free Information Infrastructure, a Brussels group that led the opposition.

But a recent report in PC Magazine (”ISO votes to reject Microsoft’s OOXML as standard,” Peter Sayer, 4 Sept 2007) indicates that

Microsoft Corp. has failed in its attempt to have its Office Open XML document format fast-tracked straight to the status of an international standard by the International Organization for Standardization.

The proposal must now be revised to take into account the negative comments made during the voting process.

This early report is confirmed by the Wall Street Journal (”Microsoft Fails to Win Approval On File Format for Office,” CHARLES FORELLE, 4 Sept 2007)

All of this follow reports that Microsoft was engaging in vote-buying (”Sweden’s OOXML vote declared invalid,” Infoworld, Martin Wallström, 31 August 2007) and (”Microsoft Memo to Partners in Sweden Surfaces: Vote Yes for OOXML - Updated,” Groklaw, 29 August 2007) so the question has become whether or not the process was fair.

This Tammany Hall-style debate, really misses the point though. Of course it’s ethically objectionable for any participant in this process to ‘buy’ or offer incentives for votes. But let’s suppose that none of that sort of behavior happened, does the world really arrive at the best IT standards by taking a vote of all countries on the planet? Broad participation in standards development is vital, but the experience of the Internet and the Web suggests that the primary mode of participation should software developers writing programs, not governments casting votes. The Internet (through the IETF) and the World Wide Web (through W3C), along with lots of valuable open source software, have evolved into global standards through a much more bottom-up, consensus based process that sets standards based on a much more meritocratic, substantive assessment of which parts of which technical specifications are actually used to make systems interoperable. When it comes to Internet and Web standard setting, we don’t just take a vote to anoint a design as a standard, we combine working groups developing specifications with a requirement that the features proposed for standardization are actually implemented, widely used, and have been demostrated to be interoperable across a range of products and services. (At W3C and (sometimes) OASIS we also require that everyone who participates in the standards setting process make assurances that the standard can be implemented without paying patent license fees.)

The Internet and the Web have grown into truly open platforms because of a process that grants ’standards’ status to technology AFTER it has proven that it has consensus support behind it and that it is actually the basis for interoperability. The strength of this process is that design ideas are subjected to the test of user acceptance and the marketplace. It’s clear that this sort of technical scrutiny, rather than vote-counting, would serve the process well. As the Wall Street Journal article explains, the underlying dispute is really a set of technical questions:

Those opposed to Open XML say it isn’t really open at all — that it is actually so complex and so loaded with Microsoft-specific features that no one but Microsoft can use it fully. Critics also allege technical failings and say the format needlessly duplicates an existing format, called Open Document, used by IBM and many open-source programmers.

Microsoft says it has opened up the Office formats to encourage competition and interoperability, not squelch it. Open XML should be a standard in addition to Open Document, Microsoft argues, because Open XML allows for more features.

The way to figure out whether OOXML is good for interoperability or not is to see whether independent developers actually use it.

Is there politics in this process, of course! But the fact that the basis decisions to be made have to do with assessment of implementation experience, as opposed to vote counting. This is in contrast to the traditional standards process which accords the label ’standard’ based on a vote of country representatives. Democracy is a great thing for government decisions, but not a great way to design new technology.

Does broadband speed matter

The Washington Post reports this morning (Japan’s Warp-Speed Ride to Internet Future, 29 August 2007) that in Japan

Broadband service here is eight to 30 times as fast as in the United States — and considerably cheaper. Japan has the world’s fastest Internet connections, delivering more data at a lower cost than anywhere else, recent studies show.

And that the comparative disadvantage faced by Internet users in the US means that

The speed advantage allows the Japanese to watch broadcast-quality, full-screen television over the Internet, an experience that mocks the grainy, wallet-size images Americans endure.

The clear message is that any country without widely deploy, fast broadband Internet access will fall behind in the competitive marketplace for developing new Internet services.

While it’s clear that Japanese users do actually have access to more affordable, high bandwidth access, it’s less clear to me that this will result in the US falling behind in the application/service development arena. The one example that the article cites for this proposition that Japan is racing ahead is that new telemedicine services are now possible:

The burgeoning optical fiber system is hurtling Japan into an Internet future that experts say Americans are unlikely to experience for at least several years.

Shoji Matsuya, director of diagnostic pathology at Kanto Medical Center in Tokyo, has tested an NTT telepathology system scheduled for nationwide use next spring.

It allows pathologists — using high-definition video and remote-controlled microscopes — to examine tissue samples from patients living in areas without access to major hospitals. Those patients need only find a clinic with the right microscope and an NTT fiber connection.

“Before, we did not have the richness of image detail,” Matsuya said, noting that Japan has a severe shortage of pathologists. “With this equipment, I think it is possible to make a definitive remote diagnosis of cancer.”

Japan’s leap forward, as the United States has lost ground among major industrialized countries in providing high-speed broadband connections, has frustrated many American high-tech innovators.

Is it really that case that this is impossible in the US? After all, what’s required for this sort of remote diagnostic service is just broadband to the clinic with the microscope. Surely if a clinic can afford one of these microscopes than it shouldn’t be too hard to pay for the added cost of broadband service, even if it’s a bit more expensive in the US than in Japan. What’s more, it’s not clear that many such new applications require the super-high bandwidth offered by fiber. A one or two minute delay in getting the microscope image from clinic to remote doctor hardly seems ‘fatal’.

All of the statistics about US broadband lag cited in this article and elsewhere refer to overall residential broadband penetration. But applications such as telemedicine don’t depend on universal broadband access. In the end I find that arguments the “we’re falling behind” panic to be a little thin.

More than bandwidth, I really worry about lack of openness in the operation of our Internet infrastructure. We’ve seen pretty extraordinary innovation in Web-based services over the last decade and I’d argue that the key to this has been open, non-discriminatory provision of Internet access services.

“The experience of the last seven years shows that sometimes you need a strong federal regulatory framework to ensure that competition happens in a way that is constructive,” said Vinton G. Cerf, a vice president at Google…. The opening of Japan’s copper phone lines to DSL competition launched a “virtuous cycle” of ever-increasing speed, said Cisco’s [Robert] Pepper. The cycle began shortly after Japanese politicians — fretting about an Internet system that in 2000 was slower and more expensive than what existed in the United States — decided to “unbundle” copper lines.
In the United States, a similar kind of competitive access to phone company lines was strongly endorsed by Congress in a 1996 telecommunications law. But the federal push fizzled in 2003 and 2004, when the Federal Communications Commission and a federal court ruled that major companies do not have to share phone or fiber lines with competitors. The Bush administration did not appeal the court ruling.

So, if there’s a choice between more bandwidth or more openness, I’m for the open platform.

More on privacy issues with Apple’s DRM-less iTunes Plus

There’s been more discussion of Apple iTunes Plus DRM-less music and its practice of embedded personal account information into the tracks that are sold without copy protection. I’ve earlier expressed my support for this accountability approach to copyright protection, as opposed to burdensome DRM systems. However, privacy complaints (BBC, Anger over DRM-free iTunes tracks) are appearing over the use of personal information in this way.

Looking through Apple’s privacy policy (updated 23 December 2004) and iTunes terms of service (updated 30 May 2007 I found no mention of this otherwise hidden use personal information. The terms of service does say:

(xii) iTunes Plus Products do not contain security technology that limits y our usage of such Products, and Usage Rules (iii) – (vi) do not apply to iTunes Plus Products. You may copy, store and burn iTunes Plus Products as reasonably necessary for personal, noncommercial use.

Seems that this would have been a good place to indicate the new use of users information. A simple notice here that passing tracks, which appears to be permitted as long as it is for “personal, non-commercial use,” also results in having your personal information passed around. Perhaps I missed this or perhaps Apple plans to add it. I’m going to ask around to get clarification.

Update: EFF and O’Reilly also report that the iTunes files may have individual differences (that could allegedly be used for individual tracking) even beyond the personal information that is visible.

A glimse of sanity in the online copyright arena

With Apple’s announcement of DRM-free music downloadable through iTunes, it appears that we may actually be heading toward a sane, scalable approach to copyrighted commercial content on the Web. Tracks from EMI and other music publishers can now be purchased in two versions, a locked up version for the usual 99 cents or a higher-quality and DRM-free version for $1.29. I got an entire album (Jacqueline Du Pre playing the Dvorák & Elgar Cello Concertos with the Chicago Symphony) for a mere $9.95 in unlocked form.

As several observers have pointed out, these DRM-free tracks do come with a catch — your name is embedded inside the MPEG-4 file so that if you decide to casually share these files around with your hundred thousand closest friends on the Net (exactly the result the DRM has tried, unsuccessfully, to prevent) then you’re at some risk of getting caught and of having personal information spread around the Net with your illegally-copied files. Following some instructions from an independent Apple news blog, I was able to verify that my name was put into these files upon being downloaded

[Daniel-Weitzners-Computer:iTunes Music/...] djweitzn% strings *.m4a | grep name
nameDaniel Weitzner
nameDaniel Weitzner
nameDaniel Weitzner
nameDaniel Weitzner

In addition to my name it appears that my .mac account id, through which I purchased the tracks, was also included.

The big news here goes beyond just copyright. Apple has decided to jettison heavyweight DRM enforcement in favor of an approach that allows the free flow of data with back-end accountability. I believe this is just one step in a larger trend toward what I’ve been calling ‘accountable systems.’

An exclusive reliance on access restrictions such as DRM leads to technology and policy perspectives where information, once revealed, is completely uncontrolled. It’s like focusing all one’s attention on closing the barn door and ignoring what might happen to the horses after they’ve escaped. The reality is that even when information is widely available, society has interests in whether or not that information is used appropriately. Information policies should reflect those interests, and information technology should support those policies.

In research we’ve been doing on accountable systems approaches to privacy and copyright, we seek an alternative to the “hide it or lose it” approach that currently characterizes policy compliance on the Web. Our alternative is to design systems that are oriented toward information accountability and appropriate use, rather than information security and access restriction. I think what Apple is doing here will come to be seen as the an early step in a large-scale transformation in how we approach a wide variety of policy issues on the Web.

Watch this space for more.

Updating network security community’s understanding of privacy

A few weeks ago a colleague reminded me of one of the early definitions of privacy in the computer security literature from Saltzer and Schroeder (The Protection of Information in Computer Systems):

“The term “privacy” denotes a socially defined ability of an individual (or organization) to determine whether, when, and to whom personal (or organizational) information is to be released.”

This view reflects the widely held view even today amongst computer security architects that the way to achieve privacy policy ends is to control the release of information. To this end, great effort has been expended to design systems that control access to and flow of personal, sensitive information. While there are certainly good reasons to do this, access control alone has not, and never will, be sufficient to achieve compliance with privacy, copyright or other information-related rules.

Next Page »
Creative Commons Attribution-NonCommercial 3.0 Unported
Creative Commons Attribution-NonCommercial 3.0 Unported