The PEP Report – October 2018

by Jane Bambauer

Welcome to the inaugural PEP Report, an occasional (roughly monthly) roundup of news, research, and events related to privacy and economics.

A lot is happening this year, including upcoming public hearings at the FTC on big data and consumer protection and the recent passage of California’s Consumer Privacy Protection Act. Future PEP Reports will focus on those developments. For now, I would like to highlight research presented at the 46th TPRC Conference on Communications, Information, and Internet Policy.

TPRC recently took place at American University Washington College of Law. (NB: It’s called TPRC because it used to stand for the Telecommunications Policy Research Conference, but the scope has since expanded to include the Internet, of course.) As usual, TPRC brought together a great lineup of papers from multiple disciplines tackling current and future problems in communications policy. Below are my thoughts on some of my favorite papers.

One note about the conference over all: TPRC has had a Privacy/Security track for several years now, but some of my favorite papers were in other tracks, and had only brief discussions of privacy law. This is telling. Too often, the authors of privacy papers are not forced to be in conversation with researchers who are trying to optimize the utility of information and communications technologies, and vice versa. In the future, I would love to see authors and audiences of privacy and data use papers in the same room so that tensions can be aired, acknowledged, and accounted for.

Click below to read about some of my favorite privacy-related papers from TPRC

This study applies topic modeling (a methodology that looks for common clusters of words in unstructured survey answers) to see how social media users describe their privacy concerns. The authors find that while elite scholarly and policy conversations focus on vertical privacy (that is, concerns about data sharing between services that occurs in the background), social media users are typically concerned about horizontal privacy (concerns about managing reputation with their peers.)

This paper adds great value to the debates about whether content platforms like Google and Facebook should be pushed to provide “neutral” and “truthful” content through government regulation or through self-regulation.  This paper does not take a position on the debate, but offers a model for the self-regulation approach—the Media Ratings Council. The MRC sets standards and accredits firms like Nielsen in the business of measuring audience sizes for television, radio, and the web. The advertising market rather than regulators pushed competing rating systems to get accredited. The author argues that a private, independent monitor of this sort could offer some of the same advantages to social media algorithms. Napoli notes that both audience measurement and services like Facebook’s newsfeed have significant power over culture and advertisers, and do important work to build trust and credibility in a shared set of facts (albeit imperfectly, and with some opportunities for gaming.)

Would it be good for web users if social media companies sought to accredit their news feed algorithms with an independent body? Here’s the author’s take:

Given the size, reach and potential impact of social media platforms such as Facebook, YouTube, and Twitter on such a broad range of stakeholders, it seems increasingly problematic that these platforms can suddenly and unilaterally alter their curation algorithms or the privacy policies that affect the user data that feed into these algorithms. Of course, any such algorithmic or data-gathering changes are typically the result of rigorous internal analysis, but given the trust issues surrounding social media platforms, and their demonstrated vulnerability to misuse and manipulation, the question here is whether from an ethical and public interest standpoint, some more multilateral process of evaluation, input, and decision-making would be desirable.

I am not necessarily convinced that end users would benefit if changes in newsfeed algorithms had to undergo independent review. This would essentially kill Machine Learning algorithms, and could also make the curation algorithms more gameable because of the slow pace of correction. But it’s a very interesting idea that deserves attention and consideration.

If consumers are worried about the anticompetitive advantage that companies like Google and Facebook could have thanks to their large troves of collected personal data, they should be happy to discover that there is an emerging market in personal data porting that allows consumers to consolidate many sources of personal data and sell access to it to other companies. This paper provides theoretical models to explore the effects of personal data markets on the quality of content providers’ services and on consumer surplus.

For a long time I have wondered why there wasn’t a niche market for consumers who are willing to join data from multiple sources and trade them to service-providers. Thanks to this paper, I now know that they exist!

I had been thinking about how a Rawlsian approach to privacy policy might play out if we take concepts like maximin seriously, especially because many of the strongest privacy advocates I know take a Rawlsian approach to other areas of public policy. Thus, I was delighted to see Morten Bay take on the topic. As Bay points out, Rawls was not particularly interested in protecting privacy, particularly to the extent that it hinders structured efforts to create a fair society. Quoting Rawls (with Bay’s added emphasis):

A well-ordered society, as thus specified, is not, then, a private society; for the well-ordered society of justice as fairness, citizens do have final ends in common.

Bay goes on to analyze the privacy versus national security tension, and concludes (tentatively) that the unfairness that society should correct are for extreme privacy intrusions for relatively vulnerable people, especially because the intrusions made for national security purposes seem to have virtually no efficacy for deterring or detecting terrorism. This conclusion seems sound as long as the empirical assertion that data snooping in the name of national security does not actually protect anybody from terrorist attacks. If you loosen that assumption, then it seems that the future terrorist victims are the ones whose rights we must bolster through aggressive monitoring—a conclusion I suspect Bay would be loath to fully embrace. Moreover, if we take the topic of data surveillance out of the realm of national security (where, I agree, efficacy is largely unproven) and ask about data surveillance for ordinary crime investigation, the victims of domestic violence, murder, sexual assault, and other violent crimes are surely the people whose interests a dedicated Rawlsian would want to favor, and the result very well could be significant privacy intrusions for the many in order to preserve the dignity and safety of a few.

One more note: the importance of efficacy and probabilistic outcomes of terrorism investigations give the Rawlsian analysis a bit of a utilitarian flavor since we wind up having to ask threshold questions like “is a .001% chance of thwarting a terrorist attack enough to implicate a future terrorist victim’s rights and require a fairness analysis?” Like other applications of deontological reasoning, the exercise of applying Rawls to privacy may prove that the divisions between Rawls and utilitarianism is not as bright as we often assume.

This paper provides clarity about what the purpose of digital consent should be (and, by extension, what the process should be as well.) As is always the case with Meg Leta Jones’ work, the writing and the logic is extremely clear. Edenberg and Jones take seriously the idea that consent should “communicate to others the agent’s intention to undertake new obligations and/or convey to others new rights” and therefore finds a lot of the boilerplate contract agreements to be insufficient for providing notice and voluntariness of significant changes in rights or obligations. I agree. In tort law, we distinguish between contract “ascent,” which does not require actual willingness to change the terms of a relationship, and “consent,” which generally does. The trouble, as the authors point out, is that the concept of consent is meant to be strong in order to protect critical interests that people have a moral right to control—for example, entry on our land or contact with our bodies. Right now, this project is under-developed in terms of the moral core that privacy law and digital consent must protect. I suspect that when the authors state the parameters of this moral core more clearly in future drafts, there will be a lot of fodder for disagreement. But this early draft is nevertheless valuable. It argues, rightly in my view, that consent should be strong in form and tied to strong moral/societal interests.

This study uses SEC filings of publicly traded apps to show that these companies (and presumably the privately held app companies, too) are at the mercy of Google’s and Apple’s app store policies. Many of them disclosed a risk to their core business models if the App Store or Play store “change how the personal information of consumers is made available to developers.”

Given that about 90% of Facebook activity now takes place on mobile devices, the app stores are important choke points that could be targeted by privacy regulators or advocates. Google or Apple could also exploit privacy concerns to justify reducing access to user data by competitors.

This study examines whether the Chinese government—specifically its poverty elimination programs—are in compliance with the standards expressed in the country’s privacy laws. The authors find that the majority of standards are either insufficiently achieved or “definitely not achieved.” The authors plan to use this study as a benchmark for analyzing the efficacy of future policies and enforcement practices.

I am not surprised that very few of China’s privacy laws (which are similar in form to the EU’s former data protection directive) are practiced, but the authors do not (and probably cannot) assess whether the lack of compliance is due to the programs treating privacy as a low priority or if it is due to the privacy law’s opportunity costs and its conflicts with the core missions of the programs. The privacy rules that I suspect are the most taxing on a program’s efficacy—consent requirements, de-identification requirements, and prohibitions on repurposing data—had some of the worst compliance rates.

Eli Noam (Columbia Business School) is a regular contributor at TPRC, and his work is always insightful, fun to read, and fun to hear about. Noam has the rare ability to take a soaring view of the terrain and where we are headed or to dive into the nitty details depending on what the occasion requires. This year, he presented a chapter from his forthcoming book on the future of media focusing on the regulation of Internet TV. After charting the similarity of today’s policy debates to previous forms of media (albeit with different labels; net neutrality = common carriage, e.g.), Noam resists the conclusion that broadcast-style regulations are inevitable. He sees potential for learning from the past and from genuine differences in new media so that new regulatory options can be tried, such as encouraging or requiring resource sharing among competing firms. He acknowledges that the current political climate makes progress toward rational regulations more difficult. From the abstract:

The time has therefore come to engage in a new discussion over regulatory principles for internet-based activities. This is not easy. The rivals in the debate over the treatment of communications networks at times exhibit a messianic fervor and are quick to slay messengers of unwanted news. One side invokes a danger to either the survival of diversity, democracy, and the internet; while the other side predicts a grave damage to technology, national competitiveness, and the economy. The paper will discuss these themes, try to identify the telecom economics treatment of the various issues, their methodologies, and their conclusions. It will provide the lessons that can be applied, as well as where failure provides lessons for the future.

Noam’s discussion of privacy is less creative than other parts of the chapter. He endorses the traditional Fair Information Practice Principles and also recommends a property model for privacy that vests co-ownership in personal data in the data producer and the data subject. The result of that model will be to require explicit notice and opt-in consent before data is shared or used for something else. This is consistent with popular privacy proposals, but I was surprised to see Noam embrace it since it’s a step away from the resource-sharing that he finds so promising to avoid the greatest risks of anticompetitive behavior by media firms. (For example, Noam suggested that a single log-in portal should be shared by a range of media companies to facilitate data security and competition and to reduce transaction costs. Why not explore this model with personal data as well, with the understanding that the data, like login credentials, must be protected against certain threats?)

For more on regulatory proposals that emphasize sharing bottleneck resources, see this TPRC paper by Douglas Sicker and William Lehr.

This paper describes a really cool general methodology for regulators who want to run simulations of alternative regulatory rules. Here’s the abstract:

Infrastructure decision-making is challenging due to high levels of future uncertainty. Indeed, when considering broadband coverage, we are also faced with a situation where operators are reluctant to share data, there are few existing open-source evaluation models, and most available tools cannot be used by non-technical users. Transparency is sometimes low. Often disagreements arise between operators, regulators and other actors, with little independent assessment of key issues. Consequently, this paper proposes the development of a Digital Twin as a virtual test-bed for evaluating telecommunication policies. The concept of a Digital Twin has been common for several years in aerospace engineering, since first proposed by NASA in 2010. The vision defined in this paper is for a Digital Twin to be a virtual engineering-economic representation of a real-world telecommunication network, whereby simulation techniques allow exploration of potential future states under different policy conditions. A Digital Twin is developed for the British incumbent’s fixed broadband network, spanning 30 million premises and over 4.3 million geospatial telecommunication network assets. Using a network subset for Cambridgeshire, England the rollout of Fibre-To-The-Premises and Fibre-To-The-Distribution-Point upgrade options are then tested. Under different demand scenarios, market-based rollout is compared to a subsidised rollout strategy. Independently testing broadband deployment strategies in a virtual market and evaluating their effectiveness can provide greater transparency for decision-making processes. Over the long-term, a Digital Twin could help to generate new knowledge by testing experimental policy options, fostering greater innovation in how we tackle both perennial and emerging digital divide issues.

Data flows can be treated as trade in services, trade in information goods (analogous to books), or trade as a factor of production, like capital and labor. Considering the third, the authors attempt to quantify the amount of data that is flowing into and out of each of several countries—a simple-sounding preliminary question that is surprisingly difficult to estimate. From the abstract:

This paper is a preliminary attempt to analyze information as a factor of production in international trade. It is a first attempt to get a handle on the direction and balance of information flows. We have obtained quantitative data about Web-related data flows between countries, and we explore how those flows are correlated to trade in goods. Using Telegeography data on “Server Location as a Percentage of Top Websites,” we found that 2/3 of all web traffic is transnational. More than half of the top 100 web sites in 9 of the world’s 13 sub-regions are hosted in the United States. In Central Asia and Eastern Europe, for whom 37% and 41% of their most popular websites, respectively, are requested from the US. Even well-developed Western Europe makes almost half of its top 100 web site requests to US-based sites. For US users, on the other hand, only about 26 of the top 100 websites are hosted outside the country, and 20 of them are in Europe. Ironically, East Asia, which has a huge goods trade surplus with the developed economies, particularly with the US, has the largest negative balance in the relationship between incoming and outgoing Web requests. Indeed, we found a very strong negative correlation (-0.878) between web traffic balances and the balance of trade in goods across all sub-regions. Once these aspects of transnational data flows are quantified, the paper discusses the implications of these findings for policy, especially trade policy. It raises the question whether the goal of a free and open digital economy is best advanced by placing information exchanges in the trade paradigm and pushing for free trade, or by asserting a more general human right to free and open information exchanges across borders, which has social and political as well as economic consequences. These two approaches are not mutually exclusive, of course, but by making these distinctions we clarify the debate over international policy in the digital world.

From the Introduction:

Platform intermediaries can generate a compelling value position for consumers by opting to recover most, if not all costs, from upstream users, rather than split the financial burden between both upstream and downstream participants. Economists use the term two-sided markets to identify platform functions where transactions occur both upstream and downstream from the intermediary. Successful insertion of an intermediary platform has generated both positive and negative impacts on consumer welfare, competition, the rate of innovation, employment and other key factors. On the positive side, intermediaries can promote efficiency, economies of scale and positive network externalities. On the negative side, they, may leverage dominant market shares to extract high prices from both upstream and downstream participants, after having acquired market dominance through a sustained period of below market pricing. . . This article asserts that any analysis of costs and benefits occurring via broadband intermediary transactions necessitates an assessment of impacts occurring on both sides of the platform. Heretofore, legislators, judges, regulators, policy makers, business executives and academics have emphasized, or solely examined the downstream impacts. [] Considering the decision by intermediaries to shift costs upstream, analysts of two-sided markets may overestimate consumer benefits by failing to consider offsetting costs occurring when upstream transactions are examined.

From the abstract:

What is it that separates “traditional” algorithms and machines that for decades have been subject to traditional product liability legal framework from what I would call “thinking algorithms,” that seem to warrant their own custom-made treatment? Why have “auto-pilots,” for example, been traditionally treated as “products,” while autonomous vehicles are suddenly perceived as a more “human-like” system that requires different treatment? Where is the line between machines drawn? Scholars who touch on this question, have generally referred to the system’s level of autonomy as a classifier between traditional products and systems incompatible with products liability laws (whether autonomy was mentioned expressly, or reflected in the specific questions posed). This article, however, argues that a classifier based on autonomy level is not a good one, given its excessive complexity, the vague classification process it dictates, the inconsistent results it might lead to, and the fact said results mainly shed light on the system’s level of autonomy, but not on its compatibility with products liability laws. This article therefore proposes a new approach to distinguishing traditional products from “thinking algorithms” for the determining whether products liability should apply. Instead of examining the vague concept of “autonomy,” the article analyzes the system’s specific features and examines whether they promote or hinder the rationales behind the products liability legal framework. The article thus offers a novel, practical method for decision-makers wanting to decide when products liability should continue to apply to “smart” systems and when it should not.