Table Of Content

Big Data RFI Index # Entity Date 1 Annie Shebanow 3/21/2014 2 Jackamo 3/21/2014 3 Peter Muhlberger 3/24/2014 4 Consumer Action 3/25/2014 5 ARM and AMD 3/26/2014 6 USPIRG and Center for Digital Democracy 3/27/2014 7 Information Technology Industry Council 3/27/2014 8 Consumer Federation of America 3/28/2014 9 Leadership Conference on Civil and Human Rights et al. 3/28/2014 10 MAG-Net 3/28/2014 11 Abraham Wagner 3/28/2014 12 Mary Culnan 3/29/2014 13 Georgetown University 3/30/2014 14 Intrical 3/30/2014 15 Access 3/31/2014 16 ACLU 3/31/2014 17 Advertising Self‐Regulatory Council, Council of Better Business Bureaus 3/31/2014 18 Association for Computing Machinery 3/31/2014 19 Association of National Advertisers 3/31/2014 20 BSA | The Software Alliance 3/31/2014 21 CDT 3/31/2014 22 Center for Data Innovation 3/31/2014 23 Center for Digital Democracy 3/31/2014 24 Center for National Security Studies 3/31/2014 25 Cloud Security Alliance 3/31/2014 26 Common Sense Media 3/31/2014 27 Computer and Communications Industry Association 3/31/2014 28 Computing Community Consortium 3/31/2014 29 Consumer Watchdog 3/31/2014 30 Dell 3/31/2014 31 Direct Marketing Association 3/31/2014 32 Durrell Kapan 3/31/2014 33 Electronic Transactions Association 3/31/2014 34 Federation of American Societies for Experimental Biology 3/31/2014 35 Financial Services Roundtable 3/31/2014 36 Food Marketing Groups 3/31/2014 37 Future of Privacy Forum 3/31/2014 38 IMS Health 3/31/2014 # Entity Date 39 Interactive Advertising Bureau 3/31/2014 40 IT Law Group 3/31/2014 41 James Cooper 3/31/2014 42 Jason Kint 3/31/2014 43 Jonathan Sander, STEALTHbits 3/31/2014 44 Marketing Research Association 3/31/2014 45 McKenna Long & Aldridge 3/31/2014 46 Microsoft 3/31/2014 47 MITRE Corporation 3/31/2014 48 Mozilla 3/31/2014 49 NYU Center for Urban Science & Progress 3/31/2014 50 Pacific Northwest National Laboratory 3/31/2014 51 Privacy Coalition 3/31/2014 52 Reed Elsevier 3/31/2014 53 Frank Pasquale 3/31/2014 54 Sidley Austin 3/31/2014 55 Software & Information Industry Association 3/31/2014 56 TechAmerica 3/31/2014 57 TechFreedom 3/31/2014 58 Technology Policy Institute 3/31/2014 59 The Internet Association 3/31/2014 60 U.S. Chamber of Commerce 3/31/2014 61 US Leadership for the Revision of the 1967 Space Treaty 3/31/2014 62 VIPR Systems 3/31/2014 63 World Privacy Forum 3/31/2014 64 Constellation Research 4/2/2014 65 Fred Cate, Peter Cullen, and Viktor Mayer-Schönberger 4/4/2014 66 Healthcare Leadership Council 4/4/2014 67 Brennan Center for Justice 4/4/2014 68 Making Change at Walmart 4/4/2014 69 Online Trust Alliance 4/4/2014 70 MIT 4/4/2014 71 Coalition for Privacy and Free Trade 4/4/2014 72 EFF 4/4/2014 73 Privacy Coalition‐Updated 4/4/2014 74 EPIC 4/5/2014 75 Kaliya Identity Woman 1 4/5/2014 76 Kaliya Identity Woman 2 4/6/2014 # Entity Date 77 Tyrone Grandison 4/6/2014 78 Open Technology Institute, New America Foundation 4/8/2014 Annie Shebanow My notes on Big Data RFI: Privacy notices are difficult to draw and deliver for the data use. Policies are needed to safeguard data and information. There should not be differences between the government and the private sectors for policy frameworks or regulations on handling big data. Although we all know there are differences between government and private sectors’ data use, the privacy policies should be general enough to apply to both entities. For any government or private organization:  Big data policies needed for securely handling, storing, and protection of structure, unstructured, and semi-structures data  Privacy policy controls for big data must be part of organizations’ operations, and those privacy policy controls should be public information and listed on the organizations’ Web sites. These policies could be companies’ own policies and the regulatory environments policies. Transparency is the key to safeguarding data privacy. These should be in bullet list format (not the large documents written by attorneys in attorneys’ language, which no one reads them as some companies provide during signup process.)  Different privacy policies needed for the type of data use  Differentiate big data analytics policies for collecting individual’s data directly (supplied by individuals), indirectly from third parties (posts to social media, videos, sensor data), or public records  Global consortiums for big data analytics policies are needed to draft, modify standards and policies, and accepts membership (U.N. of Big Data Analytics Policies)  Countries’ memberships protects their citizens’ data, lack of membership may create economic downfall. This will incentivize best practices for big data analytics policies.  Special privacy policies for those countries that are not members of big data privacy consortiums and when data is originated from those countries  Fair information practices privacy policies needed for information sharing  Big data privacy announcement policies needed for data and information sharing  Random auditing policy for ensuring privacy policies’ of organizations (It could be similar of SOX-404 process)  Big data analytics policies whistleblower measures needed  Policies needed to encourage big data analytics privacy policies education in our educational systems for the understanding and protection of citizens’ fundamental rights and providing information on existing data privacy policies and standards From: Jackamo < Sent: Friday, March 21, 2014 8:23 PM To: [email protected] Subject: Big Data & Big Brother Dear Big Brother, I know you love me and want to take care of me, that's why you want "Big Data" to love me too. But I don't love you Big Brother or your friend Big Data. As a matter of fact Big Brother, you can take a flying bit-of-my- ass and Big Data can choke on snot and slobber all over itself-just don't bother my right to be left alone. I hope this doesn't seem impolite or rude Big Brother, but it seems to me you need to find something more important to do; like finding a cure for clap or the common cold and stop wasting the taxpayers money on this frivolous paranoid set of fantasies you've been having ad infinitum about monsters from the ID, etc. Maybe you need a new therapist Big Brother? Maybe, you can discuss with him or her why you're so anal retentive and obsessed about collecting other people's business. At any rate if, all else fails Big Brother: just relax, pour yourself a favorite drink, then lay down and take it easy, and all those bad thoughts will go way-at least for a while. Love, __JPC__ 1 From: Muhlberger, Peter <[email protected]> Sent: Monday, March 24, 2014 2:55 PM To: [email protected] Subject: [Big Data RFI] To contextualize my comments, a little about me: Till Oct. 2013, I was the National Science Foundation's program director for cyber social sciences. This position involved me in questions of cybersecurity and, thereby, data privacy. I am now in an unrelated position, but remain interested in issues of big data privacy. My views here are my own and not the views of the National Science Foundation. My background includes political science, public policy, political psychology, and data intensive social science. I am still learning about the policy context of big data privacy. Consequently, what I say here is an impression at this point. Nevertheless, it does seem that there are some important points to be made. The following clarifies what I perceive to be limitations of the current policy framework regarding big data and privacy issues, and I sketch a more complete framework. I will address specific questions with respect to this RFI (request for information) after elaborating this framework. I comment on four of the five questions below, identifying them by question number. It is my impression that policy makers are acutely aware of the potential social and economic value of big data, but are less knowledgeable regarding the long-term implications of big data analytics for individuals and society ('privacy') or the related ethical implications. For instance, Leon Panetta, in the recent MIT / White- House meeting on big data privacy discussed detailed examples of the benefits of big data and, seemingly as an afterthought, mentioned that we should seek to achieve these benefits while preserving our ethical mores with respect to privacy. He did not mention what these mores are, how they bear on big data, or how they might be preserved. The White House data privacy framework identifies privacy as a means to achieving the trust needed to fully exploit big data, but does not elaborate on privacy as an ethical ends or in terms of potential social implications. The discussion, then, seems framed in terms of how to extract the benefits of big data while providing some protection for a less than fully elaborated notion of privacy. That notion of privacy seems to largely consist in avoiding alarm by the public with respect to the spread of individually-identifiable data. Much attention is also focused on technical solutions such as k-anonymity or differential privacy that allow sharing of data--effectively, expanded use of big data--while making a bow to privacy by preventing identification of individuals. This was much of the focus of the MIT / White-House meeting. A notion of privacy focused on the spread of individually identifiable data cannot adequately capture the underlying issues behind privacy concerns. Privacy is only a subset of a set of privacy-related concerns raised by big data that pose substantial social and ethical risks. Technological solutions to insuring individuals are not identified in big data are quite inadequate to address these risks. What is needed is an expanded notion of risk and a robust regulatory regime, based on strong rights, that continually incorporates input from the public. First, it is not evident that technological solutions can even protect the simple notion of privacy: that of anonymizing individual data. Syntactic models of anonymity, such as k-anonymity and l-diversity, have been shown to be seriously flawed because combining an anonymized dataset with other publicly-available datasets 1 allows re-identification of large numbers of people in many circumstances. Big data is precisely about integrating a diversity of datasets, but this then opens many paths to identifying individuals. Differential privacy circumvents this problem by not sharing anonymized datasets. Data is given to a curator who only answers questions based on the data and the answers contain statistical noise to prevent leakage of individual information. This works, however, only if the number of related questions is constrained. That is, queries must be put on a budget. Overall, this arrangement is far too onerous for organizations to willingly adopt it for internal purposes. It means giving up control over a crucial resource, making data difficult to analyze, and preventing resale to others, as well as constraining uses that are best done with non-anonymized data, such as advertising. Those organizations and persons most likely to use big data for the public good-- university researchers, non-partisan non-profits, and government agencies--are those most likely to be potentially hamstrung by technical solutions to preserving individual anonymity in data. Such organizations and persons are the least able to protect themselves from onerous legal requirements with respect to data privacy. Second, the social and ethical risks of big data are appreciably broader than a narrow notion of privacy as individual anonymity. Some aspects of this are evident from considering the functions that privacy is meant to serve. An older academic literature on the social psychological functions of privacy, now largely ignored in the more technology-focused privacy literature, discusses privacy as a crucial element in a person's construction of their identity or sense of self. Privacy is about limiting information about oneself so as to construct boundaries that create the conditions for and also help define personal identities. The need for privacy comes about, in substantial part, from the recognition that the construction of personal identity is vulnerable to social influence. Other people could influence, perhaps even control, the construction of personal identity if they have access to sufficient private information. This information could allow them to utilize means of social control--social shaming, reputational attacks, and interpersonal manipulation--to alter the construction of personal identity and otherwise impose harms, such as limiting future options. Personal identity construction is not something merely confined to young people, particularly in a society in which people are moving at an increasing pace into new jobs, new roles more generally, and new geographic locations. Research in psychology indicates that few adults have fully integrated identities, which means that they accept alternate and often conflicting roles and identities. Numerous experimental studies have found that framing issues in different ways, ways that often evoke different identities and roles, elicits sharply different attitudes and decisions in individuals. In other words, many people have roles and identities that are sufficiently poorly integrated that knowledgeable actors can readily manipulate them—cause them to act or think in a way they would not if they knew as much as the manipulator. It is well known that social identities powerfully affect political attitudes and behavior. Such identities have apparently been used quite successfully to manipulate the public in the political sphere. Various lines of psychological and sociological research show that identities and roles guide much of day-to-day behavior as well — as people make their way through the day by enacting such roles as 'parent,' 'employee,' 'consumer,' 'friend,' 'husband,' 'citizen,' and many others. Because these roles and identities are less than perfectly defined and integrated in most individuals, people are open to manipulation. An organization that could subtly shift what a person understands by, for example, 'employee,' 'consumer,' 'citizen,' or 'friend' or can affect when these roles are activated or the tradeoffs between them could powerfully affect behavior. 2 Big data increasingly puts large organizations in the position of being sufficiently knowledgeable about individuals or groups of individuals that they can effectively impose social controls, particularly manipulation, on individuals. The social science applications of big data and computationally intensive modeling and analysis techniques are expanding rapidly. Today, advertisers build profiles of individuals. In the not too distant future, social scientists may well be able to take the sum total of what people have said and done online and done within view of increasingly omnipresent sensor networks to construct detailed models of people's beliefs and understandings as well as psychological, cognitive, and behavioral propensities. With such information, organizations will seek to determine what to say and do to people to shift these people to a new belief system, personality, or identity configuration that will prove beneficial to the organization manipulating that person. 'Doing to' a person could be something as subtle as providing targeted discounts for certain activities that shift behavior and self-definitions over the long term. While for the most part organizations have thus far shied away from other means of social control such as shaming and reputational attacks, such means could come to be utilized in the absence of a clear and well-enforced regulatory regime against them or given expanded possibilities to shame or attack people anonymously. The prospects for capturing key aspects of people from social media and other data and being able to use this information to greatly influence them beyond existing levels of influence are, of course, speculative. Nevertheless, large companies have put down multi-billion dollar wagers that big data about people can help them affect their behavior. We also know that people who know us intimately can 'push our buttons,' that most people are readily manipulated by psychological experiments, that people seem to be manipulated at a substantial scale by political campaigns, and that a person's daily activity and interaction contexts greatly influence behavior. Big data is creating a world in which large organizations will likely have sufficient access to background information about people to potentially know as much about them as intimates, if not more. This is all the more possible to the extent that people increasingly use social media and related contexts to interact with their friends and to develop new identities and roles. These behaviors are permanently captured and then shared between organizations. What remains is for natural language, text mining, and other techniques to be further developed to make powerful inferences from this vast quantity of information. To the extent that good modeling of individual propensities will be possible in the not too distant future, oligopolistic organizations could use such information to more thoroughly influence people by presenting them with certain frames and belief structures. This is already done in relatively poorly targeted ways today in political campaigns. Organizations could also tamper with people's marginal costs of activity—structuring their environments and behavioral affordances to influence people's development. For example, social media companies could alter the likelihoods of who people are likely to encounter online, to insure they have the 'right' friends to influence them in a particular direction. Prices, advertising, and the availability of information (e.g., search engine top results) could be modified to insure people encounter the 'right' information. Search companies already personalize search results based on past behavior, and companies already seek to influence the prices and information to which individuals are exposed. The full set of such developments in technologies of manipulation remains speculative, but such developments do seem to be in the realm of possibility. Consequently, the question is whether to risk potentially substantial social and ethical harm by allowing, as we currently are, the building of a vast infrastructure by large organizations to accumulate and analyze big data about people and to influence people's environments in various ways. 3 A technology of manipulation would, of course, be most effective if attuned to each individual. Nevertheless, even without information about specific individuals, a greatly improved technology of manipulation would still be possible with big data and emerging computationally-intensive methods. Simply knowing, in vast quantity and detail, the behavior and verbal expressions of highly specific types of individuals should prove very useful in determining how to influence individuals of these types. Thus, even if individuals were successfully anonymized in big data, by technical or legal means, it would remain possible for organizations with access to such data to use it to determine how to manipulate rather specific types of people, which these organizations could identify without big data. Thus, the privacy of individuals could be maintained even while individuals were being harmed. Because of this, the ethical implications of big data should be construed more broadly than the issue of individual privacy. The solutions to the above concerns are apt to be complex and multi-faceted, particularly with respect to the tradeoff between the potential social and private value of big data and its potential harms. Nevertheless, any adequate solutions are apt to embrace a number of key ethical principles that form a set of 'big data rights': the right to know, the right to anonymity, the right to be forgotten, and the right to be heard. Aspects of these principles also appear in the Consumer Privacy Bill of Rights (CPBR). There are, however, a number of ways in which CPBR would need to be fortified to be up to the task of fully addressing the potential social and ethical harms of big data. First, the rights must be treated as strong ethical rights. A strong right is not one that should be subject to definition and voluntary self-restrictions by those entities with the most to gain in violating those rights—organizations with large stakes in big data. The public and academia should be most involved in defining these rights and monitoring their protection in a changing environment. Second, people have a deep right to know—one that extends to the inferences and profiles drawn about them and to exactly how these inferences and profiles are being used to affect them. People have a deep right to know what organizations know about them and how this knowledge is being used with respect to them and what would be different in their environment or information exposure were they not known to these organizations. For example, people should have access to a service that would show them search results that were withheld from them based on prior information held by a search firm or commercial firm. They should also have access to a service that presents an accessible summary of what organizations know about them, including inferences, and access to organizations’ detailed information about them as well as information on how the inferences organizations make about them are being used. Once they know how their information is used, people need a practical mechanism to help them articulate their concerns and have these concerns impressed upon policy makers. That is, it is insufficient for people to only be able to express private alarm about what they learn concerning the uses of big data; they need public fora that allow them to articulate their concerns and relay collective concerns to policy makers. This is a meaningful 'right to be heard.' Policy makers should welcome continual input from the public with respect to something that could drastically and not necessarily positively shape society. People should have a default right to anonymity, balanced against the social benefits of the data. Even when people are in public, they have the expectation that they are not being tracked and analyzed--which is stalking. That is, people do not expect, even in public places, that someone is recording their every move and making inferences about those moves that add up to a detailed picture that may well invade their privacy. Similarly, people do not expect their online activities to be carefully tracked across websites, their utterances on social media sites to be forever recorded, analyzed, and data or inferences shared with other organizations. Until they obtain consent from affected individuals, organizations 4 whose purposes are not entirely focused on the public good should not be allowed to track and analyze individuals via big data methods. Perhaps such organizations could persuade the public to make their data available by having a third option of temporary tracking so the organizations could show people the benefits of their maintaining information about them. Such data and the consent to collect it should, however, be erasable at request as well. This is the right to be forgotten. People seeking a new start, as many do, were once able to move to a new town and begin anew, without others having preconceptions about them. Identity development and change more generally are also facilitated by a world in which others do not have perfect records of a person's past. A right to be forgotten is therefore important. CPBR is thus far ambiguous with respect to a right to be forgotten. Corporations are, for example, not held responsible for data they do not control. This seems to allow that a company will collect and share, with third parties, very substantial data about an individual, which data is then out of the reach of a right to be forgotten because the company no longer controls the third party data. A serious right to be forgotten would require every piece of data collected by a company to receive a unique identifier that would allow, with the right data systems, all instances of this data to be erased, even if held by third parties. Moreover, not only should the data be erasable, but all inferences built based on this data should be undone. These inferences pose the greatest dangers to privacy. Having presented a framework for understanding big data and privacy-related issues, the specific questions from this RIF can be better addressed: Question 1: The current U.S. policy framework is insufficient to fully address the ethical and social implications of big data. Too much emphasis is currently placed on technical solutions to insuring the privacy of individual data, which is not the central issue with respect to the potentially adverse implications of big data. Organizations such as information sector firms will likely not be required to use the more effective technical solutions internally. The existing framework does not capture the most important aspects of the functions of privacy--namely to protect people who are vulnerable to manipulation of their identities, roles, and beliefs. With such concerns in focus, the chief problem presented by big data shifts to the many possibilities big data affords to large organizations for improving their methods of influence. While the prospects for a vastly improved technology of manipulation based on big data remain speculative, these prospects are not implausible, companies are making big wagers that such technologies will prove effective, and a risk-averse approach should be taken to minimize potential harms of such advances. In particular, it is important to take seriously the notion that people have strong rights with respect to privacy that supersede the economic benefit of exploiting new technologies. Taking such rights seriously would require a far more stringent regulatory scheme than is being currently considered. Question 2: Given the potential for large-scale abuse of big data technologies, there should be measures in place to insure that any organization that does not operate exclusively in terms of the public interest, understood in a non-partisan way, should be subject to constraints with respect to their big data capabilities, particularly as outlined in the 'big data rights' above. Such rights implicitly call for strong regulation of these organizations. In the case of organizations that pursue the public good, rights should be balanced against potential for contributions to the public welfare. Commercial organizations cannot be assumed to be exclusively or even primarily concerned with the public good. Universities and government agencies, on the other hand, are tasked with pursuing the common good and, ideally, act under careful scrutiny and transparency rules, unlike commercial organizations. Question 3: This question is part of the problem. Technological solutions to insuring the privacy of individual data are not a sufficient answer. Instead, 5

Description:

An important point to keep in mind from the start is that Big Data starts with protecting their data against hacking and (ii) ensuring their data is not sensitive data – health, financial, individual communications, trade secrets, etc;.

Entity Date 1 Annie Shebanow 3/21/2014 2 Jackamo 3/21/2014 3 Peter Muhlberger 3/24/2014 4 ... PDF

627 Pages·2014·13.16 MB·English

by Annie Shebanow

Checking for file health...

Save to my drive

Quick download

Download

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Entity Date 1 Annie Shebanow 3/21/2014 2 Jackamo 3/21/2014 3 Peter Muhlberger 3/24/2014 4 ...

Description:

See more

The list of books you might like

Upgrade Premium

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.