Eighth Workshop on the Philosophy of Information

Printer-friendly versionPDF version


Developing formal ontologies for informed consent in medical research (Flavio D’Abramo)

Informed consent has been thought to protect patients and participants by letting them know all the information needed to decide in autonomy about their participation in medical research projects. With the actual technologies – e.g. biobanking – we are about using informed consent to authorize both the sharing of individual biological samples and the circulation of data extracted from bio-samples and data on donors’ personal information. To handle biological data different technologies and approaches have been put in place; particularly these data have been enriched with semantic information to allow their description, management, dissemination and analysis, even if mostly within biological sciences and less in medical research. At the contrary, information expressed by research participants through informed consent processes seldom are handled electronically and almost never are semantically enriched. Traditional informed consent is usually elicited once and for all, and information is conveyed only through paper and face to face encounters. Using traditional informed consent for research based on high-throughput technologies situated in different institutions and countries poses problems that challenge the validity of informed consent itself, related research and international agreements. The development of ontologies for informed consent applied to medical research can be of help to realize an effective data sharing among researchers respecting the will of patients allowing the travelling of situated information expressed by participants to research. Indeed, when coupled with general information about medical research (i.e. general information on specific research projects, such as aims, benefits, risks, procedures and outcomes of research, phases of the projects, related publications, funding) the semantically enriched information about informed consent options decided by patients might be of help in realizing the social pact agreed by scientists and citizens for which medical research has to be at society’s service – here I see citizens’ will as a central component of politics of science. Being active, a single participant might participate in research, not only controlling her will on opting in specific projects, on data sharing and feedbacks of research, but might also enrich the data needed by researchers to test research’s hypotheses.

In this talk I propose some hypotheses describing the problems hindering such an implementation and some practical steps to develop such a project.

Epistemic Lenses: The Case of Informed Consent (Orlin Vakarelov & Ken Rogerson)

People lie. Lies have more impact in some situations than others. If a child takes a cookie, the consequences are light. But, if an untruth is part of a medical process, that can result in more troublesome outcomes. The medical field is well aware of its responsibility to “do no harm,” the historical Hippocratic Oath vow taken by physicians. But not all medical information is created equally. It may be impossible to explain the intricacies of brain surgery – and its consequences – to a layperson. What does this asymmetry in knowledge mean for the doctor and the patient?

One way to think about this medical information exchange is to imagine how vision changes through a pair of glasses. In a very real sense, optical lenses also lie (especially if we bracket the question of intentions and we don’t insist that lies are propositions of determined truth-value). We need to assume the following: 1) that there is a normal condition for optical exchange as an information transfer process and 2) that there is an agent that needs the information conveyed through the lens. Following these assumptions, it is possible that the presence of the lens, with its light-modifying properties, may lead to incorrect information getting to the agent. Lenses can also reveal the “truth,” so to speak. If one’s eyes distort the normal condition for information transfer, the correct lens can reverse the distortion. Note furthermore that because corrective optical lenses may provide epistemic benefits, they may have moral value. It is not unreasonable to say that in some circumstances one ought to provide for corrective lenses. Therefore, it is also not unreasonable to say that there should be public policy related to the availability of corrective optical lenses and, of course, there are such policies [1]. Optical lenses are, of course, not terribly interesting. We are interested in a different but analogous problem: Can there be epistemic lenses? Lenses the will aid in the improved understanding of transmitted information? Especially in the medical field?

What are epistemic lenses? Consider an example from medical research. In drug testing, researchers need subjects in order to test the effectiveness of pharmaceutical interventions. We need sick people to see if the drugs work. Various ethical considerations demand that patients provide informed consent to participate in these studies. Often, human studies for new drugs involve terminally-ill patients who agree mainly because they think that the new drug gives a new hope.1 Let us call this the hope factor. It is reasonable to argue that a correct assessment of the hope factor is important for informed consent. The patients are taking a risk and need a correct assessment of the return. Giving false hope will be like misrepresenting the value of the risk. The consent, thus, requires that the researchers provide accurate information about the experiment and the likelihood of successful outcome for the patient.

The experimenter is ethically obligated to be as truthful as possible about the possible outcome of – and any possible risks from – the experiment. Here is the problem: there is ample evidence [2,3] that potential subjects of such experiments systematically misunderstand or even outright ignore the explanations, and are left with incorrect assessments of the outcomes. In the literature, there do not seem to be any good solutions to this problem, which we can describe this as an epistemic distortion.

The situation resembles (suggestively) the case of optical distortion described above. There is an information medium – light beams in one case or a message in the other – that is subject to normal conditions of evaluation – no distortions of the light or an interpretation of the message’s semantic content. The audience, however, in a sense distorts the medium in such a way that it effectively receives different information. The normal condition of the medium is not normal for the audience. The solution in the optical case, as we observed, is to introduce corrective distortions to the medium that compensate for the distortions introduced by the audience. A corrective epistemic lens is a distortion to the information medium in a way that the content of the message is modified so that, when processed by the distorting cognitive agent, the agent will make similar judgments as a non-distorting agent. Let us call a non-distorting agent an epistemic expert.

In this paper, we show that the concept of epistemic lens is well defined and identifiable. We further argue that in case of medical informed consent, corrective epistemic lenses are morally obligatory. Finally, we will propose some optimal circumstances under which corrective epistemic lenses best function.

Let us begin by identifying several assumptions about the communications process that may make people hesitate to accept this type of approach.

First, there is a tendency to assume that semantic evaluation is categorical – a message is either true or false (or undetermined). Consider the following statements that may appear in a hypothetical medial communication:

  1. The probability of successful outcome of the experiment is 15%.
  2. The probability of successful outcome of the experiment is 17%.
  3. The probability of successful outcome of the experiment is 90%.

Clearly, the reactions that 1 and 2 ought to produce would be very similar to each other and very different from the reaction to 3. The purposes of the patient’ decision making, the information in 1 and 2 is nearly identical, while the information in 3 may lead to a very different decision. Information media can convey information of varying non-categorical degrees (and dimensions) of variation. Decisions or confidence in decision made by a cognitive agent may vary by similar degrees. The categorical nature of the truth-functional semantics often creates unnecessarily sharp corners for human decision-making. Semitics of non-categorical information media is possible. Probability theory has been developed partly for this reason. Such sophisticated semantics are notoriously difficult for non-experts to follow correctly. It should be expected, thus, that corrections of epistemic distortions would be more appropriate non- categorical media. As a hypothetical example, if John systematically “misunderstands” such statements by overestimating by 10% (in the mid-range of probabilities), this may be compensated for by reducing the number by 10%.

Second, there is a tendency to assume that the content of verbal messages is determined entirely and uniquely by the form of the message and by various contextual parameters set at the time of the utterance. As a result, epistemic judgments, such as whether the message is true, or moral judgments, such as whether the message is a lie, are settled before they are received or comprehended by the audience. As a result, the significance of the audience in the analysis is diminished. When we say that “the experimenter is ethically obligated to be as truthful as possible about the outcome of and any possible risks from the experiment,” there is a tendency to think that this is determined independently of the audience.

We have argued elsewhere that the best analysis for information transfer from one agent to another is based on a pragmatic theory of information. The most important condition that a pragmatic theory of information adds to the traditional accounts is that the receiving agent must be able to act on the information in pursuit of a specific goal; that is, the information must be what we call actionable. The effect of the condition of actionability is that the state and dispositions of the receiving agent are relevant for the analysis of the message, including its semantic analysis.

While pragmatic accounts make notions such as “content of the message” dependent on the receiving agent, they allow the identification of normal conditions for communication. Such conditions may be based on “typicality” judgments, that is, how people “normally” understand certain expressions. However, such conditions may also be based on “rational” criteria, such as how the message ought to be interpreted according to some theoretical analysis. Either way, the pragmatic account allows for an assessment of how audience-based deviations from the norm affect communication. (Note that the case is similar for the optical problem, where the normal conditions about parallel light rays depend on the nature of the typical visual system.)

The case of the “rational” criteria for normal conditions is specifically interesting for our case study. The reason is that many claims about medical experimental results, probabilities of outcome, etc., have precise theoretical conditions of interpretation. Correct interpretation of claims about probabilities, for example, and rational decisions based on such claims, demand special epistemic expertise. As a result, most non-experts, as is a typical terminally ill patient, may not only misunderstand such claims, but may do so in systematic ways.

If a researcher knows that a patient will systematically misunderstand the explanation of an experiment or the likelihood of the result, then constructing a message that meets the normal conditions of interpretation will effectively misinform the patient. If, moreover, the researcher can systematically modify the message so that patient gets the correct interpretation, then the researcher ought to do so.

We would like to pre-empt several potential objections to a theory of epistemic lenses: (1) The do-not- lie objection, (2) the autonomy objection, (3) the alternative solution objection, and (4) the slippery- slope objection.

The do-not-lie objection states that the modification of the message is a form of lying and lying is wrong. Indeed, if one talks to an expert, a modified message would be a lie. Such an analysis is based on a non- pragmatic theory of the communication process. According to our analysis, the objection is based on an incorrect application of the normal conditions to a non-normal case. If one is in a non-normal case of communication – not talking to an expert – then one ought to adjust accordingly.A difficulty may arise when the message is used in different conditions. For example, the recipient may use it at a later time or the message may be communicated to another person. In such cases, maintaining normal conditions may be a safer policy. This suggests that epistemic lenses may be used more safely when the message is only significant in the specific circumstances and to the specific person. We believe that the case of informed consent for a medical study meets this condition.

The autonomy objection states that modifying the message is patronizing and violates the audience’s discretion to use the information as they please. Such an objection is misguided for two reasons. First, modification of the message would be patronizing only if the goal is to manipulate audience behavior, because one does not trust the audience’s ability to make good decisions, or if the message may harm the audience. The old, and now rejected, practice of hiding terminal diagnoses from patients in order to protect them from bed news indeed violates the autonomy of the patient. Modifying a message to correct misunderstanding of probabilities, for example, does not override a patient’s ability to make decisions according to personal goals and values. Second, not applying the epistemic lens may violate the patient’s autonomy because the patient is effectively misinformed. It should be noted that the researcher is not a neutral party. The patient is valuable as a data-point. There is an opportunity for the researcher to appear truthful – in order to seem to comply with the informed consent rules (de vidi) – while not being truthful de facto.

The alternative-solution objection states that instead of modifying the message, which looks too much like misinformation, the researchers ought to make sure that patients are in the normal condition. This all sounds good until one thinks about the burden this places on both the audience and the researchers. Completely understanding normal conditions, in effect, means that patients must become experts in the medical topic at hand. In some simple cases, where one can easily educate the audience, this indeed may be preferred. [2] and [3] report that the most effective intervention techniques to improve patient undertaking involve some form of targeted education. The best, but most expansive, strategy is to arrange an extended conversation between the researcher and the patient. Even such strategy, however, show only small improvement. Moreover, the studies show that overwhelmingly, the most important factor for better understanding is educational achievement of the patient. In other words, the closer one is to an expert the better they understand. To have a real effect, it seems, targeted education would work only if it is real education. It is not reasonable that participants in a study must take a six- month course in probability theory before they can consent.

Finally, the slippery-slope objection is that, as a matter of policy, if we start allowing agencies to modify messages away from publicly examinable norms for content, the door may open to manipulation. This may be especially aggravated by the fact that the conditions that epistemic lenses correct are private and may reveal vulnerabilities. This is a very serious objection. As with most slippery-slope arguments, however, the objection shows that the conditions under which an epistemic lens is appropriate are complex and must be investigated independently for each case. We believe that, in the case of medical informed consent, an argument can be made for the benefit of epistemic lenses. In other cases, one may have to be more careful. We acknowledge that this is a difficult public policy problem that must be investigated with care. However, epistemic lenses offer an opportunity for a more efficient policy environment that is sensitive to individual differences and needs. This is in line with the emerging field of e-governance that emphasizes individualized solutions. We also agree strongly with the concern for abuse and endorse strong caution in the application of epistemic lenses.

For this paper, we want to explore the conditions under which information flowing through an epistemic lens has the most optimal impact as applied to the case of medical informed consent. The case is especially relevant given the advances in medicine and the access of more and more people to health care, whether through socialized medicine or greater availability to health insurance. Health care is expensive, can be invasive and is very personal. The epistemic lens can provide an avenue for more in- depth conversation about the best way to help people understand both the risks and the benefits.

Data flow & the shaping of hereditary etiology. Past, present, future (Giulia Frezza)

In my paper I propose to look at the history of medical etiology for highlighting how shifts in the understanding of data have shaped the etiological debate. My aim, in particular, is showing how the relation between nature and nurture interwove with and impacted the etiological debate within the framework of hereditary theories. Analyzing the crossroad between heredity and causality in the history of etiology will also allow to outlining more recent issues. The way we handle, observe and consider data eventually relies to the entanglement/disentanglement of their sources, intended as natural or cultural, such as it is happening in environmental epigenetics and in the newborn debate about the novel germ-line applications of CRISPR-cas technology where the return of eugenics is at stake.

The cradle of pluralist etiology within the humoralist paradigm, where the cause was just one among many influences, broke around the half of XIX century when a parallelism between the rising of the mono- causal bacteriological model in etiology (one-cause, one disease, and possibly one therapy), the establishment of the scientific explanation of heredity, and the spreading of Galton’s nature/nurture distinction can be outlined. While positivism and modern experimental medicine supported the idea of isolation of the cause, rather than interaction amongst influences, physicians became acquainted with heritability when challenging trans-generational effects of unstable heritable diseases, establishing the idea of latent causality of disease. Data opened a novel gaze in physicians’ analyses. The increasing amount of clinical data due to urbanization process was compelling in assessing the variability and the indeterminacy of the effects visible at the level of the individual, families and sub groups of population too, eventually developing an epidemiological approach.

Galton, by means of nature/nurture distinction, introduced yet a novel gaze on data: outlining pioneering techniques and concepts such as normal distribution of traits and hereditary transmission of mental ability, especially through his famous twin studies. Briefly, according to Galton’s analysis, at birth there is a great potential for change and development, but in the competition between nature and nurture, nature proves the stronger. However, Galton underlines that when observed data is the interaction between nature and circumstance complexity rises, rather than distinction and separateness.

After Mendelian scientific explanation of hereditary transmission, seminal Morgan’s fly group experiments took place, opening a subtler level of analysis through magnifying data and discovering underlying genetic mechanisms. As a consequence, the extremely complex nature of embryonic development and inheritance was revealed and the more “elementary” mechanic of genes transmission was exploited. Since molecular biology tremendous achievements in the half of XX century, the revolutionary notion of gene as the core of genetic data analysis established the concept of linear and internal determination of hereditary transmission and development. Two metaphors became influent in this model: (i) Gene is the biologists’ atom; (ii) Genes are linear unidirectional flows of information. Both ideas concealed the action/reaction between individual development, environment and the multiple set of causalities crossing- over heredity and development.

This shift affected developments in biology, neglecting important topics of research recently raised again by evo-devo theories and epigenetics studies. Moreover, the necessity of a finer inquiry of complex causality provoked by data flows or network interactions boosted especially in biomedicine. Nature (gene) and nurture (environment), considered as isolated causes, are intertwined when dealing with most common non infective diseases such as non-Mendelian complex disorders like cancer, or cardiovascular events. Evidences of complex etiology turned again the flow of data upside down: from focusing on isolated natural/cultural causes to focusing on how gene and environment interact.

For instance, the epigenetic model proposes a reversal of the genetic information flow: from focusing on coding DNA to focusing on DNA expression, by means, of switching, silencing, reversible and inheritable multifarious epigenetic mechanisms, which are even said to “bridging the gap” between nature and nurture by simply blurring their distinction. In this perspective, the multifactorial mix of distal social factors (poverty, inequality, stress, etc.) in assessing etiological components can be brought back to their underlying epigenetic mechanisms. From a sociological perspective this idea has been variously criticized by stressing, for instance, the issue of parents’ responsibility vs. political engagement in health prevention, and emphasizing the fact that blurring an epistemic distinction between natural and socio-economical inequalities does not eliminate the very existence of these inequalities (from parents’ risky behaviors, such as smoking, to poverty, famine and so on).

Currently, a novel shift in the consideration of genetic and epigenetic flow may break in thanks to a simple discovery that is opening a revolutionary chapter. CRISPR-Cas technology is an application to genetic engineering of a bacterial immunity system for editing, regulating and targeting genomes, which is already changing our way of understanding biological life and eventually will result in tremendous heuristic insights and applications from human health to reshaping the biosphere. Germ-line application of such a rapid, precise and inexpensive device, say in pre-natal diagnosis, is potentially revolutionary and overturning bioethical and sociological debates about inequalities in health prevention, and the issue of eugenics looms in again.

A fundamental task in future debate would be assessing the impact of what data stability, data reversibility and data design means. How genetic data changes when an immediate and inexpensive application for “re-programming life” is going to be available? In the light of their potential immediate application and spreading are those genetic data conveying the same kind of information as before?

Representing and unifying biomedical knowledge: the case of Gene Ontology (Federico Boem)

In biomedical research an experimental result can be grounded on the consistency of the methods adopted and on the locality of its production, namely, the experimental conditions. As also remarked by Jacob “in biology, any study [...] begins with the choice of a ‘system’. Everything depends on this choice: the range within which the experimenter can move, the character of the ques- tions he is able to ask, and often also the answers he can give” (Jacob 1987). Thus biological findings seem to be strictly dependent on the locality of their production. The possibility of a generalisation is very problematic in biology and the claims about biological phenomena beyond the locality of data is often difficult within traditional approaches. Ontologies seem to overcome such a locality (see for instance Leonelli 2009) since they can exploit the knowledge produced in a specific context and make it disposable to another (even very different) one. To put it differently ontologies broaden the Jacob’s notion of experimental system to the entire realm of biological knowledge. However in these terms such a difference could be just an evocative picture. My proposal is to provide an epistemic justification for the unifying power of ontologies. In particular, I will focus on Gene Ontology (GO) structure. By examining both the epistemic reasons for its implementation and the type of analysis provided by GO, I will show how such a tool resembles some features of a map but nevertheless constitutes something new in the epistemological sce- nario. Not entirely a theory, more than a model (but structurally similar to it), I will argue that GO a novel category within the epistemic repertoire. I then claim that the knowledge provided by GO should be seen as a more or less effective tool through which we can discriminate, among an enormous amount of data, a convenient way of organising those empirical results which were at the basis of the GO analysis. Accordingly, such a specific status will be better specified given that GO is both conventional, as the result of epistemic interactions towards a common agreement, and normative, since the tool shapes the representation of knowledge as it will be perceived by other, future researchers. In conclusion I will suggest that GO is an orienteering tool on which scientist can map their data on a wider context and then, thanks to this, elaborate new experimental strategies. GO is then a map for making the conceptual content of a particular experimental condition comparable across different research contexts. Such a map is essential not as a way to confirm experimental results but as a way to compare experimental results with the theoretical background (the so called ‘big picture’). Lastly, I will face the fact that ontologies are considered a unification tool. In taking into account the possibility of such a generalisation (beyond the locality of data production), I will show that GO does not create, per se, a unification for the theoretical content. My proposal is then to clarify what and how exactly GO is unifying.

Making Big Data Useful in Biomedical Sciences: The Need for Serious and Pluralistic Methodology (Stefano Canali)

As a consequence of the advent of big data, a number of characterisations regarding the role of data have been proposed. According to a popular view, the presence of more data revolutionises this role, making it the most impor- tant element of research. Other methodological elements, such as hypotheses and models, are either generated directly by the data (Kitchin, 2014: 2-3) or, according to stronger views, made redundant by the fact that in big data we can find all the information we need (Mayer-Schönberger and Cukier, 2013). A typical example used to illustrate how this view applies in biomedical sci- ences is Google Flu Trends, which analyses search data to predict flu activity: here, letting data ‘speak for itself’ and play its role as main character is all that is necessary (Mayer-Schönberger and Cukier, 2013: 26-28).

In recent discussions about biomedical sciences, however, many have ar- gued that this view is implausible. Leonelli (2015) has proposed a novel definition of data, for which data is what can be used for making statements about phenomena and can be shared with others. This definition, in my opinion, has consequences on the possible roles data can play in biomedi- cal sciences. Namely, the two elements of the definition – usage for making claims and sharing with others – entail that data cannot drive research: work needs to be done with the data, insofar as it has to be used by someone to make some kind of statement and it has to be made mobile to be shared with others.

Thus, following Leonelli, the role data can play depends on the work done by scientists on the data itself. Consequently, I think that a number of questions arises concerning which kind of work should be done with big data in order to make it capable of playing a significant role in biomedical sciences. By studying methodological choices in big data research, it is possible to give answers on this issue. Here, I present a comparison between two projects of big data research – Google Flu Trends (GFT) and EXPOsOMICS –, arguing that a serious and pluralistic approach to methodology can help us exploit the value of big data.

GFT became famous at the end of 2009, as it predicted the spread of the swine flu epidemic weeks before the report of the Centers for Disease Control and Prevention (CDC); the project was described as a success in the letters’ section of Nature (Ginsberg et al., 2009). However, in 2013 a number of articles highlighted how GFT had anticipated substantially different numbers of the spreading of seasonal influenza, predicting roughly more than the double of visits to doctors than the CDC (Butler, 2013; Olson et al., 2013). Arguably as a consequence of this and other failures, in August 2015 Google announced that they would stop publishing flu estimates. According to Lazer et al. (2014), many lessons can be learnt about big data methodology from the failure of GFT. In particular, Lazer et al. argue that two main issues affected GFT: problems with Google’s algorithm dynamics and what they call “big data hubris”; let me focus on the latter. By big data hubris Lazer et al. mean that the software ignored methodological issues with combining big and small data as well as correlations more generally. That is, GFT’s methodology consisted in looking for strong correlations between roughly 50 million search terms and 1152 data points; with such a high disproportion between the two datasets, Lazer et al. suggest that “the odds of finding search terms that match the propensity of the flu but are structurally unrelated, and so do not predict the future, were quite high” (Lazer et al., 2014: 1203). Another crucial issue was the focus on search data only: comparing and combining search data with other health data would have made GFT more useful.

Considering the points made by Lazer et al., I would argue that GFT’ issues have to do with a lack of engagement with a serious (overlooking po- tential issues affecting correlations) and pluralistic methodology (in terms of few data points, databases and statistical approaches used). As a con- sequence, from the analysis of this case-study, we may draw the suggestion that having a plurality of tools and a serious consideration of methodology is crucial for big data research. While this may be seen as a quite general and difficult to implement suggestion, I think we can find positive examples of big data research applying such an approach. For instance, EXPOsOMICS (www.EXPOsOMICSproject.eu) is a big data project which studies the rela- tion between environmental elements and disease. One of the many interest- ing aspects of this project (see Illari and Russo, 2013) regards its methodol- ogy. That is, EXPOsOMICS methodology can be defined as pluralistic, since researchers use a large variety of different databases (from omics to data col- lected through questionnaires and smartphones, see Vineis et al., 2009 and Wild, 2012: 26) as well as different statistical models (see Chadeau-Hyam et al., 2013). Moreover, methodology is a serious aspect of EXPOsOMICS research, and scientists dedicate entire papers to its discussion and presen- tation (see again Chadeau-Hyam et al., 2013 and, more recently, Assi et al, 2015). Through this pluralistic and serious way of working with big data, researchers aim to assess risk related to the environment, predict disease evo- lution and suggest consequent policy interventions. While EXPOsOMICS’ approach is considerably new and it may be too early to assess its success, points in its favour are the funding received by the European Commission (e8.7 million) and the number of important articles already published as part of the research.

Studying the methodological approaches of EXPOsOMICS and compar- ing it to GFT, we can find significant suggestions for future big data research; additionally, in this way we can see how much work needs to be done with the data in order to make it useful in the first place and let it play a crucial role in current biomedical sciences.

Privacy without (too much) confidentiality: ethical considerations in genomic and big data health research (Alessandro Blasimme & Effy Vayena)

The problem of how to treat personal data in medical research is a special case of the more general ethical issue regarding the moral significance of privacy.

Most of the (philosophical and legal) literature on privacy has thus fur produced a variety of competing definitions and ethical justifications1. The aim of this paper is not to propose a unified normative theory of privacy, nor to define general criteria for the obligations that derive from confidentiality. Rather, we explore if and how privacy can be defended on ethical grounds in the face of huge practical transformations provoked by the use of genome sequencing and the increasing importance of big data in clinical research.

Experts argue that such developments will transform clinical research in the years to come, as huge amounts of personal information on unprecedentedly large cohorts of individuals will be generated, shared and analyzed for research purposes2,3. As a consequence, a much larger array of users will have access to extensive datasets about individuals. According to many, the benefits deriving from such extensive personal datasets (in terms of knowledge, new therapies and public health) will have to be balanced against a reduction in the stringency of current confidentiality standards4. On the basis of such prediction, and in light of people’s willingness to share their data with the research community, it has been argued that current confidentiality standards and privacy protections are anachronistic5.

It is however important to distinguish between relaxing confidentiality standards (thus fostering openness in the name of the expected benefits of genomic and big data health research) and downplaying the moral importance of privacy. Through a conceptual analysis of the meaning and significance of privacy violations in clinical research ethics, we will show that accepting reduced confidentiality obligations does not logically entail altering the moral significance of privacy (see below).

In the context of clinical research ethics, privacy generally means informational privacy and it is mostly grounded in the value of autonomy. What privacy protections are supposed to prevent are privacy invasions, that is, unauthorized access to or disclosure of health-related information: this is the so-called control-based account of privacy6,7,8).

Specific literature on genetic privacy has highlighted the limits of control-based theories9. We take stock of this critique and propose an agency-based account of privacy violations. In our account, privacy violations are not limited to privacy invasions, but also include privacy infringements: an infringement corresponds to a use of personal health information that limits a person’s capacity to access a number of fundamentally valuable freedoms in the sphere of private life (e.g. forming intimate relationships, developing a sense of self and personhood, maintaining authenticity, maintaining self-esteem, avoiding social disapproval etc.). For example, this capacity can be endangered by stigmatization and discrimination based on health-related information10,11,12, and it is independent on whether or not information is obtained following explicit authorization. As rightly pointed out, a person may experience such harms even without knowing they have been caused by abuse of health information13. Those who possess personal health information thus carry a responsibility to minimize the risk of privacy violations of this kind.

Autonomous authorization to access and to share personal data is just one of the necessary ethical conditions for saying that a person’s health privacy is respected. The other necessary and jointly sufficient condition has to do with avoidance of serious harms deriving from loss of fundamental freedoms in the sphere of private life. Unless both conditions are fulfilled, it makes little sense to say that a participant’s privacy is respected.

It follows that accepting to share data with researchers under slack confidentiality standards does not imply that whatever use of those data will no longer constitute a violation of the participant’s privacy. As a matter of fact, indeed, the harm that can derive from stigmatizing or discriminative uses of personal health information does not disappear just because information was lawfully collected (without privacy invasions), or its circulation was explicitly authorized (through informed consent to specific rules of confidentiality) or otherwise permitted (for example by “broad-consenting” to lessen confidentiality standards).

This point, far from being of purely conceptual relevance, has practical implications for the way research participation is organized.
To harness the potential of genomic and big data health research, we might indeed need to accept less confidentiality – and thus allow researchers to share data about participants even without explicit authorization. However, this does not imply that participants no longer have stakes in avoiding privacy invasions, nor that they have less expectations and interests in avoiding the harms typically connected to privacy infringements.
As a consequence, relaxing confidentiality standards in view of a more open data-ecosystem, we reckon, does not undermine the value of privacy. Privacy retains its importance and arguably an acceptable level of privacy protection is still possible even without (too much) confidentiality, provided certain conditions occur that mitigate potentially increased privacy risks, namely: that data can be tracked; that participants are empowered to select upfront authorized and non-authorized data uses and data users; that data sharing takes place within a non-stigmatizing cultural environment and under the protection of effective anti-discrimination laws. However, even if those conditions are respected, it will likely be the case that some of the burden of privacy protection will eventually shift from researchers to research participants.