|
Content Assessment: Reducing Algorithmic Opacity: Technical Solutions for Understanding Systems and Outcomes
Information - 95%
Insight - 95%
Relevance - 95%
Objectivity - 90%
Authority - 90%
93%
Excellent
A short percentage-based assessment of the qualitative benefit of the 2019 study by the European Parliamentary Research Service on algorithmic accountability and transparency.
Editor’s Note: The study, A Governance Framework for Algorithm Accountability and Transparency, published in 2019 by the European Parliamentary Research Service (EPRS), develops policy options for the governance of algorithmic transparency and accountability, based on an analysis of the social, technical and regulatory challenges posed by algorithmic systems. Based on a review and analysis of existing proposals for governance of algorithmic systems, a set of four policy options are proposed in the study, each proposal addressing an aspect of algorithmic transparency and accountability. This study extract highlights technical solutions for reducing the opacity of algorithmic systems and may be useful for cybersecurity, information governance, and eDiscovery professionals as they consider algorithmic systems.
Study
A Governance Framework for Algorithmic Accountability and Transparency
By Ansgar Koene (main author), Chris Clifton, Yohko Hatada, Helena Webb, Menisha Patel, Caio Machado, Jack LaViolette, Rashida Richardson, and Dillon Reisman
Study Abstract
Algorithmic systems are increasingly being used as part of decision-making processes in both the public and private sectors, with potentially significant consequences for individuals, organizations, and societies as a whole. Algorithmic systems in this context refer to the combination of algorithms, data, and the interface process that together determine the outcomes that affect end users. Many types of decisions can be made faster and more efficiently using algorithms. A significant factor in the adoption of algorithmic systems for decision-making is their capacity to process large amounts of varied data sets (i.e. big data), which can be paired with machine learning methods in order to infer statistical models directly from the data. The same properties of scale, complexity, and autonomous model inference however are linked to increasing concerns that many of these systems are opaque to the people affected by their use and lack clear explanations for the decisions they make. This lack of transparency risks undermining meaningful scrutiny and accountability, which is a significant concern when these systems are applied as part of decision-making processes that can have a considerable impact on people’s human rights (e.g. critical safety decisions in autonomous vehicles; allocation of health and social service resources, etc.).
Study Extract
Technical Solutions for Reducing Opacity
Just as there can be technical reasons for opacity of algorithmic systems, there are technical methods for reducing algorithmic opacity, or extracting explanations for the system behavior despite a lack of transparency.
First, we divide transparency and explanation into two categories: Understanding the overall system, and understanding a particular outcome. These may require quite different approaches. For each category, we list several approaches, and briefly discuss what each does and does not provide.
A key idea to keep in mind is the goal of transparency. Is it to understand how the system works? Or how it behaves? From a regulatory viewpoint, a primary issue is likely if the outcome is fair and appropriate – behavior is critical. The regulatory issues governing process are likely more straightforward – GDPR, for example, forbids processing of certain types of personal information except in prescribed circumstances. This simply requires determining if such information is used by the system; in cases where use is allowed, the onus could be placed on the developer to explain and ensure that such use was proper. They key challenge, then, is transparency into system behavior, and we should evaluate methods with respect to how they support explanation.
Understanding the Overall System
The goal here is to obtain a general understanding of the process by which an algorithmic system makes decisions. One challenge with the approaches described below is that they are likely to be difficult or impossible without direct involvement of system developers.
Design Review / Code Review
Design and Code reviews are methods from Software Engineering, used to enhance reliability of a system being developed and ensure that it satisfies requirements. Various techniques are used, such as mapping a specific requirement to the design- and code-level modules that address that requirement. This does provide opportunity for transparency, and research showing that traditional code reviews often find issues with code understandability rather than specific defects suggest the viability of code reviews for improving transparency.
Unfortunately, design and code reviews are expensive and time-consuming and typically operate at a level that involves proprietary information. Furthermore, as noted in section 3.5.4 (see complete study), in a system using machine learning, this provides little transparency. The review may show that the process for building the machine learning model is as expected, but provides little insight into what that model actually will do.
Input Data Analysis
Input data analysis can be used to determine if the data being used to make decisions is appropriate and consistent with legal requirements. The process is to analyze a system at either design or code level to determine all information that is provided to a system when making a decision. This can be useful for determining regulatory compliance, e.g., a system that does not have access to race, gender, etc. as input may not be capable of direct discrimination, and thus not in violation of GDPR Article 9. This provides little insight into system behavior but can be a useful step provided issues with proprietary information can be resolved.
Statistical Analysis of Outcomes
For addressing some concerns, overall analysis of outcomes can be useful. For example, this can be used to identify indirect discrimination: is a protected group disproportionately affected in a negative way? The challenge is that this often requires obtaining data that would otherwise not be used. For example, a machine learning model may not make use of race or gender (avoiding direct discrimination); to store this information anyway conflicts with the principle of data minimization and places more individual data at risk, and requiring this could potentially be considered a violation of GDPR Article 11.
An alternative approach is to create test datasets (either in a protected regulatory environment, or using synthetic data) that can be used to evaluate if overall statistical outcomes suggest there are issues. For example, standard statistical evaluation techniques could be used to determine if outcomes or accuracy are different for specific subgroups of individuals, suggesting fairness problems. This is particularly useful with static models, although it may be more difficult with continuous learning systems.
One caveat is that absolute standards for what constitutes acceptable statistical outcomes may be problematic. There have been many definitions for fairness proposed, and it has been shown that it can be impossible to simultaneously satisfy multiple definitions; any hard requirement on fairness may have unintended impacts. The statistical analysis approach suggested can be useful in determining if there are large-scale issues with a system needing further exploration, rather than as a specific means of providing transparency into the decision-making process and criteria.
Sensitivity Analysis
There is also the opportunity to test systems by providing carefully crafted inputs to better understand how the systems react. For example, providing multiple test cases where only one attribute has changed can provide insight into how that attribute is used in an algorithmic decision process. This is particularly important for machine learning approaches, where even the developers may have little insight into how particular decisions are made.
While a useful technique, this is by no means complete. Many algorithms, including most modern machine learning approaches, can take into account higher-order interactions between attributes. Evaluating all possible multi-way interactions is prohibitive, and as a result, such testing may fail to reveal particularly interesting cases. A potential direction arises in the development of adversarial manipulation techniques; these can identify minimal changes that result in a different outcome, thus identifying particularly sensitive combinations of inputs.
A second issue is that care must be taken to distinguish causation from correlation. While there is a growing research literature in making this distinction, there are still open questions, and as such results need to be used carefully.
Algorithmic Accountability
Technical issues in algorithmic accountability are largely a question if the system behaves according to specifications. Accountability issues such as redress are really beyond the technical challenges of the algorithm; these are more a question about the actions implied by the specifications. While accountability for actions taken by algorithmic systems may need to be different than for human actions, those differences are largely governed by the particular application. As a result, this section will only look at mechanisms for ensuring that algorithmic systems satisfy specifications.
Traditional software design processes include design review, code review, and testing procedures to ensure algorithmic systems meet specifications. Beyond this, formal verification techniques are making significant advances. Formal verification has been demonstrated on significant software artifacts, it is likely that these techniques will become part of standard software engineering practice.
A second aspect of accountability is process standards and certification, such as ISO/IC JTC 1/SC7 standards for software engineering, or the Capability Maturity Model Integration. These discuss processes and procedures organizations should follow in systems design. Within the area of algorithmic transparency and accountability, the IEEE P7000 series of standards currently under development, particularly IEEE P7001 Transparency of Autonomous Systems, may provide good options.
Transparency of Individual Outcomes
A second type of transparency is understanding a particular outcome. Here understanding how a system works is likely of little value, and approaches providing explanation become more important.
Input Data Analysis
Understanding what data is used to determine an outcome can be useful in establishing confidence in the fairness of an individual outcome. Furthermore, the ability to evaluate correctness of that data can identify incorrect outcomes. GDPR Article 15 already requires that data subjects have access to the personal data being processed. While this does not of itself provide explanation of an outcome, it is important to determine if an individual outcome is based on correct or incorrect data. Combined with other explanation methods, this provides useful recourse for individuals concerned about outcomes.
There are numerous cases, however where access to the data that produced an outcome might not be available. Data is often considered to be a valuable asset that organizations are reluctant to share. GDPR for instance does not compel access to non-personal data, e.g., statistical data about large population groups, that might have played an important role in a decision. Furthermore, unless efforts are put in place to ensure that data is retained, for instance, for data audit purposes, it might get overwritten by new inputs. A typical example where deliberate efforts are made to retain data that would otherwise disappear are flight data recorders. The mandatory inclusion of vehicle data recorders in autonomous vehicles has for instance been suggested in order to help future accident investigators get access to input data that preceded self-driving car crashes.
Static Explanation
Systems can be designed to provide explanation of the basis of individual outcomes. This can be either a specific design criteria incorporated into the entire system, or accomplished through techniques such as sensitivity analysis.
Such systems already exist in practice, even without regulatory requirements. As an example, the Fair-Isaac Corporation FICO score, commonly used in financial credit decisions in the United States, provides reports to individuals explaining their individual credit score. These provide ‘the top factors that affected your FICO Score 8 from each bureau with a detailed analysis’. Further, these factors have to be remediable by the individual; ‘You are a woman’ is not, but ‘You are too often late in making your credit card payment’ is.
Design / Code Review and Statistical Analysis
Techniques such as design and code review are of little direct relevance to understanding an individual outcome. However, disclosing synopses of such reviews can be part of the process of setting out ‘meaningful information about the logic involved’, helping to satisfy GDPR Article 15 1(h).
Sensitivity Analysis
As with overall outcomes, sensitivity analysis can be used to determine what has led to a particular outcome. By perturbing inputs — sometimes referred to as testing counterfactuals — and evaluating the change in outcomes, insight can be gained into how a particular outcome has been arrived at. The ability to start with a particular set of inputs enables a wide variety of perturbations to be tried, potentially even capturing multi-variate factors. The previously discussed techniques for sensitivity analysis to study overall outcomes may provide appropriate starting points for such analysis.
Furthermore enabling sensitivity analysis for individual outcomes provides not only greater transparency, but it gives the data subject the opportunity to determine what actions might result in a different outcome, or information that can be useful in contesting an outcome. Such ‘what if’ analyses can provide useful information to individuals, as well as identify fairness issues that require further investigation.
In many cases, this is a tractable approach, for example, in the U.S. Fair-Isaac already offers consumers a FICO Score Simulator that shows ‘how different financial decisions — like getting a new credit card or paying down debt — may affect a FICO® score’.
An example of a powerful, model agnostic, explanation approach for machine learning classifiers that uses input feature perturbation-based sensitivity analysis is the LIME (Local Interpretable Model-agnostic Explanations) technique. LIME derives an easily interpretable model (e.g., a linear regression) that is locally faithful to the machine learning classifier in the vicinity around the individual predictions that it is seeking to explain. This is achieved by fitting the simplified model to input-output pairs that are generated by the machine classifier for input sample instances in the vicinity of the to-be-explained prediction.
Reverse Engineering the ‘Black-Box’ – Putting It All Together
Reverse engineering the black-box relies on varying the inputs and paying close attention to the outputs, until it becomes possible to generate a theory, or at least a story, of how the algorithm works, including how it transforms each input into an output, and what kinds of inputs it’s using. Sometimes inputs can be partially observable but are not controllable; for instance, when an algorithm is being driven off public data but it’s not clear exactly what aspect of that data serves as inputs into the algorithm. In general, the observability of the inputs and outputs is a limitation and challenge to the use of reverse engineering in practice. There are many algorithms that are not public-facing, used behind an organizational barrier that makes them difficult to prod. In such cases, partial observability (e.g., of outputs) through FOIA, Web-scraping, or something like crowdsourcing can still lead to some interesting results.
Conclusions
Meaningful transparency into how outcomes are reached is technically challenging given modern computing systems; regulatory requirements for such transparency may significantly limit the ability to use advanced computing techniques for regulated purposes. Meaningful transparency into the behavior of computing systems is feasible, and can provide important benefits. Mechanisms for behavioral transparency may need to be designed into systems, and typically require participation of the developers or operators of systems.
Fairness, Accountability and Transparency/Explainability are some of the fastest-growing research areas for algorithmic decision-making systems, and especially machine learning. Not only academic funding bodies, but also industry is increasing its investment in this domain. This has resulted in the production of an increasing number of open-source libraries and tools to help developers address Fairness, Accountability and Transparency requirements.
EPRS -STUDY (2019) 624262 - English
Reference: Koene, A., Clifton, C., Hatada, Y., Webb, H., Patel, M., & Machado, C. et al. (2019). A Governance Framework for Algorithmic Accountability and Transparency. European Parliamentary Research Service (EPRS). Retrieved from https://www.europarl.europa.eu/RegData/etudes/STUD/2019/624262/EPRS_STU(2019)624262_EN.pdf
Additional Reading
- Revisiting the Wild West? The eDiscovery Medicine Show
- More Keeper? Predictive Coding Technologies and Protocols Survey – Fall 2021 Results
Source: ComplexDiscovery