Artificial intelligence systems based on machine learning (ML) models are currently under rapid development, with many successful applications - so far predominantly in the private sector. Public sector entities are beginning to develop and implement ML algorithms in the provision of public services. The goal is a more efficient public administration with improved, possibly personalised services at lower costs.
Development and implementation of ML algorithms leads to new challenges, including: the use of personal data versus privacy rights; inexplicable and therefore unjustifiable decisions; or potentially institutionalised discrimination by algorithmic bias. If an algorithm is not properly tailored to the objective and its environment, it can result in higher workload, delays and frustrated staff. Usage of a carelessly developed algorithm in public services can thus lead to obscured inefficiency, damaged trust in the authorities and be detrimental to a well-functioning public sector. Both internal control mechanisms and external audits are needed to ensure the proper use of ML and prevent these dangerous side effects.
The first principles and guidelines to address AI-related risks have been developed both internationally and on a national level in several countries, and are likely to result in relevant legislation in the near future.3 Independent third-party auditing is not only recommended in the context of the EU’s General Data Protection Regulation (GDPR) and following interpretations,4 but for all AI systems affecting fundamental rights; the EU’s Ethics Guidelines for Trustworthy Artificial Intelligence  points out the need for the system to be lawful, ethical and robust. It further lists accountability, including auditability, as one requirement for trustworthy AI, and explicitly states the necessity for independent internal and external audits. The topics of fairness, transparency and accountability of AI are extensively discussed in the global research community.5 Although it is not obvious how to facilitate the auditability of ML algorithms, the necessity is widely acknowledged.6 The idea of specialised, licensed AI system auditors has been put forward .
This paper outlines potential audits of AI systems by Supreme Audit Institutions (SAIs), covering risks related to the use of ML models in government agencies as well as possible tests to gain audit evidence. It further includes an auditablity checklist which summarises the minimum prerequisites an auditee organisation should retain from the ML implementation phase to enable any subsequent audit.
An audit of ML algorithms can have components of both performance audit and compliance audit. It should always include a risk assessment of the related IT system, potentially leading to a wider IT audit that includes the AI system. ML algorithms are typically not used as stand-alone software, but rather they are embedded in a pipeline of procedures as part of a wider IT infrastructure. The focus in this paper lies in the audit of the ML component.
The suggested audit model is based on the literature given in the bibliography, as well as on the experiences of the authoring SAIs with audits of IT systems in general and, in particular pilot audits of ML applications. It is thus focused on the most commonly used AI systems and those encountered in the pilot audits, and should eventually be updated with more audit experience and the results of new research where appropriate.
Chapter 3 is structured into five sections aligning with five audit topics. It suggests an ‘audit catalogue’ that specifies the relevant considerations with audit questions and risks for each point. A detailed list of practical audit tests and suggested contacts within the auditee organisations is given at the end of each section.
- Appendix One Classic IT audit components in ML/AI context summarises IT audit components relevant to ML.
- Appendix One Personal data and GDPR in the context of ML and AI includes risks related to personal data and violation of GDPR.
- Appendix One Equality and Fairness measures in classification models gives an overview of the most relevant equality and fairness measures for classification applications.
- An auditability checklist of minimum requirements for an auditee using ML can be found in Appendix One Auditablity checklist.
The ML audit helper tool (in Excel) is available as a seperate file that accompanies this paper.
 M. Brundage et al. (2020): Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims, https://arxiv.org/abs/2004.07213.
 High-Level Expert Group on AI (2019): Ethics Guidelines for Trustworthy Artificial Intelligence, https://ec.europa.eu/digital-single-market/en/news/ethics-guidelines-trustworthy-ai.
Several countries are in the process of producing standards for AI similar to the EU’s guidelines , the European Commission has announced a legislative proposal , and the Organisation for Economic Co-operation and Development has already launched recommendations that were accepted by G20 ministers as guiding principles for trustworthy AI . See further  for an overview of existing AI laws and policies.↩︎