One of the promises of artificial intelligence is improving efficiency in various processes, including decision-making.
For specific decisions it is vital that human experts understand and are able to influence machine-made advice.
In my dissertation research, I design and study argumentation-based systems for transparent human-in-the-loop decision support.
Based on domain-specific knowledge or experience, these systems are able to construct an initial advice on some decision (justification);
investigate the possibility that additional, yet uncertain, information can change the conclusion (stability) and if so,
which information is still worth investigating (relevance).
Being argumentation-based, these systems have the potential to automatically generate (interactive) explanations in various levels of detail.
The systems' requirements of detecting justification, stability and relevance correspond to theoretical problems in
computational argumentation, most of which are in high complexity classes.
In order to achieve reasonable estimations for these problems in polynomial time, I develop and investigate
not only exact algorithms but also approximations.
In my function as an AI scientist at the Dutch National Police, I implement these algorithms for various applications.
For more information on my research, see my
More information on the National Police Lab AI can be found on the
For one of the applications of my research, see this dialogue system aiming to assist victims of online trade fraud on the
Dutch National Police website.
Click on the title to obtain more information.
Odekerken, D., Bex, F., & Prakken, H. (2023). Precedent-based reasoning with incomplete cases
To be presented at JURIX 2023
We extend the result model for precedent-based reasoning with incomplete case bases. In contrast to
regular case bases, these consist of incomplete cases for which not all dimension values need to be
specified, but rather each dimension is assigned a set of possible values. The outcome of cases then
applies for each (combination of) the possible dimension values. Building on earlier proposed notions of
justification and stability for incomplete focus cases, we introduce the notion of possible justification
statuses, which are required to maintain consistency of the incomplete case base. We demonstrate how these
theoretic notions can be applied in practice for human-in-the-loop decision support, discuss their
computational complexity and provide efficient algorithms.
Odekerken, D., Borg, A., Bex, F. (2023). Justification, Stability and Relevance in Incomplete
Argument & Computation
We explore the computational complexity of justification, stability and relevance in incomplete
argumentation frameworks (IAFs). IAFs are abstract argumentation frameworks that encode qualitative
uncertainty by distinguishing between certain and uncertain arguments and attacks. These IAFs can be
completed by deciding for each uncertain argument or attack whether it is present or absent. Such a
completion is an abstract argumentation framework, for which it can be decided which arguments are
acceptable under a given semantics. The justification status of an argument in a completion then
expresses whether the argument is accepted (IN), not accepted because it is attacked by an accepted
argument (OUT) or neither (UNDEC). For a given IAF and certain argument, the justification status of that
argument need not be the same in all completions. This is the issue of stability, where an argument
is stable if its justification status is the same in all completions. For arguments that are not stable in
an IAF, the relevance problem is of interest: which uncertain arguments or attacks should be
investigated for the argument to become stable? In this paper, we define justification, stability and
relevance for IAFs and provide a complexity analysis for these problems under grounded, complete,
preferred and stable semantics.
Odekerken, D., Borg, A., & Berthold, M. (2023). Demonstrating PyArg 2.0
To be presented at the 7th Workshop on Advances in Argumentation in Artificial Intelligence
We demonstrate the latest release of PyArg, an open-source Python package of implementation algorithms
with a web interface. PyArg provides various argumentation-based functionalities, including evaluation
and visualisation of abstract argumentation frameworks, ASPIC+ argumentation theories and assumption-based
argumentation frameworks; explanation algorithms; multiple generators; a learning environment;
implementations of theoretical papers and a showcase of a practical application.
Odekerken, D., Lehtonen, T., Borg, A., Wallner, J.P. & Järvisalo, M. (2023). Argumentative Reasoning in ASPIC+ under Incomplete Information
Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning
Reasoning under incomplete information is an important research direction in AI argumentation.
Most computational advances in this direction have so-far focused on abstract argumentation frameworks.
Development of computational approaches to reasoning under incomplete information in structured formalisms
remains to-date to a large extent a challenge. We address this challenge by studying the so-called stability
and relevance problems---with the aim of analyzing aspects of resilience of acceptance statuses in light
of new information---in the central structured formalism of ASPIC+. Focusing on the case of the grounded
semantics and an ASPIC+ fragment motivated through application scenarios, we develop exact ASP-based
algorithms for stability and relevance in incomplete ASPIC+ theories, and pinpoint the complexity of
reasoning about stability (coNP-complete) and relevance (Sigma_2^P-complete), further justifying our
ASP-based approaches. Empirically, the algorithms exhibit promising scalability, outperforming even a
recent inexact approach to stability, with our ASP-based iterative approach being the first algorithm
proposed for reasoning about relevance in ASPIC+.
Odekerken, D., Borg, A. & Berthold, M. (2023). Accessible Algorithms for Applied Argumentation
First International Workshop on Argumentation and Applications
Computational argumentation is a promising research area, yet there is a gap between theoretical
contributions and practical applications.
Bridging this gap could potentially raise interest in this topic even more.
We argue that one part of the bridge could be an open-source package of implementations of argumentation
algorithms, visualised in a web interface.
Therefore we present a new release of PyArg, providing various new argumentation-based functionalities --
including multiple generators, a learning environment, implementations of theoretical papers and a
showcase of a practical application -- in a new interface with improved accessibility.
Odekerken, D., Bex, F., & Prakken, H. (2023). Justification, Stability and Relevance for Case-based
Reasoning with Incomplete Focus Cases.
Nineteenth International Conference for Artificial Intelligence
We define and study the notions of stability and relevance for precedent-based reasoning, focusing on
Horty's result model of precedential constraint.
According to this model, precedents constrain the possible outcomes for a focus case, which is a yet
undecided case, where precedents and the focus case are compared on their characteristics (called dimensions).
In this paper, we refer to the enforced outcome for the focus case as its justification status.
In contrast to earlier work, we do not assume that all dimension values of the focus case have been
established with certainty: rather, each dimension has a set of possible value assignments.
We define a focus case as stable if its justification status is the same for every choice of the
possible value assignments.
For focus cases that are not stable, we study the task of identifying relevance: which possible
value assignments should be excluded to make the focus case stable?
We show how the tasks of identifying justification, stability and relevance can be exploited for
human-in-the-loop decision support.
Finally, we discuss the computational complexity of these tasks and provide efficient algorithms.
Winner of the Donald Berman Best Student Paper Award.
Berthold, M., Knorr, M., & Odekerken, D. (2023). Forgetting Web.
39th International Conference on Logic Programming (Technical Communications)
The relatively young area of forgetting is concerned with the removal of selective information, while
preserving other knowledge. This might be useful or even necessary, for example, to simplify a
knowledge base or to tend legal requests. In the last few years, there has been an ample amount of
research in the field, in particular with respect to logic programs, spanning from theoretical
considerations to more practical applications, starting at the conceptual proposal of forgetting, to suggestions
of properties that should be satisfied, followed by characterizations of abstract classes of operators
that satisfy these properties, and finally the definition of concrete forgetting procedures.
In this work we present novel Python implementations of all the forgetting procedures that have
been proposed to date on logic programs. We provide them in a web interface, and hope to thereby
give anybody who is interested a low-barrier overview of the landscape.
Odekerken, D. (2022). Justification, Stability and Relevance for Transparent and
Efficient Human-in-the-Loop Decision Support.
In Online Handbook of Argumentation for AI, Vol. 3
One of the promises of artificial intelligence is improving efficiency in various processes,
including decision-making. For specific decisions it is vital that human experts understand and
are able to influence machine-made advice. In my dissertation research, I design and study
argumentation-based systems for transparent human-in-the-loop decision support. Based on
a domain-specific argumentation setting, these systems are able to construct an initial advice
on some decision (justification); investigate the possibility that additional, yet uncertain, information
can change the conclusion (stability) and if so, which information is worth investigating
(relevance). The systems’ requirements of detecting justification, stability and relevance
correspond to theoretical problems in computational argumentation, most of which are in high
complexity classes. In order to achieve reasonable estimations for these problems in polynomial time,
I develop and investigate not only exact algorithms but also approximations.
Odekerken, D., Borg, A., & Bex, F. (2022). Stability and Relevance in Incomplete Argumentation Frameworks.
In Computational Models of Argument
We explore the computational complexity of stability and relevance in incomplete argumentation frameworks
(IAFs), abstract argumentation frameworks that encode qualitative uncertainty by distinguishing between
certain and uncertain arguments and attacks. IAFs can be specified by, e.g., making uncertain arguments or
attacks certain; the justification status of arguments in an IAF is determined on the basis of the certain
arguments and attacks. An argument is stable if its justification status is the same in all specifications
of the IAF. For arguments that are not stable in an IAF, the relevance problem is of interest: which
uncertain arguments or attacks should be investigated for the argument to become stable? We redefine
stability and define relevance for IAFs and study their complexity.
Borg, A., & Odekerken, D. (2022). PyArg for Solving and Explaining Argumentation in Python: Demonstration.
In Computational Models of Argument
We introduce PyArg, a Python-based solver and explainer for both abstract argumentation
and ASPIC+. A large variety of extension-based semantics allows for flexible evaluation and several explanation functions are available.
Odekerken, D., Bex, F., Borg, A., & Testerink, B. (2022).
Approximating Stability for Applied Argument-based Inquiry. Intelligent Systems with Applications.
In argument-based inquiry, agents jointly construct arguments supporting or attacking a topic claim to find
out if the claim can be accepted given the agents’ knowledge bases. While such inquiry systems can be used
for various forms of automated information intake, several efficiency issues have so far prevented widespread
application. In this paper, we aim to tackle these efficiency issues by exploring the notion of stability:
can additional information change the justification status of the claim under discussion? Detecting stability
is not tractable for every input, since the problem is CoNP-complete, yet in practical applications it is
essential to guarantee efficient computation. This makes approximation a viable alternative. We present a
sound approximation algorithm that recognises stability for many inputs in polynomial time and discuss
several of its properties. In particular, we show that the algorithm is sound and identify constraints on
the input under which it is complete. As a final contribution of this paper, we describe how the proposed
algorithm is used in three different case studies at the Netherlands Police.
Odekerken, D., Koops, H. V., & Volk, A. (2021). Improving Audio Chord Estimation by Alignment and
Integration of Crowd-Sourced Symbolic Music.
Transactions of the International Society for Music Information Retrieval, 4(1), 141-155.
Automatic Chord Estimation (ACE) is a fundamental task in Music Information Retrieval (MIR) and has
applications in both music performance and MIR research. The task consists of segmenting a music recording
or score and assigning a chord label to each segment. Although it has been a task in the annual benchmarking
evaluation MIREX for over 10 years, ACE is not yet a solved problem, since performance has stagnated and
modern systems have started to tune themselves to subjective training data. We propose DECIBEL, a new ACE
system that exploits heterogeneous musical representations, specifically MIDI and tab files, to improve
audio-based ACE methods. From an audio file and a set of MIDI and tab files corresponding to the same
popular music song, DECIBEL first estimates chord sequences. For audio, state-of-the-art audio ACE methods
are used. MIDI files are aligned to the audio, followed by a MIDI chord estimation step. Tab files are
transformed into untimed chord sequences and then aligned to the audio. Next, DECIBEL uses data fusion to
integrate all estimated chord sequences into one final output sequence. DECIBEL improves all tested
state-of-the-art ACE methods by 0.5 to 13.6 percentage points. This result shows that the integration of
crowd-sourced annotations from heterogeneous symbolic music representations using data fusion is a suitable
strategy for addressing challenging MIR tasks such as ACE.
Odekerken, D., & Bex, F. J. (2020).
Towards Transparent Human-in-the-Loop Classification of Fraudulent Web Shops.
In S. Villata, J. Harašta, & P. Křemen (Eds.), Legal Knowledge and Information Systems (Vol. 334, pp. 239-242).
(Frontiers in Artificial Intelligence and Applications). IOS Press.
We propose an agent architecture for transparent human-in-the-loop classification. By combining dynamic
argumentation with legal case-based reasoning, we create an agent that is able to explain its decisions at
various levels of detail and adapts to new situations. It keeps the human analyst in the loop by presenting
suggestions for corrections that may change the factors on which the current decision is based and by
enabling the analyst to add new factors. We are currently implementing the agent for classification of
fraudulent web shops at the Dutch Police.
Odekerken, D., Borg, A., & Bex, F. J. (2020). Estimating Stability for Efficient Argument-based Inquiry.
In Computational Models of Argument. Proceedings of COMMA 2020 (Frontiers in Artificial Intelligence and Applications).
We study the dynamic argumentation task of detecting stability: given a specific structured argumentation
setting, can adding information change the acceptability status of some propositional formula? Detecting
stability is not tractable for every input, but efficient computation is essential in practical applications.
We present a sound approximation algorithm that recognises stability for many inputs in polynomial time and
we discuss several of its properties. In particular, we show under which constraints on the input our
algorithm is complete. The proposed algorithm is currently applied for fraud inquiry at the Dutch National
Police - we provide an English demo version that also visualises the output of the algorithm.
Araszkiewicz, M., Amantea, I. A., Chakravarty, S., van Doesburg, R., Dymitruk, M., Garin, M., Gilpin, L.,
Odekerken, D., & Salehi, S. S. (2020). ICAIL Doctoral Consortium, Montreal 2019.
Artificial Intelligence and Law, 28(2), 267-280.
This is a report on the Doctoral Consortium co-located with the 17th International Conference on
Artificial Intelligence and Law in Montreal.
Odekerken, D., Testerink, B. J. G., & Bex, F. J. (2019). A method for efficient argument-based inquiry.
In Flexible Query Answering Systems: 13th International Conference, FQAS 2019, Amantea, Italy,
July 2–5, 2019, Proceedings (Lecture Notes in Artificial Intelligence). Springer Verlag.
In this paper we describe a method for efficient argument-based inquiry. In this method, an agent creates
arguments for and against a particular topic by matching argumentation rules with observations gathered by
querying the environment. To avoid making superfluous queries, the agent needs to determine if the
acceptability status of the topic can change given more information. We define a notion of stability,
where a structured argumentation setup is stable if no new arguments can be added, or if adding new
arguments will not change the status of the topic. Because determining stability requires hypothesizing
over all future argumentation setups, which is computationally very expensive, we define a less complex
approximation algorithm and show that this is a sound approximation of stability. Finally, we show how
stability (or our approximation of it) can be used in determining an optimal inquiry policy, and discuss
how this policy can be used to, for example, determine a strategy in an argument-based inquiry dialogue.
Testerink, B. J. G., Odekerken, D., & Bex, F. J. (2019). AI-assisted message processing for the Netherlands National Police.
In Proceedings of the ICAIL 2019 Workshop on AI and the Administrative State (AIAS 2019) CEUR Workshop Proceedings.
The number of messages that the Netherlands National Police (NNP) receives (e.g. from international
partner institutes and citizens) grows steadily every year. The NNP has initiated a number of projects to
develop artificial intelligence systems that assist in the processing of such messages. In this demo,
we show a prototype of one such system that will be used for supporting the processing of messages from
international (Interpol) partners.
Schraagen, M. P., Bex, F. J., Odekerken, D., & Testerink, B. J. G.
Argumentation-driven information extraction for online crime reports.
In International Workshop on Legal Data Analysis and Mining (LeDAM 2018) (CEUR workshop proceedings).
A new system is currently being developed to assist the Dutch National Police in the assessment of crime
reports submitted by civilians. This system uses Natural Language Processing techniques to extract
observations from text. These observations are used in a formal reasoning system to construct arguments
supporting conclusions based on the extracted observations, and possibly ask the complainant who files
the report extra questions during the intake process. The aim is to develop a dynamic question-asking
system which automatically learns effective and user-friendly strategies. The proposed approach is planned
to be integrated in the daily workflow at the Dutch National Police, in order to provide increased
efficiency and transparency for processing of crime reports.
Odekerken, D., Volk, A., & Koops, H. V. (2017). Rhythmic Patterns in Ragtime and Jazz.
In I. Barbancho, L. Tardón, & A. Peinado (Eds.), Proceedings of the 7th International Workshop on Folk
Music Analysis: 14-16 June 2017, Málaga, Spain (pp. 44-49). (FMA Proceedings; Vol. 7).
This paper presents a corpus-based study on rhythmic patterns in ragtime and jazz. Ragtime and jazz are
related genres, but there are open questions on what specifies the two genres. Earlier studies revealed that
variations of a particular syncopation pattern, referred to as 121, are among the most frequently used
patterns in ragtime music. Literature in musicology states that another pattern, clave, is often heard in
jazz, particularly in songs composed
before 1945. Using computational tools, this paper tests three hypotheses on the occurrence of 121 syncopation and clave patterns
in ragtime and jazz. For this purpose, we introduce a new data
set of 252 jazz MIDI files with annotated melody and metadata.
We also use the RAG-collection, which consists of around 11000
ragtime MIDI files and metadata. Our analysis shows that syncopation patterns are significantly more frequent in the melody
of ragtime pieces than in jazz. Clave on the other hand is found
significantly more in jazz melodies than in ragtime. Our findings
show that the frequencies of rhythmic patterns differ significantly
between music genres, and thus can be used as a feature in automatic genre classification.
Software and Demos
PyArg is a Python-based solver and explainer for both abstract argumentation and ASPIC+. A large variety of
extension-based semantics allows for flexible evaluation and several explanation functions are available.
Source codeVisual interfaceDocumentation website
DECIBEL is a new system for Automatic Chord Estimation (ACE) which exploits MIDI and tab files to improve
audio ACE, thereby implicitly integrating musical knowledge.
Source codeDocumentation website
ForgettingWeb is a web interface that provides implementations of forgetting procedures on logic programs.
LCBR is a repository with algorithms for justification, stability and relevance for case-based reasoning,
focused on Horty's result model of precedential constraint.