Seminar 1: BrIAS Fellow Prof. Rafael Bello
Considerations about the research on XAI at UCLV: generation of factual and counterfactual explanations.
Abstract: The need to develop and apply a responsible and safe Artificial Intelligence has led to the development of the explainable AI (XAI). However, the work at XAI has many challenges due to different factors such as the diversity of forms and stakeholders of an explanation, and the diversity of methods created to generate the explanations. After offering an initial overview about XAI, some of the results achieved at UCLV in this field are shown, including an approach to develop the process of generating explanations and several examples of application; as well as the construction of meta-explanations. Some of the research tasks in development are presented, especially in the case of counterfactual explanations. Counterfactual Explanations (CEs) have become one of the leading post-hoc methods for explaining AI models in the field of XAI. The core idea is that, given an input x to a model M, a CE essentially presents a user with a new, slightly modified input x′, illustrating how a different outcome could be achieved if certain changes were applied to x. Various measures have been proposed to evaluate the quality of counterfactuals, such as actionability, causality, diversity, proximity, sparsity, plausibility, and robustness. Especially the work related to the last two is presented.
Seminar 2: BrIAS Fellow Prof. Willem Zuidema
Under the hood: what LLMs learn about our language, and what they teach us about us
Abstract: Large Language Models (LLMs) and Neural Speech Models (NSMs) have made big advances in the last few years in their abilities to mimic and process human language and speech. Their internal representations, however, are notoriously difficult to interpret, limiting their usefulness for cognitive and neuroscience. However, a new generation of posthoc interpretability techniques, based on causal interventions, provide an increasingly detailed look under the hood. These techniques allow us, in some cases, to reveal the nature of the learned representations, assess how general the learned rules are and formulate new hypotheses on how humans might process aspects of language and speech. I will discuss examples on syntactic priming and phonotactics, and speculate on the future impact of AI-models on the cognitive science of language.