Forests and Trees: the Formal Semantics of Collective Categorization (ROCKY)

ROCKY final workshop

10 – 11 July 2023
Huize Molenaar, Korte Nieuwstraat 6-8, Utrecht


Workshop participants from left to right: Maaike Smit, Sven Smeman, Lasha Abzianidze (holding Aron Abzianidze), James Hampton, Sofiya Ros (holding Sasha Ros), Imke Kruitwagen (holding Boris van Hilten), Denis Paperno, Yoad Winter, Sofia Nikiforova, Giada Palmieri
Photo taken by Anya Popova

Monday, 10 July 2023
Reception and opening
Session 1 – experimental semantics
James Hampton:
Long blue coats and Large dark glasses: Conjunctions of unidimensional vague adjectives
Imke Kruitwagen:
Reciprocal predicates: a typicality approach
10:45-11:00 Tea
Sven Smeman and Maaike Smit:
Methodological Choices in the Study of Mass-Count Comparisons
Yoad Winter:
Countability and measurement in comparatives
Session 2 – cross-linguistic semantics
Joost Zwarts:
Between a singular: The role of subatomic plurality
Giada Palmieri:
From Romance to Bantu and back: a journey through lexical reciprocal predicates
Sofiya Ros:
Some observations on reciprocals in Wolof
Invited Speaker – Rick Nouwen
Terribly nice, pretty ugly: the semantics and computational pragmatics of deadjectival intensifiers


Tuesday, 11 July 2023

Session 3 – computational semantics
Invited Speaker – Albert Gatt:
Describing objects, scenes and actions: Models and explanations.
Lasha Abzianidze:
SpaceNLI: Evaluating Reasoning Capacity of Large Language Models in Space
Coffee and Tea
Denis Paperno:
Leverage Points in Modality Shifts: Comparing Language-only and Multimodal Word Representations
Sofia Nikiforova:
Imperfect AI in High-Accuracy Fields
James Hampton – Long blue coats and Large dark glasses: Conjunctions of unidimensional vague adjectives
Previous research has established how relative clause and other forms of conjunctions of semantic categories such as Sports and Games, or Pets and Birds can generate non-logical effects. Notably, one category may have a large influence on the conjunction (dominance), they may be non-commutative when order is reversed, and items may be included in the conjunction which are omitted from one or other category. I will discuss new data examining how people form conjunctive category decisions when the dimensions of colour and size are manipulated for three types of object – coats, sunglasses and tomatoes. The key issue will be whether/when people categorize in relation to a composite representation of the conjunction (e.g. an ideal long blue coat) as opposed to performing separate category judgments (is it long? Is it blue?) and applying a Boolean conjunction rule to the answers.
Imke Kruitwagen – Reciprocal predicates: a typicality approach
In this talk, I present an experimental study that investigates how the different forms of reciprocal verbs are connected to each other. Reciprocal verbs like hug, fight and collide alternate between a unary collective form (1), a “with” form (2) and a binary form (3):
(1) Wendy and Pete are fighting.
(2) Wendy fights with Pete.
(3) Pete fights Wendy.
Currently, there is no general account of the semantic relations between these forms. I present a threshold-based model of those relations and discuss experimental data that allow me to evaluate the model. For instance, our new model predicts – in contrast to the dominant view in the literature – that (1) and (2) do not entail (3). In our model, a verbal root has a conceptual core (CC) which specifies the semantic attributes of the different verbal forms, where the weights of those attributes vary between alternations.
Sven Smeman and Maaike Smit – Methodological Choices in the Study of Mass-Count Comparisons
It is generally assumed that the baseline quantification strategy for mass noun denotations is non-cardinal measurement (e.g., mass or volume). At the same time, it is well known that conceptual packaging is possible, which in turn allows cardinal measurement (i.e., counting). In a series of experiments, we have attempted to investigate the effect of the morpho-syntactic properties of the comparison on the quantification of mass noun denotations. For each experiment, we created minimal pairs in which we compared a substance mass noun (e.g. soil) to the count or mass form of a flexible noun like stone/stones. In our talk, we will discuss our experiments, focusing on our methodological choices.
Yoad Winter – Countability and measurement in comparatives
The mass-count distinction can strongly affect meanings of nominal comparatives: ‘more stones/fruits/packs’ trigger  counting; ‘more stone/fruit/sugar’ trigger measuring. This simple pattern often breaks down. One such case is counting in comparatives with mass nouns like ‘furniture’, ‘baggage’ and ‘weaponry’. Another is measuring with count nouns in mass-count comparatives like ‘more gold than diamonds’ and ‘more friends than money’. We report new results indicating that counting in comparatives is primed by both perceived discreteness and a grammatical ‘count’ status, but neither of these factors forces it. In cases of mismatch between grammar and perception, last resort operations come into play. This provides a unified picture on the grammatical, lexical and semantic-pragmatic effects on discrete and non-discrete meanings.
Joost Zwarts – Between a singular: The role of subatomic plurality
The Dutch preposition tussen ‘between’ is occasionally used with one singular count noun phrase (e.g., klem tussen de deur ‘caught in the door’, tussen het gewei ‘between the antlers’). A systematic collection of such uses from the Corpus of Contemporary Dutch offers insights in the role of ‘subatomic pluralities’ in semantics (e.g., the frame and panel of a door, the two antlers of one ‘gewei’).
Giada Palmieri – From Romance to Bantu and back: a journey through lexical reciprocal predicates
Languages like English make a clear distinction between instances where reciprocity is expressed lexically (through the meaning of the verb, as in Mary and Lisa hugged) and cases where reciprocal interpretations are the outcome of a productive grammatical operation (like in Mary and Lisa described each other). This distinction is not overtly encoded in a number of languages, in which all transitive verbs require the same morphosyntactic marking to denote reciprocal configurations, regardless of whether they originate from a lexical or grammatical strategy. This is the case in many Romance and Bantu languages, where reciprocal interpretations are systematically marked by the clitic se and by the verbal affix -an-, respectively. In this talk I will demonstrate that despite the absence of an overt morphosyntactic distinction, both lexical and grammatical reciprocity are operational in Romance and Bantu. Focusing primarily on Italian and Swahili, I will discuss the semantic properties shared by lexical reciprocal predicates in these two languages. I will argue that neither se nor -an- are the source of lexical reciprocal meanings, but these elements have different roles: Italian se is treated as a functional head projection, whereas Swahili -an- is lexicalized as part of the entry of lexical reciprocal verbs.
Sofiya Ros – Some observations on reciprocals in Wolof
Wolof is a Niger-Congo Atlantic language with a rich verbal and nominal morphology. Verbal derivations use distinct suffixes which may attach to a verb root and permit alterations to the category, valence and semantics of a verbal base. Reciprocity in Wolof is also derived with the use of verbal suffixes: -ante, -e and -oo. In my talk I will talk about their properties and show that they reflect different strategies: -ante is a valence-reducing morpheme that turns transitive verbs into reciprocal verbs; -e and -oo mark predicates with an inherent reciprocal meaning and do not operate on the verbs’ argument structure.
Rick Nouwen – Terribly nice, pretty ugly: the semantics and computational pragmatics of deadjectival intensifiers
Deadjectival intensifiers (e.g. “terribly” in “terribly nice”) are bleached, meaning that the lexical content of the adjectival base  (e.g. “terrible”) is not conveyed in any way when the adverb is used as an intensifier. This observation has led to many proposals where such deadjectival adverbs of degree have extremely impoverished lexical content. In this talk I argue that the lexical content of the adjectival base of an intensifier, and in particular its connotative meaning, is directly relevant for how intensification works semantically and pragmatically. In particular, I highlight two generalisations that have remained unaccounted for so far. First, evaluative adjectives with a negative connotation tend to turn into deadjectival intensifiers expressing high degree, while adjectives with a positive evaluative connotation make intensifiers of medium degree. (Compare, for instance, “terribly” with “pretty”). Second, negative modal adjectives can form deadjectival intensifiers, but positive ones cannot. (Compare, for instance, “Sue is unusually tall” with “?Sue is usually tall”). I will argue that a relatively simple intersective semantics for evaluative and modal adverbs accounts for these observations, but that we can only show this if we supplement that semantic analysis with a computational probabilistic pragmatic component.


Albert Gatt – Describing objects, scenes and actions: Models and explanations
A visual scene can be described in many different ways. For example, one could describe the same image as “three people sitting on a rug eating a sandwich”, “people having a picnic” or “people having a good time”.  In this talk, I will discuss some of our research on image captioning that goes beyond the standard object-centric style that characterises many datasets and models. I will introduce a new dataset that explicitly aligns object-centric captions with scene-level descriptions, as well as descriptions of actions and rationales. We use this dataset to study how neural models handle visual inputs when they describe images in these different ways. In particular, I will discuss methods using attention analysis, multimodal ablation, as well as a novel SHAP-based framework. These methods converge on insights which allow us to draw some parallels with research in human visual cognition, in particular, the perception of scenes.

Lasha Abzianidze – SpaceNLI: Evaluating Reasoning Capacity of Large Language Models in Space
While many natural language inference (NLI) datasets target certain semantic phenomena, e.g., negation, tense & aspect, monotonicity, and presupposition, to the best of our knowledge, there is no NLI dataset that involves diverse types of spatial expressions and reasoning. We fill this gap by semi-automatically creating an NLI dataset for spatial reasoning, called SpaceNLI. The data samples are automatically generated from a curated set of reasoning patterns, where the patterns are annotated with inference labels by experts. We test several SOTA NLI systems on SpaceNLI to gauge the complexity of the dataset and the system’s capacity for spatial reasoning. We also introduce a pattern accuracy score that measures a system’s prediction on NLI patterns. We argue that it is a more reliable and stricter measure than the standard accuracy score based on generated NLI problems. Based on the evaluation results we find that the systems obtain moderate results on the spatial NLI problems but lack consistency per inference pattern. The results also reveal that non-projective spatial inferences are the most challenging ones.

Denis Paperno – Leverage Points in Modality Shifts: Comparing Language-only and Multimodal Word
Multimodal embeddings aim to enrich the semantic information in neural representations of language compared to text-only models. While different embeddings exhibit different applicability and performance on downstream tasks, little is known about the systematic representation differences attributed to the visual modality. Our paper compares word embeddings from three vision-and-language models (CLIP, OpenCLIP and Multilingual CLIP) and three textonly models, with static (FastText) as well as contextual representations (multilingual BERT XLM-RoBERTa). This is the first large-scale study of the effect of visual grounding on language representations, including 46 semantic parameters. We identify meaning properties and relations that characterize words whose embeddings are most affected by the inclusion of visual modality in the training data; that is, points where visual grounding turns out most important. We find that the effect of visual modality correlates most with denotational semantic properties related to concreteness, but is also detected for several specific semantic classes, as well as for valence, a sentiment-related connotational property of linguistic expressions.
Sofia Nikiforova – Imperfect AI in High-Accuracy Fields
Even the best AI models make mistakes. Does it mean that in domains where making a mistake is sometimes critical, AI adoption is unrealistic in practice? In this talk, I will discuss evidence from the fields of Medicine, Law and Education, focusing on the attitudes of experts and potential stakeholders towards the usage of imperfect AI applications.