In this page we present the first CLEF 2023 shared task evaluating the temporal persistence of information retrieval (IR) systems and text classifiers. The task is motivated by recent research showing that the performance of these models drops as the test data becomes more distant in time from the training data. LongEval differs from traditional IR and classification shared task with special considerations on evaluating models that mitigate performance drop over time. We envisage that this task will bring more attention from the NLP community to the problem of temporal generalisability of models, what enables or prevents it, potential solutions and limitations.
The CLEF 2023 LongEval Lab encourages participants to develop temporal information retrieval systems and longitudinal text classifiers that survive through dynamic temporal text changes, introducing time as a new dimension for ranking models performance.