LLMs are increasingly used for creative tasks, yet we lack proper ways to evaluate and understand their creative abilities. We provide the first systematic evaluation framework for combinatorial creativity (CC), uncovering fundamental limitations that persist even as models scale.
Our algorithmic framework for evaluating combinatorial creativity: Models are trained on concept-relation-concept triples from a conceptual graph, then prompted to generate creative "ideas" (paths) between distant concepts while satisfying inclusion-exclusion constraints.
Theoretical Framework: We propose the first mathematical framework for evaluating combinatorial creativity, modeling it as pathfinding in conceptual spaces where models must discover novel connections between concepts while satisfying logical constraints.
Since Mednick (1962), creative ability has been associated with richer associative hierarchies that enable combinations of distant representations, leading to breakthrough discoveries.
Architectural Insights: Through systematic experiments on 1M, 10M, and 100M parameter GPT-2 style decoder-only Transformers, we reveal:
Fundamental Limitation: We discover a persistent novelty-utility tradeoff that remains constant across all scales - as models generate more novel ideas, they increasingly struggle with practical constraints. This explains the "ideation-execution gap" where LLMs excel at generating creative ideas but fail at ensuring feasibility.
This scale-invariant tradeoff suggests that simply making models larger may not solve creativity limitations in current architectures. Instead of just needing more parameters, creative AI may require fundamental architectural innovations. Our framework provides the first systematic approach for understanding and improving AI creativity, revealing both the potential and the intrinsic limitations of current approaches.
@misc{schapiro2025combinatorialcreativitynew,
title={Combinatorial Creativity: A New Frontier in Generalization Abilities},
author={Samuel Schapiro and Sumuk Shashidhar and Alexi Gladstone and Jonah Black and Royce Moon and Dilek Hakkani-Tur and Lav R. Varshney},
year={2025},
eprint={2509.21043},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2509.21043},
}