Search papers, labs, and topics across Lattice.
This paper explores ChatGPT's geographic knowledge representation and reasoning abilities through three exploratory probes. The authors investigate the model's sensitivity to syntactic variations, its susceptibility to distributional shifts when composing tasks, and the limitations of relying solely on factual recall for evaluating geographic understanding. The vignettes reveal potential biases and brittleness in ChatGPT's geographic reasoning, highlighting the need for deeper investigation beyond simple accuracy metrics.
ChatGPT's geographic reasoning can be surprisingly brittle, with minor syntactic changes causing significant output variations and task composition revealing unexpected distributional shifts.
Understanding how AI will represent and reason about geography should be a key concern for all of us, as the broader public increasingly interacts with spaces and places through these systems. Similarly, in line with the nature of foundation models, our own research often relies on pre-trained models. Hence, understanding what world AI systems construct is as important as evaluating their accuracy, including factual recall. To motivate the need for such studies, we provide three illustrative vignettes, i.e., exploratory probes, in the hope that they will spark lively discussions and follow-up work: (1) Do models form strong defaults, and how brittle are model outputs to minute syntactic variations? (2) Can distributional shifts resurface from the composition of individually benign tasks, e.g., when using AI systems to create personas? (3) Do we overlook deeper questions of understanding when solely focusing on the ability of systems to recall facts such as geographic principles?