Varvara Krechetova

Papers on Lattice

Total citations

Topics

h-index

Research focus

Eval Frameworks & Benchmarks (1)Reasoning & Chain-of-Thought (1)Tool Use & Agents (1)

Frequent co-authors

Denis Kochedykov (2)

Papers (2)

2025

Varvara Krechetova +12025

GeoBenchX: Benchmarking LLMs for Multistep Geospatial Tasks

A benchmark for evaluating large language models (LLMs) on multi-step geospatial tasks relevant to commercial GIS practitioners is established, and an LLM-as-Judge evaluation framework is developed to compare agent solutions against reference implementations.

Varvara Krechetova, Denis Kochedykov

Mar 23, 2025

AnthropicMar 23, 2025

GeoBenchX: Benchmarking LLMs in Agent Solving Multistep Geospatial Tasks

Turns out, Claude 3.5 Sonnet and o4-mini are surprisingly good at geospatial tasks, outperforming even GPT-4.1 and Gemini 2.5 Pro Preview on a new benchmark for tool-calling LLMs.

Varvara Krechetova, Denis Kochedykov

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Tool Use & Agents

Search

Varvara Krechetova

Research focus

Frequent co-authors

Papers (2)