Search papers, labs, and topics across Lattice.
The authors introduce PaveInstruct, a large-scale dataset of 278k image-instruction-response pairs for pavement condition assessment, and use it to train PaveGPT, a domain-specific vision-language model. Instruction tuning on PaveInstruct significantly improves performance on spatial grounding, reasoning, and generation tasks related to pavement analysis. PaveGPT achieves ASTM D6433-compliant outputs, demonstrating the potential for unified conversational assessment tools in transportation infrastructure.
Domain-specific instruction tuning can transform general vision-language models into expert systems capable of comprehensive and standardized infrastructure assessment.
General-purpose vision-language models demonstrate strong performance in everyday domains but struggle with specialized technical fields requiring precise terminology, structured reasoning, and adherence to engineering standards. This work addresses whether domain-specific instruction tuning can enable comprehensive pavement condition assessment through vision-language models. PaveInstruct, a dataset containing 278,889 image-instruction-response pairs spanning 32 task types, was created by unifying annotations from nine heterogeneous pavement datasets. PaveGPT, a pavement foundation model trained on this dataset, was evaluated against state-of-the-art vision-language models across perception, understanding, and reasoning tasks. Instruction tuning transformed model capabilities, achieving improvements exceeding 20% in spatial grounding, reasoning, and generation tasks while producing ASTM D6433-compliant outputs. These results enable transportation agencies to deploy unified conversational assessment tools that replace multiple specialized systems, simplifying workflows and reducing technical expertise requirements. The approach establishes a pathway for developing instruction-driven AI systems across infrastructure domains including bridge inspection, railway maintenance, and building condition assessment.