Search papers, labs, and topics across Lattice.
This paper introduces a pipeline for constructing a large-scale, author-centric knowledge graph of battery research using the OpenAlex bibliographic catalogue. The pipeline generates weighted research descriptor vectors for each author, combining OpenAlex concepts and ChatGPT-extracted keyphrases, weighted by origin, authorship position, and recency. The resulting knowledge graph, comprising 189,581 battery-related works, enables author similarity computation, community detection, and integration with external linked data sources, offering a domain-semantic approach to expertise tracking.
Discover expertise and collaborators in battery research at a global scale, grounded in semantic understanding rather than just citations.
Battery research is a rapidly growing and highly interdisciplinary field, making it increasingly difficult to track relevant expertise and identify potential collaborators across institutional boundaries. In this work, we present a pipeline for constructing an author-centric knowledge graph of battery research built on OpenAlex, a large-scale open bibliographic catalogue. For each author, we derive a weighted research descriptors vector that combines coarse-grained OpenAlex concepts with fine-grained keyphrases extracted from titles and abstracts using KeyBERT with ChatGPT (gpt-3.5-turbo) as the backend model, selected after evaluating multiple alternatives. Vector components are weighted by research descriptor origin, authorship position, and temporal recency. The framework is applied to a corpus of 189,581 battery-related works. The resulting vectors support author-author similarity computation, community detection, and exploratory search through a browser-based interface. The knowledge graph is then serialized in RDF and linked to Wikidata identifiers, making it interoperable with external linked open data sources and extensible beyond the battery domain. Unlike prior author-centric analyses confined to institutional repositories, our approach operates at cross-institutional scale and grounds similarity in domain semantics rather than citation or co-authorship structure alone.