Search papers, labs, and topics across Lattice.
IsalGraph encodes any finite, simple graph as a compact string using a nine-character instruction set executed by a virtual machine with a sparse graph and a circular doubly-linked list. The encoding guarantees that every string decodes to a valid graph, preventing invalid states. Experiments on real-world graph datasets demonstrate that the Levenshtein distance between IsalGraph strings correlates strongly with graph edit distance (GED), making it suitable for graph similarity search and generation.
Representing graphs as strings with a guaranteed-valid instruction set unlocks language model-based approaches for graph similarity, generation, and conditioned modeling.
We present IsalGraph, a method for representing the structure of any finite, simple graph as a compact string over a nine-character instruction alphabet. The encoding is executed by a small virtual machine comprising a sparse graph, a circular doubly-linked list (CDLL) of graph-node references, and two traversal pointers. Instructions either move a pointer through the CDLL or insert a node or edge into the graph. A key design property is that every string over the alphabet decodes to a valid graph, with no invalid states reachable. A greedy \emph{GraphToString} algorithm encodes any connected graph into a string in time polynomial in the number of nodes; an exhaustive-backtracking variant produces a canonical string by selecting the lexicographically smallest shortest string across all starting nodes and all valid traversal orders. We evaluate the representation on five real-world graph benchmark datasets (IAM Letter LOW/MED/HIGH, LINUX, and AIDS) and show that the Levenshtein distance between IsalGraph strings correlates strongly with graph edit distance (GED). Together, these properties make IsalGraph strings a compact, isomorphism-invariant, and language-model-compatible sequential encoding of graph structure, with direct applications in graph similarity search, graph generation, and graph-conditioned language modelling