Search papers, labs, and topics across Lattice.
This paper introduces a wireless image transmission method that leverages scene graph generation and text-to-image diffusion models to improve bandwidth efficiency and interference resilience. The method encodes semantic information extracted from images into scene graphs, transmits these graphs wirelessly, and then reconstructs the image using a diffusion model conditioned on the decoded scene graph. Experimental results, evaluated using a newly proposed Comprehensive Semantic Similarity Index (CSSI), demonstrate improved semantic similarity between reconstructed and original images compared to traditional methods.
Scene graphs plus diffusion models can dramatically cut bandwidth needs for wireless image transmission while maintaining semantic fidelity.
With the rise of concepts like the metaverse and Internet of Everything, current wireless communication technologies struggle to meet growing demands in image transmission. This paper presents a novel wireless image transmission method using a simple encoding-decoding framework with scene graph generation preprocessing. The approach extracts semantic information from the original image, including object bounding box coordinates, categories, and relationships. This information is encoded and transmitted through the wireless channel. Upon decoding, a generative diffusion model synthesizes an image semantically consistent with the original. Compared to traditional methods, this approach shows improved performance in bandwidth requirements and interference resilience. To evaluate performance, a new metric called the Comprehensive Semantic Similarity Index (CSSI) is introduced, combining deep feature similarity, semantic segmentation consistency, and object detection consistency. Experiments indicate superior semantic similarity between reconstructed and original images using this method.