Search papers, labs, and topics across Lattice.
This paper introduces ORGAN, a novel object-centric representation learning approach based on cycle-consistent GANs, to segment images into objects and represent them in a low-dimensional latent space. The method addresses limitations of autoencoder-based approaches, particularly in handling real-world datasets with numerous objects and low contrast. Experiments on synthetic and real-world datasets demonstrate that ORGAN performs comparably to state-of-the-art methods on synthetic data, while uniquely handling complex real-world scenarios and enabling object manipulation through expressive latent spaces.
Cycle-consistent GANs can now do unsupervised object-centric representation learning, outperforming autoencoders on messy real-world images.
Although data generation is often straightforward, extracting information from data is more difficult. Object-centric representation learning can extract information from images in an unsupervised manner. It does so by segmenting an image into its subcomponents: the objects. Each object is then represented in a low-dimensional latent space that can be used for downstream processing. Object-centric representation learning is dominated by autoencoder architectures (AEs). Here, we present ORGAN, a novel approach for object-centric representation learning, which is based on cycle-consistent Generative Adversarial Networks instead. We show that it performs similarly to other state-of-the-art approaches on synthetic datasets, while at the same time being the only approach tested here capable of handling more challenging real-world datasets with many objects and low visual contrast. Complementing these results, ORGAN creates expressive latent space representations that allow for object manipulation. Finally, we show that ORGAN scales well both with respect to the number of objects and the size of the images, giving it a unique edge over current state-of-the-art approaches.