Search papers, labs, and topics across Lattice.
This paper presents a longitudinal analysis of 423 neuromorphic datasets, examining their characteristics, accessibility, and underlying task structures. It identifies challenges related to dataset size, lack of standardization, and access difficulties, while also highlighting the increasing prevalence of synthetic datasets. The authors propose meta-datasets as a potential solution to mitigate bias and reduce the demand for entirely new datasets.
Neuromorphic engineering's data deluge is more mirage than oasis: a comprehensive analysis reveals a landscape plagued by accessibility issues, lack of standardization, and a rising tide of potentially misleading synthetic data.
Neuromorphic engineering has a data problem. Despite the meteoric rise in the number of neuromorphic datasets published over the past ten years, the conclusion of a significant portion of neuromorphic research papers still states that there is a need for yet more data and even larger datasets. Whilst this need is driven in part by the sheer volume of data required by modern deep learning approaches, it is also fuelled by the current state of the available neuromorphic datasets and the difficulties in finding them, understanding their purpose, and determining the nature of their underlying task. This is further compounded by practical difficulties in downloading and using these datasets. This review starts by capturing a snapshot of the existing neuromorphic datasets, covering over 423 datasets, and then explores the nature of their tasks and the underlying structure of the presented data. Analysing these datasets shows the difficulties arising from their size, the lack of standardisation, and difficulties in accessing the actual data. This paper also highlights the growth in the size of individual datasets and the complexities involved in working with the data. However, a more important concern is the rise of synthetic datasets, created by either simulation or video-to-events methods. This review explores the benefits of simulated data for testing existing algorithms and applications, highlighting the potential pitfalls for exploring new applications of neuromorphic technologies. This review also introduces the concepts of meta-datasets, created from existing datasets, as a way of both reducing the need for more data, and to remove potential bias arising from defining both the dataset and the task.