Search papers, labs, and topics across Lattice.
This paper analyzes the approximation error of DeepONets, demonstrating that the branch network dominates the error when the internal dimension is sufficiently large and that the learned trunk basis can often be replaced by classical basis functions. By replacing the trunk network with the left singular vectors of the training solution matrix, the authors identify a spectral bias in the branch network, where low-frequency modes are learned more effectively, and show that the overall branch error is dominated by modes with intermediate singular values. They also find that a shared branch network improves generalization of small modes compared to a stacked architecture, but strong coupling between modes in parameter space is detrimental.
DeepONets' approximation error isn't just about the smallest modes; intermediate singular values in the branch network dominate the overall error, revealing a spectral bias that limits accuracy.
Operator learning has the potential to strongly impact scientific computing by learning solution operators for differential equations, potentially accelerating multi-query tasks such as design optimization and uncertainty quantification by orders of magnitude. Despite proven universal approximation properties, deep operator networks (DeepONets) often exhibit limited accuracy and generalization in practice, which hinders their adoption. Understanding these limitations is therefore crucial for further advancing the approach. This work analyzes performance limitations of the classical DeepONet architecture. It is shown that the approximation error is dominated by the branch network when the internal dimension is sufficiently large, and that the learned trunk basis can often be replaced by classical basis functions without a significant impact on performance. To investigate this further, a modified DeepONet is constructed in which the trunk network is replaced by the left singular vectors of the training solution matrix. This modification yields several key insights. First, a spectral bias in the branch network is observed, with coefficients of dominant, low-frequency modes learned more effectively. Second, due to singular-value scaling of the branch coefficients, the overall branch error is dominated by modes with intermediate singular values rather than the smallest ones. Third, using a shared branch network for all mode coefficients, as in the standard architecture, improves generalization of small modes compared to a stacked architecture in which coefficients are computed separately. Finally, strong and detrimental coupling between modes in parameter space is identified.