Steerable convolutional neural networks (CNNs) provide a general framework for building neural networks equivariant to translations and other transformations belonging to an origin-preserving group \(G\), such as reflections and rotations. They rely on standard convolutions with \(G\)-steerable kernels obtained by analytically solving the group-specific equivariance constraint imposed onto the kernel space. As the solution is tailored to a particular group \(G\), the implementation of a kernel basis does not generalize to other symmetry transformations, which complicates the development of general group equivariant models. We propose using implicit neural representation via multi-layer perceptrons (MLPs) to parameterize \(G\)-steerable kernels. The resulting framework offers a simple and flexible way to implement Steerable CNNs and generalizes to any group \(G\) for which a \(G\)-equivariant MLP can be built. We prove the effectiveness of our method on multiple tasks, including N-body simulations, point cloud classification and molecular property prediction.
From the atomic level to the vast expanse of the universe, symmetry and equivariance are consistently observed.
Whether it is the behaviour of molecules or patterns in point clouds, there are often properties
of the system that are preserved under certain transformations.
Equivariant deep learning aims to encode these symmetries directly into the learning process,
yielding more efficient and generalizable models.
Such models are able to preserve certain transformations in the input data through to the model's output.
Convolutional Neural Networks (CNNs) serve as a classic example,
being equivariant with respect to translations in the input space (try shifting the image before and after a convolutional layer).
However, to capture a broader range of symmetries found in complex systems,
especially in physics and chemistry, group equivariant CNNs
(G-CNNs) have been developed.
The feature spaces of G-CNNs are described as collections of feature fields.
Each field is essentially a feature map that assigns a tensor of values to each point in the input space.
Common instances include scalar fields like grey-scale images and temperature distributions and vector fields such as wind velocity or electromagnetic fields.
For example, one can think of feature vectors in standard CNNs as collections of \(N_{channels}\) scalar fields
and generalize them to collections of fields with different types (e.g. \(N\) scalar channels, \(M\) vector channels).
Depending on its type, a field has specific transformation behaviour when subjected to a group element, e.g. a rotation.
This behaviour is described by a group representation, which is a mapping from group elements to linear operators on the field space.
For example, a scalar field is invariant to rotation, and hence, the linear operator is the identity corresponding to the trivial representation.
We furthermore must require that the model respects the transformation laws of input, intermediate and output feature fields,
which is essentially the equivariance constraint covered next.
@inproceedings{
anonymous2023implicit,
title={Implicit Convolutional Kernels for Steerable {CNN}s},
author={Maksim Zhdanov and Nico Hoffmann and Gabriele Cesa},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=2YtdxqvdjX}
}