In this paper, the proxies are defined as a _trainable_ Module . The proxies' feature vectors contribute to the loss function. Thus, they are learnable.
That being said, other approaches define the proxies as _non-trainable_ constants. In this case, the proxies are initialized to impose a certain characteristic on the proxies. For example, you want to make sure the proxies are orthogonal to each other.
Of course, you can combine the merits of both approach. Make the proxies trainable while imposing an orthogonality regularizer