Federated learning is a training framework that allows multiple clients to collaboratively train a model without sharing data. VerFedLogistic.jl is a Julia package for solving the following the vertical federated multinomial logistic regression problem:

$\min_{\theta_1, \dots, \theta_M} \enspace \frac{1}{N}\sum_{i=1}^N \ell\left(\theta_1, \dots, \theta_M; \{x_i, y_i\} \right),$

where $$N$$ is the number of data points, $$M$$ is the number of clients, $$x_i\in\mathbb{R^d}$$ is the feature vector, and $$y_i\in\mathbb{N}$$ is the label. Every feature vector $$x_i$$ is distributed across $$M$$ clients $$\{x^m_i\in\mathbb{R}^{d^m}: m \in [M]\}$$, where $$d^m$$ is the feature dimension for client $$m$$ such that $$\sum_{m=1}^M d^m = d$$. For a wide range of models, such as linear and logistic regression, and support vector machines, the loss function has the form

$\ell\left(\theta_1, \dots, \theta_M; \{x_i, y_i\} \right) := f\left( h_i; y_i\right)$

where

$$h_i = \sum_{m=1}^M h_i^m, \enspace h_i^m = \langle \theta_m, x_i^m \rangle,$$ and $$f(\cdot; y)$$ is a differentiable function for any $$y$$. For each client $$m$$, the term $$h_i^m$$ can be viewed as the client’s embedding of the local data point $$x_i^m$$ and the local model $$\theta_m$$. To preserve the privacy, clients are not allowed to share their local data set $$\mathcal{D}^m$$ or local model $$\theta^m$$ with other clients or with the server. Instead, clients can share only their local embeddings $$\{h_i^m\mid i\in[N]\}$$ to the server for training.