Fermionic Machine Learning

1 Overview

We introduce fermionic machine learning (FermiML), a machine learning framework based on fermionic quantum computation. FermiML models are expressed in terms of parameterized matchgate circuits, a restricted class of quantum circuits that map exactly to systems of free Majorana fermions. The FermiML framework allows for building fermionic counterparts of any quantum machine learning (QML) model based on parameterized quantum circuits, including models that produce highly entangled quantum states. Importantly, matchgate circuits are efficiently simulable classically, thus rendering FermiML a flexible framework for utility benchmarks of QML methods. Building on this foundation, we are actively pushing FermiML beyond vanilla machine learning, branching into deep learning, generative models, and optimization frameworks. A notable outcome of this work is the development of MatchCake, an open-source Python package for simulating matchgate circuits — with a publication on the way!

2 Core Resources

3 Tutorials

4 Homework

4.1 Questions from Basic Quantum Mechanics & Linear Algebra

4.1.1 Quantum Physics

What is a quantum state, an observable in quantum mechanics and a quantum operator?
What is an expectation value? How do you calculate it in quantum mechanics?
What is a Hamiltonian? What are the Pauli matrices?
What is the correlation between single qubit rotation gates and the Pauli matrices?
What is commutation and what are commuting variables?
What is meant by orthogonality and orthonormality?
What are the Bell states?

Exercise: Calculate the expectation value of the Pauli-\(Z\) operator for the state \(|+\rangle = \frac{1}{\sqrt{2}}(|0\rangle + |1\rangle)\). What does this result tell you physically?

4.1.2 Linear Algebra

What is the rank of a matrix?
What is a tensor? And by extension, what is a tensor network?
What are unitary, Hermitian, normal, symmetric/skew-symmetric, upper/lower triangular matrices?
Matrix operations: eigendecompositions, singular value decompositions, trace, determinant, inversion, Pfaffian, permanent.
What is a random matrix?
What is the Haar measure?
What are dot and vector products, inner and outer products, tensor product, Kronecker product, Hadamard product?
What is an Einstein summation? Are they important in tensor networks?
How do you compute the gradient of a function (analytically & numerically)?

Exercise 1: Given the matrix \(H = \begin{pmatrix} 1 & i \\ -i & 1 \end{pmatrix}\), (a) verify that it is Hermitian, (b) compute its eigendecomposition and verify that the eigenvectors are orthonormal, (c) compute its trace and determinant, and (d) verify that the matrix exponential \(U = e^{-iH}\) is unitary.

Exercise 2: Compute the tensor product \(|0\rangle \otimes |1\rangle\) explicitly in vector form. Then represent the two-qubit state as a rank-2 tensor \(T^{ij}\) and write the contraction \(\sum_{j} T^{ij} T^{*}_{jk}\) in Einstein summation notation. Sketch the corresponding tensor network diagram.

4.2 Questions from Basic Machine Learning

What is principal component analysis (PCA), and how is it used in machine learning?
What are some of the main ML paradigms and what are the differences between them?
What is the difference between regression, classification, binary vs multiclass classification?
What are some common methods for data preprocessing with feature vectors, images and time series?
What is feature engineering, and how is it used?
What are classical kernel methods?
What is a kernel (Gram) matrix?
Explain the core principles and differences of these models: Linear, Probabilistic, Tree-Based, Neural Networks. Are there specific sets of data these are particularly good at?
What are the parameters and hyperparameters of a model?
What is a loss function?
What is gradient descent?
What are the different kinds of neural networks? How do you train a neural network? What is an optimizer? What are some current state-of-the-art options available?
What is a perceptron and a multi-layer perceptron?
What are hidden layers?
What are the common evaluation metrics for classification and regression?
What is cross-validation and why is it important?
What are training, test and cross-validation sets?
What is the difference between underfitting and overfitting?
What is generalization in machine learning?
What is parameter regularization?
What does “inductive bias” mean?
Explain the common methods to optimize the hyperparameters of a model.
What is a transformer?
What is the attention mechanism in LLM training?
What is fine-tuning of an LLM model?

Exercise 1: Implement gradient descent from scratch in Python to minimize the mean squared error (MSE) loss \(\mathcal{L} = \frac{1}{N}\sum_{i=1}^{N}(y_i - \hat{y}_i)^2\) on a simple linear regression problem. (a) Generate a toy dataset of 100 points from \(y = 3x + 2 + \varepsilon\) where \(\varepsilon\) is Gaussian noise. (b) Train your model using gradient descent and plot the loss curve. (c) Experiment with at least three different learning rates \(\eta \in \{0.1, 0.01, 0.001\}\) and comment on the effect.

Exercise 2: Code from scratch a multi-layer perceptron (MLP) with two layers using gradient descent to classify the Iris dataset using numpy only. Thereafter, do the same using PyTorch with the Adam and AdamW optimizers.

Exercise 3: Using PyTorch, build a multi-layer perceptron (MLP) to classify the Wine dataset. (a) Define a network with at least 2 hidden layers and ReLU activations \(\sigma(x) = \max(0, x)\). (b) Train it using both SGD and Adam optimizers and compare their convergence curves. (c) Report accuracy on the test set and plot a confusion matrix.

4.3 Questions from Quantum Simulation, Quantum Computation & Quantum Machine Learning

What is a quantum gate, and how does it differ from a classical logic gate?
What is a random unitary circuit?
What is a qubit, and how does it differ from a classical bit?
What is quantum superposition, and why is it computationally useful?
What is a parameterized quantum circuit (PQC), and how is it used in quantum algorithms?
What is quantum entanglement, and how is it generated in a quantum circuit?
What is a unitary matrix, and why must quantum gates be unitary?
What is quantum measurement, and how does it affect the state of a quantum system?
What is the difference between a pure state and a mixed state?
What is a density matrix, and when is it used instead of a state-vector?
What is meant by strong and weak simulation?
How are results of computation actually measured on a quantum computer?
What is quantum machine learning?
What are quantum kernel methods?
How can a quantum circuit be used as a kernel model?
What are variational quantum algorithms? What is meant by variational training in “variational quantum algorithms”?
How can a quantum circuit be variationally trained?
What is the problem of barren plateaus in variational quantum algorithms and exponential concentration in quantum kernel methods?
What is quantum data encoding, and what are the main strategies for embedding classical data into a quantum state?
What is quantum data?

Exercise 1: Using PennyLane (preferably) or Qiskit, implement a simple variational quantum classifier on a binary classification toy dataset. (a) Encode the input data using angle encoding, mapping each feature \(x_i\) to a rotation gate \(R_X(x_i)\) on qubit \(i\). (b) Define a parameterized quantum circuit ansatz with at least 2 layers of parameterized rotations \(R_Y(\theta_i)\) and entangling \(\text{CNOT}\) gates. (c) Train the circuit variationally by minimizing the binary cross-entropy loss \[\mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N}\left[y_i \log(\hat{y}_i) + (1 - y_i)\log(1 - \hat{y}_i)\right]\] using gradient descent. (d) Plot the loss curve and report the classification accuracy on a test set.

Exercise 2: Implement a quantum kernel classifier using PennyLane (preferably) or Qiskit. (a) Define a feature map circuit \(U(x)\) and compute the quantum kernel (Gram) matrix \(K_{ij} = |\langle 0 | U^\dagger(x_i) U(x_j) | 0 \rangle|^2\) for a small dataset (e.g. the Iris dataset). (b) Plug the resulting kernel matrix \(K\) into a classical support vector machine (SVM) and report the classification accuracy. (c) Now consider a PQC ansatz with \(n \in \{2, 4, 6, 8\}\) qubits and compute the variance of the gradient \(\text{Var}\left[\partial_\theta \mathcal{L}\right]\) as a function of \(n\). What do you observe?

4.4 Basics of Matchgates

What are matchgate circuits, and why are they efficiently simulable classically compared to general quantum circuits?
What is the connection between matchgate circuits and free Majorana fermions? How does this mapping motivate the FermiML framework?
What is a Majorana fermion? What is the relation between Majorana fermions and fermionic creation and annihilation operators?
What is the Jordan-Wigner transformation, and how does it map fermionic operators to qubit operators?
What is a free-fermionic system?
What is the Gaussian state formalism?
What is a covariance matrix in the context of fermionic Gaussian states, and how is it used to compute expectation values?
What is the Pfaffian of a matrix, and where does it appear in the computation of matchgate circuit outputs?

4.5 Additional References

4.6 Basics of Quantum Mechanics, Quantum Computing, Quantum Information and Quantum Algorithms

[Sakurai Quantum Mechanics Book] https://icourse.club/uploads/files/65fb4c16648d2b93f82fe3271c3381c211f2f532.pdf
[Quantum Mechanics and Quantum Computing : An introduction] https://www.macs.hw.ac.uk/~des/qcnotesaims17.pdf
[An Introduction to Quantum Computing for Non-Physicists] https://arxiv.org/abs/quant-ph/9809016
[Nielsen and Chuang] https://profmcruz.wordpress.com/wp-content/uploads/2017/08/quantum-computation-and-quantum-information-nielsen-chuang.pdf
[Quantum Computing Since Democritus] https://s3.amazonaws.com/arena-attachments/958521/7c581f75f258e9c36788c60cf45f3961.pdf?1491247031
[David Mermin] http://www-f1.ijs.si/~ramsak/Nanofizika/QCS/books_3092_0.pdf
[Thomas Wong] Beginner Friendly : https://www.thomaswong.net/introduction-to-classical-and-quantum-computing-1e4p.pdf

4.7 Quantum Machine Learning

[Supervised Learning With Quantum Computers] (Book Available in Print at the lab) https://ndl.ethernet.edu.et/bitstream/123456789/73371/1/320.pdf
[An introduction to quantum machine learning] https://arxiv.org/abs/1409.3097
[Quantum Computing and What it means for Data mining] https://www.researchgate.net/profile/Peter-Wittek/publication/264825604_Quantum_Machine_Learning_What_Quantum_Computing_Means_to_Data_Mining/links/562aacc208ae04c2aeb1cb64/Quantum-Machine-Learning-What-Quantum-Computing-Means-to-Data-Mining.pdf
[Quantum Machine Learning : A Review] https://mlazarovits.github.io/LearningMachineLearning/assets/papers_summer19/quantum_ML.pdf

4.8 Option References for Fermionic Systems, Matchgates and Fermionic Quantum Computation

[Jozsa and Miyake: Matchgate Basics] https://arxiv.org/abs/0804.4050
[Bravyi and Kitaev: Fermionic Quantum Computation] https://arxiv.org/pdf/quant-ph/0003137
[Majorana-Based Fermionic Quantum Computation] https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.120.220504
[Jens Eisert, Basics of Second Quantization] https://www.physik.fu-berlin.de/en/einrichtungen/ag/ag-eisert/teaching/AdvancedQuantumMechanicsChapter32.pdf
[Fermionic Systems for Quantum Information People] https://arxiv.org/abs/2006.03087
[Fermionic Systems and Algebra] https://web2.ph.utexas.edu/~vadim/Classes/2022f/ffs.pdf
[Quantum Simulation of Fermionic Systems] https://indico.global/event/8922/contributions/85116/attachments/39584/73648/Quantum_Simulation_of_Fermion_Systems.pdf