## How to connect quantum states and PCA components

One of the greatest gifts of mathematics is its uncanny ability to generalize as much as our creativity allows. An important consequence of this generalizability is that we can use the same set of tools to formalize completely different topics. When we do this, as a side effect, some unexpected similarities emerge between these different domains. To illustrate what I mean, in this article I will try to convince you that the principal values of PCA coordinates and the energy of a quantum system are the same thing (mathematically).

For those of you unfamiliar with Principal Component Analysis (PCA), here's the bare minimum: the basic idea of PCA is to get a new set of coordinates based on your data, and then rewrite your original data in this new coordinate system so that the axes point in the directions with the highest variance.

Suppose you have a set like this *yeah *Data sample (below *Individual*), each individual *Meters *For example, if you ask 10 people for their weight, height, and salary, *no*10 and *Meters = 3*In this example, we would expect there to be some relationship between weight and height, but no relationship, at least in principle, between these variables and salary. PCA can help us visualize these relationships more clearly. To understand how and why this happens, we will go through each step of the PCA algorithm.

First, for formalization, each individual is represented as a vector. **X**Each element of this vector is a feature. *yeah *Vector is *Meters*dimensional space, the dataset can be viewed as a large matrix. *X*, *Meters *X *yeah*Here, we essentially arrange the individuals side by side (i.e. each individual is represented as a column vector).

With this in mind, we can get a good start with the PCA algorithm.

## Centralize your data

Centralizing the data means shifting the data points so that they are distributed around the origin of the coordinate system. To do this, we calculate the mean of each feature and subtract it from the data points. The mean of each feature can be represented as a vector: **µ**:

where *µ_i* The average is *I*Centralizing data gives you new metrics *B *Given by:

This matrix *B *represents a data set centered at the origin. Since we define the mean vector as a row matrix, *Transpose* calculate *B *(Each individual is represented by a column matrix.) But this is just a minor detail.

## Calculate the covariance matrix

Calculating the covariance matrix, *S*Multiplying matrices gives *B *and its transpose *no *The following is displayed:

First place*/(n-*1*) *The reason for adding the factor in front is to make the definition equivalent to the statistical definition. *S_ij* The covariance of the above matrix is the feature *I *With functions *character*and its diagonal entries *S_ii *The variance of *I-*function.

## Find the eigenvalues and eigenvectors of a covariance matrix

Here are three important linear algebra facts about covariance matrices (we won't prove them here): *S *Here's what I've built so far:

- matrix
*S*It is symmetric: the elements mirrored across the diagonal are equal (i.e.*S_ij = S_ji*); - matrix
*S*is orthogonally diagonalizable, i.e. there exists a set of numbers (λ_1, λ_2, …, λ_m) such that*eigenvalue*and a set of vectors (**v_**1.**v_**2…**v_**m)*Eigenvectors*,In other words,*S*When the eigenvectors are expressed as a basis, they are in diagonal form, with the diagonal elements being the eigenvalues. - matrix
*S*It has only real, non-negative eigenvalues.

In the PCA formalism, the eigenvectors of the covariance matrix are called principal components and the eigenvalues are called principal values.

At first glance, this just seems like a series of mathematical operations on a set of data, but we'll wrap up our math lesson with a final linear algebra fact.

4. The trace of a matrix (i.e., the sum of the diagonal terms) is independent of the basis in which the matrix is represented.

This means that the sum of the diagonal terms of the matrix is *S* If the total variance of the dataset is *eigenvalue* Matrix *S *It is also the variance of the entire data set. This is called the total variance. *L.*

With this mechanism in mind, we can order the eigenvalues (λ_1, λ_2, …, λ_m) in descending order: λ_1 > λ_2 > … > λ_m, λ_1/*and others* > λ_2/*and others* > … > λ_m/*and others*The eigenvalues were ordered using the total variance of the data set as the importance measure. The first principal component is: **v_**1 points in the direction of maximum variance since its eigenvalue λ_1 contributes the most to the overall variance.

This is the whole point of PCA. So what about quantum mechanics?

Perhaps the most important aspect of quantum mechanics for our discussion here is one of its axioms.

The state of a quantum system is represented as a vector (usually called the state vector) that lives in a vector space called Hilbert space.

As I write this, I realize that this assumption is very natural to me, because I see this every day and I'm used to it, but this is a bit absurd, so please take the time to understand it. *state* It's a common term used in physics to mean “the configuration of something at a given point in time.”

This axiom states that* *If we treat our physical systems as vectors, then all the rules of linear algebra apply here and it is not surprising that some connection emerges between PCA (which also relies on linear algebra) and quantum mechanics.

Physics is a science that is interested in how physical systems change. *change *In the formalism of quantum mechanics *change* We need to apply some kind of operation to vectors, using (unsurprisingly) mathematical entities called operators. A particularly interesting class of operators is the class of linear operators. In fact, they're so important that when we're talking about operators, we usually just drop the term “linear” because it's implicit that these are linear operators. So if we want to impress people at our bar table, just drop this bomb:

In quantum mechanics it's all about (state) vectors and (linear) operators.

## Measurement in quantum mechanics

If vectors represent physical states in the context of quantum mechanics, what do operators represent? They represent physical states. *measurement*For example, if we want to measure the position of a quantum particle, in quantum mechanics this is modeled as applying a position operator to the state vector associated with the particle. Similarly, if we want to measure the energy of a quantum particle, we need to apply an energy operator. A final point that connects quantum mechanics and PCA is to remember that, given the choice of basis, linear operators can be represented as matrices.

A commonly used basis for representing quantum systems is the basis made up of eigenvectors of the energy operator. In this basis, the energy operator matrix is diagonal, and its diagonal terms are the energies for the various energy (eigen)states of the system. The sum of these energy values corresponds to the trace of the energy operator. If you think about it, a change of basis obviously cannot change this, as we saw earlier in this text. If it did, it would mean that we should be able to change the energy of the system by describing its components in a different way, which is absurd. For laboratory measurement equipment, it doesn't matter whether we use basis A or basis B to represent the system. If we measure energy, we measure energy and that's it.

So, a good interpretation of the principal values of a PCA decomposition is that they correspond to the “energy” of the system. If you write out the principal values (and principal components) in descending order, the “state” with the greatest “energy” of the system will take precedence.

This interpretation may be somewhat more insightful than trying to interpret statistics such as variance, and I believe we have a better intuition about energy because it is a fundamental physical concept.

“All this is clear” is a provocative statement made by my good friend Rodrigo da Motta in relation to the article you have just read.

When I write posts like this, I try to keep the reader in mind and explain things with minimal context. This exercise has led me to the conclusion that with the right background, almost anything can potentially be illuminating. Rodrigo and I are both physicists and data scientists, so the connection between quantum mechanics and PCA should be pretty obvious. *to us*.

Writing posts like this gives me more reason to believe that we should be exposed to all kinds of knowledge, because it makes interesting connections. The same human brain that thinks about and produces understanding of physics also produces understanding of biology, history, and movies. If the possibilities for language and the connections in the brain are finite, that means that we will eventually reuse concepts from one field in another, whether sequential or not, producing fundamental shared structures across domains of knowledge.

As scientists, we should take advantage of this.

[1] Linear algebra of PCA: https://www.math.union.edu/~jaureguj/PCA.pdf

[2] Axioms of Quantum Mechanics: https://web.mit.edu/8.05/handouts/jaffe1.pdf