Tight informationally complete quantum measurements

In [101]:
import numpy as np
import scipy as sc
import scipy.linalg
np.set_printoptions(precision=6, suppress=True)

Spherical t-designs

We begin with "spherical t-designs." They're useful for numerical integration. Basically instead of integrating a polynomial over some space, you can instead take a finite sum of the values of the polynomial at some specially chosen points: and you get the same answer!

So: a spherical t-design is a set of $n$ normalized vectors such that the average value of any $t$-th order (homogeneous) polynomial over the set is equal to the average of over all normalized vectors.

In other words, working with a unit 2-sphere, and a $t$-th order polynomial $f_{t}(\psi)$, and $n$ points $\{ |\psi_{i}\rangle \}$:

$$ \frac{1}{4\pi}\int_{S^2} p(\psi) d\psi = \frac{1}{n}\sum_{i=0}^{n} p(\psi_{i}) $$

As we'll see, a set of $t$-design vectors have the property that they minimize the $t$-th order frame potential:

$$ \sum_{i,j} |\langle \psi_{i} | \psi_{j} \rangle|^{2t} $$

So let's look for one! We'll use jax for some just-in-time complication and for taking derivatives, and scipy.optimize for the constrained optimization.

In [102]:
import jax
import jax.numpy as jp

def spherical_design(d, n, t):
    @jax.jit
    def frame_potential(V):
        R = V.reshape(d, n)
        return sum([jp.abs(jp.dot(a, b))**(2*t) for b in R.T for a in R.T])
    frame_potential_jac = jax.jit(jax.jacrev(frame_potential))
    @jax.jit
    def norm_constraint(V):
        R = V.reshape(d,n)
        return jp.linalg.norm(jp.linalg.norm(R, axis=0) - jp.ones(n))**2
    norm_jac = jax.jit(jax.jacrev(norm_constraint))
    result = sc.optimize.minimize(frame_potential,\
                                  np.random.randn(d*n),\
                                  jac=frame_potential_jac,\
                                  constraints=[{"type": "eq",\
                                                "fun": norm_constraint,\
                                                "jac": norm_jac}],\
                                  tol=1e-14,\
                                  options={"ftol": 1e-14,\
                                           "disp": True,\
                                           "maxiter": 5000},\
                                  method="SLSQP")
    return result.x.reshape(d, n)

First, we'll look for a spherical 2-design on the 2-sphere in $\mathbb{R}^3$ with six elements.

In [103]:
d, n, t = 3, 6, 2
R = spherical_design(d, n, t); R
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 7.2000274658203125
            Iterations: 178
            Function evaluations: 437
            Gradient evaluations: 178
Out[103]:
array([[ 0.654425, -0.138834, -0.029355, -0.804426, -0.871627, -0.378888],
       [ 0.251973, -0.97831 ,  0.589923,  0.586375, -0.381487,  0.378301],
       [-0.712908,  0.153738,  0.806925,  0.095198, -0.307788, -0.84459 ]])

Now let's make up some random 2nd order homogeneous polynomial:

In [104]:
poly = lambda xyz: xyz[0]**2 + xyz[0]*xyz[1] + 6*xyz[2]**2 + 5*xyz[2]*xyz[0] + xyz[1]**2

We'll use quadpy for the numerical integration. (Ironically, of course, under the hood, they basically do the same thing!)

In [105]:
import quadpy
scheme = quadpy.u3.get_good_scheme(19)

We have $\frac{1}{n}\sum_{i=0}^{n} p(\psi_{i})$:

In [106]:
(1/n)*sum([poly(r) for r in R.T])
Out[106]:
2.6670280946135927

And $\frac{1}{4\pi}\int_{S^2} p(\psi) d\psi$:

In [107]:
(1/(4*np.pi))*scheme.integrate(poly, np.zeros(3), 1)
Out[107]:
2.6666666666666665

Pretty darn close! Let's look at the magical points on the sphere:

In [108]:
import vpython as vp
scene = vp.canvas(background=vp.color.white)
vp.sphere(color=vp.color.blue, opacity=0.3)
points = [vp.sphere(radius=0.2, emissive=True, pos=vp.vector(*r)) for r in R.T]

Frames and Spherical $t$-designs

(The following exposition is adapted from Symmetric Informationally Complete Quantum Measurements.)

As we now know, a collection of $n$ vectors $\{| \psi_{i}\rangle \}$ forms a frame if for all vectors $|v\rangle$:

$$ a|\langle v | v \rangle|^2 \leq \sum_{i}|\langle v | \psi_{i} \rangle |^2 \leq b|\langle v | v \rangle|^2 $$

Where $a$ and $b$ are the "frame bounds." If $a=b$, the frame is a tight frame.

We can define the frame operator:

$$ S = \sum_{i} | \psi_{i} \rangle \langle \psi_{i}| $$

For a tight frame $S = aI$: in other words, $S$ is proportional to the identity.

One implication: a tight frame corresponds to a rank-1 POVM, whose elements are $\frac{1}{a}|\psi_{i}\rangle\langle\psi_{i}|$. We'll sometimes write $\frac{1}{a}\Pi_{i}$.


For a set of normalized vectors, we can define the frame potential:

$$ Tr[S^2] = \sum_{i,j} |\langle \psi_{i}|\psi_{j}\rangle|^2 $$

Given a dimension $d$ and $n$ frame vectors:

$$ Tr[S^2] \ge max(n, \frac{n^2}{d})$$

The bound is achieved if and only if either a) $ n \leq d$ and the vectors are all orthonormal or b) $ n \ge d $, and the vectors form a tight frame.

The proof: First, the the number of nonzero eigenvalues of $S$ is at most $q = min(n,d)$. So we have:

$$ Tr[S] = n = \sum_{k}^{q} \lambda_{k} $$

and

$$ Tr[S^2] = \sum_{k}^{q} \lambda_{k}^2 $$

By Cauchy-Schwartz, we have:

$$ \sum_{k}^{q} \lambda_{k}^2 \ge \frac{1}{q}(\sum_{k}^{q} \lambda_{k})^2$$$$ Tr[S^2] \ge \frac{n^2}{min(n,d)}$$$$ Tr[S^2] \ge max(n, \frac{n^2}{d})$$

Equality holds if and only if $\lambda_{k} = \frac{n}{q}$. If $n \leq d$, we have $\lambda_{k} = 1$. Thus $S$ is a projector onto an $n$-dimensional subspace, which implies that the vectors are orthogonal. If $n \ge d$, we have $\lambda_{k} = \frac{n}{d}$. Thus $S=\frac{n}{d}I$, which implies that the vectors form a tight frame.

So tight frames minimize $\sum_{i,j} |\langle \psi_{i}|\psi_{j}\rangle|^2 $. That also means that the vectors form a 1-design!


A symmetric informationally complete POVM, or SIC-POVM, corresponds to a set of $d^2$ normalized vectors $|\psi_{i}\rangle$ in $\mathbb{C}^d$ satisfying:

$$ |\langle \psi_{i} | \psi_{j} \rangle|^2 = \frac{1}{d+1}, i \neq j$$

In other words, they form an equiangular frame.

The POVM elements are $\frac{1}{d}|\psi_{i} \rangle \langle \psi_{i}| = \frac{1}{d}\Pi_{i}$, whose pairwise Hilbert-Schmidt inner product is:

$$ \frac{1}{d^2} Tr[\Pi_{i}^{\dagger}\Pi_{j}] = \frac{1}{d^2(d+1)}, i \neq j$$

So:

$$Tr[S^2] = \sum_{i,j} |\langle \psi_{i}|\psi_{j}\rangle|^2 = (n^2-n)\frac{1}{d+1} + n = \frac{d^2(d^2 -1)}{d+1} + d^2 = d^2(d-1) + d^2 = d^3 = \frac{n^2}{d}$$

Thus a SIC is a tight frame, and thus, a POVM.

Moreover, a SIC is informationally complete: the $\Pi_{i}$'s are linearly independent, and span the space of operators.

This follows from the fact that each row of its Gram matrix is a cyclic shift of the previous row. Thus it's a circulant matrix and its eigenvalues are given by the Fourier transform of one of the rows. It turns out in fact that the eigenvalues are exactly the same as the values in any row. Therefore, no eigenvalues are zero, the Gram is full rank, the $\Pi_{i}$'s are linearly independent, and the POVM is informationally complete.


As we discussed a moment ago, a spherical $t$-design is a set of $n$ normalized vectors such that the average value of any $t$-th order polynomial over the set is equal to the average of the polynomial over all normalized vectors.

The concept originally applied to the real Euclidian 2-sphere, but we'll extend the idea to normalized vectors in complex Hilbert space.

Let $\mathcal{H} = \mathbb{C}^{d}$, and $\mathcal{H}_{t} = \mathcal{H}^{\otimes t}$ be the $t$-fold tensor product. Furthermore, let $S_{t}$ be the symmetric subspace of $\mathcal{H}_{t}$, and let $|\Psi^{t}\rangle = |\psi\rangle^{\otimes t}$.

We can define a t-order polynomial function in terms of a symmetric operator $F_{t}$:

$$ f_{t}(\psi) = \langle \Psi^{t} | F_{t} | \Psi^{t} \rangle $$

We can decompose $F_{t}$ as a sum of product operators: $F_{t} = \sum_{k}\otimes_{j=1}^{t} A_{j;k}$, and so any such function can be decomposed into monomials like:

$$ \langle \Psi^{t} | \otimes_{j=1}^{t} A_{j} | \Psi_{t} \rangle = \prod_{j=1}^{t} \langle \psi | A_{j} |\psi \rangle$$

So let's just consider the monomials, without loss of generality. We can rewrite the above as:

$$ f_{t}(\psi) = \prod_{j=1}^{t} Tr[A_{j}|\psi\rangle\langle\psi |]\$$

If $\Pi_{\psi} = |\psi\rangle\langle \psi|$, we can further rewrite this as:

$$ f_{t}(\psi) = Tr[(\otimes_{j=1}^{t} A_{j})\Pi_{\psi}^{\otimes t}] $$

As we've said, the defining property of a $t$-design is that the average of $f_{t}$ over the design vectors $\{ | \psi_{i} \rangle \}$ is equal to the average over all normalized vectors $| \psi \rangle$. So, the average of $f_{t}$ over all vectors is:

$$\langle f_{t} \rangle = \int d\psi Tr[(\otimes_{j=1}^{t} A_{j})\Pi_{\psi}^{\otimes t}] = Tr[(\otimes_{j=1}^{t} A_{j}) \int d\psi \Pi_{\psi}^{\otimes t}] $$

This should be equivalent to:

$$\langle f_{t} \rangle = Tr[(\otimes_{j=1}^{t} A_{j}) K_{t}]$$

For some $K_{t}$ relating to the design vectors.

A spherical $t$-design is then a set of vectors $\{ | \psi_{i} \rangle \}$ for which:

$$ S_{t} = \sum_{i}^{n} | \Psi_{i}^{t} \rangle \langle \Psi_{i}^{t}| = nK_{t} $$

Where $ | \Psi_{i}^{t} \rangle = | \psi_{i} \rangle^{\otimes t}$, and $S_{t}$ is the $t$-fold tensor product analogue of the frame operator $S$.

$K_{t}$ only has support on the symmetric subpace. Further, since $K_{t}$ has to be invariant under any $U^{\otimes t}$ for $U \in SU(d)$, it has to be proportional to a projector $\Pi_{sym}^{(t)}$ onto the symmetric subspace. (This follows from Schur's lemma.) If we consider the average of $f_{t}(\psi) = 1$, then we get $\langle f_{t} \rangle = Tr[K_{t}] = 1$. Since the symmetric subspace has dimensions $\binom{t+d-1}{d-1}$, we have:

$$ K_{t} = \binom{t+d-1}{d-1}^{-1}\Pi_{sym} = \frac{t!(d-1)!}{(t+d-1)!}\Pi_{sym}^{(t)} $$

.

Putting it all together, a set of normalized vectors $\{ | \psi_{i} \rangle \}$ with $n \ge \binom{t+d-1}{d-1}$ forms a spherical t-design if and only if it minimizes the $t$-th order frame potential: indeed, that:

$$ Tr[S_{t}^2] = \sum_{i,j} |\langle \psi_{i} | \psi_{j} \rangle|^{2t} = \frac{n^2t!(d-1)!}{(t+d-1)!}$$

For the case of a 2-design with $d^2$ elements:

$$ Tr[S_{2}^2] = \frac{2 d^4 (d-1)!}{(d+1)!} = \frac{2d^4}{d(d+1)} = \frac{2d^3}{d+1}$$

Indeed, for a SIC-POVM, we have:

$$ Tr[S_{2}^2] = \sum_{i,j} |\langle \psi_{i} | \psi_{j} \rangle|^{4} = \frac{n^2-n}{(d+1)^2} + n = \frac{d^2(d+1)(d-1)}{(d+1)^2} + d^2 = d^2( \frac{d-1}{d+1} +1) = d^2( \frac{d-1 + d+1}{d+1}) = \frac{2d^3}{d+1}$$

Thus a SIC-POVM is a 2-design. A further argument, which we omit here, proves the converse, that every 2-design with $d^2$ elements is a SIC-POVM.


Complex Projective Designs

(The following exposition is adapted from Tight informationally complete quantum measurements.)

It might be better think of our construction from above as relating to "complex projective $t$-designs".

Recall that complex projective space $\mathbb{C}P^{d-1}$ is the space of lines passing through the origin in $\mathbb{C}^d$. Each point $x \in \mathbb{C}P^{d-1}$ can be represented as a unit vector $\mid \psi \rangle \in \mathbb{C}^d$ modulo phase, or as a rank-1 projector $\Pi_\psi \equiv | \psi \rangle \langle \psi |$.

One way of defining a complex projective t-design:

$$\frac{1}{n^2} \sum_{i,j} f_{t}(|\langle \psi_{i} | \psi_{j} \rangle|^{2t}) = \int \int_{\mathbb{C}P^{d-1}} d\mu(\psi)d\mu(\phi) f_{t}(|\langle \psi | \phi \rangle|^2) $$

Where $f_{t}$ is a real polynomial of degree $t$ or less, and $\mu$ is the unique unitarily invariant probability measure on $\mathbb{C}P^{d-1}$ induced by the Haar measure on $U(d)$.

Alternatively:

$$ \int_{\mathbb{C}P^{d-1}} d\mu(\psi) \Pi_{\psi}^{\otimes t} = \binom{t+d-1}{d-1}^{-1}\Pi_{sym}^{(t)}$$

Where $\Pi_{sym}^{(t)}$ is the projector onto the totally symmetric subspace of $(\mathbb{C}^d)^{\otimes t}$. As I said above, this follows from the fact that the left hand side is invariant under all unitaries $U^{\otimes t}$ that act irriducibly on the totally symmetric subspace.

Noting that $|\langle \psi | \phi \rangle|^{2t} = Tr[\Pi_{\psi}^{\otimes t}\Pi_{\phi}^{\otimes t}]$, we can define a complex projective t-design via:

$$ \frac{1}{n} \sum_{i} (|\psi_{i} \rangle \langle \psi_{i}|)^{\otimes t} = \binom{t+d-1}{d-1}^{-1}\Pi_{sym}^{(t)}$$

It's worth noting that $t$-designs for $\mathbb{C}P^{d-1}$ exist for any $t$ and $d$, but:

$$ n \ge \binom{d + \lceil\frac{t}{2}\rceil -1 }{ \lceil \frac{t}{2} \rceil }\binom{d + \lfloor \frac{t}{2} \rfloor -1}{ \lfloor \frac{t}{2} \rfloor}$$

"Tight designs" achieve this bound.

Here's some nice facts:

Tight $t$-designs in $\mathbb{C}P^{1}$ are equivalent to tight spherical $t$-designs on the Euclidian 2-sphere. Such designs exist for $t=1,2,3,5$.

When $d \ge 3$, tight t-designs for $\mathbb{C}P^{d-1}$ only exist for $t=1,2,3$.

Tight 1-designs exist in all dimensions. Tight 2-designs have been conjectured to exist for all $d$: this is the famous SIC existence queation.

We could also have a weighted $t$-design, such that given weights $w_{i}$, we have:

$$ \sum_{i} w_{i} (|\psi_{i} \rangle \langle \psi_{i}|)^{\otimes t} = \binom{t+d-1}{d-1}^{-1}\Pi_{sym}^{(t)}$$

The unweighted case implictly had $w_{i} = \frac{1}{n}$.

It's also clear that a (weighted) t-design is also a weighted (t-1)-design: if we trace out a subsystem, we'll still get a projector onto the symmetric subspace.


Okay, so let's find some!

In [109]:
import jax
import jax.numpy as jp

@jax.jit
def real_to_complex(z):    
    return z[:len(z)//2] + 1j*z[len(z)//2:]

@jax.jit
def complex_to_real(z):
    return jp.concatenate((jp.real(z), jp.imag(z)))

def complex_projective_design(d, n, t):
    @jax.jit
    def frame_potential(V):
        R = real_to_complex(V).reshape(d, n)
        return jp.sum(jp.array([jp.abs(jp.dot(jp.conjugate(a), b))**(2*t) for b in R.T for a in R.T]))
    frame_potential_jac = jax.jit(jax.jacrev(frame_potential))
    @jax.jit
    def norm_constraint(V):
        R = real_to_complex(V).reshape(d, n)
        return jp.linalg.norm(jp.linalg.norm(R, axis=0) - jp.ones(n))**2
    norm_jac = jax.jit(jax.jacrev(norm_constraint))
    result = sc.optimize.minimize(frame_potential,\
                                  complex_to_real(np.random.randn(d*n)+1j*np.random.randn(d*n)),\
                                  jac=frame_potential_jac,\
                                  constraints=[{"type": "eq",\
                                                "fun": norm_constraint,\
                                                "jac": norm_jac},
                                              ],\
                                   tol=1e-14,\
                                   options={"ftol": 1e-14,\
                                            "disp": True,\
                                            "maxiter": 5000},\
                                   method="SLSQP")
    return real_to_complex(result.x).reshape(d, n)
In [110]:
d = 2
n = d**2
SIC = complex_projective_design(d, n, 2)
Optimization terminated successfully.    (Exit mode 0)
            Current function value: 5.3333353996276855
            Iterations: 247
            Function evaluations: 1842
            Gradient evaluations: 247

We should minimize the $t=2$ frame potential with $\frac{2d^3}{d+1}$.

In [111]:
2*(d**3)/(d+1)
Out[111]:
5.333333333333333

Now let's check this business with the projector onto the symmetric subspace. We should have:

$$ \binom{t+d-1}{d-1}^{-1} = \binom{d+1}{d-1}^{-1} = \frac{(d-1)!(d+1 -d +1)}{(d+1)!} = \frac{2}{d(d+1)}$$$$ \frac{1}{d^2} \sum_{i} \Pi_{i} \otimes \Pi_{i} = \frac{2}{d(d+1)}\Pi_{sym}^{(2)}$$
In [112]:
from itertools import permutations, product
import functools

def basis(d, n):
    v = np.zeros(d)
    v[n] = 1
    return v

def kron(elements):
    return functools.reduce(lambda x, y: np.kron(x, y), elements)

def flatten(to_flatten):
    return [item for sublist in to_flatten for item in sublist]

def symmetrize(pieces):
    perms = list(permutations(range(len(pieces))))
    vec =  sum([kron([pieces[p] for p in perm]) for perm in perms])
    return vec/np.linalg.norm(vec)

def symmetric_projector(d, n):
    d = [basis(d, i) for i in range(d)]
    labels = list(filter(lambda b: sum(b) == n, list(product(range(n+1), repeat=len(d)))[::-1]))
    sym_basis = [symmetrize(flatten([[d[i]]*b for i, b in enumerate(label)])) for label in labels]
    return sum([np.outer(s, s.conj()) for s in sym_basis])
In [113]:
np.linalg.eig(symmetric_projector(2,2))
Out[113]:
(array([1., 0., 1., 1.]),
 array([[ 0.      ,  0.      ,  1.      ,  0.      ],
        [ 0.707107,  0.707107,  0.      ,  0.      ],
        [ 0.707107, -0.707107,  0.      ,  0.      ],
        [ 0.      ,  0.      ,  0.      ,  1.      ]]))
In [114]:
(2/(d*(d+1)))*symmetric_projector(2,2)
Out[114]:
array([[0.333333, 0.      , 0.      , 0.      ],
       [0.      , 0.166667, 0.166667, 0.      ],
       [0.      , 0.166667, 0.166667, 0.      ],
       [0.      , 0.      , 0.      , 0.333333]])
In [115]:
np.round((1/d**2)*sum([np.kron(np.outer(r, r.conj()), np.outer(r, r.conj())) for r in SIC.T]), decimals=3)
Out[115]:
array([[ 0.333+0.j,  0.   +0.j,  0.   +0.j, -0.   +0.j],
       [ 0.   -0.j,  0.167+0.j,  0.167+0.j, -0.   +0.j],
       [ 0.   -0.j,  0.167+0.j,  0.167+0.j, -0.   +0.j],
       [-0.   -0.j, -0.   -0.j, -0.   -0.j,  0.333+0.j]], dtype=complex64)

Operator Frames

Denoting the vectorized version of an operator $A$ by $|A)$, we have the Hilbert-Schmidt inner product $(A|B) = tr[A^{\dagger}B]$.

In what follows, we'll be considering "superoperators" aka linear maps on operators.

Suppose we fix an orthonormal operator basis such that $(H_{i}|H_{j}) = \delta_{ij}$. For $\mathbb{C}^{d}$, this will have $d^2$ elements.

We can then define a superoperator in two different ways:

$$ \mathcal{S} = \sum_{i,j} s_{ij}H_{i} \odot H_{j}^{\dagger} = \sum_{i,j} s_{ij}|H_{i})(H_{j}|, s_{ij} \in \mathbb{C} $$

The first representation gives the "ordinary action" of the superoperator on an operator $A$:

$$ \mathcal{S}(A) = \sum_{i,j} s_{ij}H_{i} A H_{j}^{\dagger} $$

The second representation gives the "left-right action" of the superoperator:

$$ \mathcal{S}|A) = \sum_{i,j} s_{ij}|H_{i})(H_{j}|A) = \sum_{i,j}s_{ij}H_{i}tr[H_{j}^{\dagger}A] $$

In what follows, we'll make use of the identity superoperators:

$$\mathcal{I} = I_{d} \odot I_{d} = |I_{d})(I_{d}|$$
In [116]:
np.outer(np.eye(d).reshape(d**2), np.eye(d).reshape(d**2))
Out[116]:
array([[1., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 1.]])

and

$$\textbf{I} = \sum_{i} |H_{i})(H_{i}| = I_{n}$$
In [117]:
def operator_basis(d):
    return [np.outer(basis(d, i), basis(d, j)) for j in range(d) for i in range(d)]

H = operator_basis(d)
sum([np.outer(h.reshape(d**2), h.reshape(d**2).conj()) for h in H])
Out[117]:
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

Incidentally, let's check that we have $(H_{i}|H_{j}) = \delta_{ij}$:

In [118]:
np.array([[(H[i].conj().T @ H[j]).trace() for j in range(d**2)] for i in range(d**2)])
Out[118]:
array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

We can extend the notion of a frame to that of an operator frame, in other words, a set of operators $\{ E_{i} \}$, such that for some $a$ and $b$, and for all $C$ in the operator space:

$$ a(C|C) \le \sum_{i} |(E_{i}|C)|^2 \le b(C|C)$$

When $a=b$, we have a tight frame, and for an operator frame with $d^2$ elements, the frame is tight if and only if it is an orthonormal basis.

For any operator frame, there is a dual frame $\{ D_{i} \}$, such that:

$$ \sum_{i} |D_{i})(E_{i}| = \mathbf{I}$$

Thus any $C$ can be written:

$$ C = \sum_{i} (E_{i}|C) D_{i} = \sum_{i} (D_{i}|C) E_{i} $$

When $n > d^2$, the dual frame isn't unique. The canonical choice is defined in terms of the frame superoperator:

$$ F = \sum_{i} |E_{i})(E_{i}| $$

And is given by:

$$ |D_{i}) = F^{-1}|E_{i}) $$

So that:

$$\sum_{i} |D_{i})(E_{i}| = \sum_{i} F^{-1}|E_{i})(E_{i}| = F^{-1}F = \textbf{I}$$

Since for a tight operator frame $F = a\textbf{I}$, we have $|D_{i}) = \frac{1}{a}|E_{i})$. This is one way of seeing that a tight operator frame can't be a POVM: POVM elements are positive semi-definite, but dual elements are generally indefinite--and they can't be proportional to each other.


Okay, so let's try it out. We can get an operator frame from pretty much any old set of matrices: with probability one, they'll be linearly independent, and span the operator space. We'll then form the frame (super)operator $F$, its inverse $F^{-1}$, and get the canonical dual elements $\{ D_{i} \}$.

In [119]:
E = [np.random.randn(d,d)+1j*np.random.randn(d,d) for i in range(d**2)]
F = sum([np.outer(o.reshape(d**2), o.reshape(d**2).conj()) for o in E])
Finv = np.linalg.inv(F)
D = [(Finv @ e.reshape(d**2)).reshape(d,d) for e in E]

Notice:

In [120]:
(E[0].conj().T @ D[0]).trace()
Out[120]:
(1+9.71445146547012e-17j)

Now let's get a density matrix $\rho$ to play around with:

In [121]:
def rand_dm(d, n=2):
    A = np.random.randn(d, n) + 1j*np.random.randn(d, n) 
    rho = A @ A.conj().T
    return rho/rho.trace()
In [122]:
rho = rand_dm(2); rho
Out[122]:
array([[0.74103 +0.j      , 0.327131+0.000948j],
       [0.327131-0.000948j, 0.25897 +0.j      ]])

Let's expand it in operator frame elements:

In [124]:
p = np.array([(e.conj().T @ rho).trace() for e in E]); p
Out[124]:
array([ 0.719634-0.767312j, -0.784941+0.288272j,  0.685088+0.385522j,
        0.75119 -1.772347j])

And in dual elements:

In [125]:
q = np.array([(e.conj().T @ rho).trace() for e in D]); q
Out[125]:
array([-0.257759-0.021812j,  0.31128 +0.661881j,  0.234991+0.431962j,
        0.442281-0.221602j])

The former weight the dual elements:

In [127]:
sum([p[i]*D[i] for i in range(d**2)])
Out[127]:
array([[0.74103 -0.j      , 0.327131+0.000948j],
       [0.327131-0.000948j, 0.25897 +0.j      ]])

And the latter weight the frame elements:

In [128]:
sum([q[i]*E[i] for i in range(d**2)])
Out[128]:
array([[0.74103 -0.j      , 0.327131+0.000948j],
       [0.327131-0.000948j, 0.25897 -0.j      ]])

Finally, let's check out $\sum_{i} |D_{i})(E_{i}|$:

In [129]:
sum([np.outer(D[i].reshape(d**2), E[i].reshape(d**2).conj()) for i in range(d**2)])
Out[129]:
array([[ 1.+0.j,  0.-0.j,  0.-0.j,  0.-0.j],
       [-0.-0.j,  1.-0.j,  0.+0.j, -0.+0.j],
       [ 0.+0.j,  0.-0.j,  1.-0.j,  0.-0.j],
       [ 0.+0.j, -0.-0.j,  0.-0.j,  1.+0.j]])

Here's a nice trick: given any old operator frame, we can tighten it via:

$$ |E_{i}) \rightarrow F^{-\frac{1}{2}}|E_{i})$$
In [131]:
Fsqrt = sc.linalg.fractional_matrix_power(F, -1/2)
T = [(Fsqrt @ e.reshape(d**2)).reshape(d,d) for e in E]

Now $F = \sum_{i} |E_{i})(E_{i}| = \textbf{I}$:

In [132]:
TF = sum([np.outer(e.reshape(d**2), e.reshape(d**2).conj()) for e in T]); TF
Out[132]:
array([[ 1.+0.j,  0.+0.j, -0.+0.j, -0.+0.j],
       [ 0.-0.j,  1.+0.j, -0.-0.j, -0.-0.j],
       [-0.-0.j, -0.+0.j,  1.+0.j,  0.+0.j],
       [-0.-0.j, -0.+0.j,  0.-0.j,  1.+0.j]])

Let's expand $\rho$ in tight frame elements:

In [133]:
rho
Out[133]:
array([[0.74103 +0.j      , 0.327131+0.000948j],
       [0.327131-0.000948j, 0.25897 +0.j      ]])
In [134]:
p = np.array([(e.conj().T @ rho).trace() for e in T]); p
Out[134]:
array([ 0.079125-0.187423j, -0.098291+0.344842j,  0.231467+0.281337j,
        0.382004-0.61774j ])

And behold, the elements are the same as their duals:

In [135]:
sum([p[i]*T[i] for i in range(d**2)])
Out[135]:
array([[0.74103 +0.j      , 0.327131+0.000948j],
       [0.327131-0.000948j, 0.25897 -0.j      ]])

Finally, we have:

$$ \sum_{i,j} |(E_{i}|E_{j})|^2 \ge \frac{(Tr[F])^2}{d^2} $$

With equality for a tight operator frame, by Cauchy-Schwartz considerations.

In [136]:
sum([abs(np.vdot(T[i].reshape(d**2), T[j].reshape(d**2)))**2 for j in range(d**2) for i in range(d**2)])
Out[136]:
3.999999999999968
In [137]:
(TF.trace()**2)/d**2
Out[137]:
(3.999999999999969+0j)

(IC-)POVM's

A few conventions are worth establishing.

The POVM elements we'll denote $\{ E_{i} \}$.

In [139]:
def random_haar_povm(d, k=None, n=2):
    k = k if type(k) != type(None) else d**2
    povm = np.zeros((k, d, d), dtype=np.complex128)
    S = np.zeros(d, dtype=np.complex128)
    for i in range(k):
        Xi = (np.random.randn(d, n) + 1j*np.random.randn(d,n))/np.sqrt(2) 
        Wi = Xi @ Xi.conjugate().T
        povm[i, :, :] = Wi
        S = S + Wi
    S = sc.linalg.fractional_matrix_power(S, -1/2)
    for i in range(k):
        Wi = np.squeeze(povm[i, :, :])
        povm[i, :, :] = S @ Wi @ S
    return [e for e in povm]
In [140]:
E = random_haar_povm(d)

We'll denote the normalized elements $P_{i} = \frac{E_{i}}{Tr[E_{i}]}$.

In [141]:
P = [e/e.trace() for e in E]

Finally, we'll define the POVM superoperator:

$$ F = \sum_{i} tr[E_{i}]|P_{i})(P_{i}| $$
In [142]:
F = sum([E[i].trace()*np.outer(P[i].reshape(d**2), P[i].reshape(d**2).conj()) for i in range(d**2)])
np.linalg.eig(F)
Out[142]:
(array([1.      -0.j, 0.200444-0.j, 0.046406+0.j, 0.060062-0.j]),
 array([[ 0.707107+0.j      ,  0.430302+0.033293j, -0.363281-0.310925j,
         -0.080389-0.280396j],
        [-0.      +0.j      ,  0.560117+0.j      ,  0.080418+0.51467j ,
          0.64414 +0.j      ],
        [-0.      -0.j      ,  0.553451+0.086159j,  0.520915+0.j      ,
         -0.546292+0.341293j],
        [ 0.707107-0.j      , -0.430302-0.033293j,  0.363281+0.310925j,
          0.080389+0.280396j]]))

Note that the identity matrix is thus an eigenvector of $F$ with eigenvalue $1$:

In [143]:
F @ np.eye(d).reshape(d**2)
Out[143]:
array([ 1.+0.j, -0.+0.j, -0.-0.j,  1.-0.j])

In what follows, without loss of generality, we'll often be considering the case of equal weight POVM's.

In [144]:
# Generates a random unit norm tight (vector) frame aka a rank-1 equal weight POVM.
def random_funtf(d, n, rtol=1e-15, atol=1e-15):
    R = np.random.randn(d, n) + 1j*np.random.randn(d, n) 
    done = False
    while not (np.allclose(R @ R.T.conj(), (n/d)*np.eye(d), rtol=rtol, atol=atol) and\
               np.allclose(np.linalg.norm(R, axis=0), np.ones(n), rtol=rtol, atol=atol)):
        R = sc.linalg.polar(R)[0]
        R = np.array([state/np.linalg.norm(state) for state in R.T]).T
    return R

E = [(d/n)*np.outer(r, r.conj()) for r in random_funtf(d, d**2).T]
P = [e/e.trace() for e in E]
sum(E)
Out[144]:
array([[ 1.+0.j, -0.-0.j],
       [-0.+0.j,  1.+0.j]])

In this case:

$$ F = \frac{d}{n} \sum_{i} |P_{i})(P_{i}| $$
In [145]:
np.linalg.eig((d/n)*sum([np.outer(e.reshape(d**2), e.reshape(d**2).conj()) for e in P]))[0]
Out[145]:
array([1.      -0.j, 0.670354-0.j, 0.194656-0.j, 0.13499 +0.j])

Here's some facts whose proofs we omit:

A POVM superoperator is positive under the left/right action. It must have full rank to be informationally complete. $F$ and $F^{-1}$ map Hermitian to Hermitian operators, and elements and duals are Hermitian.


Tight IC-POVM's

So POVM's can't be tight operator frames per se, but maybe we can still define something analogous. Thus, we come to tight IC-POVM's.

First: we can embed the space of quantum states (density operators) into $\mathbb{R}^{d^2-1}$. First we map each $\rho$ to a traceless Hermitian operator:

$$ \rho \rightarrow \rho - \frac{1}{d}I $$

We have the Hilbert-Schmidt inner product $(A|B) = tr[A^{\dagger}B]$, and the Frobenius norm $||A|| = \sqrt{(A|A)}$. With this defined, the set of all traceless Hermitian operators forms a real inner product space.

Pure states lie on the surface of a $(d^2-2)$-sphere $|| |\psi \rangle \langle \psi | - \frac{1}{d}I || = \sqrt{\frac{d-1}{d}}$, and mixed states live inside the sphere.

In [146]:
rho = rand_dm(2, n=1) # pure state
np.linalg.norm(rho - np.eye(d)/d)
Out[146]:
0.7071067811865475
In [147]:
np.sqrt((d-1)/d) # radius
Out[147]:
0.7071067811865476

For $d=2$, this the familiar Bloch sphere representation: we embed $2 \times 2$ operators on $\mathbb{C}^{2}$ onto a ball in $\mathbb{R}^3$, where pure states live on the surface of the 2-sphere. The higher dimensional generalizations are more difficult to visualize.


Now consider the POVM superoperator of some arbitrary POVM. Recall we denote its normalized elements $P_{i}$. We have:

$$ F = \frac{1}{d}\mathcal{I} + \sum_{i}Tr[E_{i}]|P_{i} - \frac{1}{d}I)(P_{i} - \frac{1}{d}I| $$

Where $\mathcal{I} = \frac{1}{d}|I)(I|$.

In [148]:
E = random_haar_povm(d, d**2)
P = [e/e.trace() for e in E]
In [150]:
F = sum([E[i].trace()*np.outer(P[i].reshape(d**2), P[i].reshape(d**2).conj()) for i in range(d**2)]); F
Out[150]:
array([[ 0.535329-0.j      ,  0.002444-0.034646j,  0.002444+0.034646j,
         0.464671+0.j      ],
       [ 0.002444+0.034646j,  0.116376-0.j      , -0.045241-0.050318j,
        -0.002444-0.034646j],
       [ 0.002444-0.034646j, -0.045241+0.050318j,  0.116376-0.j      ,
        -0.002444+0.034646j],
       [ 0.464671-0.j      , -0.002444+0.034646j, -0.002444-0.034646j,
         0.535329-0.j      ]])
In [151]:
ID = np.outer(np.eye(d).reshape(d**2), np.eye(d).reshape(d**2)); ID
Out[151]:
array([[1., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 1.]])
In [152]:
ID/d + sum([E[i].trace()*np.outer((P[i] - np.eye(d)/d).reshape(d**2), (P[i] - np.eye(d)/d).reshape(d**2).conj()) for i in range(d**2)])
Out[152]:
array([[ 0.535329-0.j      ,  0.002444-0.034646j,  0.002444+0.034646j,
         0.464671+0.j      ],
       [ 0.002444+0.034646j,  0.116376-0.j      , -0.045241-0.050318j,
        -0.002444-0.034646j],
       [ 0.002444-0.034646j, -0.045241+0.050318j,  0.116376-0.j      ,
        -0.002444+0.034646j],
       [ 0.464671+0.j      , -0.002444+0.034646j, -0.002444-0.034646j,
         0.535329-0.j      ]])

Now $ \frac{1}{d}\mathcal{I} = \frac{1}{d}|I)(I|$ left/right projects onto the subspace spanned by the identity:

In [153]:
np.linalg.eig(ID/d)
Out[153]:
(array([1., 0., 0., 0.]),
 array([[ 0.707107, -0.707107,  0.      ,  0.      ],
        [ 0.      ,  0.      ,  1.      ,  0.      ],
        [ 0.      ,  0.      ,  0.      ,  1.      ],
        [ 0.707107,  0.707107,  0.      ,  0.      ]]))

Its orthogonal complement is the $(d^2 -1)$-dimensional subspace of traceless operators. So in terms of $\mathbf{I}= I_{n}$, we define:

$$ \Pi_{0} = \mathbf{I} - \frac{1}{d}\mathcal{I}$$

This operator projects onto the subspace of traceless operators.

In [154]:
PI0 = np.eye(d**2) - ID/d; PI0
Out[154]:
array([[ 0.5,  0. ,  0. , -0.5],
       [ 0. ,  1. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  0. ],
       [-0.5,  0. ,  0. ,  0.5]])
In [155]:
np.linalg.eig(PI0)
Out[155]:
(array([1., 0., 1., 1.]),
 array([[ 0.707107,  0.707107,  0.      ,  0.      ],
        [ 0.      ,  0.      ,  1.      ,  0.      ],
        [ 0.      ,  0.      ,  0.      ,  1.      ],
        [-0.707107,  0.707107,  0.      ,  0.      ]]))

Therefore, this operator realizes the embedding into the "Bloch body":

$$ \Pi_{0} | \rho ) = |\rho - \frac{1}{d}I)$$
In [156]:
rho = rand_dm(2); rho
Out[156]:
array([[ 0.752401+0.j      , -0.104281+0.165283j],
       [-0.104281-0.165283j,  0.247599+0.j      ]])
In [157]:
PI0 @ rho.reshape(d**2)
Out[157]:
array([ 0.252401+0.j      , -0.104281+0.165283j, -0.104281-0.165283j,
       -0.252401+0.j      ])
In [158]:
(rho - np.eye(d)/d).reshape(d**2)
Out[158]:
array([ 0.252401+0.j      , -0.104281+0.165283j, -0.104281-0.165283j,
       -0.252401+0.j      ])

We denote $ \textbf{I}_{0}$ the left/right identity superoperator for the space of traceless Hermitian operators.

Okay, so now for the punch line:

A tight IC-POVM is one for which the elements $P_{i}-\frac{1}{d}I$ form a tight operator frame for the space of traceless Hermitian operators. I.e.:

$$ \sum_{i} Tr[E_{i}] |P_{i} - \frac{1}{d}I)(P_{i} - \frac{1}{d}I| = a\textbf{I}_{0}$$

Or: $\Pi_{0}F\Pi_{0} = a\Pi_{0}$ for some $a > 0$.

Anticipating somewhat, let's try it out on the SIC we found before:

In [160]:
E = [(d/n)*np.outer(r, r.conj()) for r in SIC.T]
P = [e/e.trace() for e in E]
F = sum([E[i].trace()*np.outer(P[i].reshape(d**2), P[i].reshape(d**2).conj()) for i in range(d**2)])
In [164]:
np.round(PI0 @ F @ PI0, decimals=3)
Out[164]:
array([[ 0.167+0.j,  0.   +0.j,  0.   -0.j, -0.167+0.j],
       [ 0.   -0.j,  0.333+0.j, -0.   +0.j, -0.   +0.j],
       [ 0.   +0.j, -0.   -0.j,  0.333+0.j, -0.   -0.j],
       [-0.167+0.j, -0.   -0.j, -0.   +0.j,  0.167+0.j]])
In [162]:
PI0
Out[162]:
array([[ 0.5,  0. ,  0. , -0.5],
       [ 0. ,  1. ,  0. ,  0. ],
       [ 0. ,  0. ,  1. ,  0. ],
       [-0.5,  0. ,  0. ,  0.5]])

Proportionality looks good so far (up to numerical error!)

So: tight IC-POVM's are those with elements whose images under $\Pi_{0}$ form a tight operator frame for the space of traceless Hermitian matrices. As such, they are "as close as possible" to being orthonormal bases for the space of quantum states.

We can find the constant of proportionality $a$:

$$a = \frac{1}{d^2 -1}\sum_{i} Tr[E_{i}](P_{i} - \frac{1}{d}I|P_{i} - \frac{1}{d}I)$$
In [165]:
a = (1/(d**2 - 1))*sum([E[i].trace()*np.vdot((P[i] - np.eye(d)/d).reshape(d**2), (P[i] - np.eye(d)/d).reshape(d**2)) for i in range(d**2)]);a
Out[165]:
(0.33333332506703606+0j)

Behold:

In [166]:
(1/3)*PI0
Out[166]:
array([[ 0.166667,  0.      ,  0.      , -0.166667],
       [ 0.      ,  0.333333,  0.      ,  0.      ],
       [ 0.      ,  0.      ,  0.333333,  0.      ],
       [-0.166667,  0.      ,  0.      ,  0.166667]])

Simplifying things, we have for a tight IC-POVM:

$$ F = \frac{1}{d}\mathcal{I} + a\Pi_{0} = a\textbf{I} + \frac{1-a}{ad}\mathcal{I}$$

Recall: $ \mathcal{I} = |I)(I|$, and $\textbf{I} = I_{n}$.

In [167]:
ID/d + a*PI0
Out[167]:
array([[0.666667+0.j, 0.      +0.j, 0.      +0.j, 0.333333+0.j],
       [0.      +0.j, 0.333333+0.j, 0.      +0.j, 0.      +0.j],
       [0.      +0.j, 0.      +0.j, 0.333333+0.j, 0.      +0.j],
       [0.333333+0.j, 0.      +0.j, 0.      +0.j, 0.666667+0.j]])
In [168]:
np.round(F, decimals=3)
Out[168]:
array([[ 0.667+0.j,  0.   -0.j,  0.   +0.j,  0.333+0.j],
       [ 0.   +0.j,  0.333+0.j, -0.   +0.j, -0.   +0.j],
       [ 0.   -0.j, -0.   -0.j,  0.333+0.j, -0.   -0.j],
       [ 0.333+0.j, -0.   -0.j, -0.   +0.j,  0.666+0.j]], dtype=complex64)
In [169]:
a*np.eye(d**2) + ((1-a)/d)*ID
Out[169]:
array([[0.666667+0.j, 0.      +0.j, 0.      +0.j, 0.333333+0.j],
       [0.      +0.j, 0.333333+0.j, 0.      +0.j, 0.      +0.j],
       [0.      +0.j, 0.      +0.j, 0.333333+0.j, 0.      +0.j],
       [0.333333+0.j, 0.      +0.j, 0.      +0.j, 0.666667+0.j]])

Similarly, we have:

$$ F^{-1} = \frac{1}{a}\textbf{I} - \frac{1-a}{ad}\mathcal{I} $$

.

In [170]:
Finv = np.linalg.inv(F); np.round(Finv, decimals=3)
Out[170]:
array([[ 2.   +0.j   , -0.   +0.j   , -0.   -0.j   , -1.001+0.j   ],
       [-0.   -0.j   ,  2.999+0.j   ,  0.002-0.002j,  0.   -0.001j],
       [-0.   +0.j   ,  0.002+0.002j,  2.999+0.j   ,  0.   +0.001j],
       [-1.001-0.j   ,  0.   +0.001j,  0.   -0.001j,  2.001+0.j   ]],
      dtype=complex64)
In [172]:
np.eye(d**2)/a - ((1-a)/(a*d))*ID
Out[172]:
array([[ 2.+0.j,  0.+0.j,  0.+0.j, -1.+0.j],
       [ 0.+0.j,  3.+0.j,  0.+0.j,  0.+0.j],
       [ 0.+0.j,  0.+0.j,  3.+0.j,  0.+0.j],
       [-1.+0.j,  0.+0.j,  0.+0.j,  2.+0.j]])

And for the dual elements:

$$ D_{i} = \frac{1}{a}P_{i} - \frac{1-a}{ad}I $$
In [173]:
D = [(Finv @ e.reshape(d**2)).reshape(d,d) for e in P]; D[0]
Out[173]:
array([[ 0.478568+0.j      , -1.109467+1.009989j],
       [-1.109467-1.009989j,  0.520477+0.j      ]], dtype=complex64)
In [174]:
D2 = [e/a - ((1-a)/(a*d))*np.eye(d) for e in P]; D2[0]
Out[174]:
array([[ 0.47899 +0.j      , -1.108152+1.010721j],
       [-1.108152-1.010721j,  0.52101 +0.j      ]])

And we can express some $\rho$ as:

$$\rho = \frac{1}{a}\sum_{i} p_{i}P_{i} - \frac{1-a}{ad}I$$
In [178]:
p = np.array([(e.conj().T @ rho).trace() for e in E]).real; p
Out[178]:
array([0.342437, 0.341537, 0.149521, 0.166767])
In [179]:
rho
Out[179]:
array([[ 0.752401+0.j      , -0.104281+0.165283j],
       [-0.104281-0.165283j,  0.247599+0.j      ]])
In [180]:
(1/a)*sum([p[i]*P[i] for i in range(d**2)]) - ((1-a)/(a*d))*np.eye(d)
Out[180]:
array([[ 0.752979+0.j      , -0.104117+0.165952j],
       [-0.104117-0.165952j,  0.247807+0.j      ]])

Simplifying even further, in the case of a tight rank-one IC-POVM, we have:

$$ a = \frac{1}{d+1} $$
In [181]:
1/(d+1)
Out[181]:
0.3333333333333333

So that:

$$F = \frac{\textbf{I}+\mathcal{I}}{d+1}$$
In [182]:
(np.eye(d**2) + ID)/(d+1)
Out[182]:
array([[0.666667, 0.      , 0.      , 0.333333],
       [0.      , 0.333333, 0.      , 0.      ],
       [0.      , 0.      , 0.333333, 0.      ],
       [0.333333, 0.      , 0.      , 0.666667]])
In [183]:
np.round(F, decimals=3)
Out[183]:
array([[ 0.667+0.j,  0.   -0.j,  0.   +0.j,  0.333+0.j],
       [ 0.   +0.j,  0.333+0.j, -0.   +0.j, -0.   +0.j],
       [ 0.   -0.j, -0.   -0.j,  0.333+0.j, -0.   -0.j],
       [ 0.333+0.j, -0.   -0.j, -0.   +0.j,  0.666+0.j]], dtype=complex64)

And:

$$ \rho = (d+1)\sum_{i} p_{i}|\psi_{i}\rangle\langle\psi_{i}| - I $$
In [186]:
(d+1)*sum([p[i]*P[i] for i in range(d**2)]) - np.eye(d)
Out[186]:
array([[ 0.752979+0.j      , -0.104117+0.165952j],
       [-0.104117-0.165952j,  0.247807+0.j      ]])
In [187]:
rho
Out[187]:
array([[ 0.752401+0.j      , -0.104281+0.165283j],
       [-0.104281-0.165283j,  0.247599+0.j      ]])

We can rewrite $F = \sum_{i} Tr[E_{i}]|P_{i})(P_{i}| = \frac{1}{d+1} (\textbf{I} + \mathcal{I})$ in terms of the ordinary superoperator action as:

$$ \sum_{i} Tr[E_{i}]\Pi_{i} \odot \Pi_{i} = \frac{1}{d+1}(\sum_{i} H_{i} \odot H_{i}^{\dagger} + I \odot I) $$

Where the $H_{i}$'s are some orthonormal operator basis.

In [188]:
sum([E[i].trace()*P[i] @ rho @ P[i] for i in range(d**2)])
Out[188]:
array([[ 0.584326+0.j      , -0.034706+0.055317j],
       [-0.034706-0.055317j,  0.415936-0.j      ]])
In [189]:
H = operator_basis(d)
(1/(d+1))*(sum([ H[i] @ rho @ H[i].conj().T   for i in range(d**2)]) + np.eye(d) @ rho @ np.eye(d))
Out[189]:
array([[ 0.584134+0.j      , -0.03476 +0.055094j],
       [-0.03476 -0.055094j,  0.415866+0.j      ]])

And then in terms of the left/right action:

$$ \sum_{i} Tr[E_{i}]\Pi_{i} \otimes \Pi_{i} = \frac{1}{d+1}(\sum_{i} H_{i} \otimes H_{i}^{\dagger} + I \otimes I) $$
In [192]:
np.round(sum([E[i].trace()*np.kron(P[i], P[i]) for i in range(d**2)]), decimals=3)
Out[192]:
array([[ 0.667+0.j,  0.   +0.j,  0.   +0.j, -0.   +0.j],
       [ 0.   -0.j,  0.333+0.j,  0.333+0.j, -0.   +0.j],
       [ 0.   -0.j,  0.333+0.j,  0.333+0.j, -0.   +0.j],
       [-0.   -0.j, -0.   -0.j, -0.   -0.j,  0.666+0.j]], dtype=complex64)
In [191]:
(1/(d+1))*(sum([np.kron(H[i], H[i].conj().T) for i in range(d**2)]) + np.kron(np.eye(d), np.eye(d)))
Out[191]:
array([[0.666667, 0.      , 0.      , 0.      ],
       [0.      , 0.333333, 0.333333, 0.      ],
       [0.      , 0.333333, 0.333333, 0.      ],
       [0.      , 0.      , 0.      , 0.666667]])

Using the fact that for any orthonormal basis, we can write:

$$ \sum_{i} E_{i} \otimes E_{i}^{\dagger} = T = \sum_{i,j} |e_{i}\rangle\langle e_{j}| \otimes |e_{j}\rangle\langle e_{i}|$$

Where $|e_{i}\rangle$ is some orthonormal basis, and $T$ is a swap operator, we have:

$$ \sum_{i} Tr[E_{i}]\Pi_{i} \otimes \Pi_{i} = \frac{1}{d+1}(T + I \otimes I) = \frac{2}{d+1}\Pi_{sym}^{(2)}$$
In [78]:
T = sum([np.kron(np.outer(basis(d, i), basis(d, j)), np.outer(basis(d, j), basis(d, i))) for j in range(d) for i in range(d)]); T
Out[78]:
array([[1., 0., 0., 0.],
       [0., 0., 1., 0.],
       [0., 1., 0., 0.],
       [0., 0., 0., 1.]])
In [79]:
(1/(d+1))*(T + np.kron(np.eye(d), np.eye(d)))
Out[79]:
array([[0.666667, 0.      , 0.      , 0.      ],
       [0.      , 0.333333, 0.333333, 0.      ],
       [0.      , 0.333333, 0.333333, 0.      ],
       [0.      , 0.      , 0.      , 0.666667]])
In [198]:
(2/(d+1))*symmetric_projector(d, d)
Out[198]:
array([[0.666667, 0.      , 0.      , 0.      ],
       [0.      , 0.333333, 0.333333, 0.      ],
       [0.      , 0.333333, 0.333333, 0.      ],
       [0.      , 0.      , 0.      , 0.666667]])

So we've shown that a tight IC-POVM is a 2-design.

And indeed, the unique minimal tight rank-one IC-POVM in each dimension has $d^2$ elements and corresponds to a SIC.

$$ (\Pi_{i}|\Pi_{j}) = |\langle \psi_{i} | \psi_{j} \rangle|^2 = \frac{d \delta_{ij} + 1}{d+1}$$

Embedded in $\mathbb{R}^{d^2 -1}$ the elements of a SIC correspond to the vertices of a regular simplex:

$$ \frac{d}{d-1}(\Pi_{i} - \frac{1}{d}I | \Pi_{j} - \frac{1}{d}I) = \frac{d^2 \delta_{ij} - 1}{d^2 -1 }$$

But not all simplices are POVM's. (The factor of $\frac{d}{d-1}$ is due to embedding into a sphere of radius $\sqrt{\frac{d-1}{d}}$).

A uniform POVM has weights $\frac{d}{n}$, and equiangular when $(\Pi_{i}|\Pi_{j}) = c$, and SIC POVM's are uniquely picked out by these criteria.


Another example of a tight rank-one IC-POVM is a complete set of so-called mutually unbiased bases (MUB's): a set of $d+1$ othonormal bases for $\mathbb{C}^d$ with constant overlap $\frac{1}{d}$ between elements of different bases.

$$\begin{equation} (\Pi(e_{i}^{l}) | \Pi(e_{j}^{m})) = \left\{ \begin{array}{@{}ll@{}} \delta_{ij}, & \text{if}\ l=m \\ \frac{1}{d}, & \text{otherwise} \end{array}\right. \end{equation}$$

We should weight with $\frac{d}{n} = \frac{d}{d(d+1)} = \frac{1}{d+1}$.

For example, we can get one via the Pauli eigenstates:

$$\mathscr{B}_{Z} = \{ \begin{pmatrix} 1 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 1 \end{pmatrix} \}, \mathscr{B}_{Y} = \{ \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ i \end{pmatrix}, \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ -i \end{pmatrix} \}, \mathscr{B}_{X} = \{ \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ 1 \end{pmatrix}, \frac{1}{\sqrt{2}}\begin{pmatrix} 1 \\ -1 \end{pmatrix} \}$$
In [194]:
X = [np.array([1,1])/np.sqrt(2), np.array([1,-1])/np.sqrt(2)]
Y = [np.array([1,1j])/np.sqrt(2), np.array([1,-1j])/np.sqrt(2)]
Z = [np.array([1,0]), np.array([0,1])]

MUB = [np.outer(x, x.conj()) for x in X] + [np.outer(y, y.conj()) for y in Y] + [np.outer(z, z.conj()) for z in Z]
In [195]:
np.array([[(a.conj().T @ b).trace() for b in MUB] for a in MUB]).real
Out[195]:
array([[1. , 0. , 0.5, 0.5, 0.5, 0.5],
       [0. , 1. , 0.5, 0.5, 0.5, 0.5],
       [0.5, 0.5, 1. , 0. , 0.5, 0.5],
       [0.5, 0.5, 0. , 1. , 0.5, 0.5],
       [0.5, 0.5, 0.5, 0.5, 1. , 0. ],
       [0.5, 0.5, 0.5, 0.5, 0. , 1. ]])
In [199]:
(1/(d+1))*sum([np.kron(e, e) for e in MUB])
Out[199]:
array([[0.666667+0.j, 0.      +0.j, 0.      +0.j, 0.      +0.j],
       [0.      +0.j, 0.333333+0.j, 0.333333+0.j, 0.      +0.j],
       [0.      +0.j, 0.333333+0.j, 0.333333+0.j, 0.      +0.j],
       [0.      +0.j, 0.      +0.j, 0.      +0.j, 0.666667+0.j]])
In [201]:
F_MUB = (1/(d+1))*sum([np.outer(e.reshape(d**2), e.reshape(d**2).conj()) for e in MUB]); F_MUB
Out[201]:
array([[0.666667+0.j, 0.      +0.j, 0.      +0.j, 0.333333+0.j],
       [0.      +0.j, 0.333333+0.j, 0.      +0.j, 0.      +0.j],
       [0.      +0.j, 0.      +0.j, 0.333333+0.j, 0.      +0.j],
       [0.333333+0.j, 0.      +0.j, 0.      +0.j, 0.666667+0.j]])

Finally, we have for an arbitrary POVM:

$$ \sum_{i,j} Tr[E_{i}]Tr[E_{j}] |(P_{i}|P_{j})|^2 \ge 1 + \frac{(Tr[F]-1)^2}{d^2 -1 }$$

With equality if we have a tight IC-POVM.

In [85]:
sum([E[i].trace()*E[j].trace()*abs(np.vdot(P[i].reshape(d**2), P[j].reshape(d**2)))**2 for j in range(d**2) for i in range(d**2)])
Out[85]:
(1.333333446217599+0j)
In [86]:
1 + ((F.trace() - 1)**2)/(d**2 -1)
Out[86]:
(1.3333332538604783+0j)
In [87]:
1 + ((F_MUB.trace() - 1)**2)/(d**2 -1)
Out[87]:
(1.3333333333333328+0j)

A POVM is rank-one if and only if $Tr[F] = d$.

In [88]:
F.trace()
Out[88]:
(1.9999999+0j)

So we can finagle this into:

$$ \sum_{i,j} Tr[E_{i}]Tr[E_{j}]|(P_{i}|P_{j})|^2 \ge \frac{2d}{d+1}$$

With equality if F is a tight IC-POVM.

In [89]:
2*d/(d+1)
Out[89]:
1.3333333333333333

Optimal Tomography

Suppose we take $N$ random samples $s_{1}, \dots s_{N}$ (i.e. perform our POVM $N$ times), and that outcome $i$ occurs with some "unknown probability" $p(i)$. Given the samples, our estimate for the probability could be written:

$$ \hat{p}(i) = \hat{p}(i; s_{1}, \dots s_{N}) = \frac{1}{N} \sum_{k=1}^{N} \delta_{i, s_{k}}$$

So that the expectation $E[\hat{p}(i)] = p(i)$.

Meanwhile, after some algebra, the expected covariance for $N$ samples can be written:

$$ E[(p(i) - \hat{p}(i))(p(j) - \hat{p}(j))] = \frac{1}{N}[p(i)\delta_{i,j} - p(i)p(j)]$$

Given the samples, our estimate for $\rho$ is:

$$ \hat{\rho} = \sum_{i} \frac{1}{Tr[E_{i}]}\hat{p}(i)D_{i} $$

The error of the estimate is:

$$ ||\rho - \hat{\rho}||^2 = (\rho - \hat{\rho}|\rho - \hat{\rho}) = \sum_{i,j} \frac{1}{Tr[E_{i}]Tr[E_{j}]}(p(i) - \hat{p}(i))(p(j) - \hat{p}(j))(D_{i}|D_{j})$$

The expectation of the error:

$$ E[||\rho - \hat{\rho}||^2 ] = \frac{1}{N} \sum_{i,j} \frac{1}{Tr[E_{i}]Tr[E_{j}]} (p(i)\delta_{i,j} - p(i)p(j))(D_{i}|D_{j}) = \frac{1}{N} (\sum_{i} \frac{1}{Tr[E_{i}]^2} p(i)(D_{i}|D_{i}) - Tr[\rho^2])$$

We therefore want to minimize $\sum_{i} \frac{1}{Tr[E_{i}]^2}p(i)(D_{i}|D_{i})$.

The POVM which minimizes this quantity, however, will generally depend on the paticular quantum state. So we want to average over states. We'll set $\rho = U\sigma U^{\dagger}$, and take the Haar average of $U = U(d)$, recalling that we have $p(i) = Tr[E_{i}\rho]$.

$$ \int_{U(d)} d\mu_{U} (\sum_{i} \frac{1}{Tr[E_{i}]^2}Tr[E_{i}U\sigma U^{\dagger}](D_{i}|D_{i}) $$

This works out to be:

$$ \frac{1}{d} (\sum_{i} \frac{1}{Tr[E_{i}]^2} Tr[E_{i}] Tr[\sigma](D_{i}|D_{i}) $$

But we know $Tr[\sigma] = 1$, and so:

$$ \frac{1}{d} \sum_{i} \frac{1}{Tr[E_{i}]} (D_{i}|D_{i}) $$

And if it isn't clear:

$$ Tr[F^{-1}] = \sum_{i} \frac{1}{Tr[E_{i}]} (D_{i}|D_{i}) $$

So we want to minimize $Tr[F^{-1}]$.


Now we've been supposing that $\{ D_{i} \}$ is the canonical dual frame. Suppose it weren't, and instead we used some other dual: $\{ B_{i} \}$.

Omitting the normalization, we can show that:

$\sum_{i} (D_{i}|D_{i}) \le \sum_{i} (B_{i}|B_{i}) $

Defining $C = B - D$, we have:

$$ \sum_{i} |D_{i})(C_{i}| = \sum_{i}|D_{i})(B_{i}| - \sum_{i} |D_{i})(D_{i}| = \sum_{i} F^{-1}|E_{i})(B_{i}| - \sum_{i} F^{-1}|E_{i})(E_{i}|F^{-1}$$

Which amounts to:

$$ F^{-1}\textbf{I} - F^{-1}FF^{-1} = 0$$

This means that $\sum_{i} (C_{i}|D_{i}) = 0$, so:

$$ \sum_{i} (B_{i}|B_{i}) = \sum_{i} (D_{i}|D_{i}) + \sum_{i} (D_{i}|C_{i}) + \sum_{i}(C_{i}|D_{i}) + \sum_{i}(C_{i}|C_{i}) $$$$ = \sum_{i} (D_{i}|D_{i}) + \sum_{i}(C_{i}|C_{i}) $$

So that $\sum_{i} (D_{i}|D_{i}) \le \sum_{i} (B_{i}|B_{i}) $.

Thus it suffices to minimize $ \sum_{i} \frac{1}{Tr[E_{i}]}(D_{i}|D_{i}) = Tr[F^{-1}]$.

It turns out:

$$ Tr[F^{-1}] \ge d(d(d+1) -1)$$

With equality for a tight rank-one IC-POVM.

We have $Tr[F^{-1}] = \sum_{k=1}^{d^2} \frac{1}{\lambda_{k}}$, where the $\lambda_{k}$'s are the left/right eigenvalues of $F$.

Now:

$$ Tr[F] = \sum_{k=1}^{d^2} \lambda_{k} = \sum_{i} Tr[E_{i}](P_{i}|P_{i}) \leq d$$

Since $Tr[P^2] \leq 1$.

But we know that the identity operator is a left/right eigenvector of $F$ with eigenvalue 1. So:

$$\sum_{k=2}^{d^2} \lambda_{k} \leq d-1$$

Working the numbers, we'll get a minimum if and only if the other $\lambda_{k}$'s are:

$\lambda_{k} = \frac{d-1}{d^2 -1} = \frac{1}{d+1}$, so that:

$$ F = 1 \cdot \frac{1}{d}\mathcal{I} + \frac{1}{d+1}\cdot\Pi_{0} = \frac{\textbf{I} + \mathcal{I}}{d+1}$$

.

So $Tr[F^{-1}]$ takes its minimum if and only if $F$ corresponds to a tight rank-one IC-POVM, and the minimum is $ 1 \cdot 1 + (d+1)\cdot(d^2 -1) = d(d(d+1) -1)$.

In [202]:
E = [(d/n)*np.outer(r, r.conj()) for r in SIC.T]
P = [e/e.trace() for e in E]
F = (d/n)*sum([np.outer(e.reshape(d**2), e.reshape(d**2).conj()) for e in P])
Finv = np.linalg.inv(F)
D = [(Finv @ e.reshape(d**2)).reshape(d,d) for e in E]
In [203]:
Finv.trace()
Out[203]:
(10.0000105-8.272215e-23j)
In [204]:
d*(d*(d+1)-1)
Out[204]:
10

So we've shown that it's optimal to use tight rank-one IC-POVM's for (linear) quantum state tomography. (For more discussion of the assumptions here, check out the paper!)