Hey again! This my second post, and the first of a short "trilogy of the unit sphere" that I intend to write before Christmas. The two other posts will deal with Gaussian-like samplings on the unit-sphere, and even distributions of points on the unit sphere.
More than simple mathematical curiosities, these three subjects turn out to be quite useful in the daily practice, and even though considered simple at a glance by most, you will see that they are far from trivial problems. If you're not familiar with random sampling, you might be interested to read the next paragraph. Otherwise, jump to the second one.
Pseudo-Random Sampling
An essential task when dealing with probabilistic models is that of generating pseudo-random numbers from arbitrary distributions. For those who are not comfortable with this last sentence, I'll break it down for you:
- Probabilistic model is what it says; it describes a natural or artificial process, which behavior is either truly random, or too complex to describe analytically. In both cases, the process is formalized in terms of quantitative variables caracterizing either its causes or consequences (note: the famous Bayes formula allows to switch between the two), whereas its behavior is specified as a function of these quantities. This "behavior" function is called the Probability Density Function (pdf) of the random process, or distribution, and serves as a measure of "How likely" a given combination of causes and effects is.
- Pseudo-random numbers in Computer Science usually refer to infinite sets of numbers, from which finite sequences can be extracted and considered as good approximations of truly random processes. Obviously these processes are always discrete in practice, but can approximate continuous ones. Note that random does not imply uniformly random (meaning that the behavior of the corresponding process may not be trivial); if we were to sort one such sequence in ascending order, and analyze the differences between two consecutive terms, we would notice "lower difference values" at places where the underlying pdf is high (meaning that a lot of samples are concentrated there, we talk about the modes of the pdf).
What I meant by "generating pseudo-random numbers from arbitrary distribution" should now be clearer. There are two standard methods to achieve this:
- Rejection Method: it applies when the behavior of the process is unknown, but "resembles" that of a known process. We're talking about maths here, so "resemblance" should be expressed formally, and the worse difference possible should be statically bounded. When this is true, rejection sampling allows to generate random variables that conform with the unknown behavior, with a "trial and error" approach so to speak.
- Inverse Distribution Sampling: these apply when the analytical expression of the unknown pdf is nice enough to be inverted. Why this is nice is out-of-scope here, but you can find more details here is you're interested.
The method I present next is not standard, and rather exploits an elegant approach to solving the specific problem of generating uniformly distributed samples on the unit sphere. Nevertheless, you might be interested in more standard approaches, such as those described in this article, and in this wiki.
Isotropic Gaussian Distribution's Level-sets
I bet you can tell where this is going from the title ;) Yes, it's that simple, but we're going to detail a little more why/how it works. For those who don't have time for a little chat, here is the solution: \[ \frac{1}{\sqrt{ x^2 + y^2 + z^2 }} \begin{pmatrix} x \\ y \\ z \end{pmatrix} \] where $x,y,z \sim \mathcal{N}(0,1)$. And.. why would this be uniformly distributed on the unit sphere?? Think about it. You've probably seen a Bell-curve in 3D before (look here), didn't you? Have you ever noticed that if you were to "cut" this surface with an horizontal plane, the section would be a circle?
If yes, then you will have no difficulty understanding that if we did the exact same thing, but in 4D (using a 3D space instead of a plane to cut the hypersurface), we would obtain a sphere centered at 0. These "sections", because they correspond to the set of points at the same level on the (hyper)surface, are called level-sets, and given what we just said, it happens that the level-sets of (n+1)-variate Gaussians are concentric n-spheres.
From there, there are two things to say to prove the above formulae:
- Because isotropic n-variate Gaussian distributions are separable, taking an n-variate sample is equivalent to taking n univariate samples;
- Given that any sample of such distribution is on a level-set, we can concentrate all samples on the unit sphere by simply dividing by the radius, which is actually the Euclidian norm of the sample.
Elegant right? ;) I hope you enjoyed this post as much as I did, thanks for reading!