Monte Carlo integration by Vegas¶

Vegas Algorithm¶

We proved before that the best weight function has the propery: \begin{eqnarray} \left[\frac{|f(x_1,x_2,\cdots,x_n)|^2}{(w(x_1,x_2,\cdots,x_n))^2}\right] = const \end{eqnarray}

The Vegas method is primarily based on the importance sampling algorithm with the above mentioned self-adapting strategy. The basic idea is to use a separable weight function. Thus instead of complicated $w(x_1,x_2,x_3,\cdots)$ one uses an ansatz $w=w_1(x_1)w_2(x_2)w_3(x_3)\cdots$.

The optimal separable weight functions are \begin{eqnarray} (w_1(x_1))^2 \propto \left[\int dx_2 dx_3\cdots dx_n \frac{|f(x_1,x_2,x_3,\cdots,dx_n)|^2}{w_2(x_2)w_3(x_3)\cdots w_n(x_n)}\right] \end{eqnarray} as follows from our ansatz.

Proof of optimal weight function for separable weight:

As before, we want to minimize $$\langle(\frac{f}{w})^2\rangle-\langle\frac{f}{w}\rangle^2$$ with constraint $\int w dV=1$

\begin{eqnarray} \frac{\delta}{\delta w_i(x_i)}\left[ \int \frac{f(x_1\cdots x_n)^2}{(w_1(x_1)\cdots w_n(x_n))^2} w_1(x_1)\cdots w_n(x_n)dV - \left(\int \frac{f(x_1\cdots x_n)}{w_1(x_1)\cdots w_n(x_n)} w_1(x_1)\cdots w_n(x_n)dV\right)^2 +\sum_i \lambda_i\left(\int w_i(x_i) dx_i -1 \right) \right]=0 \end{eqnarray}

simplifies to

\begin{eqnarray} \frac{\delta}{\delta w_i(x_i)}\left[ \int \frac{f(x_1\cdots x_n)^2}{w_1(x_1)\cdots w_n(x_n)} dV - \left(\int f(x_1\cdots x_n) dV\right)^2 +\sum_i \lambda_i\left(\int w_i(x_i) dx_i -1 \right) \right]=0 \end{eqnarray}

and gives

\begin{eqnarray} -\frac{1}{(w_i(x_i))^2}\int \frac{f(x_1\cdots x_n)^2}{\prod_{j\ne i} w_j(x_j)} \prod_{j\ne i} dx_j +\lambda_i =0 \end{eqnarray}

Finally

\begin{eqnarray} (w_i(x_i))^2 \propto \int \frac{f(x_1\cdots x_n)^2}{\prod_{j\ne i} w_j(x_j)^2} \prod_{j\ne i} w_j(x_j) dx_j \end{eqnarray}

The power of Vegas is that by iteration, it can resolve any divergent point which is separable, i.e., it is parallel to any axis. However, when the divergency is along diagonal Vegas is similar to usual MC sampling. Note that separable ansatz avoids the explosion of stratified regions, which scale as $K^d$, while separable ansatz scales as $K\times d$. ($K$ is number of points in one dimension, i.e., typically $K\approx 10^2$, $d\approx 10 \rightarrow$ ZettaByte versus KiloByte).

Vegas also does not compute exactly the best weight functions $w_i(x_i)$ derived above. It rather approximates them on a very rough mesh (any number of points in the grid makes algorithm more efficient than naive Monte Carlo) but does not introduce systematic error. In the limit of infinite number of random numbers integral is exact regardless of how well are weight functions approximated on finite mesh.

We will slightly change the notation now, and instead of weight functions $w_i(x_i)$, we will talk about grid functions $g_i(x_i)$, which are related to $w_i(x_i)$ by $\frac{1}{w_i(x_i)}=\frac{dg_i(x_i)}{dx_i}$.

The algorithm starts with the product of grid functions $g_i(x)$ (separable weight functions). We want to evaluate \begin{eqnarray} &&\int_{a_1}^{b_1} dg_1 \int_{a_2}^{b_2} dg_2\cdots\int_{a_n}^{b_n} dg_n f(g_1,g_2,\cdots,g_n)= \nonumber\\ &&\int_0^1 dx_1\int_0^1 dx_2\cdots\int_0^{x_n} f(g_1(x_1),g_2(x_2),\cdots,g_n(x_n)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\cdots\frac{dg_n}{dx_n} \end{eqnarray} Since we casted the problem into multiple 1D problems, we can simply looked at this problem as the change of integration variable. Note that all $x_i$ are not distributed between $[0,1]$ in hypercubic box.

Note that $g_1$ depends only on $x_1$, and $g_2$ only on $x_2$, so that the Jacobian of such separable set of functions is just the product of all derivatives.

We first start with the grid functions $g_i(x)=a_i + x (b_i-a_i)$, so that the integration at the first iteration is equivalent to the usual Monte Carlo sampling.

We generate a few thousand sets of random points $(x_1,x_2,\cdots,x_n)$ and evaluate $f$ on these points. During the sampling we evaluate the integral \begin{eqnarray} ⟨f⟩=\sum_{x_1,x_2,\cdots,x_n} f(g_1(x_1),g_2(x_2),\cdots,g_n(x_n)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\cdots\frac{dg_n}{dx_n}, \label{vegas_int} \end{eqnarray} Note that random points are uniformly distributed on mesh $x_i$ ($x_i \in [0,1) $), therefore the unit volume $\int dx dy \cdots =1$.

We will also construct the projection of the function $f$ to all dimenions. We start by sampling the following functions: \begin{eqnarray} f_1(x_1) = \sum_{x_2,x_3} |f(g_1(x_1),g_2(x_2),g_3(x_3)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\frac{dg_3}{dx_3}|^2\\ f_2(x_2) = \sum_{x_1,x_3} |f(g_1(x_1),g_2(x_2),g_3(x_3)) \frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\frac{dg_3}{dx_3}|^2\\ f_3(x_3) = \sum_{x_1,x_2} |f(g_1(x_1),g_2(x_2),g_3(x_3))\frac{dg_1}{dx_1}\frac{dg_2}{dx_2}\frac{dg_3}{dx_3}|^2 \label{vegas_weight} \end{eqnarray} and then we normalize these projected functions and take the square root, because we squared the function before (notice that optimal weight function squared is proportional to $f^2$). The projection of the function to each dimension is than \begin{eqnarray} \widetilde{f}_i(x)= \sqrt{\frac{f_i(x)}{\int_0^1 f_i(x) dx}}, \end{eqnarray} so that we map the interval $[0,1]\rightarrow [0,1]$. This is choosen as our weight function $w_i\propto \widetilde{f}_i$.

In Vegas the weight functions are $$w_i(x) = \frac{1}{\frac{dg_i(x)}{dx}} \propto \widetilde{f}_i(g_i(x))$$ This requires that \begin{eqnarray} \widetilde{f}_i(g_i(x))\frac{dg_i(x)}{dx}=const \end{eqnarray}

This can also be understood as simple 1D problem involving a change of variable. We aim to find a refined grid $g(x)$, which is the best mesh for function $\widetilde{f}$. As proven in statistical part of the course, this requires the cumulative distribution to be constant, $$ \int_{g_{l}}^{g_{l+1}} \widetilde{f}(g) dg = const$$ for every $l$. In another words, we want each small interval of the integral to contribute equally to the integral, when using the improved grid function $g(x)$.

We can also write \begin{eqnarray} \int \widetilde{f}(g)dg = \int \widetilde{f}(g(x))\frac{dg}{dx}dx, \end{eqnarray} we would like to have \begin{eqnarray} \widetilde{f}(g(x))\frac{dg}{dx} = const \end{eqnarray} which is the same as requirement above.

To determine the new grid $g_l$, we will determine the grip points numerically so that \begin{eqnarray} \int_0^1 \widetilde{f}(g_{old}) dg_{old} = I\\ \int_{g^{new}_{l-1}}^{g^{new}_l} \widetilde{f}(g_{old})dg_{old} = \frac{I}{N_g} \label{f_int} \end{eqnarray} where $N_g$ is the number of gridpoints. Hence, we require that there is exactly the same weight between each two consecutive points (between $g_{0}$ and $g_1$, and between $g_1$ and $g_2$,....).

Once we have a new grid, we can restart the sampling of the integral of $f$, just like in Eqs. for $⟨f⟩$, $f_1(x_1)$, $f_2(x_2)$, $f_3(x_3),$ but using $g_{new}$. We generate again a few thousand random set of points $(x_1,x_2,x_3)$ and obtain $\widetilde{f}_i(x)$ functions. We then repeat this procedure approximately 10-times, and we can slightly increase the number of random points each time, as the grid functions becomes more are more precise. At the end, we can run a single long run with 10-times longer sampling, to reduce sampling error-bar.

In practice, we will implement one more tricks, to make the algorithm more stable. When we compute the separable function $\widetilde{f}_i(x)$, we will smoothen it, because we want to avoid accessive discontinuities due to finite number of random points. We will just average over nearest neighbors \begin{eqnarray} \widetilde{f}_i \leftarrow \frac{\widetilde{f}_{i-1}+\widetilde{f}_i +\widetilde{f}_{i+1}}{3} \end{eqnarray} Note, be careful at the endpoints, where you need to average over two points only.

Finally, for the grid $g(x)$ we will have the grid of points distributed between $[0,1]$ so that $x_0 = \frac{1}{N}$, $x_1=\frac{2}{N}$, $\cdots$, $x_{N-2}=\frac{N-1}{N}$, $x_{N-1}=1$. We know that $g(x=0)=0$, hence we do not need to save pont $x=0$, but we need to be careful when interpolating at the point $x_0$. For such equidistant mesh, it is clear that given a point $x\in [0,1]$, we can compute $i=int(x N)$, and then we know that $x$ appears between $x_{i-1}$ and $x_i$, and linear interpolation gives \begin{eqnarray} &&g(x) = g_{i-1} + (g_{i}-g_{i-1})\frac{(x-i/N)}{1/N}\\ &&g(x) = g_{0}\; x N \quad if\quad i=0 \end{eqnarray}

Finally, we want to discuss the calculation of the error and confidence in the result. We will perform $M$ outside iterations (which update the grid). Each such iteration will consist of $n_i$ function evaluations (of the order of few thousand to ten thousand). From all these calculations $M*n$ we want to evaluate the best estimate of the integral, and its estimation for the error.

At each iteration, we will sample the following qualities: \begin{eqnarray} && ⟨f_w⟩_i = \frac{1}{n_i}\sum_{j=1}^{n_i} f(g_1({x_1}_j),g_2({x_2}_j),g_3({x_3}_j))\frac{dg_1}{dx_1}({x_1}_j)\frac{dg_2}{dx_2}({x_2}_j)\frac{dg_3}{dx_3}({x_3}_j)\\ && ⟨f_w^2⟩_i = \frac{1}{n_i}\sum_{j=1}^{n_i} \left(f(g_1({x_1}_j),g_2({x_2}_j),g_3({x_3}_j))\frac{dg_1}{dx_1}({x_1}_j)\frac{dg_2}{dx_2}({x_2}_j)\frac{dg_3}{dx_3}({x_3}_j)\right)^2 \end{eqnarray} Then the estimation for the variance-square of the MC-sampling is \begin{eqnarray} \sigma_i^2 = \frac{ ⟨f_w^2⟩_i-⟨f_w⟩_i^2}{n_i-1} \end{eqnarray} Note that the variance of the sampled function is $\sqrt{⟨f_w^2⟩_i-⟨f_w⟩_i^2}$, which is approaching a constant when the number of sampled points $n_i$ goes to infinity. However, the variance of the MC-sampling is $\sigma_i \propto \frac{1}{\sqrt{n_i}}$, as expected.

From all accumulated evaluations of the function (in M iterations), we can construct the best estimate of the integral. Naively, we would just calculate $1/M\sum_i ⟨f_w⟩_i$. However, at the first few iterations the error was way bigger than in the last iteration, and therefore we want to penalize those early estimates, which were not so good. We achieve that by \begin{eqnarray} I_{best}=\frac{\sum_{i=1}^M \frac{⟨f_w⟩_i}{\sigma_i^2}}{\sum_{i=1}^M\frac{1}{\sigma_i^2}} \end{eqnarray} Similarly, the error does not sum up, but is rather smaller than the smallest error (in the last iteration). We have \begin{eqnarray} \frac{1}{\sigma^2}_{best} = \sum_{i=1}^M \frac{1}{\sigma_i^2} \end{eqnarray} and finally the $\chi^2$ can be estimated by \begin{eqnarray} \chi^2 = \frac{1}{M-1}\sum_{i=1}^M \frac{(⟨f_w⟩_i -I_{best})^2}{\sigma_i^2} \end{eqnarray}

It measures the weighted squared deviations between the observed data ($⟨f_w⟩_i$) and the model prediction ($I_{best}$). If model prediction is good, $\chi^2$ should be close to unity, because $⟨f_w⟩_i$ should be on average $\sigma_i$ away from $I_{best}$.

Brute force Monte Carlo¶

First we will implement the simple Monte Carlo method.

In [1]:
# simple class to take care of statistically determined quantities
class Cumulants:
    def __init__(self):
        self.sum=0.0    # f_0 + f_1 +.... + f_N
        self.sqsum=0.0  # f_0^2 + f_1^2 +....+ f_N^2
        self.avg = 0.0  # I_best when many iterations, otherwise <f> = 1/N\sum_i f_i
        self.err = 0.0  # sigma of I_best when many iterations, otherwise sqrt( <f^2>-<f>^2 )/sqrt(N)
        self.chisq = 0.0
        self.weightsum=0.0 # \sum_i 1/sigma_i^2
        self.avgsum=0.0    # \sum_i <f>_i/sigma_i^2
        self.avg2sum=0.0   # \sum_i <f>_i^2/sigma_i^2
        self.neval=0    # number of function evaluations
    def __repr__(self):
        return f"Integral={self.avg}+-{self.err}\nChi2={self.chisq}\nNum of eval={self.neval}"
        
def SimpleMC(integrant, ndim, unit, maxeval, cum):
    """ integrant  -- function to integrate. It should be vectorized function!
        ndim       -- how many dimenions the function has
        unit       -- the size of the box we are using for integration [[0,L],[0,L]...]
        maxeval    -- total number of function evaluations we will perform
        cum        -- class Cumulants, which will hold all information we want to obatin.
    """
    nbatch=1000             # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
    neval=0
    for nsamples in range(maxeval,0,-nbatch):  # loop over all_nsample evaluations in batches of nbatch
        # nsamples counts remaining number of function evaluations
        n = min(nbatch,nsamples)  # How many evaluations in this pass?
        xr = unit*random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
        wfun = integrant(xr)  # n function evaluations required in single call
        neval += n  # We just added so many fuction evaluations
        cum.sum += sum(wfun)      # sum_i f_i = <fw> * neval
        cum.sqsum += sum(wfun*wfun)    # sum_i f_i^2 = <fw^2> * neval
    
    cum.avg = cum.sum/neval
    w0 = sqrt(cum.sqsum/neval)  # sqrt(<f^2>)
    cum.err = sqrt((w0-cum.avg)*(w0+cum.avg)/neval) # sqrt(sqsum-sum**2)
    cum.avg *= unit**ndim  # adding units if not [0,1] interval
    cum.err *= unit**ndim  # adding units if not [0,1] interval
    return cum

We used here a simple trick to avoid overflow, i.e.,

\begin{eqnarray} \sqrt{\frac{\langle f^2\rangle-\langle f\rangle^2}{N}} = \sqrt{\frac{(\sqrt{\langle f^2\rangle}-\langle f\rangle)(\sqrt{\langle f^2\rangle}+\langle f\rangle)}{N}} \end{eqnarray}

For testing, we will integrate $$ \frac{1}{\pi^3}\int_0^\pi dx\int_0^\pi dy \int_0^\pi dz \frac{1}{1-\cos(x) \cos(y) \cos(z)}$$

In [2]:
def my_integrant2(x):
    """ For testing, we are integration the function
       1/(1-cos(x)*cos(y)*cos(z))/pi^3
       in the interval [0,pi]**3
    """
    #nbatch,ndim = shape(x)
    return 1.0/(1.0-cos(x[:,0])*cos(x[:,1])*cos(x[:,2]))/pi**3
In [3]:
from scipy import *
from numpy import *
unit=pi
ndim=3
maxeval=200000
exact = 1.3932  # exact value of the integral
    
cum = Cumulants()
    
SimpleMC(my_integrant2, ndim, pi, maxeval, cum)
print(cum.avg, '+-', cum.err, 'exact=', exact)
print('how many sigma away', abs(cum.avg-exact)/cum.err)
1.3876998767965767 +- 0.02296769165452763 exact= 1.3932
how many sigma away 0.2394721805810668

Vegas¶

First we define the grid points $g(x)$. At the beginning, we just set $g(x)=x$.

In [4]:
class Grid:
    """Contains the grid points g_n(x) with x=[0...1], and g=[0...1]
       for Vegas integration. There are n-dim g_n functions.
       Constraints : g(0)=0 and g(1)=1.
    """
    def __init__(self, ndim, nbins):
        self.g = zeros((ndim,nbins+1))  
        # a bit dirty trick: We will later use also g[-1] in interpolation, which should be set to zero, hence
        # we allocate dimension nbins+1, rather than nbins
        self.ndim=ndim
        self.nbins=nbins
        # At the beginning we set g(x)=x
        # The grid-points are x_0 = 1/N, x_1 = 2/N, ... x_{N-1}=1.0. Note x_N=x_{-1}=0 
        # Note that g(0)=0, and we skip this point on the mesh.
        for idim in range(ndim): # over all 3 dimensions
            self.g[idim,:nbins] = arange(1,nbins+1)/float(nbins)
In [5]:
def Vegas_step1(integrant, unit, maxeval, nstart, nincrease, grid, cum):
    """
        integrant  -- function to integrate. It should be vectorized function!
        unit       -- the size of the box we are using for integration [[0,L],[0,L]...]
        maxeval    -- total number of function evaluations we will perform
        nstart     -- number of MC steps in the first iteration (before refining the grid)
        nincrease  -- how many points we add once grid is refined
        grid       -- the class which contains current grid, and is self-consistently refined.
        cum        -- class Cumulants, which will hold all information we want to obatin.    
    """
    ndim, nbins = grid.ndim,grid.nbins  # dimension of the integral, size of the grid for binning in each direction
    unit_dim = unit**ndim   # converts from unit cube integration to generalized cube with unit length
    nbatch=1000             # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
    neval=0
    print ("""Vegas parameters:
       ndim = """+str(ndim)+"""
       unit = """+str(unit)+"""
       maxeval = """+str(maxeval)+"""
       nstart = """+str(nstart)+"""
       nincrease = """+str(nincrease)+"""
       nbins = """+str(nbins)+"""
       nbaths = """+str(nbatch)+"\n")

    bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
    
    all_nsamples = nstart
    for nsamples in range(all_nsamples,0,-nbatch):  # loop over all_nsample evaluations in batches of nbatch
        # nsamples is the number of points left to evaluate
        n = min(nbatch,nsamples)  # How many evaluations in this pass?
        # We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
        # This is represented as  1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
        #  where dg_1/dx = diff*NBINS
        wgh = zeros(n)               # weights for each random point in the batch
        xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
        # This part takes quite a lot of time and it would be nice to rewrite with numba!
        for i in range(n):   # over the batch, which is n-items long
            weight = 1.0/all_nsamples    # weight in this point of the batch, i.e., one configuration
            for dim in range(ndim):      # over all dimension of the function ovaluation
                # We want to evaluate the function f at point g(x), i.e, f(g_1(x_1),g_2(x_2),...)
                # Here we transform the points x_1,x_2,x_3 -> g_1(x_1), g_2(x_2), g_3(x_3)
                # We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
                # The discretized g(t) is defined on the grid :
                #       t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
                # We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
                # To interpolate g at x, we first compute  i=int(x*N) and then we use linear interpolation
                # g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-i)  ;  if i>0
                # g(x) =   0    + (g[0]-0)*(x*N-0)       ;  if i=0
                #
                pos = xr[i,dim]*nbins               # which grid would it fit ? (x*N)
                ipos = int(pos)                     # the grid position is ipos : int(x*N)==i    
                diff = grid.g[dim,ipos] - grid.g[dim,ipos-1]  # g[i]-g[i-1]
                # linear interpolation for g(x) : 
                xr[i,dim] = (grid.g[dim,ipos-1] + (pos-ipos)*diff)*unit # g(xr) ~ ( g[i-1]+(g[i]-g[i-1])*(x*N-i) )*[units]
                bins[i,dim]=ipos                    # remember in which bin this random number falls.
                weight *= diff*nbins                #  weight for this dimension is dg/dx = (g[i]-g[i-1])*N
                                                    # because dx = i/N - (i-1)/N = 1/N
            wgh[i] = weight # total weight is for this point (df/dx)*(df/dy)*(df/dz).../N_{samples}
        # Here we evaluate function f on all randomly generated x points above
        fx = integrant(xr)  # n function evaluations required in single call
        neval += n  # We just added so many fuction evaluations
        
        # Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
        wfun = wgh * fx           # weight * function ~ f_i*w_i, here w_i has 1/N in its weight, hence actually we are 
                                  # evaluating  1/N * sum_i f_i*w_i
        cum.sum += sum(wfun)      # 1/N sum_i f_i*w_i = <fw>
        wfun *= wfun              # carefull : we need 1/N (f_i w_i)^2, while this gives (f_i w_i/N)^2
        cum.sqsum += sum(wfun)    # sum_i (f_i*w_i/N)^2 = <fw^2>/all_nsamples
                                  # 
    cum.neval = neval 
    w0 = sqrt(cum.sqsum*all_nsamples)  # w0 = sqrt(<fw^2>)
    w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
    w = (all_nsamples-1)/w1            # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
    # Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
    cum.weightsum += w          # weightsum ~ \sum_i 1/sigma_i^2
    cum.avgsum += w*cum.sum     # avgsum    ~ \sum_i <fw>_i / sigma_i^2
    
    cum.avg = cum.avgsum/cum.weightsum     # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
    cum.err = sqrt(1/cum.weightsum)        # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))

    cum.avg *= unit**ndim
    cum.err *= unit**ndim
In [6]:
from scipy import *
from numpy import *

unit=pi
ndim=3
maxeval=200000
exact = 1.3932  # exact value of the integral
    
cum = Cumulants()
    
nbins=128
nstart =10000
nincrease=5000
grid = Grid(ndim,nbins)
#random.seed(0)

#SimpleMC(my_integrant2, ndim, pi, maxeval, cum)
Vegas_step1(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
print (cum.avg, '+-', cum.err, 'exact=', exact)
print(cum)
Vegas parameters:
       ndim = 3
       unit = 3.141592653589793
       maxeval = 200000
       nstart = 10000
       nincrease = 5000
       nbins = 128
       nbaths = 1000

1.420606722953498 +- 0.04795781868621445 exact= 1.3932
Integral=1.420606722953498+-0.04795781868621445
Chi2=0.0
Num of eval=10000

Now we are going to insert the sampling of the projection of the function $f(x)$ to all axis, $f_1(x), f_2(y)...$, from which we will calculate new grids $g_1(x), g_2(y)...$.

In [7]:
def Vegas_step2(integrant, unit, maxeval, nstart, nincrease, grid, cum):
    ndim, nbins = grid.ndim,grid.nbins  # dimension of the integral, size of the grid for binning in each direction
    unit_dim = unit**ndim   # converts from unit cube integration to generalized cube with unit length
    nbatch=1000             # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
    neval=0
    print ("""Vegas parameters:
       ndim = """+str(ndim)+"""
       unit = """+str(unit)+"""
       maxeval = """+str(maxeval)+"""
       nstart = """+str(nstart)+"""
       nincrease = """+str(nincrease)+"""
       nbins = """+str(nbins)+"""
       nbaths = """+str(nbatch)+"\n")

    bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
    
    all_nsamples = nstart  
    # fxbin is not the function being integrated; it is an envelope for importance sampling.
    fxbin = zeros((ndim,nbins))    #new2: after each iteration we reset the average function being binned
    for nsamples in range(all_nsamples,0,-nbatch):  # loop over all_nsample evaluations in batches of nbatch
        n = min(nbatch,nsamples)  # How many evaluations in this pass?
        # We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
        # This is represented as  1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
        #  where dg_1/dx = diff*NBINS
        wgh = zeros(n)               # weights for each random point in the batch
        xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
        for i in range(n):
            weight = 1.0/all_nsamples
            for dim in range(ndim):
                # We want to evaluate the function f at point g(x), i.e, f(g_1(x),g_2(y),...)
                # Here we transform the points x,y,z -> g_1(x), g_2(y), g_3(z)
                # We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
                # The discretized g(t) is defined on the grid :
                #       t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
                # We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
                # To interpolate g at x, we first compute  i=int(x*N) and then we use linear interpolation
                # g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-i)  ;  if i>0
                # g(x) =   0    + (g[0]-0)*(x*N-0)       ;  if i=0
                #
                pos = xr[i,dim]*nbins               # which grid would it fit ? (x*N)
                ipos = int(pos)                     # the grid position is ipos : int(x*N)==i
                diff = grid.g[dim,ipos] - grid.g[dim,ipos-1]  # g[i]-g[-1]
                # linear interpolation for g(x) : 
                xr[i,dim] = (grid.g[dim,ipos-1] + (pos-ipos)*diff)*unit # g(xr) ~ ( g[i-1]+(g[i]-g[i-1])*(x*N-i) )*[units]
                bins[i,dim]=ipos                    # remember in which bin this random number falls.
                weight *= diff*nbins                #  weight for this dimension is dg/dx = (g[i]-g[i-1])*N
                                                    # because dx = i/N - (i-1)/N = 1/N
            wgh[i] = weight # total weight is  (df/dx)*(df/dy)*(df/dx).../N_{samples}
        # Here we evaluate function f on all randomly generated x points above
        fx = integrant(xr)  # n function evaluations required in single call
        neval += n  # We just added so many fuction evaluations

        # Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
        wfun = wgh * fx           # weight * function ~ f_i*w_i
        cum.sum += sum(wfun)      # sum_i f_i*w_i = <fw>
        wfun *= wfun              # carefull : this is like  (f_i * w_i/N)^2 hence  1/N (1/N (f_i*w_i)^2)
        cum.sqsum += sum(wfun)    # sum_i (f_i*w_i)^2 = <fw^2>/all_nsamples
                                  # 
        for dim in range(ndim):   #new2
            # projection of the sampled function^2 to each dimension, which will be used to improve the grid.
            for i in range(n):    #new2
                # loop over x_i: fxbin[dim,x_i] += sum_{x1_1,x_2,..but not x_i}|f(g_1(x_1),g_2(x_2),...)*dg_1/dx*dg_2/dx...|^2
                fxbin[dim, bins[i,dim] ] += wfun[i] #new2: just bin the function f. We saved the bin position before.
                
    cum.neval = neval        
    w0 = sqrt(cum.sqsum*all_nsamples)  # w0 = sqrt(<fw^2>)
    w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
    w = (all_nsamples-1)/w1            # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
    # Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
    cum.weightsum += w          # weightsum ~ \sum_i 1/sigma_i^2
    cum.avgsum += w*cum.sum     # avgsum    ~ \sum_i <fw>_i / sigma_i^2
    
    cum.avg = cum.avgsum/cum.weightsum     # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
    cum.err = sqrt(1/cum.weightsum)        # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
     
    cum.avg *= unit**ndim
    cum.err *= unit**ndim
    
    return fxbin
In [8]:
from scipy import *
from numpy import *

unit=pi
ndim=3
maxeval=200000
exact = 1.3932  # exact value of the integral
    
cum = Cumulants()
    
nbins=128
nstart =10000
nincrease=5000
grid = Grid(ndim,nbins)

#SimpleMC(my_integrant2, ndim, pi, maxeval, cum)
fxbin = Vegas_step2(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
#print(cum.avg, '+-', cum.err, 'exact=', exact)
print('exact=', exact)
print(cum)
Vegas parameters:
       ndim = 3
       unit = 3.141592653589793
       maxeval = 200000
       nstart = 10000
       nincrease = 5000
       nbins = 128
       nbaths = 1000

exact= 1.3932
Integral=1.3451646613018884+-0.02410637407853138
Chi2=0.0
Num of eval=10000
In [9]:
from pylab import *
%matplotlib inline

plot(fxbin[0])
plot(fxbin[1])
plot(fxbin[2])
#xlim([0,128])
ylim([0,1e-7])
show()
No description has been provided for this image
In [10]:
def Smoothen(fxbin):
    (ndim,nbins) = shape(fxbin)
    final = zeros(shape(fxbin))
    for idim in range(ndim):
        fxb = copy(fxbin[idim,:]) # widetilde{f}(x) copied
        #**** smooth the widetilde{f} value stored for each bin ****
        # f[i] <- (f[i+1]+f[i]+f[i-1])/3.
        fxb[:nbins-1] += fxbin[idim,1:nbins]
        fxb[1:nbins]  += fxbin[idim,:nbins-1]
        fxb[1:nbins-1] *= 1/3.
        fxb[0] *= 1/2.
        fxb[nbins-1] *= 1/2.
        norm = sum(fxb)
        if( norm == 0 ):
            print ('ERROR can not refine the grid with zero grid function')
            return # can not refine the grid if the function is zero.
        fxb *= 1.0/norm         # we normalize the function.
        # Note that normalization is such that the sum is 1.
        # And than we take the square root of abs value just in case the value is negative at a point.
        final[idim,:] = sqrt(abs(fxb))
    return final
In [11]:
from pylab import *
%matplotlib inline

imp = Smoothen(fxbin)

plot(imp[0])
plot(imp[1])
plot(imp[2])
xlim([0,128])
show()
No description has been provided for this image

Note that we require $$f(g(x))\frac{dg}{dx} = const$$ which is equivalent to say that the weigh function will be $$w(x) \propto f(g(x))$$ because $\frac{dg}{dx}=\frac{1}{w(x)}$

Next we wanted to rewrite this identity in more useful way: $$f(g(x))\frac{dg}{dx} = const = \int_{g(x)}^{g(x)+\frac{dg}{dx}dx} f(t) dt$$ which when discretized becomes

$$\int_{g_{l-1}}^{g_l} f(g_{old}) dg_{old} = \frac{I}{N_g}$$

where $I=\int_0^1 f(g_{old}) dg_{old}$ and $N_g$ is the number of discrete points $l$.

In [18]:
class Grid:
    """Contains the grid points g_n(x) with x=[0...1], and g=[0...1]
       for Vegas integration. There are n-dim g_n functions.
       Constraints : g(0)=0 and g(1)=1.
    """
    def __init__(self, ndim, nbins):
        self.g = zeros((ndim,nbins+1))  
        # a bit dirty trick: We will later use also g[-1] in interpolation, which should be set to zero, hence
        # we allocate dimension nbins+1, rather than nbinx
        self.ndim=ndim
        self.nbins=nbins
        # At the beginning we set g(x)=x
        # The grid-points are x_0 = 1/N, x_1 = 2/N, ... x_{N-1}=1.0. 
        # Note that g(0)=0, and we skip this point on the mesh.
        for idim in range(ndim):
            self.g[idim,:nbins] = arange(1,nbins+1)/float(nbins)
            
    def RefineGrid(self, imp):
        # imp[idim,ibin] is the input function \widetilde{f}(g) from which we construct the grid
        (ndim,nbins) = shape(imp)
        gnew = zeros((ndim,nbins+1))
        for idim in range(ndim):
            avgperbin = sum(imp[idim,:])/nbins
            #**** redefine the size of each bin  ****
            newgrid = zeros(nbins)
            cur=0.0
            newcur=0.0
            thisbin = 0.0
            ibin = -1
            # we are trying to determine
            #   Int[ f(g) dg, {g, g_{i-1},g_i}] == I/N_g
            #   where I == avgperbin
            for newbin in range(nbins-1):  # all but the last bin, which is 1.0
                while (thisbin < avgperbin) :
                    ibin+=1
                    thisbin += imp[idim,ibin] # adding widetilde{f}(g)
                    prev = cur               # g^old_{l-1}
                    cur = self.g[idim,ibin]  # g^old_{l}
                # Explanation is in order : 
                #   g^new_l should be somewhere between g^old_l and g^old_{l-1}, because if we add the last point
                #      we exceeded I/Ng value, while withouth the last point, we do not have enough. Hence need to interpolate.
                #   prev    -- g^{old}_{l-1}
                #   cur     -- g^{old}_l
                #   thisbin -- Sm = f_{l-k}+.... +f_{l-2}+f_{l-1}+f_l
                #   we know that  Sm is just a bit more than we need, i.e., I/N_g, hence we need to compute how much more
                #   using linear interpolation :
                #   g^{new} = g_l - (g_l-g_{l-1}) * (f_{l-k}+....+f_{l-2}+f_{l-1}+f_l - I/N_g)/f_l
                #    clearly
                #         if I/N_g == f_{l-k}+....+f_{l-2}+f_{l-1}+f_l
                #            we will get g^{new} = g_l
                #     and if I/N_g == f_{l-k}+....+f_{l-2}+f_{l-1}
                #            we will get g^{new} = g_{l-1}
                #     and if I/N_g  is between the two possibilities, we will get linear interpolation between
                #     g_{l-1} and g_l
                #     
                thisbin -= avgperbin   # thisbin <- (f_{l-k}+....+f_{l-2}+f_{l-1}+f_l - I/N_g)
                dx=thisbin/imp[idim,ibin] # dx is between 0 and 1
                # cur=g_l and prev=g_{l-1} and dx is the fraction of each we need to take for linear interpolation
                newgrid[newbin] = cur + (prev-cur)*dx # linear interpolation between g_l and g_{l-1}

            newgrid[nbins-1]=1.0
            gnew[idim,:nbins]= newgrid
        self.g = gnew
        return gnew

Update the class Grid, so that it can refine the grid

In [19]:
from scipy import *
from numpy import *

unit=pi
ndim=3
maxeval=200000
exact = 1.3932  # exact value of the integral
    
cum = Cumulants()
    
nbins=128
nstart =10000
nincrease=5000

grid = Grid(ndim,nbins)

fxbin = Vegas_step2(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)
imp = Smoothen(fxbin)
grid.RefineGrid(imp)
print(cum.avg, '+-', cum.err, 'exact=', exact)

plot(grid.g[0,:nbins])
plot(grid.g[1,:nbins])
plot(grid.g[2,:nbins])
xlim([0,128])
show()
Vegas parameters:
       ndim = 3
       unit = 3.141592653589793
       maxeval = 200000
       nstart = 10000
       nincrease = 5000
       nbins = 128
       nbaths = 1000

1.3755414672846529 +- 0.03599525540482765 exact= 1.3932
No description has been provided for this image
In [20]:
def Vegas_step3(integrant, unit, maxeval, nstart, nincrease, grid, cum):
    ndim, nbins = grid.ndim,grid.nbins  # dimension of the integral, size of the grid for binning in each direction
    unit_dim = unit**ndim   # converts from unit cube integration to generalized cube with unit length
    nbatch=1000             # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
    neval=0
    print ("""Vegas parameters:
       ndim = """+str(ndim)+"""
       unit = """+str(unit)+"""
       maxeval = """+str(maxeval)+"""
       nstart = """+str(nstart)+"""
       nincrease = """+str(nincrease)+"""
       nbins = """+str(nbins)+"""
       nbaths = """+str(nbatch)+"\n")

    bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
    all_nsamples = nstart
    for iter in range(1000):         # NEW in step 3
        fxbin = zeros((ndim,nbins))    # after each iteration we reset the average function being binned
        for nsamples in range(all_nsamples,0,-nbatch):  # loop over all_nsample evaluations in batches of nbatch
            n = min(nbatch,nsamples)  # How many evaluations in this pass?
            # We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
            # This is represented as  1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
            #  where dg_1/dx = diff*NBINS
            wgh = zeros(n)             # weights for each random point in the batch
            xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
            for i in range(n):
                weight = 1.0/all_nsamples
                for dim in range(ndim):
                    # We want to evaluate the function f at point g(x), i.e, f(g_1(x),g_2(y),...)
                    # Here we transform the points x,y,z -> g_1(x), g_2(y), g_3(z)
                    # We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
                    # The discretized g(t) is defined on the grid :
                    #       t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
                    # We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
                    # To interpolate g at x, we first compute  i=int(x*N) and then we use linear interpolation
                    # g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-i)  ;  if i>0
                    # g(x) =   0    + (g[0]-0)*(x*N-0)       ;  if i=0
                    #
                    pos = xr[i,dim]*nbins               # which grid would it fit ? (x*N)
                    ipos = int(pos)                     # the grid position is ipos : int(x*N)==i
                    diff = grid.g[dim,ipos] - grid.g[dim,ipos-1]  # g[i]-g[i-1]
                    # linear interpolation for g(x) : 
                    xr[i,dim] = (grid.g[dim,ipos-1] + (pos-ipos)*diff)*unit # g(xr) ~ ( g[i-1]+(g[i]-g[i-1])*(x*N-i) )*[units]
                    bins[i,dim]=ipos                    # remember in which bin this random number falls.
                    weight *= diff*nbins                #  weight for this dimension is dg/dx = (g[i]-g[i-1])*N
                                                        # because dx = i/N - (i-1)/N = 1/N
                wgh[i] = weight # total weight is  (df/dx)*(df/dy)*(df/dx).../N_{samples}
            
            # Here we evaluate function f on all randomly generated x points above
            fx = integrant(xr)  # n function evaluations required in single call
            neval += n  # We just added so many fuction evaluations
            
            # Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
            wfun = wgh * fx           # weight * function ~ f_i*w_i            
            cum.sum += sum(wfun)      # sum_i f_i*w_i = <fw>
            wfun *= wfun              # carefull : this is like  (f_i * w_i/N)^2 hence  1/N (1/N (f_i*w_i)^2)
            cum.sqsum += sum(wfun)    # sum_i (f_i*w_i)^2 = <fw^2>/all_nsamples
                                      # 
            for dim in range(ndim):   #new2
                # Here we make a better approximation for the function, which we are integrating.
                for i in range(n):    #new2
                    fxbin[dim, bins[i,dim] ] += wfun[i] #new2: just bin the function f. We saved the bin position before.
            
        w0 = sqrt(cum.sqsum*all_nsamples)  # w0 = sqrt(<fw^2>)
        w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
        w = (all_nsamples-1)/w1            # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
        # Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
        cum.weightsum += w          # weightsum ~ \sum_i 1/sigma_i^2
        cum.avgsum += w*cum.sum     # avgsum    ~ \sum_i <fw>_i / sigma_i^2
        cum.avg2sum += w*cum.sum**2  # avg2cum   ~ \sum_i <fw>_i^2/sigma_i^2
        
        cum.avg = cum.avgsum/cum.weightsum     # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
        cum.err = sqrt(1/cum.weightsum)        # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
     
        # NEW in this step3
        if iter>0:
            # <f_w>_i^2/sigma_i^2 - 2*<f_w>_i/sigma_i^2 * I_best + 1/sigma_i^2 * I_best=(<f_w>_i-I_best)^2/sigma_i^2
            cum.chisq = (cum.avg2sum - 2*cum.avgsum*cum.avg + cum.weightsum*cum.avg**2)/iter
    
        print ("Iteration {:3d}: I= {:10.8f} +- {:10.8f}  chisq= {:10.8f} number of evaluations = {:7d} ".format(iter+1, cum.avg*unit_dim, cum.err*unit_dim, cum.chisq, neval))
        imp = Smoothen(fxbin)
        grid.RefineGrid(imp)
        
        cum.sum=0                    # clear the partial sum for the next step
        cum.sqsum=0
        all_nsamples += nincrease    # for the next time, increase the number of steps a bit
        if (neval>=maxeval): break
    cum.neval = neval 
    cum.avg *= unit**ndim
    cum.err *= unit**ndim
In [21]:
from scipy import *
from numpy import *

unit=pi
ndim=3
maxeval=20000000
exact = 1.3932  # exact value of the integral
    
cum = Cumulants()
    
nbins=128
nstart =100000
nincrease=5000

grid = Grid(ndim,nbins)

#random.seed(0)

Vegas_step3(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)

print (cum.avg, '+-', cum.err, 'exact=', exact, 'real error=', abs(cum.avg-exact)/exact)

plot(grid.g[0,:nbins])
plot(grid.g[1,:nbins])
plot(grid.g[2,:nbins])
show()
Vegas parameters:
       ndim = 3
       unit = 3.141592653589793
       maxeval = 20000000
       nstart = 100000
       nincrease = 5000
       nbins = 128
       nbaths = 1000

Iteration   1: I= 1.39319392 +- 0.01421017  chisq= 0.00000000 number of evaluations =  100000 
Iteration   2: I= 1.39166124 +- 0.00665566  chisq= 0.01490245 number of evaluations =  205000 
Iteration   3: I= 1.39325001 +- 0.00422883  chisq= 0.05523115 number of evaluations =  315000 
Iteration   4: I= 1.39354406 +- 0.00327450  chisq= 0.04084582 number of evaluations =  430000 
Iteration   5: I= 1.39211524 +- 0.00265027  chisq= 0.16863639 number of evaluations =  550000 
Iteration   6: I= 1.39135424 +- 0.00224732  chisq= 0.19359891 number of evaluations =  675000 
Iteration   7: I= 1.39112188 +- 0.00196600  chisq= 0.16892397 number of evaluations =  805000 
Iteration   8: I= 1.39126217 +- 0.00176274  chisq= 0.14850175 number of evaluations =  940000 
Iteration   9: I= 1.39164834 +- 0.00162061  chisq= 0.16870194 number of evaluations = 1080000 
Iteration  10: I= 1.39212509 +- 0.00150431  chisq= 0.21944941 number of evaluations = 1225000 
Iteration  11: I= 1.39241396 +- 0.00143549  chisq= 0.23874840 number of evaluations = 1375000 
Iteration  12: I= 1.39283302 +- 0.00136110  chisq= 0.29377766 number of evaluations = 1530000 
Iteration  13: I= 1.39333090 +- 0.00131411  chisq= 0.43364985 number of evaluations = 1690000 
Iteration  14: I= 1.39317381 +- 0.00125313  chisq= 0.41241620 number of evaluations = 1855000 
Iteration  15: I= 1.39324298 +- 0.00118493  chisq= 0.38501271 number of evaluations = 2025000 
Iteration  16: I= 1.39322328 +- 0.00112878  chisq= 0.35954438 number of evaluations = 2200000 
Iteration  17: I= 1.39346160 +- 0.00107611  chisq= 0.36763604 number of evaluations = 2380000 
Iteration  18: I= 1.39378891 +- 0.00103012  chisq= 0.41107725 number of evaluations = 2565000 
Iteration  19: I= 1.39400135 +- 0.00098653  chisq= 0.41675899 number of evaluations = 2755000 
Iteration  20: I= 1.39423218 +- 0.00094804  chisq= 0.43248972 number of evaluations = 2950000 
Iteration  21: I= 1.39419298 +- 0.00091240  chisq= 0.41202385 number of evaluations = 3150000 
Iteration  22: I= 1.39417226 +- 0.00087909  chisq= 0.39274609 number of evaluations = 3355000 
Iteration  23: I= 1.39419219 +- 0.00084856  chisq= 0.37523633 number of evaluations = 3565000 
Iteration  24: I= 1.39405876 +- 0.00082513  chisq= 0.37866580 number of evaluations = 3780000 
Iteration  25: I= 1.39379985 +- 0.00080110  chisq= 0.43435818 number of evaluations = 4000000 
Iteration  26: I= 1.39365609 +- 0.00077658  chisq= 0.43834991 number of evaluations = 4225000 
Iteration  27: I= 1.39348180 +- 0.00075311  chisq= 0.45403825 number of evaluations = 4455000 
Iteration  28: I= 1.39349041 +- 0.00073265  chisq= 0.43731242 number of evaluations = 4690000 
Iteration  29: I= 1.39375280 +- 0.00071273  chisq= 0.50709855 number of evaluations = 4930000 
Iteration  30: I= 1.39370242 +- 0.00069346  chisq= 0.49284113 number of evaluations = 5175000 
Iteration  31: I= 1.39374949 +- 0.00069225  chisq= 0.52042168 number of evaluations = 5425000 
Iteration  32: I= 1.39366095 +- 0.00068674  chisq= 0.53690296 number of evaluations = 5680000 
Iteration  33: I= 1.39394256 +- 0.00066987  chisq= 0.62842382 number of evaluations = 5940000 
Iteration  34: I= 1.39370695 +- 0.00065366  chisq= 0.68778910 number of evaluations = 6205000 
Iteration  35: I= 1.39365161 +- 0.00063742  chisq= 0.67185664 number of evaluations = 6475000 
Iteration  36: I= 1.39326622 +- 0.00062134  chisq= 0.86230497 number of evaluations = 6750000 
Iteration  37: I= 1.39321597 +- 0.00060778  chisq= 0.84256073 number of evaluations = 7030000 
Iteration  38: I= 1.39331900 +- 0.00060012  chisq= 0.85078566 number of evaluations = 7315000 
Iteration  39: I= 1.39329797 +- 0.00059280  chisq= 0.82972947 number of evaluations = 7605000 
Iteration  40: I= 1.39315448 +- 0.00058198  chisq= 0.84996371 number of evaluations = 7900000 
Iteration  41: I= 1.39304623 +- 0.00056942  chisq= 0.84898100 number of evaluations = 8200000 
Iteration  42: I= 1.39308702 +- 0.00055770  chisq= 0.83134578 number of evaluations = 8505000 
Iteration  43: I= 1.39312125 +- 0.00054728  chisq= 0.81397528 number of evaluations = 8815000 
Iteration  44: I= 1.39303878 +- 0.00053668  chisq= 0.80882363 number of evaluations = 9130000 
Iteration  45: I= 1.39290541 +- 0.00052589  chisq= 0.82566949 number of evaluations = 9450000 
Iteration  46: I= 1.39296894 +- 0.00051761  chisq= 0.81769965 number of evaluations = 9775000 
Iteration  47: I= 1.39295123 +- 0.00050844  chisq= 0.80064839 number of evaluations = 10105000 
Iteration  48: I= 1.39273842 +- 0.00049836  chisq= 0.87849227 number of evaluations = 10440000 
Iteration  49: I= 1.39264595 +- 0.00048918  chisq= 0.87984569 number of evaluations = 10780000 
Iteration  50: I= 1.39270314 +- 0.00048197  chisq= 0.87142999 number of evaluations = 11125000 
Iteration  51: I= 1.39274585 +- 0.00047462  chisq= 0.85918565 number of evaluations = 11475000 
Iteration  52: I= 1.39271355 +- 0.00046679  chisq= 0.84511617 number of evaluations = 11830000 
Iteration  53: I= 1.39296693 +- 0.00045884  chisq= 0.99663717 number of evaluations = 12190000 
Iteration  54: I= 1.39305174 +- 0.00045093  chisq= 0.99668884 number of evaluations = 12555000 
Iteration  55: I= 1.39302420 +- 0.00044363  chisq= 0.98038397 number of evaluations = 12925000 
Iteration  56: I= 1.39280309 +- 0.00043632  chisq= 1.10078322 number of evaluations = 13300000 
Iteration  57: I= 1.39279187 +- 0.00042936  chisq= 1.08149900 number of evaluations = 13680000 
Iteration  58: I= 1.39275704 +- 0.00042279  chisq= 1.06632309 number of evaluations = 14065000 
Iteration  59: I= 1.39275456 +- 0.00041640  chisq= 1.04795808 number of evaluations = 14455000 
Iteration  60: I= 1.39266859 +- 0.00040998  chisq= 1.05380857 number of evaluations = 14850000 
Iteration  61: I= 1.39273881 +- 0.00040659  chisq= 1.06589461 number of evaluations = 15250000 
Iteration  62: I= 1.39283231 +- 0.00040281  chisq= 1.09532789 number of evaluations = 15655000 
Iteration  63: I= 1.39287865 +- 0.00039770  chisq= 1.08612440 number of evaluations = 16065000 
Iteration  64: I= 1.39287793 +- 0.00039369  chisq= 1.06888691 number of evaluations = 16480000 
Iteration  65: I= 1.39281740 +- 0.00038890  chisq= 1.06744229 number of evaluations = 16900000 
Iteration  66: I= 1.39275577 +- 0.00038363  chisq= 1.06537797 number of evaluations = 17325000 
Iteration  67: I= 1.39267141 +- 0.00037871  chisq= 1.07800613 number of evaluations = 17755000 
Iteration  68: I= 1.39270363 +- 0.00037412  chisq= 1.06640414 number of evaluations = 18190000 
Iteration  69: I= 1.39267452 +- 0.00036931  chisq= 1.05420292 number of evaluations = 18630000 
Iteration  70: I= 1.39263768 +- 0.00036442  chisq= 1.04440223 number of evaluations = 19075000 
Iteration  71: I= 1.39260211 +- 0.00035959  chisq= 1.03465566 number of evaluations = 19525000 
Iteration  72: I= 1.39278014 +- 0.00035547  chisq= 1.17164777 number of evaluations = 19980000 
Iteration  73: I= 1.39284895 +- 0.00035124  chisq= 1.17734545 number of evaluations = 20440000 
1.3928489450759955 +- 0.0003512352693772236 exact= 1.3932 real error= 0.0002519774074105985
No description has been provided for this image

Here we are going to speed up the code by using vectorized numpy capability, and numba capability.

Here numba will be used to set fxbin array, using wfun array, which contains projection of the function integrated to all axis.

To interpolate grid at the random points xr, we will use numpy vectorized functionality and fancy indexing to remove the loop over batches. All loops over batch size n is gone in the final version of the code.

In [22]:
from numba import jit

@jit(nopython=True)
def SetFxbin(fxbin,bins,wfun):
    (n,ndim) = bins.shape
    for dim in range(ndim):
        # Here we make a better approximation for the function, which we are integrating.
        for i in range(n):
            fxbin[dim, bins[i,dim] ] += abs(wfun[i]) # just bin the function f. We saved the bin position before.                


def Vegas_step3b(integrant, unit, maxeval, nstart, nincrease, grid, cum):
    ndim, nbins = grid.ndim,grid.nbins  # dimension of the integral, size of the grid for binning in each direction
    unit_dim = unit**ndim   # converts from unit cube integration to generalized cube with unit length
    nbatch=1000             # function will be evaluated in bacthes of 1000 evaluations at one time (for efficiency and storage issues)
    neval=0
    print ("""Vegas parameters:
       ndim = """+str(ndim)+"""
       unit = """+str(unit)+"""
       maxeval = """+str(maxeval)+"""
       nstart = """+str(nstart)+"""
       nincrease = """+str(nincrease)+"""
       nbins = """+str(nbins)+"""
       nbaths = """+str(nbatch)+"\n")

    bins = zeros((nbatch,ndim),dtype=int) # in which sampled bin does this point fall?
    
    
    all_nsamples = nstart
    for iter in range(1000):         # NEW in step 3
        fxbin = zeros((ndim,nbins))    # after each iteration we reset the average function being binned
        for nsamples in range(all_nsamples,0,-nbatch):  # loop over all_nsample evaluations in batches of nbatch
            n = min(nbatch,nsamples)  # How many evaluations in this pass?
            wgh = zeros(n)            # weights for each random point in the batch
            # We are integrating f(g_1(x),g_2(y),g_3(z))*dg_1/dx*dg_2/dy*dg_3/dz dx*dy*dz
            # This is represented as  1/all_nsamples \sum_{x_i,y_i,z_i} f(g_1(x_i),g_2(y_i),g_3(z_i))*dg_1/dx*dg_2/dy*dg_3/dz
            #  where dg_1/dx = diff*NBINS
            xr = random.random((n,ndim)) # generates 2-d array of random numbers in the interval [0,1)
            pos = xr*nbins                   # (x*N)
            bins = array(pos,dtype=int)      # which grid would it fit ? (x*N)
            wgh = ones(nbatch)/all_nsamples
            for dim in range(ndim):   
                # We want to evaluate the function f at point g(x), i.e, f(g_1(x),g_2(y),...)
                # Here we transform the points x,y,z -> g_1(x), g_2(y), g_3(z)
                # We hence want to evaluate g(x) ~ g(x[i]), where x is the random number and g is the grid function
                # The discretized g(t) is defined on the grid :
                #       t[-1]=0, t[0]=1/N, t[1]=2/N, t[2]=3/N ... t[N-1]=1.
                # We know that g(0)=0 and g(1)=1, so that g[-1]=0.0 and g[N-1]=1.0
                # To interpolate g at x, we first compute  i=int(x*N) and then we use linear interpolation
                # g(x) = g[i-1] + (g[i]-g[i-1])*(x*N-int(x*N))
                gi = grid.g[dim,bins[:,dim]]            # g[i]
                gm = grid.g[dim,bins[:,dim]-1]          # g[i-1]
                diff = gi - gm                          # g[i]-g[i-1]
                gx = gm + (pos[:,dim]-bins[:,dim])*diff # linear interpolation g(xr)
                xr[:,dim] = gx*unit                     # xr <- g(xr)
                wgh *= diff*nbins                       # wgh = prod_{dim} dg/dx
            
            # Here we evaluate function f on all randomly generated x points above
            fx = integrant(xr)  # n function evaluations required in single call
            neval += n  # We just added so many fuction evaluations
            
            # Now we compute the integral as weighted average, namely, f(g(x))*dg/dx
            wfun = wgh * fx           # weight * function ~ f_i*w_i            
            cum.sum += sum(wfun)      # sum_i f_i*w_i = <fw>
            wfun *= wfun              # carefull : this is like  (f_i * w_i/N)^2 hence  1/N (1/N (f_i*w_i)^2)
            cum.sqsum += sum(wfun)    # sum_i (f_i*w_i)^2 = <fw^2>/all_nsamples
                                      # 
            SetFxbin(fxbin,bins,wfun)
            #for dim in range(ndim):   #new2
            #    # Here we make a better approximation for the function, which we are integrating.
            #    for i in range(n):    #new2
            #        fxbin[dim, bins[i,dim] ] += wfun[i] #new2: just bin the function f. We saved the bin position before.
            
        w0 = sqrt(cum.sqsum*all_nsamples)  # w0 = sqrt(<fw^2>)
        w1 = (w0 + cum.sum)*(w0 - cum.sum) # w1 = (w0^2 - <fw>^2) = (<fw^2>-<fw>^2)
        w = (all_nsamples-1)/w1            # w ~ 1/sigma_i^2 = (N-1)/(<fw^2>-<fw>^2)
        # Note that variance of the MC sampling is Var(monte-f) = (<f^2>-<f>^2)/N == 1/sigma_i^2
        cum.weightsum += w          # weightsum ~ \sum_i 1/sigma_i^2
        cum.avgsum += w*cum.sum     # avgsum    ~ \sum_i <fw>_i / sigma_i^2
        cum.avg2sum += w*cum.sum**2  # avg2cum   ~ \sum_i <fw>_i^2/sigma_i^2
        
        cum.avg = cum.avgsum/cum.weightsum     # I_best = (\sum_i <fw>_i/sigma_i^2 )/(\sum_i 1/sigma_i^2)
        cum.err = sqrt(1/cum.weightsum)        # err ~ sqrt(best sigma^2) = sqrt(1/(\sum_i 1/sigma_i^2))
     
        # NEW in this step3
        if iter>0:
            cum.chisq = (cum.avg2sum - 2*cum.avgsum*cum.avg + cum.weightsum*cum.avg**2)/iter
    
        print ("Iteration {:3d}: I= {:10.8f} +- {:10.8f}  chisq= {:10.8f} number of evaluations = {:7d} ".format(iter+1, cum.avg*unit_dim, cum.err*unit_dim, cum.chisq, neval))
        imp = Smoothen(fxbin)
        grid.RefineGrid(imp)
        
        cum.sum=0                    # clear the partial sum for the next step
        cum.sqsum=0
        all_nsamples += nincrease    # for the next time, increase the number of steps a bit
        if (neval>=maxeval): break
    cum.neval = neval    
    cum.avg *= unit**ndim
    cum.err *= unit**ndim
In [23]:
from scipy import *
from numpy import *

unit=pi
ndim=3
maxeval=20000000
exact = 1.3932  # exact value of the integral
    
cum = Cumulants()
    
nbins=128
nstart =100000
nincrease=5000

grid = Grid(ndim,nbins)

random.seed(0)

Vegas_step3b(my_integrant2, pi, maxeval, nstart, nincrease, grid, cum)

print (cum.avg, '+-', cum.err, 'exact=', exact, 'real error=', abs(cum.avg-exact)/exact)

plot(grid.g[0,:nbins])
plot(grid.g[1,:nbins])
plot(grid.g[2,:nbins])
show()
Vegas parameters:
       ndim = 3
       unit = 3.141592653589793
       maxeval = 20000000
       nstart = 100000
       nincrease = 5000
       nbins = 128
       nbaths = 1000

Iteration   1: I= 1.36762346 +- 0.01023309  chisq= 0.00000000 number of evaluations =  100000 
Iteration   2: I= 1.38493614 +- 0.00552045  chisq= 4.03725299 number of evaluations =  205000 
Iteration   3: I= 1.39107078 +- 0.00365694  chisq= 3.11889891 number of evaluations =  315000 
Iteration   4: I= 1.39220291 +- 0.00285603  chisq= 2.16116961 number of evaluations =  430000 
Iteration   5: I= 1.39391923 +- 0.00246919  chisq= 1.97837591 number of evaluations =  550000 
Iteration   6: I= 1.39327395 +- 0.00216911  chisq= 1.64253221 number of evaluations =  675000 
Iteration   7: I= 1.39237495 +- 0.00190896  chisq= 1.49574064 number of evaluations =  805000 
Iteration   8: I= 1.39084621 +- 0.00171092  chisq= 1.74779014 number of evaluations =  940000 
Iteration   9: I= 1.39124007 +- 0.00159307  chisq= 1.57911830 number of evaluations = 1080000 
Iteration  10: I= 1.39178313 +- 0.00148354  chisq= 1.50089423 number of evaluations = 1225000 
Iteration  11: I= 1.39129403 +- 0.00138447  chisq= 1.43499906 number of evaluations = 1375000 
Iteration  12: I= 1.39136789 +- 0.00129897  chisq= 1.30670608 number of evaluations = 1530000 
Iteration  13: I= 1.39192235 +- 0.00122906  chisq= 1.34277224 number of evaluations = 1690000 
Iteration  14: I= 1.39210209 +- 0.00116712  chisq= 1.25622458 number of evaluations = 1855000 
Iteration  15: I= 1.39289525 +- 0.00112200  chisq= 1.60155180 number of evaluations = 2025000 
Iteration  16: I= 1.39279461 +- 0.00107659  chisq= 1.50154462 number of evaluations = 2200000 
Iteration  17: I= 1.39284042 +- 0.00106297  chisq= 1.41219776 number of evaluations = 2380000 
Iteration  18: I= 1.39295317 +- 0.00103524  chisq= 1.34198096 number of evaluations = 2565000 
Iteration  19: I= 1.39360680 +- 0.00099462  chisq= 1.55533457 number of evaluations = 2755000 
Iteration  20: I= 1.39377022 +- 0.00095542  chisq= 1.49186042 number of evaluations = 2950000 
Iteration  21: I= 1.39356055 +- 0.00091986  chisq= 1.45022690 number of evaluations = 3150000 
Iteration  22: I= 1.39346800 +- 0.00088810  chisq= 1.38827277 number of evaluations = 3355000 
Iteration  23: I= 1.39356357 +- 0.00085791  chisq= 1.33304529 number of evaluations = 3565000 
Iteration  24: I= 1.39348336 +- 0.00082902  chisq= 1.28082649 number of evaluations = 3780000 
Iteration  25: I= 1.39358231 +- 0.00080332  chisq= 1.23718491 number of evaluations = 4000000 
Iteration  26: I= 1.39371675 +- 0.00078013  chisq= 1.20738380 number of evaluations = 4225000 
Iteration  27: I= 1.39378904 +- 0.00075729  chisq= 1.16667047 number of evaluations = 4455000 
Iteration  28: I= 1.39369588 +- 0.00073682  chisq= 1.13397047 number of evaluations = 4690000 
Iteration  29: I= 1.39340503 +- 0.00071662  chisq= 1.19640849 number of evaluations = 4930000 
Iteration  30: I= 1.39302942 +- 0.00069683  chisq= 1.32902196 number of evaluations = 5175000 
Iteration  31: I= 1.39284884 +- 0.00067851  chisq= 1.32786718 number of evaluations = 5425000 
Iteration  32: I= 1.39253524 +- 0.00066360  chisq= 1.44353551 number of evaluations = 5680000 
Iteration  33: I= 1.39256824 +- 0.00064833  chisq= 1.40012422 number of evaluations = 5940000 
Iteration  34: I= 1.39270298 +- 0.00063255  chisq= 1.38490760 number of evaluations = 6205000 
Iteration  35: I= 1.39260687 +- 0.00061716  chisq= 1.35830699 number of evaluations = 6475000 
Iteration  36: I= 1.39266202 +- 0.00060602  chisq= 1.32587503 number of evaluations = 6750000 
Iteration  37: I= 1.39269778 +- 0.00059474  chisq= 1.29166805 number of evaluations = 7030000 
Iteration  38: I= 1.39261544 +- 0.00058300  chisq= 1.27001715 number of evaluations = 7315000 
Iteration  39: I= 1.39265521 +- 0.00057115  chisq= 1.23964002 number of evaluations = 7605000 
Iteration  40: I= 1.39263452 +- 0.00055972  chisq= 1.20870268 number of evaluations = 7900000 
Iteration  41: I= 1.39262307 +- 0.00054850  chisq= 1.17874879 number of evaluations = 8200000 
Iteration  42: I= 1.39254852 +- 0.00053745  chisq= 1.16129750 number of evaluations = 8505000 
Iteration  43: I= 1.39243857 +- 0.00052710  chisq= 1.15977315 number of evaluations = 8815000 
Iteration  44: I= 1.39247021 +- 0.00051732  chisq= 1.13508077 number of evaluations = 9130000 
Iteration  45: I= 1.39271351 +- 0.00050792  chisq= 1.24897036 number of evaluations = 9450000 
Iteration  46: I= 1.39273124 +- 0.00049882  chisq= 1.22197782 number of evaluations = 9775000 
Iteration  47: I= 1.39268508 +- 0.00048980  chisq= 1.20060371 number of evaluations = 10105000 
Iteration  48: I= 1.39270826 +- 0.00048160  chisq= 1.17649486 number of evaluations = 10440000 
Iteration  49: I= 1.39268980 +- 0.00047340  chisq= 1.15289105 number of evaluations = 10780000 
Iteration  50: I= 1.39277335 +- 0.00046527  chisq= 1.14803488 number of evaluations = 11125000 
Iteration  51: I= 1.39276749 +- 0.00045757  chisq= 1.12517076 number of evaluations = 11475000 
Iteration  52: I= 1.39279949 +- 0.00045023  chisq= 1.10612186 number of evaluations = 11830000 
Iteration  53: I= 1.39282711 +- 0.00044337  chisq= 1.08724099 number of evaluations = 12190000 
Iteration  54: I= 1.39269319 +- 0.00043621  chisq= 1.12046711 number of evaluations = 12555000 
Iteration  55: I= 1.39263143 +- 0.00043015  chisq= 1.11316659 number of evaluations = 12925000 
Iteration  56: I= 1.39259639 +- 0.00042410  chisq= 1.09724364 number of evaluations = 13300000 
Iteration  57: I= 1.39266669 +- 0.00041788  chisq= 1.09451755 number of evaluations = 13680000 
Iteration  58: I= 1.39269151 +- 0.00041183  chisq= 1.07746788 number of evaluations = 14065000 
Iteration  59: I= 1.39275558 +- 0.00040646  chisq= 1.07498678 number of evaluations = 14455000 
Iteration  60: I= 1.39273543 +- 0.00040089  chisq= 1.05829837 number of evaluations = 14850000 
Iteration  61: I= 1.39278217 +- 0.00039653  chisq= 1.05112443 number of evaluations = 15250000 
Iteration  62: I= 1.39276559 +- 0.00039173  chisq= 1.03508563 number of evaluations = 15655000 
Iteration  63: I= 1.39279028 +- 0.00038664  chisq= 1.02087328 number of evaluations = 16065000 
Iteration  64: I= 1.39288265 +- 0.00038148  chisq= 1.03883198 number of evaluations = 16480000 
Iteration  65: I= 1.39293693 +- 0.00037982  chisq= 1.05905661 number of evaluations = 16900000 
Iteration  66: I= 1.39293378 +- 0.00037673  chisq= 1.04282903 number of evaluations = 17325000 
Iteration  67: I= 1.39296399 +- 0.00037183  chisq= 1.03079886 number of evaluations = 17755000 
Iteration  68: I= 1.39297885 +- 0.00036871  chisq= 1.01684201 number of evaluations = 18190000 
Iteration  69: I= 1.39301103 +- 0.00036469  chisq= 1.00705290 number of evaluations = 18630000 
Iteration  70: I= 1.39300538 +- 0.00035991  chisq= 0.99259139 number of evaluations = 19075000 
Iteration  71: I= 1.39299462 +- 0.00035519  chisq= 0.97890211 number of evaluations = 19525000 
Iteration  72: I= 1.39296368 +- 0.00035098  chisq= 0.96964893 number of evaluations = 19980000 
Iteration  73: I= 1.39298080 +- 0.00034688  chisq= 0.95760365 number of evaluations = 20440000 
1.3929807984198352 +- 0.00034687790730998316 exact= 1.3932 real error= 0.00015733676440191774
No description has been provided for this image

Homework¶

  • generalize Vegas algorithm so that the integration limits are given by arbitrary numbers [a,b]. You can still assume supercube in which all dimensions have the same limit [a,b].
  • Test Vegas on the same example $f=1/(1-\cos(k_x)\cos(k_y)\cos(k_z))/\pi^3$ but use limits $[-\pi,\pi]$ instead of $[0,\pi]$.
  • Speed up the part of the code that Redefines the grid RefineGrid
  • generalize Vegas so that it works for complex functions.
  • Using Vegas, evaluate the following Linhard function:
\begin{eqnarray} P(\Omega,\vec{q}) = -2 \int \frac{d^3k}{(2\pi)^3} \frac{f(\varepsilon_{\vec{k}+\vec{q}})-f(\varepsilon_{\vec{k}})}{\Omega-\varepsilon_{\vec{k}+\vec{q}}+\varepsilon_\vec{k}+i\delta} \end{eqnarray}

Here $\varepsilon_\vec{k}=k^2-k_F^2$.

We will use $k_F=\frac{(\frac{9\pi}{4})^{1/3}}{rs}$ with $r_s=2$, $f(x) = 1/(\exp(x/T)+1)$, $T=0.02 k_F^2$, $\delta = 0.002 k_F^2$, $\Omega=0$, $q=0.1 k_F$, integration limits can be set to $[-3 k_F,3 k_F]$.

The result for $P(\Omega=0,q<< k_F)$ should be close to $-n_F$, where $n_F = \frac{k_F}{2\pi^2}$.

  • Optional: Change the Vegas algorithm so that it computes $P(\Omega,\vec{q})$ at once for an array of $\Omega$ points, such as linspace(0,0.5*kF*kF,200). Use $P(\Omega=0)$ to redefine the grid, so that we have most efficient integration at $\Omega=0$ (where the function is hardest to integrate). More info can be found at:

https://en.wikipedia.org/wiki/Lindhard_theory

In [ ]: