Introduction ------------- In bmcmc, we fit a statistical model specified by a set of parameters :math:`\theta` to some data :math:`D`. Using MCMCM we sample the posterior :math:`p(\theta|D)`. A statistical model is specified by defining a subclass derived from the parent class ``bmcmc.Model`` and the user has to write 3 methods for his subclass. We show this with a simple example. Fitting a Gaussian distribution -------------------------------- A Gaussian or a normal distribution is given by :math:`\mathcal{N}(x|\mu,\sigma^2)=\frac{1}{\sqrt{2 \pi}\sigma}\exp\left[-\frac{(x-\mu)^2}{2\sigma^2}\right]` and we fit this to an array of values representing samples from a one dimensional distribution.:: import bmcmc import scipy.stats class gauss1(bmcmc.Model): def set_descr(self): # setup descriptor # self.descr[varname]=[type, value, sigma, latexname, min_val, max_val] self.descr['mu'] =['l0', 0.0, 1.0, r'$\mu$', -500, 500.0] self.descr['sigma'] =['l0', 1.0, 1.0, r'$\sigma$', 1e-10, 1e3] def set_args(self): # setup data points # Use self.args and self.eargs to define your variables. # During sampling, the code will pass self.args as an argument to lnfunc. np.random.seed(11) self.args['x']=np.random.normal(loc=0.0,scale=1.0,size=self.eargs['dsize']) def lnfunc(self,args): # define the function which needs to be sampled (log posterior) # Do not use self.args in this function. You can use self.eargs. return scipy.stats.norm.logpdf(args['x'],loc=args['mu'],scale=args['sigma']) Create an object. The ``bmcmc.Model`` accepts a keyword argument named ``eargs`` which should be a dictionary. In the class, it is available as ``self.eargs``. Through this keyword one can pass any external information to the class. >>> mymodel=gauss1(eargs={'dsize':50}) Run the sampler for two free parameters, for 50000 iterations, printing summary every 10000 iterations. >>> mymodel.sample(['mu','sigma'],50000,ptime=10000) Print final results, discarding first 10000 iterations >>> mymodel.info(burn=5000) no name mean stddev [ 16%, 50%, 84%] Level-0 0 mu 0.0121747 0.0955931 [ -0.0818376, 0.0127852, 0.106895] 1 sigma 0.948579 0.0680425 [ 0.880681, 0.944834, 1.01692] Plot final results , discarding first 5000 iterations >>> mymodel.plot(burn=5000) .. figure:: images/gauss1.png Print chain values for a few given parameters >>> print mymodel.chain['mu'] >>> print mymodel.chain['sigma'] To print final results as a latex table, discarding the first 5000 iterations. >>> mymodel.info(burn=5000,latex=True) \begin{tabular} { l c} Parameter & 16 and 84 \% limits\\ \hline $\mu$ & $0.01^{+0.09}_{-0.09}$ \\ $\sigma$ & $0.95^{+0.07}_{-0.07}$ \\ \hline \end{tabular} Plot final results for specific parameters >>> mymodel.plot(keys=['mu','sigma'],burn=1000) .. For a derived class named ``gauss1`` the object is instantiated by :: mymodel=gauss1(eargs={'dsize':50}) .. We then run the MCMC sampler. The first argument is the list of parameter names which we wish to constrain, the second argument is the number of iterations we want to run. The keyword ``ptime`` is the frequency (in number of iterations) with which to print summary on the screen. :: #mymodel.sample(varnames,iterations,ptime=1000) mymodel.sample(['mu','sigma'],10000,ptime=1000) .. To print final results, discarding the first 1000 iterations. :: mymodel.info(burn=1000) .. To print final results as a latex table, discarding the first 1000 iterations. :: mymodel.info(burn=1000, latex=True) .. To plot results, discarding the first 1000 iterations. :: mymodel.plot(burn=1000) .. To plot results, for specific parameters. :: mymodel.plot(keys=['mu','sigma'],burn=1000) .. To print the MCMC chain for a few given parameters. :: print mymodel.chain['m'] print mymodel.chain['c'] Bayesian hierarchical model (BHM). ----------------------------------- We again consider the example of fitting a Gaussian distribution to some observed data. This time the observed data :math:`x` has uncertainty :math:`\sigma_x` associated with it. The true values :math:`x_t` is unknown which we estimate alongside the parameters of the Gaussian distribution :math:`\theta=\{ \mu,\sigma\}`. The posterior is given by .. math:: p(\theta,x_t|x)=\sum_i \mathcal{N}(x_{t,i}|\mu,\sigma^2)\mathcal{N}(x|x_{t,i},\sigma_{x,i}^2). The Metropolis-Within-Gibbs scheme is used for sampling the posterior. While :math:`(\mu,\sigma)` are level-0 parameters, :math:`x_t` is at level-1. This makes the model hierarchical. A wide variety of real world problems fall into this category. This method provides a way to marginalize over observational uncertainties via sampling but rather than doing any integration. Missing or hidden variables can also be marginalized in a similar fashion. :: class gauss2(bmcmc.Model): def set_descr(self): # setup descriptor self.descr['mu'] =['l0',0.0,1.0,r'$\mu$' ,-500,500.0] self.descr['sigma'] =['l0',1.0,1.0,r'$\sigma$',1e-10,1e3] self.descr['xt'] =['l1',0.0,1.0,r'$x_t$' ,-500.0,500.0] def set_args(self): # setup data points np.random.seed(11) # generate true coordinates of data points self.args['x']=np.random.normal(loc=self.eargs['mu'],scale=self.eargs['sigma'],size=self.eargs['dsize']) # add observational uncertainty to each data point self.args['sigma_x']=np.zeros(self.args['x'].size,dtype=np.float64)+0.5 self.args['x']=np.random.normal(loc=self.args['x'],scale=self.args['sigma_x'],size=self.eargs['dsize']) def lnfunc(self,args): # log posterior temp1=scipy.stats.norm.logpdf(args['xt'],loc=args['mu'],scale=args['sigma']) temp2=scipy.stats.norm.logpdf(args['x'],loc=args['xt'],scale=args['sigma_x']) return temp1+temp2 Create an object and run the sampler. >>> mymodel=gauss2(eargs={'dsize':100,'mu':1.0,'sigma':2.0}) >>> mymodel.sample(['mu','sigma','xt'],50000,ptime=1000) Note, for level-1 parameters, each data point has its own value for the parameter. Rather than storing the full chain, only mean and standard-deviation for each data point are stored and made available. These can be accessed as follows. To print mean value of ``xt`` for each data point >>> print mymodel.mu['xt'] [ 1.68764301 -0.191843 ..., 0.80656432 0.47805343] To print stddev of ``xt`` for each data point >>> print mymodel.sigma['xt'] [ 0.44846551 0.43683076 ..., 0.44343331 0.44246093] Print final results. For level-1 parameters the printed values are just average taken over all data points. >>> mymodel.info() no name mean stddev [ 16%, 50%, 84%] Level-0 0 mu 0.0141504 0.106783 [ -0.0919234, 0.013786, 0.119224] 1 sigma 0.929101 0.0875563 [ 0.842694, 0.923683, 1.01629] Level-1 0 xt 0.0132397 0.441688 >>> mymodel.plot(burn=5000) .. figure:: images/gauss2.png Data Augmentation ==================== We can also estimate :math:`\theta=\{\mu,\sigma\}` by explicitly integrating out the :math:`x_t`. The posterior is given by .. math:: p(\theta|x)=\sum_i \mathcal{N}(x_i|\mu,\sigma^2+\sigma_{x,i}^2) Autocorrelation analysis shows that explcit integration reduces the correlation length. However, not all functions can be easily integrated. :: class gauss3(bmcmc.Model): def set_descr(self): # setup descriptor self.descr['mu'] =['l0',0.0,1.0,r'$\mu$ ',-500,500.0] self.descr['sigma'] =['l0',1.0,1.0,r'$\sigma$ ',1e-10,1e3] def set_args(self): # setup data points np.random.seed(11) # generate true coordinates of data points self.args['x']=np.random.normal(loc=self.eargs['mu'],scale=self.eargs['sigma'],size=self.eargs['dsize']) # add observational uncertainty to each data point self.args['sigma_x']=np.zeros(self.args['x'].size,dtype=np.float64)+0.5 self.args['x']=np.random.normal(loc=self.args['x'],scale=self.args['sigma_x'],size=self.eargs['dsize']) def lnfunc(self,args): # log posterior, xt has been integrated out sigma=np.sqrt(args['sigma']*args['sigma']+args['sigma_x']*args['sigma_x']) temp=scipy.stats.norm.logpdf(args['x'],loc=args['mu'],scale=sigma) return temp # Data augmentation, making use of Hierarchical modelling # marginalization using sampling. m2=gauss2(eargs={'dsize':1000,'mu':0.0,'sigma':1.0}) m2.sample(['mu','sigma','xt'],100000,ptime=10000) # Marginalization using direct integration m3=gauss3(eargs={'dsize':1000,'mu':0.0,'sigma':1.0}) m3.sample(['mu','sigma'],100000,ptime=10000) plt.subplot(2,2,1) plt.hist(m2.chain['mu'],range=[-0.2,0.2],bins=100,normed=True,histtype='step',lw=2.0) plt.hist(m3.chain['mu'],range=[-0.2,0.2],bins=100,normed=True,histtype='step',lw=2.0) plt.ylabel('p') plt.xlabel(r'$\mu$') plt.xlim([-0.2,0.2]) plt.xticks([-0.2,-0.1,0.0,0.1,0.2]) plt.subplot(2,2,2) plt.hist(m2.chain['sigma'],range=[0.85,1.15],bins=100,normed=True,histtype='step',lw=2.0) plt.hist(m3.chain['sigma'],range=[0.85,1.15],bins=100,normed=True,histtype='step',lw=2.0) plt.ylabel('p') plt.xlabel(r'$\sigma$') plt.xlim([0.85,1.15]) plt.xticks([0.9,1.0,1.1]) plt.subplot(2,2,3) nsize=50 plt.plot(np.arange(nsize),bmcmc.autocorr(m2.chain['mu'])[0:nsize],label='DA',lw=2.0) plt.plot(np.arange(nsize),bmcmc.autocorr(m3.chain['mu'])[0:nsize],label='Integration',lw=2.0) plt.ylabel(r'$\rho_{\mu \mu}(t)$') plt.xlabel(r'lag $t$') plt.legend() plt.axis([0,50,0,1]) plt.subplot(2,2,4) plt.plot(np.arange(nsize),bmcmc.autocorr(m2.chain['sigma'])[0:nsize],lw=2.0) plt.plot(np.arange(nsize),bmcmc.autocorr(m3.chain['sigma'])[0:nsize],lw=2.0) plt.ylabel(r'$\rho_{\sigma \sigma}(t)$') plt.xlabel(r'lag $t$') plt.axis([0,50,0,1]) plt.tight_layout() .. figure:: images/da_demo1.png Fitting a straight line with outliers -------------------------------------- We discuss the case of fitting a straight line to some data points :math:`D=\{(x_1,y_1),(x_2,y_2)...(x_N,y_N)\}`. The straight line model to decribe the points is .. math:: p(y_i| m, c, x_i, \sigma_{y,i}) = \frac{1}{\sqrt{2 \pi} \sigma_{y,i}}\exp\left(-\frac{(y_i - mx_i )^2}{2 \sigma_{y,i}^2}\right) The background model to take the outliers into account is .. math:: p(y_i|\mu_b,\sigma_b,x_i,\sigma_{y,i})=\frac{1}{\sqrt{2\pi(\sigma_{y,i}^2+\sigma_b^2)}}\exp\left(-\frac{(y_i-\mu_b)^2}{2 (\sigma_{y,i}^2+\sigma_b^2)}\right) The full model to describe the data is .. math:: p(Y|m,c,\mu_b,P_b,\sigma_b,X,\sigma_y)=\prod_{i=1}^N [p(y_i|m,c,x_i,\sigma_{y,i})P_b+p(y_i|\mu_b,\sigma_b,x_i,\sigma_{y,i})(1-P_b)] The methods in the derived class for implementing the straight line model are as follows. :: class stlineb(bmcmc.Model): def set_descr(self): # setup descriptor self.descr['m'] =['l0', 1.0, 0.2,'$m$', -1e10,1e10] self.descr['c'] =['l0',10.0, 1.0,'$c$', -1e10,1e10] self.descr['mu_b'] =['l0', 1.0, 1.0,'$\mu_b$', -1e10,1e10] self.descr['sigma_b']=['l0', 1.0, 1.0,'$\sigma_b$', 1e-10,1e10] self.descr['p_b'] =['l0',0.1,0.01,'$P_b$', 1e-10,0.999] def set_args(self): # setup data points np.random.seed(11) self.args['x']=0.5+np.random.ranf(self.eargs['dsize'])*9.5 self.args['sigma_y']=0.25+np.random.ranf(self.eargs['dsize']) self.args['y']=np.random.normal(self.args['x']*2+10,self.args['sigma_y']) # add outliers self.ind=np.array([0,2,4,6,8,10,12,14,16,18]) self.args['y'][self.ind]=np.random.normal(30,5,self.ind.size) self.args['y'][self.ind]=self.args['y'][self.ind]+np.random.normal(0.0,self.args['sigma_y'][self.ind]) def lnfunc(self,args): # log posterior if self.eargs['outliers'] == False: temp1=(args['y']-(self.args['m']*self.args['x']+self.args['c']))/args['sigma_y'] return -0.5*(temp1*temp1)-np.log(np.sqrt(2*np.pi)*args['sigma_y']) else: temp11=scipy.stats.norm.pdf(args['y'],loc=(self.args['m']*self.args['x']+self.args['c']),scale=args['sigma_y']) sigma_b=np.sqrt(np.square(args['sigma_y'])+np.square(args['sigma_b'])) temp22=scipy.stats.norm.pdf(args['y'],loc=self.args['mu_b'],scale=sigma_b) return np.log((1-args['p_b'])*temp11+args['p_b']*temp22) def myplot(self,chain): # optional method for plotting # plot best fit line corrsponding to chain of this model plt.clf() x = np.linspace(0,10) burn=self.chain['m'].size/2 vals=self.best_fit(burn=burn) plt.errorbar(self.args['x'], self.args['y'], yerr=self.args['sigma_y'], fmt=".k") plt.errorbar(self.args['x'][self.ind], self.args['y'][self.ind], yerr=self.args['sigma_y'][self.ind], fmt=".r") plt.plot(x,vals[0]*x+vals[1], color="g", lw=2, alpha=0.5) for i,key in enumerate(self.names0): print key plt.text(0.5,0.3-i*0.06,self.descr[key][3]+'='+bmcmc.stat_text(self.chain[key][burn:]),transform=plt.gca().transAxes) # plot best fit line corrsponding to some other chain vals1=[] burn1=chain['m'].size/2 for i,key in enumerate(['m','c']): print key plt.text(0.05,0.5-i*0.05,self.descr[key][3]+'='+bmcmc.stat_text(chain[key][burn1:]),transform=plt.gca().transAxes) vals1.append(np.mean(chain[key][burn1:])) plt.plot(x,vals1[0]*x+vals1[1], 'g--', lw=2, alpha=0.5) plt.xlabel(r'$x$') plt.ylabel(r'$y$') plt.axis([0,10,5,40,]) >>> model1=stlineb(eargs={'dsize':50}) Run without any model for outliers >>> model1.eargs['outliers']=False >>> model1.sample(['m','c'],10000) >>> chain1=model1.chain Run with a model for outliers >>> model1.eargs['outliers']=True >>> model1.sample(['m','c','p_b','mu_b','sigma_b'],20000) Plot the results. Expected values are (m, c, p_b , mu_b, sigma_b)=(2.0, 10.0, 0.2, 30, 5.0) >>> model1.myplot(chain1) .. figure:: images/stline.png Analysis of a binary system using radial velocity measurement --------------------------------------------------------------- The radial velocity of a star of mass :math:`M` in a binary system with companion of mass :math:`m` in an orbit with time period :math:`T` and eccentricity :math:`e` is given by .. math:: v(t)=\kappa \left[\cos(f+\omega)+e\cos \omega\right]+v_{0},\:\:{\rm with\ } \kappa=\frac{(2\pi G)^{1/3}m \sin I}{T^{1/3}(M+m)^{2/3}\sqrt{1-e^2}}. The true anomaly :math:`f` is a function of time, but depends upon :math:`e`, :math:`T`, and :math:`\tau` via, .. math:: \tan(f/2)=\sqrt{\frac{1+e}{1-e}}\tan(u/2), \quad u-e\sin u=\frac{2\pi}{T}(t-\tau). The actual radial velocity data will differ from the perfect relationship given above due to the observational uncertainty (variance :math:`\sigma_v^2`) and intrinsic variability of a star (variance :math:`S^2`) and we can model this by a Gaussian function :math:`\mathcal{N}(.|v,\sigma_v^2+S^2)`. For radial velocity data :math:`D` defined as a set of radial velocities :math:`\{v_1,...,v_M\}` at various times :math:`\{t_1,...,t_M\}`, one can fit and constrain seven parameters, :math:`\theta=(v_{0}, \kappa, T, e, \tau, \omega, S)`, using the Bayes theorem as shown below .. math:: p(\theta|D)=p(D|\theta) p(\theta)= p(\theta) \prod_{i=1}^{M} \mathcal{N}(v_i|v(t_i;\theta),\sigma_v^2+S^2). We implement this as shown below. :: # functions for computing the radial velocity curve def true_anomaly(t,tp,e,tau): temp1=np.min((t-tau)/tp)-1 temp2=np.max((t-tau)/tp)+1 u1=np.linspace(2*np.pi*temp1,2*np.pi*temp2,1000) ma=u1-e*np.sin(u1) myfunc=scipy.interpolate.interp1d(ma,u1) u=myfunc((2*np.pi)*(t-tau)/tp) return 2*np.arctan(np.sqrt((1+e)/(1-e))*np.tan(0.5*u)) def kappa(t,e,m,mc,i): au_yr=4.74057 # km/s G=4*np.pi*np.pi*np.power(au_yr,3.0) # [km/s]^{3}*[yr]^{-1}*[M_sol] nr=np.power(2*np.pi*G,1/3.0)*mc*np.sin(np.radians(i)) dr=np.power(t/365.0,1/3.0)*np.power(m+mc,1/3.0)*np.sqrt(1-e*e) return nr/dr def vr(t,kappa,tp,e,tau,omega,v0): omega=np.radians(omega) f=true_anomaly(t,tp,e,tau) return kappa*(np.cos(f+omega)+e*np.cos(omega))+v0 # model for describing a binary system class binary_model(bmcmc.Model): def set_descr(self): # setup descriptor self.descr['kappa'] =['p',0.1,0.1,r'$\kappa$ ',1e-10,10.0] self.descr['tp'] =['p',365.0,10.0,r'$T$ ',1e-10,1e6] self.descr['e'] =['p',0.5,0.1,r'$e$ ',0,1.0] self.descr['tau'] =['p',50.0,10.0,r'$\tau$ ',-360.0,360.0] self.descr['omega']=['p',180.0,10.0,r'$\omega$ ',-360,360.0] self.descr['v0']=['p',0.0,1.0,r'$v_0$ ',-1e3,1e3] self.descr['s']=['p',0.02,1.0,r'$\sigma$ ',1e-10,1e3] def set_args(self): # setup data points np.random.seed(17) kappa=0.15 tp=350.0 e=0.3 tau=87.5 omega=-90.0 v0=0.0 print tau,omega self.args['t']=np.linspace(0,self.args['tp']*1.5,self.eargs['dsize']) vr1=vr(self.args['t'],kappa,tp,e,tau,omega,v0) self.args['vr']=vr1+np.random.normal(0.0,self.args['s'],size=self.eargs['dsize']) def lnfunc(self,args): vr1=vr(args['t'],args['kappa'],args['tp'],args['e'],args['tau'],args['omega'],args['v0']) temp=-np.square(args['vr']-vr1)/(2*args['s']*args['s'])-np.log(np.sqrt(2*np.pi)*args['s']) return temp Create an object of the model. >>> m1=binary_model(eargs={'dsize':50}) Run the sampler. >>> m1.sample(['kappa','tp','e','tau','omega','v0'],50000,ptime=10000) Plot the results. >>> plt.figure() >>> m1.plot(keys=['kappa','tp','e'],burn=10000) .. figure:: images/rv_mcmc_params.png Plot the best fit model. :: res=m1.best_fit() args=m1.args t=np.linspace(0,np.max(m1.args['t']),1000) plt.figure() plt.errorbar(args['t'],args['vr'],yerr=args['s'],fmt='o',label='data') plt.plot(t,vr(t,res[0],res[1],res[2],res[3],res[4],res[5]),label='best fit',lw=2.0) plt.plot(t,vr(t,res[0],res[1],0.0,res[3],res[4],res[5]),'r--',label='e=0.0',lw=1.0) plt.title(r'$\kappa=0.15$ km/s, $T=350$ days, $e=0.3$, $\omega=-90.0^{\circ}$, $\tau=87.5$ days') plt.xlabel('Time (days)') plt.ylabel('radial velocity (km/s)') plt.legend(loc='lower left',frameon=False) plt.axis([0,600,-0.35,0.35]) .. figure:: images/rv_mcmc.png Group means ------------ Given data :math:`Y=\{y_{ij}|0>> m1=gmean(eargs={'dsize':40}) >>> m1.sample(['mu','omega','alpha'],10000) >>> m1.myplot() .. figure:: images/gmean.png Extra details --------------- To better understand as to how the code works, it is instructive to look at the ``__init__`` method of ``bmcmc.Model``. It creates the dictionaries ``self.descr`` and ``self.args`` so the user does not have to create them. It then calls ``self.set_descr()``. After this it initializes all level-0 parameters in ``self.args`` to values from ``self.descr``. So parameters of the model that are to be kept free do not need to be initialized. The user can reinitialize them ``set_args`` which is called next. If ``self.eargs['dsize']`` , which is the number of data points, has not been set earlier, it must be specified in ``set_args``. Any level-1 parameter that is not initialized in ``set_args``, is done using values from ``self.descr``. :: def __init__(self,eargs=None): self.eargs=eargs self.args={} self.descr={} self.set_descr() for name in self.descr.keys(): if (self.descr[name][0]=='l0'): self.args[name]=np.float64(self.descr[name][1]) self.set_args() for name in self.descr.keys(): if (self.descr[name][0]=='l1'): if name not in self.args.keys(): self.args[name]=np.zeros(self.eargs['dsize'],dtype=np.float64)+self.descr[name][1] .. * ``set_descr``: This sets the dictionary ``self.descr`` that describes the parameters :math:`\theta` of the model.:: def set_descr(self): #self.descr[name] =[level, value, sigma, latex_name, value_min, value_max] self.descr['m'] =['l0', 1.0, 0.2, '$m$', -1e10, 1e10] self.descr['c'] =['l0', 10.0, 1.0, '$c$' , -1e10, 1e10] self.descr['mu_b'] =['l0', 1.0, 1.0, '$\mu_b$', -1e10, 1e10] self.descr['sigma_b']=['l0', 1.0, 1.0, '$\sigma_b$', 1e-10, 1e10] self.descr['p_b'] =['l0', 0.15, 0.01, '$P_b$', 1e-10, 0.999] self.descr['x'] =['l1', 0.0, 1.0, '$x$', -500.0, 500.0] self.descr['sigma_y']=['l1', 1.0, 1.0, '$\sigma_x$', 1e-10, 1e3] self.descr['y'] =['l1', 0.0, 1.0, '$y$' , -500.0, 500.0] For a hierarchical model parameters can exist at various levels. We have two options to choose from ``l0`` (top level for example hyper-parameters) or ``l1`` (the level below it). * ``set_args``: This sets the dictionary ``self.args`` that specifies the data :math:`D` to be used by the model. Any parameters :math:`\theta` or variables that will be used to compute the posterior, should be initialized here. The dictionary itself is already defined in ``__init__`` method, which also initializes all level-0 parameters to values from ``self.descr``. The user can reset them if needed. If not set earlier the number of data points must be specified by assigning a value to ``self.eargs['dsize']``. If any level-1 parameter is not initialized , the base class will attempt to initialize it from ``self.descr``. :: def set_args(self): # The dict self.args is already defined and all level-0 # parameters have been initialized to values from self.descr np.random.seed(11) self.args['x']=0.5+np.random.ranf(self.eargs['dsize'])*9.5 self.args['sigma_y']=0.25+np.random.ranf(self.eargs['dsize']) # Generate data satisfying y=mx+c+e self.args['y']=np.random.normal(self.args['x']*2+10,self.args['sigma_y']) # add outliers self.ind=np.array([0,2,4,6,8,10,12,14,16,18]) self.args['y'][self.ind]=np.random.normal(30,5,self.ind.size) # add error to outliers self.args['y'][self.ind]=self.args['y'][self.ind]+np.random.normal(0.0,self.args['sigma_y'][self.ind]) * ``lnfunc(self,args)``: This is for specifying the posterior probability :math:`p(\theta|D)` that we want to sample using MCMC. This should be an array of size ``self.eargs['dsize']``, i.e., posterior for each data point. :: def lnfunc(self,args): # likelihood of the straight line model temp1=(args['y']-(self.args['m']*self.args['x']+self.args['c']))/args['sigma_y'] temp1=temp1.clip(max=30.0) temp11=(1-args['p_b'])*np.exp(-0.5*temp1*temp1)/(np.sqrt(2*np.pi)*args['sigma_y']) # likelihood of the outlier model sigma_b=np.sqrt(np.square(args['sigma_y'])+np.square(args['sigma_b'])) temp2=(args['y']-self.args['mu_b'])/sigma_b temp2=temp2.clip(max=30.0) temp22=args['p_b']*np.exp(-0.5*temp2*temp2)/(np.sqrt(2*np.pi)*sigma_b) # full log likelihood return np.log(temp11+temp22) A full class for the model is as follows:: class stlineb(bmcmc.Model): def set_descr(self): self.descr['m'] =['l0', 1.0, 0.2,'$m$', -1e10,1e10] self.descr['c'] =['l0',10.0, 1.0,'$c$', -1e10,1e10] self.descr['mu_b'] =['l0', 1.0, 1.0,'$\mu_b$', -1e10,1e10] self.descr['sigma_b']=['l0', 1.0, 1.0,'$\sigma_b$',1e-10,1e10] self.descr['p_b'] =['l0',0.15,0.01,'$P_b$', 1e-10,0.999] self.descr['x'] =['l1', 0.0, 1.0,'$x$', -500.0,500.0] self.descr['sigma_y']=['l1', 1.0, 1.0,'$\sigma_x$',1e-10,1e3] self.descr['y'] =['l1', 0.0, 1.0,'$y$', -500.0,500.0] def set_args(self): np.random.seed(11) self.args['x']=0.5+np.random.ranf(self.eargs['dsize'])*9.5 self.args['sigma_y']=0.25+np.random.ranf(self.eargs['dsize']) self.args['y']=np.random.normal(self.args['x']*2+10,self.args['sigma_y']) self.ind=np.array([0,2,4,6,8,10,12,14,16,18]) self.args['y'][self.ind]=np.random.normal(30,5,self.ind.size) self.args['y'][self.ind]=self.args['y'][self.ind]+np.random.normal(0.0,self.args['sigma_y'][self.ind]) def lnfunc(self,args): temp1=(args['y']-(self.args['m']*self.args['x']+self.args['c']))/args['sigma_y'] sigma_b=np.sqrt(np.square(args['sigma_y'])+np.square(args['sigma_b'])) temp2=(args['y']-self.args['mu_b'])/sigma_b temp1=temp1.clip(max=30.0) temp2=temp2.clip(max=30.0) temp11=(1-args['p_b'])*np.exp(-0.5*temp1*temp1)/(np.sqrt(2*np.pi)*args['sigma_y']) temp22=args['p_b']*np.exp(-0.5*temp2*temp2)/(np.sqrt(2*np.pi)*sigma_b) return np.log(temp11+temp22) def myplot(self): plt.clf() x = np.linspace(0,10) plt.errorbar(self.args['x'], self.args['y'], yerr=self.args['sigma_y'], fmt=".k") plt.errorbar(self.args['x'][self.ind], self.args['y'][self.ind], yerr=self.args['sigma_y'][self.ind], fmt=".r") plt.plot(x,vals[0]*x+vals[1], color="g", lw=2, alpha=0.5) plt.xlabel(r'$x$') plt.ylabel(r'$y$') Hierarchical model using MWG ---------------------------- Exoplanets and binary system ----------------------------