Prev: Directing formatted output to different notebooks
Next: 3D visulaisation of 3D matrix for a 3D CA
From: Kevin J. McCann on 27 Apr 2010 04:06 I am using a Markov Chain Monte Carlo (MCMC) approach to evaluate a multidimensional probability density function. The output is a large number of multidimensional points {x1,x2,...,xn}. I can use BinCounts to gather the points into a PDF (after appropriate normalization). I would like to then define a function, p[X_], which is the multidimensional interpolation of the BinCounts output, but I can't figure out how to automate this for an arbitrary number of dimensions. Any ideas? For the 2d case I did the following: tbl = Partition[ Flatten[Table[{xmin + i*\[CapitalDelta]x + \[CapitalDelta]x/2, ymin + j*\[CapitalDelta]y + \[CapitalDelta]y/2, counts[[i + 1, j + 1]]/(\[ScriptCapitalN] \[CapitalDelta]x \ \[CapitalDelta]y)}, {i, 0, nx - 1}, {j, 0, ny - 1}]], 3]; f=Interpolation[tbl] But as you can see, this is not easily extended to higher dimensions. Kevin
From: dh on 27 Apr 2010 08:48 On 27.04.2010 10:06, Kevin J. McCann wrote: > I am using a Markov Chain Monte Carlo (MCMC) approach to evaluate a > multidimensional probability density function. The output is a large > number of multidimensional points {x1,x2,...,xn}. I can use BinCounts to > gather the points into a PDF (after appropriate normalization). I would > like to then define a function, p[X_], which is the multidimensional > interpolation of the BinCounts output, but I can't figure out how to > automate this for an arbitrary number of dimensions. > > Any ideas? > > For the 2d case I did the following: > > tbl = Partition[ > Flatten[Table[{xmin + i*\[CapitalDelta]x + \[CapitalDelta]x/2, > ymin + j*\[CapitalDelta]y + \[CapitalDelta]y/2, > counts[[i + 1, > j + 1]]/(\[ScriptCapitalN] \[CapitalDelta]x \ > \[CapitalDelta]y)}, {i, 0, nx - 1}, {j, 0, ny - 1}]], 3]; > > f=Interpolation[tbl] > > But as you can see, this is not easily extended to higher dimensions. > > Kevin > Hi Kevin, if I understand correctly, your problem is the generation of a suitable grid of data points for "Interpolation". Assume you have a function bins[{i1,i2,..,in}] of n integer arguments. The arguments run from 0..ni. The vector of ni is called bounds={n1,n2..nn}. We can now define the function "dataGrid" that creates a rectangular multidimensional structure for the input to Interpolation: dataGrid[bins_, bounds_] := Module[{iter}, iter = {x, 0, n - 1} /. Table[{x -> Symbol["x" <> ToString[i]], n -> bounds[[i]]}, {i, 1, Length[bounds]}]; Flatten[ Table[{iter[[All, 1 ]], bins[iter[[All, 1 ]]]}, Evaluate[Sequence @@ iter]] , Length[bounds] - 1] ] If we choose an example for bins: bins[v : {_ ..}] := Times @@ v; we can calulation an interpolation: bins[v : {_ ..}] := Times @@ v; Interpolation(a)dataGrid[bins, {4, 4, 4}] cheers, Daniel -- Daniel Huber Metrohm Ltd. Oberdorfstr. 68 CH-9100 Herisau Tel. +41 71 353 8585, Fax +41 71 353 8907 E-Mail:<mailto:dh(a)metrohm.com> Internet:<http://www.metrohm.com>
From: Kurt TeKolste on 29 Apr 2010 02:52 If I understand this algorithm: it would seem that it will feed all of the counts for all of the bins into Interpolation. If this is correct, read on. One of the problems in dealing with multidimensional data is that it takes quite large samples to fill in the huge multidimensional volume. In other words, it is hard to get bins fine enough in all dimensions and without having almost all of your bin counts be zero. I suspect that the interpolation will not be very satisfying unless your sample size is huge or you only need relatively course bins. Note the dividing each of four dimensions into 20 bins is already 160,000 bins with an average probability that a randomly chosen sample will be in any particular bin of 1/160000 = 4x10^-6. It takes a long time for the montecarlo to look like a real distribution ... I am not an expert in this area, but I would be tempted to use only the bins with non-zero values. I recall reading about some techniques for dealing with this -- something about trying to sample where the density is highest -- but do not recall the reference. Also, if you start with an a priori distribution rather than trying to construct the distribution based solely on data you have more tools available. ekt On Tue, 27 Apr 2010 08:48 -0400, "dh" <dh(a)metrohm.com> wrote: > On 27.04.2010 10:06, Kevin J. McCann wrote: > > I am using a Markov Chain Monte Carlo (MCMC) approach to evaluate a > > multidimensional probability density function. The output is a large > > number of multidimensional points {x1,x2,...,xn}. I can use BinCounts to > > gather the points into a PDF (after appropriate normalization). I would > > like to then define a function, p[X_], which is the multidimensional > > interpolation of the BinCounts output, but I can't figure out how to > > automate this for an arbitrary number of dimensions. > > > > Any ideas? > > > > For the 2d case I did the following: > > > > tbl = Partition[ > > Flatten[Table[{xmin + i*\[CapitalDelta]x + \[CapitalDelta]x/2, > > ymin + j*\[CapitalDelta]y + \[CapitalDelta]y/2, > > counts[[i + 1, > > j + 1]]/(\[ScriptCapitalN] \[CapitalDelta]x \ > > \[CapitalDelta]y)}, {i, 0, nx - 1}, {j, 0, ny - 1}]], 3]; > > > > f=Interpolation[tbl] > > > > But as you can see, this is not easily extended to higher dimensions. > > > > Kevin > > > Hi Kevin, > if I understand correctly, your problem is the generation of a suitable > grid of data points for "Interpolation". > Assume you have a function bins[{i1,i2,..,in}] of n integer arguments. > The arguments run from 0..ni. The vector of ni is called > bounds={n1,n2..nn}. We can now define the function "dataGrid" that > creates a rectangular multidimensional structure for the input to > Interpolation: > > dataGrid[bins_, bounds_] := Module[{iter}, > iter = {x, 0, n - 1} /. > Table[{x -> Symbol["x" <> ToString[i]], n -> bounds[[i]]}, {i, 1, > Length[bounds]}]; > Flatten[ > Table[{iter[[All, 1 ]], bins[iter[[All, 1 ]]]}, > Evaluate[Sequence @@ iter]] > , Length[bounds] - 1] > ] > > If we choose an example for bins: > bins[v : {_ ..}] := Times @@ v; > we can calulation an interpolation: > > bins[v : {_ ..}] := Times @@ v; > Interpolation(a)dataGrid[bins, {4, 4, 4}] > > cheers, Daniel > > -- > > Daniel Huber > Metrohm Ltd. > Oberdorfstr. 68 > CH-9100 Herisau > Tel. +41 71 353 8585, Fax +41 71 353 8907 > E-Mail:<mailto:dh(a)metrohm.com> > Internet:<http://www.metrohm.com> > > >
From: DrMajorBob on 30 Apr 2010 05:49 If Kevin wants to approximate a PDF, perhaps he should start with the sample CDF, interpolate and smooth it, then differentiate. Bobby On Thu, 29 Apr 2010 01:53:38 -0500, Kurt TeKolste <tekolste(a)fastmail.net> wrote: > If I understand this algorithm: it would seem that it will feed all of > the counts for all of the bins into Interpolation. If this is correct, > read on. > > One of the problems in dealing with multidimensional data is that it > takes quite large samples to fill in the huge multidimensional volume. > In other words, it is hard to get bins fine enough in all dimensions and > without having almost all of your bin counts be zero. > > I suspect that the interpolation will not be very satisfying unless your > sample size is huge or you only need relatively course bins. Note the > dividing each of four dimensions into 20 bins is already 160,000 bins > with an average probability that a randomly chosen sample will be in any > particular bin of 1/160000 = 4x10^-6. It takes a long time for the > montecarlo to look like a real distribution ... > > I am not an expert in this area, but I would be tempted to use only the > bins with non-zero values. I recall reading about some techniques for > dealing with this -- something about trying to sample where the density > is highest -- but do not recall the reference. Also, if you start with > an a priori distribution rather than trying to construct the > distribution based solely on data you have more tools available. > > ekt > > On Tue, 27 Apr 2010 08:48 -0400, "dh" <dh(a)metrohm.com> wrote: >> On 27.04.2010 10:06, Kevin J. McCann wrote: >> > I am using a Markov Chain Monte Carlo (MCMC) approach to evaluate a >> > multidimensional probability density function. The output is a large >> > number of multidimensional points {x1,x2,...,xn}. I can use BinCounts >> to >> > gather the points into a PDF (after appropriate normalization). I >> would >> > like to then define a function, p[X_], which is the multidimensional >> > interpolation of the BinCounts output, but I can't figure out how to >> > automate this for an arbitrary number of dimensions. >> > >> > Any ideas? >> > >> > For the 2d case I did the following: >> > >> > tbl = Partition[ >> > Flatten[Table[{xmin + i*\[CapitalDelta]x + \[CapitalDelta]x/2, >> > ymin + j*\[CapitalDelta]y + \[CapitalDelta]y/2, >> > counts[[i + 1, >> > j + 1]]/(\[ScriptCapitalN] \[CapitalDelta]x \ >> > \[CapitalDelta]y)}, {i, 0, nx - 1}, {j, 0, ny - 1}]], 3]; >> > >> > f=Interpolation[tbl] >> > >> > But as you can see, this is not easily extended to higher dimensions. >> > >> > Kevin >> > >> Hi Kevin, >> if I understand correctly, your problem is the generation of a suitable >> grid of data points for "Interpolation". >> Assume you have a function bins[{i1,i2,..,in}] of n integer arguments. >> The arguments run from 0..ni. The vector of ni is called >> bounds={n1,n2..nn}. We can now define the function "dataGrid" that >> creates a rectangular multidimensional structure for the input to >> Interpolation: >> >> dataGrid[bins_, bounds_] := Module[{iter}, >> iter = {x, 0, n - 1} /. >> Table[{x -> Symbol["x" <> ToString[i]], n -> bounds[[i]]}, {i, 1, >> Length[bounds]}]; >> Flatten[ >> Table[{iter[[All, 1 ]], bins[iter[[All, 1 ]]]}, >> Evaluate[Sequence @@ iter]] >> , Length[bounds] - 1] >> ] >> >> If we choose an example for bins: >> bins[v : {_ ..}] := Times @@ v; >> we can calulation an interpolation: >> >> bins[v : {_ ..}] := Times @@ v; >> Interpolation(a)dataGrid[bins, {4, 4, 4}] >> >> cheers, Daniel >> >> -- >> >> Daniel Huber >> Metrohm Ltd. >> Oberdorfstr. 68 >> CH-9100 Herisau >> Tel. +41 71 353 8585, Fax +41 71 353 8907 >> E-Mail:<mailto:dh(a)metrohm.com> >> Internet:<http://www.metrohm.com> >> >> >> > -- DrMajorBob(a)yahoo.com
|
Pages: 1 Prev: Directing formatted output to different notebooks Next: 3D visulaisation of 3D matrix for a 3D CA |