From: blamm64 on 20 Nov 2009 06:38 I have a couple of functions designed to poke a single hole, and to poke multiple holes, in a one-level list: We define a function which, given the imported pressure data, finds the subset of that pressure data excluding the pressure data points between "targetL " and "targetU". In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_? NumericQ] := Select[data,(#<=targetL || #>=targetU &)] This function will pluck out multiple holes in the data list. In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1}, tmp=data; Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1, {i,Dimensions[tarList][[1]]}]; tmp ] The following works fine (big holes chosen not to give large result): In[7]:= datalist=Range[11,3411,10]; In[12]:= targetlist={{40, 1500},{1600,3300}}; In[13]:= resultdata=subsets[datalist,targetlist] Out[13]= {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411} But if "datalist" happens to be very large, surely there is a (much) more efficient method? I tried unsuccessfully to use pure functions with Select, but have a somewhat nebulous feeling there's a pure function way of doing this effectively much more efficiently. I know, I know: the above have no consistency checking. I also know "subsets" could be used in place of "findsubset" just by replacing the call of "findsubset" with the code of "findsubset" in "subsets". >From what I've seen on this forum there are some really experienced people who might provide an efficient way of implementing the above. -Brian L.
From: Daniel Lichtblau on 21 Nov 2009 03:33 blamm64 wrote: > I have a couple of functions designed to poke a single hole, and to > poke multiple holes, in a one-level list: > > We define a function which, given the imported pressure data, finds > the subset of that pressure data excluding the pressure data points > between "targetL " and "targetU". > > In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_? > NumericQ] := Select[data,(#<=targetL || #>=targetU &)] > > This function will pluck out multiple holes in the data list. > > In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1}, > tmp=data; > Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1, > {i,Dimensions[tarList][[1]]}]; > tmp > ] > > The following works fine (big holes chosen not to give large result): > > In[7]:= datalist=Range[11,3411,10]; > > In[12]:= targetlist={{40, 1500},{1600,3300}}; > > In[13]:= resultdata=subsets[datalist,targetlist] > > Out[13]= > {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411} > > But if "datalist" happens to be very large, surely there is a (much) > more efficient method? > > I tried unsuccessfully to use pure functions with Select, but have a > somewhat nebulous feeling there's a pure function way of doing this > effectively much more efficiently. > > I know, I know: the above have no consistency checking. I also know > "subsets" could be used in place of "findsubset" just by replacing the > call of "findsubset" with the code of "findsubset" in "subsets". > >>From what I've seen on this forum there are some really experienced > people who might provide an efficient way of implementing the above. > > -Brian L. If you are working with integers then the method below should be fine. Otherwise you may need to "fuzzify" a bit differently. I use IntervalMemberQ to determine which elements in the data list to omit, and then does the selection using Select (I tried Pick, and it was perhaps a half a hair slower). subsets2[data_?VectorQ,tarList_?ListQ] := Module[ {intv=Apply[Interval,Map[#+{.5,-.5}&,tarList]]}, Select[data, !IntervalMemberQ[intv,#]&]] Here is a quick but slightly large test. datalist = RandomInteger[11000,100000]; targetlist = Table[{n,n+20}, {n,100,10000,100}]; In[47]:= Timing[resultdata = subsets[datalist,targetlist];] Out[47]= {14.4878, Null} In[48]:= Timing[resultdata2 = subsets2[datalist,targetlist];] Out[48]= {0.179973, Null} In[49]:= resultdata === resultdata2 Out[49]= True In[50]:= Length[resultdata2] Out[50]= 82596 Daniel Lichtblau Wolfram Research
From: Raffy on 21 Nov 2009 03:36 On Nov 20, 3:38 am, blamm64 <blam...(a)charter.net> wrote: > I have a couple of functions designed to poke a single hole, and to > poke multiple holes, in a one-level list: > > We define a function which, given the imported pressure data, finds > the subset of that pressure data excluding the pressure data points > between "targetL " and "targetU". > > In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_? > NumericQ] := Select[data,(#<=targetL || #>=targetU &)] > > This function will pluck out multiple holes in the data list. > > In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1}, > tmp=data; > Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1, > {i,Dimensions[tarList][[1]]}]; > tmp > ] > > The following works fine (big holes chosen not to give large result): > > In[7]:= datalist=Range[11,3411,10]; > > In[12]:= targetlist={{40, 1500},{1600,3300}}; > > In[13]:= resultdata=subsets[datalist,targetlist] > > Out[13]= > {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,332= 1, 3331,3341,3351,3361,3371,3381,3391,3401,3411} > > But if "datalist" happens to be very large, surely there is a (much) > more efficient method? > > I tried unsuccessfully to use pure functions with Select, but have a > somewhat nebulous feeling there's a pure function way of doing this > effectively much more efficiently. > > I know, I know: the above have no consistency checking. I also know > "subsets" could be used in place of "findsubset" just by replacing the > call of "findsubset" with the code of "findsubset" in "subsets". > > >From what I've seen on this forum there are some really experienced > > people who might provide an efficient way of implementing the above. > > -Brian L. I didn't do any speed testing yet, but this functionality is available through Interval. With[{interval = Interval[{40, 1500}, {1600, 3300}]}, Select[Range[11, 123123, 10], ! IntervalMemberQ[interval, #] &] ]
From: Bill Rowe on 21 Nov 2009 03:37 On 11/20/09 at 6:38 AM, blamm64(a)charter.net (blamm64) wrote: >I have a couple of functions designed to poke a single hole, and to >poke multiple holes, in a one-level list: >We define a function which, given the imported pressure data, finds >the subset of that pressure data excluding the pressure data points >between "targetL " and "targetU". >In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_? >NumericQ] := Select[data,(#<=targetL || #>=targetU &)] on my machine the following has the same result but executes faster fs[data_?VectorQ, targetL_?NumericQ, targetU_?NumericQ] := Join[SparseArray[Clip[data, {First[data], targetL}, {0, 0}]] /. SparseArray[_, _, _, {_, _, a_}] -> a, SparseArray[Clip[data, {targetU, Last[data]}, {0, 0}]] /. SparseArray[_, _, _, {_, _, a_}] -> a] >This function will pluck out multiple holes in the data list. >In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1}, >tmp=data; >Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1, >{i,Dimensions[tarList][[1]]}]; tmp >] I the following does the same thing is simpler in my view subs[data_?VectorQ, tarList_?ListQ] := Fold[fs[#1, First[#2], Last[#2]] &, data, tarList] >The following works fine (big holes chosen not to give large >result): >In[7]:= datalist=Range[11,3411,10]; >In[12]:= targetlist={{40, 1500},{1600,3300}}; =46irst to demonstrate the my solution produces the same result In[7]:= subs[datalist, targetlist] == subsets[datalist, targetlist] Out[7]= True and then timing data on my system In[8]:= Timing[subsets[datalist, targetlist];] Out[8]= {0.000894,Null} In[9]:= Timing[subs[datalist, targetlist];] Out[9]= {0.000175,Null}
From: Ray Koopman on 21 Nov 2009 03:37
On Nov 20, 3:38 am, blamm64 <blam...(a)charter.net> wrote: > I have a couple of functions designed to poke a single hole, and to > poke multiple holes, in a one-level list: > > We define a function which, given the imported pressure data, finds > the subset of that pressure data excluding the pressure data points > between "targetL " and "targetU". > > In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_? > NumericQ] := Select[data,(#<=targetL || #>=targetU &)] > > This function will pluck out multiple holes in the data list. > > In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1}, > tmp=data; > Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1, > {i,Dimensions[tarList][[1]]}]; > tmp > ] > > The following works fine (big holes chosen not to give large result): > > In[7]:= datalist=Range[11,3411,10]; > > In[12]:= targetlist={{40, 1500},{1600,3300}}; > > In[13]:= resultdata=subsets[datalist,targetlist] > > Out[13]= > {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411} > > But if "datalist" happens to be very large, surely there is a (much) > more efficient method? > > I tried unsuccessfully to use pure functions with Select, but have a > somewhat nebulous feeling there's a pure function way of doing this > effectively much more efficiently. > > I know, I know: the above have no consistency checking. I also know > "subsets" could be used in place of "findsubset" just by replacing the > call of "findsubset" with the code of "findsubset" in "subsets". > > >From what I've seen on this forum there are some really experienced > > people who might provide an efficient way of implementing the above. > > -Brian L. In[1]:= datalist = Range[11,3411,10]; targetlist = {{40, 1500},{1600,3300}} rejectinterval = Interval@@({1,-1}+#&)/@targetlist Select[datalist,!IntervalMemberQ[rejectinterval,#]&] Out[2]= {{40,1500},{1600,3300}} Out[3]= Interval[{41,1499},{1601,3299}] Out[4]= {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591, 3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411} |