From: blamm64 on
I have a couple of functions designed to poke a single hole, and to
poke multiple holes, in a one-level list:

We define a function which, given the imported pressure data, finds
the subset of that pressure data excluding the pressure data points
between "targetL " and "targetU".

In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_?
NumericQ] := Select[data,(#<=targetL || #>=targetU &)]

This function will pluck out multiple holes in the data list.

In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1},
tmp=data;
Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1,
{i,Dimensions[tarList][[1]]}];
tmp
]

The following works fine (big holes chosen not to give large result):

In[7]:= datalist=Range[11,3411,10];

In[12]:= targetlist={{40, 1500},{1600,3300}};

In[13]:= resultdata=subsets[datalist,targetlist]

Out[13]=
{11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411}

But if "datalist" happens to be very large, surely there is a (much)
more efficient method?

I tried unsuccessfully to use pure functions with Select, but have a
somewhat nebulous feeling there's a pure function way of doing this
effectively much more efficiently.

I know, I know: the above have no consistency checking. I also know
"subsets" could be used in place of "findsubset" just by replacing the
call of "findsubset" with the code of "findsubset" in "subsets".

>From what I've seen on this forum there are some really experienced
people who might provide an efficient way of implementing the above.

-Brian L.

From: Daniel Lichtblau on
blamm64 wrote:
> I have a couple of functions designed to poke a single hole, and to
> poke multiple holes, in a one-level list:
>
> We define a function which, given the imported pressure data, finds
> the subset of that pressure data excluding the pressure data points
> between "targetL " and "targetU".
>
> In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_?
> NumericQ] := Select[data,(#<=targetL || #>=targetU &)]
>
> This function will pluck out multiple holes in the data list.
>
> In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1},
> tmp=data;
> Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1,
> {i,Dimensions[tarList][[1]]}];
> tmp
> ]
>
> The following works fine (big holes chosen not to give large result):
>
> In[7]:= datalist=Range[11,3411,10];
>
> In[12]:= targetlist={{40, 1500},{1600,3300}};
>
> In[13]:= resultdata=subsets[datalist,targetlist]
>
> Out[13]=
> {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411}
>
> But if "datalist" happens to be very large, surely there is a (much)
> more efficient method?
>
> I tried unsuccessfully to use pure functions with Select, but have a
> somewhat nebulous feeling there's a pure function way of doing this
> effectively much more efficiently.
>
> I know, I know: the above have no consistency checking. I also know
> "subsets" could be used in place of "findsubset" just by replacing the
> call of "findsubset" with the code of "findsubset" in "subsets".
>
>>From what I've seen on this forum there are some really experienced
> people who might provide an efficient way of implementing the above.
>
> -Brian L.

If you are working with integers then the method below should be fine.
Otherwise you may need to "fuzzify" a bit differently. I use
IntervalMemberQ to determine which elements in the data list to omit,
and then does the selection using Select (I tried Pick, and it was
perhaps a half a hair slower).

subsets2[data_?VectorQ,tarList_?ListQ] := Module[
{intv=Apply[Interval,Map[#+{.5,-.5}&,tarList]]},
Select[data, !IntervalMemberQ[intv,#]&]]

Here is a quick but slightly large test.

datalist = RandomInteger[11000,100000];
targetlist = Table[{n,n+20}, {n,100,10000,100}];

In[47]:= Timing[resultdata = subsets[datalist,targetlist];]
Out[47]= {14.4878, Null}

In[48]:= Timing[resultdata2 = subsets2[datalist,targetlist];]
Out[48]= {0.179973, Null}

In[49]:= resultdata === resultdata2
Out[49]= True

In[50]:= Length[resultdata2]
Out[50]= 82596

Daniel Lichtblau
Wolfram Research

From: Raffy on
On Nov 20, 3:38 am, blamm64 <blam...(a)charter.net> wrote:
> I have a couple of functions designed to poke a single hole, and to
> poke multiple holes, in a one-level list:
>
> We define a function which, given the imported pressure data, finds
> the subset of that pressure data excluding the pressure data points
> between "targetL " and "targetU".
>
> In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_?
> NumericQ] := Select[data,(#<=targetL || #>=targetU &)]
>
> This function will pluck out multiple holes in the data list.
>
> In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1},
> tmp=data;
> Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1,
> {i,Dimensions[tarList][[1]]}];
> tmp
> ]
>
> The following works fine (big holes chosen not to give large result):
>
> In[7]:= datalist=Range[11,3411,10];
>
> In[12]:= targetlist={{40, 1500},{1600,3300}};
>
> In[13]:= resultdata=subsets[datalist,targetlist]
>
> Out[13]=
> {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,332=
1, 3331,3341,3351,3361,3371,3381,3391,3401,3411}
>
> But if "datalist" happens to be very large, surely there is a (much)
> more efficient method?
>
> I tried unsuccessfully to use pure functions with Select, but have a
> somewhat nebulous feeling there's a pure function way of doing this
> effectively much more efficiently.
>
> I know, I know: the above have no consistency checking. I also know
> "subsets" could be used in place of "findsubset" just by replacing the
> call of "findsubset" with the code of "findsubset" in "subsets".
>
> >From what I've seen on this forum there are some really experienced
>
> people who might provide an efficient way of implementing the above.
>
> -Brian L.

I didn't do any speed testing yet, but this functionality is available
through Interval.

With[{interval = Interval[{40, 1500}, {1600, 3300}]},
Select[Range[11, 123123, 10], ! IntervalMemberQ[interval, #] &]
]

From: Bill Rowe on
On 11/20/09 at 6:38 AM, blamm64(a)charter.net (blamm64) wrote:

>I have a couple of functions designed to poke a single hole, and to
>poke multiple holes, in a one-level list:

>We define a function which, given the imported pressure data, finds
>the subset of that pressure data excluding the pressure data points
>between "targetL " and "targetU".

>In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_?
>NumericQ] := Select[data,(#<=targetL || #>=targetU &)]

on my machine the following has the same result but executes faster

fs[data_?VectorQ, targetL_?NumericQ, targetU_?NumericQ] :=

Join[SparseArray[Clip[data, {First[data], targetL}, {0, 0}]] /.
SparseArray[_, _, _, {_, _, a_}] -> a,
SparseArray[Clip[data, {targetU, Last[data]}, {0, 0}]] /.
SparseArray[_, _, _, {_, _, a_}] -> a]

>This function will pluck out multiple holes in the data list.

>In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1},
>tmp=data;
>Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1,
>{i,Dimensions[tarList][[1]]}]; tmp
>]

I the following does the same thing is simpler in my view

subs[data_?VectorQ, tarList_?ListQ] :=
Fold[fs[#1, First[#2], Last[#2]] &, data, tarList]

>The following works fine (big holes chosen not to give large
>result):

>In[7]:= datalist=Range[11,3411,10];

>In[12]:= targetlist={{40, 1500},{1600,3300}};

=46irst to demonstrate the my solution produces the same result

In[7]:= subs[datalist, targetlist] == subsets[datalist, targetlist]

Out[7]= True

and then timing data on my system

In[8]:= Timing[subsets[datalist, targetlist];]

Out[8]= {0.000894,Null}

In[9]:= Timing[subs[datalist, targetlist];]

Out[9]= {0.000175,Null}


From: Ray Koopman on
On Nov 20, 3:38 am, blamm64 <blam...(a)charter.net> wrote:
> I have a couple of functions designed to poke a single hole, and to
> poke multiple holes, in a one-level list:
>
> We define a function which, given the imported pressure data, finds
> the subset of that pressure data excluding the pressure data points
> between "targetL " and "targetU".
>
> In[5]:= findsubset[data_?VectorQ,targetL_?NumericQ,targetU_?
> NumericQ] := Select[data,(#<=targetL || #>=targetU &)]
>
> This function will pluck out multiple holes in the data list.
>
> In[6]:= subsets[data_?VectorQ,tarList_?ListQ]:=Module[{tmp,tmp1},
> tmp=data;
> Do[tmp1=findsubset[tmp,tarList[[i,1]],tarList[[i,2]]];tmp=tmp1,
> {i,Dimensions[tarList][[1]]}];
> tmp
> ]
>
> The following works fine (big holes chosen not to give large result):
>
> In[7]:= datalist=Range[11,3411,10];
>
> In[12]:= targetlist={{40, 1500},{1600,3300}};
>
> In[13]:= resultdata=subsets[datalist,targetlist]
>
> Out[13]=
> {11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411}
>
> But if "datalist" happens to be very large, surely there is a (much)
> more efficient method?
>
> I tried unsuccessfully to use pure functions with Select, but have a
> somewhat nebulous feeling there's a pure function way of doing this
> effectively much more efficiently.
>
> I know, I know: the above have no consistency checking. I also know
> "subsets" could be used in place of "findsubset" just by replacing the
> call of "findsubset" with the code of "findsubset" in "subsets".
>
> >From what I've seen on this forum there are some really experienced
>
> people who might provide an efficient way of implementing the above.
>
> -Brian L.

In[1]:=
datalist = Range[11,3411,10];
targetlist = {{40, 1500},{1600,3300}}
rejectinterval = Interval@@({1,-1}+#&)/@targetlist
Select[datalist,!IntervalMemberQ[rejectinterval,#]&]

Out[2]=
{{40,1500},{1600,3300}}

Out[3]=
Interval[{41,1499},{1601,3299}]

Out[4]=
{11,21,31,1501,1511,1521,1531,1541,1551,1561,1571,1581,1591,
3301,3311,3321,3331,3341,3351,3361,3371,3381,3391,3401,3411}