related lists mean value [Python]

Prev: Newbie question: python versions differ per user?
Next: remove element with ElementTree

From: John Posner on 8 Mar 2010 21:43

On 3/8/2010 9:39 PM, John Posner wrote:

<snip>

> # gather data
> tally_dict = defaultdict(Tally)
> for i in range(len(x)):
> obj = tally_dict[y[i]]
> obj.id = y[i] <--- statement redundant, remove it
> obj.total += x[i]
> obj.count += 1

-John

From: John Posner on 8 Mar 2010 21:53

On 3/8/2010 9:43 PM, John Posner wrote:
> On 3/8/2010 9:39 PM, John Posner wrote:
>
> <snip>

>> obj.id = y[i] <--- statement redundant, remove it

Sorry for the thrashing! It's more correct to say that the Tally class
doesn't require an "id" attribute at all. So the code becomes:

#---------
from collections import defaultdict

class Tally:
def __init__(self):
self.total = 0
self.count = 0

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c']

# gather data
tally_dict = defaultdict(Tally)
for i in range(len(x)):
obj = tally_dict[y[i]]
obj.total += x[i]
obj.count += 1

# process data
result_list = []
for key in sorted(tally_dict):
obj = tally_dict[key]
mean = 1.0 * obj.total / obj.count
result_list.extend([mean] * obj.count)
print result_list
#---------

-John

From: Michael Rudolf on 9 Mar 2010 05:30

Am 08.03.2010 23:34, schrieb dimitri pater - serpia:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]

This kinda looks like you used the wrong data structure.
Maybe you should have used a dict, like:
{'a': [1, 2], 'c': [5, 0, 7], 'b': [8]} ?

> I have looked at iter(tools) and next(), but that did not help me. I'm
> a bit stuck here, so your help is appreciated!

As said, I'd have used a dict in the first place, so lets transform this
straight forward into one:

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

# initialize dict
d={}
for idx in set(y):
d[idx]=[]

#collect values
for i, idx in enumerate(y):
d[idx].append(x[i])

print("d is now a dict of lists: %s" % d)

#calculate average
for key, values in d.items():
d[key]=sum(values)/len(values)

print("d is now a dict of averages: %s" % d)

# build the final list
w = [ d[key] for key in y ]

print("w is now the list of averages, corresponding with y:\n \
\n x: %s \n y: %s \n w: %s \n" % (x, y, w))

Output is:
d is now a dict of lists: {'a': [1, 2], 'c': [5, 0, 7], 'b': [8]}
d is now a dict of averages: {'a': 1.5, 'c': 4.0, 'b': 8.0}
w is now the list of averages, corresponding with y:

x: [1, 2, 8, 5, 0, 7]
y: ['a', 'a', 'b', 'c', 'c', 'c']
w: [1.5, 1.5, 8.0, 4.0, 4.0, 4.0]

Could have used a defaultdict to avoid dict initialisation, though.
Or write a custom class:

x = [1 ,2, 8, 5, 0, 7]
y = ['a', 'a', 'b', 'c', 'c', 'c' ]

class A:
def __init__(self):
self.store={}
def add(self, key, number):
if key in self.store:
self.store[key].append(number)
else:
self.store[key] = [number]
a=A()

# collect data
for idx, val in zip(y,x):
a.add(idx, val)

# build the final list:
w = [ sum(a.store[key])/len(a.store[key]) for key in y ]

print("w is now the list of averages, corresponding with y:\n \
\n x: %s \n y: %s \n w: %s \n" % (x, y, w))

Produces same output, of course.

Note that those solutions are both not very efficient, but who cares ;)

> thanks!

No Problem,

Michael

From: Steve Howell on 9 Mar 2010 10:21

On Mar 8, 6:39 pm, John Posner <jjpos...(a)optimum.net> wrote:
> On 3/8/2010 5:34 PM, dimitri pater - serpia wrote:
>
> > Hi,
>
> > I have two related lists:
> > x = [1 ,2, 8, 5, 0, 7]
> > y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> > what I need is a list representing the mean value of 'a', 'b' and 'c'
> > while maintaining the number of items (len):
> > w = [1.5, 1.5, 8, 4, 4, 4]
>
> > I have looked at iter(tools) and next(), but that did not help me. I'm
> > a bit stuck here, so your help is appreciated!
>
> Nobody expects object-orientation (or the Spanish Inquisition):
>

Heh. Yep, I avoided OO for this. Seems like a functional problem.
My solution is functional on the outside, imperative on the inside.
You could add recursion here, but I don't think it would be as
straightforward.

def num_dups_at_head(lst):
assert len(lst) > 0
val = lst[0]
i = 1
while i < len(lst) and lst[i] == val:
i += 1
return i

def smooth(x, y):
result = []
while x:
cnt = num_dups_at_head(y)
avg = sum(x[:cnt]) * 1.0 / cnt
result += [avg] * cnt
x = x[cnt:]
y = y[cnt:]
return result

> #-------------------------
> from collections import defaultdict
>
> class Tally:
> def __init__(self, id=None):
> self.id = id
> self.total = 0
> self.count = 0
>
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c']
>
> # gather data
> tally_dict = defaultdict(Tally)
> for i in range(len(x)):
> obj = tally_dict[y[i]]
> obj.id = y[i]
> obj.total += x[i]
> obj.count += 1
>
> # process data
> result_list = []
> for key in sorted(tally_dict):
> obj = tally_dict[key]
> mean = 1.0 * obj.total / obj.count
> result_list.extend([mean] * obj.count)
> print result_list
> #-------------------------

From: Steve Howell on 9 Mar 2010 11:29

On Mar 8, 2:34 pm, dimitri pater - serpia <dimitri.pa...(a)gmail.com>
wrote:
> Hi,
>
> I have two related lists:
> x = [1 ,2, 8, 5, 0, 7]
> y = ['a', 'a', 'b', 'c', 'c', 'c' ]
>
> what I need is a list representing the mean value of 'a', 'b' and 'c'
> while maintaining the number of items (len):
> w = [1.5, 1.5, 8, 4, 4, 4]
>

What results are you expecting if you have multiple runs of 'a' in a
longer list?

First | Prev | Next | Last
Pages: 1 2 3
Prev: Newbie question: python versions differ per user?
Next: remove element with ElementTree