From: rickman on
Wilco Dijkstra wrote:
> Sure, the current devices use a lot of power. We'll have to wait till
> production, they might fix the issues and hopefully move to a
> better process. However they are ahead on other aspects, I like the
> 50MHz zero-waitstate flash - nobody has flash that fast.

Aren't you aware that the current devices *are* production? Most of
our apps are very power sensitive and I asked when they would be moving
to a newer, lower power process. The answer was that they have no
plans.


> Dhrystone numbers are available everywhere, including from Luminary:
>
> ARM7: ARM 0.9, Thumb 0.7 DMIPS/MHz
> ARM9: ARM 1.1, Thumb 0.9 DMIPS/MHz
> Cortex-M3 1.25 DMIPS/MHz
>
> These are comparisons of ARM7/ARM9/M3 hardware using the
> ARM compiler running from 32-bit zero-waitstate memory. As you
> know speeds will differ depending on the flash implementation. So
> if you want to know the speed of a particular MCU, you have to
> benchmark it yourself (manufacturers typically quote max speed
> of ARM code running from SRAM).

Yes, so these are not measured benchmarks at all, they are simulations.
We can assume they were realistic about the Cortex processor, but do
we know anything about how realistic the assumptions are about the ARM7
and ARM9 simulations?


> I always recommend you benchmark your own application on your
> MCU using your toolset rather than rely on micro benchmarks done
> by others. As I've explained before, Dhrystone is not the best possible
> benchmark, it underestimates the difference between ARM and Thumb
> and overestimates micro architecture features like fast branches.

That is what people often say when claims are made about power levels
on the LM parts. In reality there currently is no data that actually
compares real ARM7 to real Cortex processors.


> It is a well-known fact that 32-bit RISC CPUs are many times faster than
> 8/16-bit (mostly CISC) chips - around 20 times in the case of 8051...
> Yes, 1 ARM instruction can do the work of around 20 8-bit instructions!
> There aren't many comparisons available because they are hard to do
> and mostly pointless as you know who is going to win. One glance at
> the cycle timings does it for me :-)

But as you say, there are many factors involved when comparing speeds,
not just cycle counts or clock speeds. Currently I am not believing
that the CM3 processors are any better than the ARM7 parts until I see
the evidence against real parts.

From: Wilco Dijkstra on

"rickman" <gnuarm(a)gmail.com> wrote in message
news:1160312205.310331.133880(a)m7g2000cwm.googlegroups.com...
> Wilco Dijkstra wrote:
>> Sure, the current devices use a lot of power. We'll have to wait till
>> production, they might fix the issues and hopefully move to a
>> better process. However they are ahead on other aspects, I like the
>> 50MHz zero-waitstate flash - nobody has flash that fast.
>
> Aren't you aware that the current devices *are* production?

Where did you get that from? Volume production was scheduled
for Q3, and it will take a few months before first shipments. And
all documentation still refers to the initial preproduction run.

>Most of
> our apps are very power sensitive and I asked when they would be moving
> to a newer, lower power process. The answer was that they have no
> plans.

That's a shame.

>> Dhrystone numbers are available everywhere, including from Luminary:
>>
>> ARM7: ARM 0.9, Thumb 0.7 DMIPS/MHz
>> ARM9: ARM 1.1, Thumb 0.9 DMIPS/MHz
>> Cortex-M3 1.25 DMIPS/MHz
>>
>> These are comparisons of ARM7/ARM9/M3 hardware using the
>> ARM compiler running from 32-bit zero-waitstate memory. As you
>> know speeds will differ depending on the flash implementation. So
>> if you want to know the speed of a particular MCU, you have to
>> benchmark it yourself (manufacturers typically quote max speed
>> of ARM code running from SRAM).
>
> Yes, so these are not measured benchmarks at all, they are simulations.

Hardware simulation (whether in software or on a FPGA) is as accurate
as real hardware. They are based on the same RTL and so give identical
results, the only difference is elapsed time... So yes, these are real
measurements.

> We can assume they were realistic about the Cortex processor, but do
> we know anything about how realistic the assumptions are about the ARM7
> and ARM9 simulations?

The ARM7 and ARM9 numbers are realistic when running from single
cycle memory, so you get identical results on chips that have a cache,
SRAM or TCM (eg. ARM920, ARM946). On MCUs with flash speeds
can vary greatly due to width and waitstates. The Philips and Atmel
chips are slowed down by waitstates, the Luminary devices aren't.

>> I always recommend you benchmark your own application on your
>> MCU using your toolset rather than rely on micro benchmarks done
>> by others. As I've explained before, Dhrystone is not the best possible
>> benchmark, it underestimates the difference between ARM and Thumb
>> and overestimates micro architecture features like fast branches.
>
> That is what people often say when claims are made about power levels
> on the LM parts. In reality there currently is no data that actually
> compares real ARM7 to real Cortex processors.

You won't ever see public benchmarking info from ARM or any of
its licensees about actual chips, this is all under NDA. The best you
get is amateuristic rubbish like the Keil benchmarking pages...

I agree there is a general lack of impartial benchmarking, EEMBC is
a mess (more marketing than benchmarking) and there is nothing else.

>> It is a well-known fact that 32-bit RISC CPUs are many times faster than
>> 8/16-bit (mostly CISC) chips - around 20 times in the case of 8051...
>> Yes, 1 ARM instruction can do the work of around 20 8-bit instructions!
>> There aren't many comparisons available because they are hard to do
>> and mostly pointless as you know who is going to win. One glance at
>> the cycle timings does it for me :-)
>
> But as you say, there are many factors involved when comparing speeds,
> not just cycle counts or clock speeds.

Like using the right compiler (and options) for example...

Wilco


From: Jim Granville on
Wilco Dijkstra wrote:
> "rickman" <gnuarm(a)gmail.com> wrote in message
> news:1160312205.310331.133880(a)m7g2000cwm.googlegroups.com...
>
>>Wilco Dijkstra wrote:
>>
>>>Sure, the current devices use a lot of power. We'll have to wait till
>>>production, they might fix the issues and hopefully move to a
>>>better process. However they are ahead on other aspects, I like the
>>>50MHz zero-waitstate flash - nobody has flash that fast.
>>
>>Aren't you aware that the current devices *are* production?
>
>
> Where did you get that from? Volume production was scheduled
> for Q3, and it will take a few months before first shipments. And
> all documentation still refers to the initial preproduction run.

Ouch - so design these in slowly, then ?

>
>>Most of
>>our apps are very power sensitive and I asked when they would be moving
>>to a newer, lower power process. The answer was that they have no
>>plans.
>
>
> That's a shame.

It is for Luminary :) - but low power is not a rare requirement.

Others are not standing still:

http://www.automotivedesignline.com/products/193006369;jsessionid=HVMIQR1HTAYY0QSNDLQSKHSCJUNN2JVN

These are now 72Mhz devices, but comparing MHz alone is not such a big
deal anymore. These are Microcontrollers, and MANY things dictate design
selection, appart from some MHz spec. These new Philips devices have DMA
( as do Atmel's ) and Philips keep expanding the peripheral support.
I see DMA, CAN, USB and Ethernet on the Philips devices, at a price
point that will depress Luminary investors.....


<snip>
>
> You won't ever see public benchmarking info from ARM or any of
> its licensees about actual chips, this is all under NDA. The best you
> get is amateuristic rubbish like the Keil benchmarking pages...
>
> I agree there is a general lack of impartial benchmarking, EEMBC is
> a mess (more marketing than benchmarking) and there is nothing else.

Amazing, so ARM restricts what their users can say ?
Do ARM not realize that's a rather dumb thing to do, and will
harm their users efforts to compete agaist a growing number of
competition devices ?

-jg

From: Wilco Dijkstra on

"Jim Granville" <no.spam(a)designtools.maps.co.nz> wrote in message
news:45294ac8$1(a)clear.net.nz...
> Wilco Dijkstra wrote:
>> "rickman" <gnuarm(a)gmail.com> wrote in message
>> news:1160312205.310331.133880(a)m7g2000cwm.googlegroups.com...

>> Where did you get that from? Volume production was scheduled
>> for Q3, and it will take a few months before first shipments. And
>> all documentation still refers to the initial preproduction run.
>
> Ouch - so design these in slowly, then ?

Do you have any idea how long it takes from start of production to
actual shipment? If anything, LM is very aggressive in their timescales.

http://www.automotivedesignline.com/products/193006369;jsessionid=HVMIQR1HTAYY0QSNDLQSKHSCJUNN2JVN
>
> These are now 72Mhz devices, but comparing MHz alone is not such a big
> deal anymore. These are Microcontrollers, and MANY things dictate design
> selection, appart from some MHz spec.

Maximum frequency has never been a big deal in the ARM world - most
MCUs run at a fraction of their maximum frequency. I've never heard
anybody complaining that ARM chips are too slow!

>These new Philips devices have DMA ( as do Atmel's ) and Philips keep
>expanding the peripheral support.
> I see DMA, CAN, USB and Ethernet on the Philips devices, at a price point
> that will depress Luminary investors.....

It's good competition is heating up a bit - low-cost highly integrated
devices like that mean more 8/16-bit users will switch.

>> You won't ever see public benchmarking info from ARM or any of
>> its licensees about actual chips, this is all under NDA. The best you
>> get is amateuristic rubbish like the Keil benchmarking pages...
>>
>> I agree there is a general lack of impartial benchmarking, EEMBC is
>> a mess (more marketing than benchmarking) and there is nothing else.
>
> Amazing, so ARM restricts what their users can say ?

It's not just ARM, it is common across the industry. Many commercial
compiler vendors (ARM included) have anti-benchmarking clauses
which stop users from publishing benchmark scores.

EEMBC forbid any publication of scores until chips are finished, runs
are certified and published on their website. This takes a lot of time
and money, so everybody just shows scores to customers under NDA
and never certifies or publishes them.

> Do ARM not realize that's a rather dumb thing to do, and will
> harm their users efforts to compete agaist a growing number of
> competition devices ?

I agree it is a bad strategy indeed. How much harm it does I'm not
sure. At the low end people are more interested in codesize and
peripherals rather than performance, and the new competitors
(eg. AVR32, ZNEO) are not aiming at performance at all.

Wilco


From: rickman on
> rickman wrote:
> > > Which ARM and AVR did you compare? At what speed?
>
> > ATmega128 and SAM7S64 at 4 MHz.
>
>
> Hmm, the PLL on the SAM7 probably takes more current then the entire
> ATmega128, but I guess you can disable that and run at 4Mhz directly
> from the crystal, do you remember what is your total current was at
> 4Mhz on the SAM7 device?
>
> steve


I can't reply directly to Steve because Google won't let me reply to a
post older than 30 days. So I am replying to my post.

I don't recall the exact numbers or method that we used last spring to
figure this out, but I expect we measured it. I currently have a
spread sheet from Atmel for the power consumption and it linearly
derates the power by frequency with no offset. 18.236 mA is the figure
it uses for 50 MHz. The only power drains I can't turn off in the
spread sheet are 10 uA for the POR, 1 uA for the BOD (when off), 2 uA
for the RC oscillator and 2 uA for the ADC (when off). This is
executing from RAM with Flash disabled and everything off including the
internal LDO (external core power).

In a real app where I would use every peripheral except for one of the
two UARTs and the USB, the current is still below 30 mA at 50 MHz or 2
mA at 3.125 MHz. I think that will give any of the 8 bit parts a run
for the money, especially considering that the SAM7 can be clocked
slower given the higher performance of the CPU and the DMA that can
keep the CPU in sleep mode until I/O data has been transferred to
memory.

In very low speed mode the SAM7S parts can run under 35 uA with the CPU
chugging along at 500 Hz from RAM or 46 uA at 32 kHz. This is almost
as good as sleep mode!