From: aleksa on
I've started programming AT91SAM9260 and some
things are not clear to me.

At first, sam-ba was giving me problems
(http://embdev.net/topic/184418),
so I created a small win32 app to mimic sam-ba.

On reset, sam9260 sends "RomBOOT>" and then listens to commands.
I then issue some commands to write to internal SRAM
and then execute it. All that works.

Speed is what interests me now, so I set the led,
then clear it, in a loop. There are some 80 sets/resets
and one jump, to minimize jump's execution time effect.

The instruction are:
str r3, [r2, #-203] ; 0xcb // clear
str r3, [r2, #-207] ; 0xcf // set

I have a 18.432 MHz clock, and this what I use for PLL:

#define PLLA_SETTINGS 0x2060BF09

or, to write it in a more readable way:

#define DIVA 9
#define PLLACOUNT 63
#define OUTA 2
#define MULA 96
#define PLLA_SETTINGS (1 << 29 | MULA << 16 | OUTA << 14 | PLLACOUNT
<< 8 | DIVA)

PLL's R1,C1,C2 are 1k, 10n and 1n, as in all schematics.

Before I work with the PLLA, I first switch from PLLB
(romboot was using it) to slow clock as the master clock.

The fastest I got was 40ns per instruction.
(i.e. 40ns per set, 40ns per clear)

and that is only if MDIV field in PMC_MCKR is set to 00
(master clock is processor clock).

If MDIV is 01 (master clock is processor clock divided by 2)
then I have 60ns per instruction. If it were 80ns, that
would make more sense... but it isn't.

First of all, I thought that master clock has nothing to
do with CPU clock, with the execution of CPU instructions.
I thought that Master clock is for SDRAM, peripherals,
sampling the inputs, etc... but not for CPU instruction clock.

Or, are instructions accessing peripherals executed
differently than others?