From: Wolfgang Kern on 8 Mar 2007 23:46 Hallo Guga, > Tks.. robert.. i think i got it.. > Assuming i�m using them continuosly, i made a simple formula that > shows the amount of clock cycles of those instructions used on such a > way (continuosly) > Clocks = Latency+(Throughput*N-1) > N = Amount of instructions used (all of the same type), like the 1000 > example you gave. > Latency of the mnemonic > Throughput of the mnemonic. > Clocks = total amount of clocks of the sequence of the mnemonics used > continuosly > Is that it ? Would work also if throughput is <1 which means several instructions may perform in parallel. There is a timing calculation example in my AMD-docs ... This formula spans half a page and is impractical for daily usage, so I just use the lists as prepared by AMD: |Instruction-group |Latency |Throughput |affected PIPES| Intel got similar lists, but I also missed SSE-timing there. I once had timing information in my x86-disassembler, but it used latency values only. As this wasn't exact nor near raw, I removed it for x86 at all. But other CPUs (good olde Z80 and followers) can work as RTCL :) __ wolfgang
From: Guga on 9 Mar 2007 10:53 On Mar 8, 8:46 pm, "Wolfgang Kern" <nowh...(a)never.at> wrote: > Hallo Guga, > > > Tks.. robert.. i think i got it.. > > Assuming i´m using them continuosly, i made a simple formula that > > shows the amount of clock cycles of those instructions used on such a > > way (continuosly) > > Clocks = Latency+(Throughput*N-1) > > N = Amount of instructions used (all of the same type), like the 1000 > > example you gave. > > Latency of the mnemonic > > Throughput of the mnemonic. > > Clocks = total amount of clocks of the sequence of the mnemonics used > > continuosly > > Is that it ? > > Would work also if throughput is <1 > which means several instructions may perform in parallel. > > There is a timing calculation example in my AMD-docs ... > This formula spans half a page and is impractical for daily usage, > so I just use the lists as prepared by AMD: > > |Instruction-group |Latency |Throughput |affected PIPES| > > Intel got similar lists, but I also missed SSE-timing there. > > I once had timing information in my x86-disassembler, > but it used latency values only. > As this wasn't exact nor near raw, I removed it for x86 at all. > But other CPUs (good olde Z80 and followers) can work as RTCL :) > > __ > wolfgang hi Wolfgang, nice to see you again :) Those lists are a bit confusing.. I think i´ll do as you did. Just using the AMD list with latencies. I´m trying to make a list containing the clock cycles of each mnemonic, but there are so many different processors, ust helps to increase the confusion. The best list i found so far was here http://www.logix.cz/michal/doc/i386/chp17-00.htm Sure.. it is old.. it is for 386, but it displays the clock cycles on a easy to read way. The list robert provided, also refers to the general purpose mnemonics.. like: CMP/TEST latency: 1, Throughput = 0.5 So, i presume that the way they behave is the same as for SSE instructions right ? I mean, they works more or less like the formula i posted before, right ? But.. if that is true...then why on this documents says that JCC don´t have latency ? It is said on Table C10 that for a processor 0F2, the Jcc is not applicable, but it have a Throughput of 0.5...But.. how is that possible ? if a instructino don´t use have the latency to compute the clock cycles used to it be issued.. how it works ? I mean, it _could_ works only from the Throughput, but.. if the latency is 0, shouldn´t the Throughput be also 0 ? I thought the Throughput and latency were related to each other. Best Regards, Guga
From: Guga on 9 Mar 2007 13:20 On Mar 9, 7:53 am, "Guga" <Guga...(a)gmail.com> wrote: > On Mar 8, 8:46 pm, "Wolfgang Kern" <nowh...(a)never.at> wrote: > > > > > Hallo Guga, > > > > Tks.. robert.. i think i got it.. > > > Assuming i´m using them continuosly, i made a simple formula that > > > shows the amount of clock cycles of those instructions used on such a > > > way (continuosly) > > > Clocks = Latency+(Throughput*N-1) > > > N = Amount of instructions used (all of the same type), like the 1000 > > > example you gave. > > > Latency of the mnemonic > > > Throughput of the mnemonic. > > > Clocks = total amount of clocks of the sequence of the mnemonics used > > > continuosly > > > Is that it ? > > > Would work also if throughput is <1 > > which means several instructions may perform in parallel. > > > There is a timing calculation example in my AMD-docs ... > > This formula spans half a page and is impractical for daily usage, > > so I just use the lists as prepared by AMD: > > > |Instruction-group |Latency |Throughput |affected PIPES| > > > Intel got similar lists, but I also missed SSE-timing there. > > > I once had timing information in my x86-disassembler, > > but it used latency values only. > > As this wasn't exact nor near raw, I removed it for x86 at all. > > But other CPUs (good olde Z80 and followers) can work as RTCL :) > > > __ > > wolfgang > > hi Wolfgang, > > nice to see you again :) > > Those lists are a bit confusing.. I think i´ll do as you did. Just > using the AMD list with latencies. I´m trying to make a list > containing the clock cycles of each mnemonic, but there are so many > different processors, ust helps to increase the confusion. > > The best list i found so far was herehttp://www.logix.cz/michal/doc/i386/chp17-00.htm > > Sure.. it is old.. it is for 386, but it displays the clock cycles on > a easy to read way. > > The list robert provided, also refers to the general purpose > mnemonics.. like: > > CMP/TEST latency: 1, Throughput = 0.5 > > So, i presume that the way they behave is the same as for SSE > instructions right ? > > I mean, they works more or less like the formula i posted before, > right ? > > But.. if that is true...then why on this documents says that JCC don´t > have latency ? > > It is said on Table C10 that for a processor 0F2, the Jcc is not > applicable, but it have a Throughput of 0.5...But.. how is that > possible ? > > if a instructino don´t use have the latency to compute the clock > cycles used to it be issued.. how it works ? I mean, it _could_ works > only from the Throughput, but.. if the latency is 0, shouldn´t the > Throughput be also 0 ? I thought the Throughput and latency were > related to each other. > > Best Regards, > > Guga Someone knows where to get a list of CPUIDs signatures of all processors ? For example: Pentium M - Banias is 0x69X Pentium M - Dothan is 0x6DX
From: Guga on 9 Mar 2007 18:26 On Mar 9, 10:20 am, "Guga" <Guga...(a)gmail.com> wrote: > On Mar 9, 7:53 am, "Guga" <Guga...(a)gmail.com> wrote: > > > > > On Mar 8, 8:46 pm, "Wolfgang Kern" <nowh...(a)never.at> wrote: > > > > Hallo Guga, > > > > > Tks.. robert.. i think i got it.. > > > > Assuming i´m using them continuosly, i made a simple formula that > > > > shows the amount of clock cycles of those instructions used on such a > > > > way (continuosly) > > > > Clocks = Latency+(Throughput*N-1) > > > > N = Amount of instructions used (all of the same type), like the 1000 > > > > example you gave. > > > > Latency of the mnemonic > > > > Throughput of the mnemonic. > > > > Clocks = total amount of clocks of the sequence of the mnemonics used > > > > continuosly > > > > Is that it ? > > > > Would work also if throughput is <1 > > > which means several instructions may perform in parallel. > > > > There is a timing calculation example in my AMD-docs ... > > > This formula spans half a page and is impractical for daily usage, > > > so I just use the lists as prepared by AMD: > > > > |Instruction-group |Latency |Throughput |affected PIPES| > > > > Intel got similar lists, but I also missed SSE-timing there. > > > > I once had timing information in my x86-disassembler, > > > but it used latency values only. > > > As this wasn't exact nor near raw, I removed it for x86 at all. > > > But other CPUs (good olde Z80 and followers) can work as RTCL :) > > > > __ > > > wolfgang > > > hi Wolfgang, > > > nice to see you again :) > > > Those lists are a bit confusing.. I think i´ll do as you did. Just > > using the AMD list with latencies. I´m trying to make a list > > containing the clock cycles of each mnemonic, but there are so many > > different processors, ust helps to increase the confusion. > > > The best list i found so far was herehttp://www.logix.cz/michal/doc/i386/chp17-00.htm > > > Sure.. it is old.. it is for 386, but it displays the clock cycles on > > a easy to read way. > > > The list robert provided, also refers to the general purpose > > mnemonics.. like: > > > CMP/TEST latency: 1, Throughput = 0.5 > > > So, i presume that the way they behave is the same as for SSE > > instructions right ? > > > I mean, they works more or less like the formula i posted before, > > right ? > > > But.. if that is true...then why on this documents says that JCC don´t > > have latency ? > > > It is said on Table C10 that for a processor 0F2, the Jcc is not > > applicable, but it have a Throughput of 0.5...But.. how is that > > possible ? > > > if a instructino don´t use have the latency to compute the clock > > cycles used to it be issued.. how it works ? I mean, it _could_ works > > only from the Throughput, but.. if the latency is 0, shouldn´t the > > Throughput be also 0 ? I thought the Throughput and latency were > > related to each other. > > > Best Regards, > > > Guga > > Someone knows where to get a list of CPUIDs signatures of all > processors ? > > For example: > Pentium M - Banias is 0x69X > Pentium M - Dothan is 0x6DX Damn.. this is a hell of a work.. but i´m building the list. :):) So far i ´ve got: CPUID Name 04F4 AMD 5x86-133 P75 (X5) in 4x clock mode 0600 Cyrix/IBM 6x86MX PR166-266 or Cyrix MII PR300-433 0650 Pentium II / Celeron Processor Deschutes / Covington dA0 SECC / SEPP 0651 Pentium II / Celeron Processor Deschutes / Covington dA0 SECC/ SECC2 / SEPP 0652 Pentium II Processor Deschutes dB0 SECC/SECC2 0653 Pentium II Processor Deschutes dB1 SECC/SECC2 0660 Intel Celeron-A 300/333/366/400 A0-step with 128 KB integrated L2 cache 0660 Intel Celeron Processor Mendocino mA0 SEPP 0665 Intel Celeron Processor Mendocino mB0 PPGA 0672 Pentium III Processor Katmai kB0 SECC2 0673 Pentium III Processor Katmai kC0 SECC2 0681 Pentium III Processor Coppermine cA2 SECC/SECC2 0681 Pentium III Processor Coppermine cA2 FC-PGA 0683 Pentium III Processor Coppermine cB0 SECC2 0683 Pentium III / Celeron Processor Coppermine cB0 FC-PGA / PPGA 0686 Pentium III Processor Coppermine cC0 SECC2 0686 Pentium III / Celeron Processor Coppermine cC0 FC-PGA / PPGA 068A Pentium III / Celeron Processor Coppermine cD0 FC-PGA / PPGA 069X 80686 - Pentium M - Banias 06B1 Intel® Celeron® processor 06B1 Pentium III / Celeron Processor Tualatin tA1 PPGA-370 06B4 Pentium III / Celeron Processor Tualatin tB1 PPGA-370 06DX 80686 - Pentium M - Dothan 06D8 Pentium M 740 Processor 1.73GHz Processor 06E8 Core Solo T1300 1.66GHz Processor - 32-bit Dynamic Execution Microarchitecture 06F5 Xeon Dual-Core 3040 1.86GHz Processor. 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3060 2.4GHz Processor. 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3050 2.13GHz Processor. 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3050 2.13GHz Processor - 64-bit Core Microarchitecture 06F6 Core2 Duo T5500 1.66GHz Mobile Processor - 64-bit Core Microarchitecture 06F6 Intel Core2 Duo T7400 Mobile Processor 06F6 Core2 Duo T5600 1.83GHz Mobile Processor. 64 Bit Core Microarchitecture 06F6 Core2 Duo T5200 2.0GHz Mobile Processor. 64 Bit Core Microarchitecture 06F6 Core2 Duo T7400 2.16GHz Mobile Processor. 64 Bit Core Microarchitecture 06F7 Intel Core 2 Extreme QX6700 Processor 06F7 Intel Core 2 Quad Q6600 Kentsfield 07A0 AMD Athlon XP 2600 0F07 Pentium 4 Processor Willamette B2 PPGA-423 INT2 0F0A Pentium 4 Processor Willamette C1 PPGA-423 INT2 0F0A Pentium 4 Processor Willamette C1 PPGA-478 FC-PGA2 0F12 Intel Pentium 4 P68, Willamette, A80528 0F12 Pentium 4 Processor Willamette D0 PPGA-423 INT2 0F12 Pentium 4 Processor Willamette D0 PPGA-478 FC-PGA2 0F13 Pentium 4 / Celeron Processor Willamette E0 PPGA-478 FC-PGA2 0F24 Pentium 4 Processor Northwood B0 PPGA-478 0F27 Pentium 4 / Celeron Processor Northwood C1 PPGA-478 0F29 Pentium 4 / Celeron Processor Northwood D1 PPGA-478 0F33 Pentium 4 / Celeron Processor Prescott C0 All 0F34 Xeon (Nocona) 0F34 Pentium 4 (Prescott) 0F41 Intel® Celeron® D 336 64 Bit NetBurst Microarchitecture 0F41 Intel® Celeron® D 346 64 Bit NetBurst Microarchitecture 0F41 Celeron D 331 2.66GHz Processor 64-bit 0F41 Celeron D 336 2.80GHz Processor 64-bit 0F41 Celeron D 351 3.20GHz Processor - 64 Bit NetBurst Microarchitecture 0F41 Celeron D 346 3.06GHz Processor - 64 Bit NetBurst Microarchitecture 0F41 Pentium 4 541 - 3.20GHz Processor 64-bit NetBurst Microarchitecture 0F48 Xeon 2.80GHz Dual-Core Processor. 64-bit NetBurst Microarchitecture 0F64 Intel Celeron D 347 - 64 Bit NetBurst Microarchitecture 0F64 Celeron D 347 3.06GHz Processor - 64 Bit NetBurst Microarchitecture
From: Guga on 9 Mar 2007 21:12
I´m still completting it.. 120 more CPUids to go: ) CPUID Name 04F4 AMD 5x86-133 P75 (X5) in 4x clock mode 0600 Cyrix/IBM 6x86MX PR166-266 or Cyrix MII PR300-433 0650 Pentium II / Celeron Processor Deschutes / Covington dA0 SECC / SEPP 0651 Pentium II / Celeron Processor Deschutes / Covington dA0 SECC/ SECC2 / SEPP 0652 Pentium II Processor Deschutes dB0 SECC/SECC2 0653 Pentium II Processor Deschutes dB1 SECC/SECC2 0660 Intel Celeron-A 300/333/366/400 A0-step with 128 KB integrated L2 cache 0660 Intel Celeron Processor Mendocino mA0 SEPP 0665 Intel Celeron Processor Mendocino mB0 PPGA 0672 Pentium III Processor Katmai kB0 SECC2 0673 Pentium III Processor Katmai kC0 SECC2 0681 Pentium III Processor Coppermine cA2 SECC/SECC2 0681 Pentium III Processor Coppermine cA2 FC-PGA 0683 Pentium III Processor Coppermine cB0 SECC2 0683 Pentium III / Celeron Processor Coppermine cB0 FC-PGA / PPGA 0686 Pentium III Processor Coppermine cC0 SECC2 0686 Pentium III / Celeron Processor Coppermine cC0 FC-PGA / PPGA 068A Pentium III / Celeron Processor Coppermine cD0 FC-PGA / PPGA 069x 80686 - Pentium M - Banias 06B1 Intel® Celeron® processor 06B1 Pentium III / Celeron Processor Tualatin tA1 PPGA-370 06B4 Pentium III / Celeron Processor Tualatin tB1 PPGA-370 06Dx 80686 - Pentium M - Dothan 06D8 Pentium M 740 Processor 1.73GHz Processor 06D8 Pentium M 780 2.26GHz Processor 06D8 Processor ( mobile ) - 1 x Intel Pentium M 760 2 GHz 32-bit Dynamic Execution Microarchitecture 06D8 Intel Celeron M - 1.5GHz Processor - 1.5GHz 32-bit Dynamic Execution Microarchitecture 06E8 Core Solo T1300 1.66GHz Processor - 32-bit Dynamic Execution Microarchitecture 06F5 Xeon Dual-Core 3040 1.86GHz Processor. 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3060 2.4GHz Processor. 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3050 2.13GHz Processor. 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3050 2.13GHz Processor - 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3050 2.13GHz Processor - 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3060 2.4GHz Processor - 64-bit Core Microarchitecture 06F5 Xeon Dual-Core 3070 2.66GHz Processor - 64-bit Core Microarchitecture 06F5 Intel Core 2 Extreme X6800 (Conroe rev. B1) 06F5 Processor - 1 x Intel Dual-Core Xeon 3060 / 2.4 GHz 64-bit Core Microarchitecture 06F5 Processor - 1 x Intel Dual-Core Xeon 3050 / 2.13 GHz 64-bit Core Microarchitecture 06F5 Processor - 1 x Intel Dual-Core Xeon 3040 / 1.86 GHz - 64-bit Core Microarchitecture 06F6 Core2 Duo T5500 1.66GHz Mobile Processor - 64-bit Core Microarchitecture 06F6 Intel Core2 Duo T7400 Mobile Processor 06F6 Core2 Duo T5600 1.83GHz Mobile Processor. 64 Bit Core Microarchitecture 06F6 Core2 Duo T5200 2.0GHz Mobile Processor. 64 Bit Core Microarchitecture 06F6 Core2 Duo T7400 2.16GHz Mobile Processor. 64 Bit Core Microarchitecture 06F6 Core2 Duo T7600 2.33GHz Mobile Processor. 64 Bit Core Microarchitecture 06F6 Core2 Duo T5600 1.83GHz Mobile Processor - 64-bit Core Microarchitecture 06F6 Core2 Duo T7200 2.0GHz Mobile Processor - 64-bit Core Microarchitecture 06F6 Core2 Duo T7400 2.16GHz Mobile Processor - 64-bit Core Microarchitecture 06F6 Core2 Duo T7600 2.33GHz Mobile Processor - 64-bit Core Microarchitecture 06F7 Intel Core 2 Extreme QX6700 Processor 06F7 Intel Core 2 Quad Q6600 Kentsfield 06F7 Core 2 Duo QX6700 Extreme Processor. 2.66 ghz. 07A0 AMD Athlon XP 2600 0F07 Pentium 4 Processor Willamette B2 PPGA-423 INT2 0F0A Pentium 4 Processor Willamette C1 PPGA-423 INT2 0F0A Pentium 4 Processor Willamette C1 PPGA-478 FC-PGA2 0F12 Intel Pentium 4 P68, Willamette, A80528 0F12 Pentium 4 Processor Willamette D0 PPGA-423 INT2 0F12 Pentium 4 Processor Willamette D0 PPGA-478 FC-PGA2 0F13 Pentium 4 / Celeron Processor Willamette E0 PPGA-478 FC-PGA2 0F24 Pentium 4 Processor Northwood B0 PPGA-478 0F25 Intel Pentium 4 Extreme Edition 3.46GHz Processor - 3.46GHz 32- bit NetBurst Microarchitecture 0F27 Pentium 4 / Celeron Processor Northwood C1 PPGA-478 0F29 Pentium 4 / Celeron Processor Northwood D1 PPGA-478 0F33 Pentium 4 / Celeron Processor Prescott C0 All 0F34 Xeon (Nocona) 0F34 Pentium 4 (Prescott) 0F41 Intel® Celeron® D 336 64 Bit NetBurst Microarchitecture 0F41 Intel® Celeron® D 346 64 Bit NetBurst Microarchitecture 0F41 Celeron D 331 2.66GHz Processor 64-bit 0F41 Celeron D 336 2.80GHz Processor 64-bit 0F41 Celeron D 351 3.20GHz Processor - 64 Bit NetBurst Microarchitecture 0F41 Celeron D 346 3.06GHz Processor - 64 Bit NetBurst Microarchitecture 0F41 Pentium 4 541 - 3.20GHz Processor 64-bit NetBurst Microarchitecture 0F41 Intel Pentium 4 541 - 3.20GHz Processor - 3.20GHz - 64-bit NetBurst Microarchitecture 0F41 Intel Celeron D 326 2.53GHz Processor - 2.53GHz 64-bit 0F43 Xeon 3.20GHz Processor 0F43 Xeon 3.4GHz Processor 0F43 Xeon 3.60GHz Processor - Extended Memory 64 Technology Hyper- Threading Technology 0F43 Intel Xeon 3.0GHz Processor - 3.0GHz 0F43 Intel Xeon 3.20GHz Processor - 3.20GHz 0F44 Processor - 1 x Intel Pentium D 830 3 GHz ( 800 MHz ) Dual-Core NetBurst Microarchitecture 0F48 Xeon 2.80GHz Dual-Core Processor. 64-bit NetBurst Microarchitecture 0F4A Xeon 2.8GHz Processor- 64-bit NetBurst Microarchitecture 0F4A Xeon 3.60GHz Processor - 64-bit NetBurst Microarchitecture 0F4A Xeon 3.80GHz Processor - Extended Memory 64 Technology Enhanced SpeedStep Technology Hyper-Threading Technology 0F4A Intel Xeon 2.80GHz Processor - 2.8GHz 64-bit NetBurst Microarchitecture 0F64 Intel Celeron D 347 - 64 Bit NetBurst Microarchitecture 0F64 Celeron D 347 3.06GHz Processor - 64 Bit NetBurst Microarchitecture 0F64 Processor - 1 x Intel Pentium D 945 / 3.4 GHz 64 Bit 0F64 Intel Celeron D 347 3.06GHz Processor - 3.06GHz 64-bit NetBurst Microarchitecture 020F32 AMD Athlon 64 X2 3800+, 2.0 GHz (Manchester rev. E6) |