| 1 | |
| 2 | ************************************************************* |
| 3 | AMD |
| 4 | http://lkml.org/lkml/2005/11/4/173 |
| 5 | http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/26094.PDF |
| 6 | http://developer.amd.com/article_print.jsp?id=92 (RH) |
| 7 | |
| 8 | [7] AMD's 7th generation processors return a CPUID base |
| 9 | family value of '7'. These include AMD Athlon, AthlonXP, |
| 10 | AthlonMP, and Duron. |
| 11 | |
| 12 | [8] AMD's 8th generation processors return an effective |
| 13 | CPUID family of '0x0F'. These include AMD Opteron, |
| 14 | Athlon64, and Turion. |
| 15 | |
| 16 | |
| 17 | before 7th gen : ok |
| 18 | 7th gen: |
| 19 | P-state (performance state) change |
| 20 | UP : warn about time inaccuracy |
| 21 | SMP |
| 22 | sol : disable powernow |
| 23 | Use monotonic pseudo-TSC |
| 24 | STPCLK-Throttling (temperature) : only done on UP, ok |
| 25 | UP : warn about time inaccuracy |
| 26 | 8th gen : |
| 27 | P-state change |
| 28 | UP : inaccuracy |
| 29 | dual-core : locked-step ; inaccuracy |
| 30 | SMP : may drift |
| 31 | sol : disable powernow |
| 32 | Use monotonic pseudo-TSC |
| 33 | SMP, dual core : C1-clock ramping (halt) (power state : C-state) |
| 34 | sol : idle=poll or disable C1-ramping |
| 35 | Use monotonic pseudo-TSC |
| 36 | STPCLK-Throttling (temperature) : |
| 37 | single processor dual-core ok ; inaccuracy |
| 38 | SMP : NOT ok (rare) |
| 39 | Use monotonic pseudo-TSC |
| 40 | |
| 41 | |
| 42 | Until TSC becomes invariant, AMD recommends that operating |
| 43 | system developers avoid TSC as a fast timer source on |
| 44 | affected systems. (AMD recommends that the operating system |
| 45 | should favor these time sources in a prioritized manner: |
| 46 | HPET first, then ACPI PM Timer, then PIT.) The following |
| 47 | pseudo-code shows one way of determining when to use TSC: |
| 48 | |
| 49 | use_AMD_TSC() { // returns TRUE if ok to use TSC |
| 50 | if (CPUID.base_family < 0xf) { |
| 51 | // TSC drift doesn't exist on 7th Gen or less |
| 52 | // However, OS still needs to consider effects |
| 53 | // of P-state changes on TSC |
| 54 | return TRUE; |
| 55 | } else if (CPUID.AdvPowerMgmtInfo.TscInvariant) { |
| 56 | // Invariant TSC on 8th Gen or newer, use it |
| 57 | // (assume all cores have invariant TSC) |
| 58 | return TRUE; |
| 59 | } else if ((number_processors == 1)&&(number_cores == 1)){ |
| 60 | // OK to use TSC on uni-processor-uni-core |
| 61 | // However, OS still needs to consider effects |
| 62 | // of P-state changes on TSC |
| 63 | return TRUE; |
| 64 | } else if ( (number_processors == 1) && |
| 65 | (CPUID.effective_family == 0x0f) && |
| 66 | !C1_ramp_8gen ){ |
| 67 | // Use TSC on 8th Gen uni-proc with C1_ramp off |
| 68 | // However, OS still needs to consider effects |
| 69 | // of P-state changes on TSC |
| 70 | return TRUE; |
| 71 | } else { |
| 72 | return FALSE; |
| 73 | } |
| 74 | } |
| 75 | C1_ramp_8gen() { |
| 76 | // Check if C1-Clock ramping enabled in PMM7.CpuLowPwrEnh |
| 77 | // On 8th-Generation cores only. Assume BIOS has setup |
| 78 | // all Northbridges equivalently. |
| 79 | return (1 & read_pci_byte(bus=0,dev=0x18,fcn=3,offset=0x87)); |
| 80 | } |
| 81 | |
| 82 | |
| 83 | When an operating system can not avoid using TSC in the |
| 84 | short-term, the operating system will need to either |
| 85 | re-synchronize the TSC of the halted core when exiting halt |
| 86 | or disable C1-clock ramping. The pseudo-code for disabling |
| 87 | C1-clock ramping follows: |
| 88 | |
| 89 | if ( !use_AMD_TSC() && |
| 90 | (CPUID.effective_family == 0x0f) && |
| 91 | C1_ramp_8gen() ){ |
| 92 | for (i=0; i < number_processors; ++i){ |
| 93 | // Do for all NorthBridges in platform |
| 94 | tmp = read_pci_byte(bus=0,dev=0x18+i,fcn=3,offset=0x87); |
| 95 | tmp &= 0xFC; // clears pmm7[1:0] |
| 96 | write_pci_byte(bus=0,dev=0x18+i,fcn=3,offset=0x87,tmp) |
| 97 | } |
| 98 | } |
| 99 | |
| 100 | |
| 101 | Future TSC Directions and Solutions |
| 102 | =================================== |
| 103 | Future AMD processors will provide a TSC that is P-state and |
| 104 | C-State invariant and unaffected by STPCLK-throttling. This |
| 105 | will make the TSC immune to drift. Because using the TSC |
| 106 | for fast timer APIs is a desirable feature that helps |
| 107 | performance, AMD has defined a CPUID feature bit that |
| 108 | software can test to determine if the TSC is |
| 109 | invariant. Issuing a CPUID instruction with an %eax register |
| 110 | value of 0x8000_0007, on a processor whose base family is |
| 111 | 0xF, returns "Advanced Power Management Information" in the |
| 112 | %eax, %ebx, %ecx, and %edx registers. Bit 8 of the return |
| 113 | %edx is the "TscInvariant" feature flag which is set when |
| 114 | TSC is P-state, C-state, and STPCLK-throttling invariant; it |
| 115 | is clear otherwise. |
| 116 | |
| 117 | The rate of the invariant TSC is implementation-dependent |
| 118 | and will likely *not* be the frequency of the processor |
| 119 | core; however, its period should be short enough such that |
| 120 | it is not possible for two back-to-back rdtsc instructions |
| 121 | to return the same value. Software which is trying to |
| 122 | measure actual processor frequency or cycle-performance |
| 123 | should use Performance Event 76h, CPU Clocks not Halted, |
| 124 | rather than the TSC to count CPU cycles. |
| 125 | |
| 126 | |
| 127 | |
| 128 | ************************************************************* |
| 129 | Intel |
| 130 | |
| 131 | Pentium M |
| 132 | family [06H], models [09H, 0DH] |
| 133 | UP |
| 134 | warn about time inaccuracy |
| 135 | SMP |
| 136 | SOL : disable speedstep |
| 137 | Pentium 4 processors, Intel Xeon |
| 138 | family [0FH], models [00H, 01H, or 02H] |
| 139 | UP |
| 140 | warn about time inaccuracy |
| 141 | SMP |
| 142 | SOL : disable speedstep |
| 143 | |
| 144 | Other : ok |
| 145 | |
| 146 | http://download.intel.com/design/Xeon/specupdt/30675712.pdf |
| 147 | http://forum.rightmark.org/post.cgi?id=print:6:269&user=%20Dmitri%20Besedin&color=1 |
| 148 | http://bochs.sourceforge.net/techspec/24161821.pdf cpuid |
| 149 | |
| 150 | cpuz |
| 151 | AFAIK this problem only concerns Prescott CPUs, but I bet future production will |
| 152 | also use the same rule. |
| 153 | |
| 154 | Well, from what Intel states in one of their docs (Intel(R) Pentium(R) M |
| 155 | Processor on 90 nm Process with 2-MB L2 Cache, Specification Update, Document |
| 156 | No. 302209-010 (http://www.intel.com/design/mobile/specupdt/302209.htm) ), your |
| 157 | supposition is true: |
| 158 | |
| 159 | |
| 160 | Article V. For Pentium M processors (family [06H], models [09H, 0DH]); for |
| 161 | Pentium 4 processors, Intel Xeon processors (family [0FH], models [00H, 01H, or |
| 162 | 02H]); and for P6 family processors: the timestamp counter increments with every |
| 163 | internal processor clock cycle. The internal processor clock cycle is determined |
| 164 | by the current core-clock to bus-clock ratio. Intel(R) SpeedStep(R) technology |
| 165 | transitions may also impact the processor clock. |
| 166 | |
| 167 | Article VI. For Pentium 4 processors, Intel Xeon processors (family [0FH], |
| 168 | models [03H and higher]): the time-stamp counter increments at a constant rate. |
| 169 | That rate may be set by the maximum core-clock to bus-clock ratio of the |
| 170 | processor or may be set by the frequency at which the processor is booted. The |
| 171 | specific processor configuration determines the behavior. Constant TSC behavior |
| 172 | ensures that the duration of each clock tick is uniform and supports the use of |
| 173 | the TSC as a wall clock timer even if the processor core changes frequency. This |
| 174 | is the architectural behavior moving forward. |
| 175 | |
| 176 | |
| 177 | It's a pity they call this sucking TSC behavior an "architectural behavior |
| 178 | moving forward" |
| 179 | |
| 180 | |
| 181 | |
| 182 | ************************************************************* |
| 183 | HPET |
| 184 | |
| 185 | http://www.intel.com/hardwaredesign/hpetspec_1.pdf |
| 186 | |
| 187 | |
| 188 | |
| 189 | |