Prev: [PATCH 2/4] posix-cpu-timers: Introduction of wall clocks
Next: [PATCH] DM-CRYPT: Scale to multiple CPUs v2
From: Alexander Graf on 19 Jul 2010 08:10 Milton Miller wrote: > I wrote: > >> On Mon Jul 19 2010 at about 03:36:51 EST, Alexander Graf wrote: >> >>> On 19.07.2010, at 03:11, Benjamin Herrenschmidt wrote: >>> >>> >>>> On Thu, 2010-07-15 at 17:05 +0530, Subrata Modak wrote: >>>> >>>>> commit e62cee42e66dcca83aae02748535f62e0f564a0c solved the problem for >>>>> 2.6.34-rc6. However some other bad relocation warnings generated against >>>>> 2.6.35-rc5 on Power7/ppc64 below: >>>>> >>>>> MODPOST 2004 modules^M >>>>> WARNING: 2 bad relocations^M >>>>> c000000000008590 R_PPC64_ADDR32 .text+0x4000000000008460^M >>>>> c000000000008594 R_PPC64_ADDR32 .text+0x4000000000008598^M >>>>> >>>> I think this is KVM + CONFIG_RELOCATABLE. Caused by: >>>> >>>> .global kvmppc_trampoline_lowmem >>>> kvmppc_trampoline_lowmem: >>>> .long kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START >>>> >>>> .global kvmppc_trampoline_enter >>>> kvmppc_trampoline_enter: >>>> .long kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START >>>> >>>> Alex, can you turn these into 64-bit on ppc64 so the relocator >>>> can grok them ? >>>> >>> If I turn them into 64-bit, will the values be > RMA? In that case >>> things would break anyways. How does relocation work on PPC? Are the >>> first few megs copied over to low memory? Would I have to mask anything >>> in the above code to make sure I use the real values? >>> >>> Alex >>> >>> >> You can still do the subtraction, but you have to allocate 64 bits for >> storage. Relocatable ppc64 kernels work by adjusting PPC64_RELOC_RELATIVE >> entries during early boot (reloc in reloc_64.S called from head_64.S). >> >> The code purposely only supports 64 bit relative addressing. >> > > Oh yea, and for book-3s, the code copies from 0x100 to __end_interrupts > in arch/powerpc/kernel/exceptions-64s.h down to the real 0, but the rest > of the kernel is at some disjointed address. The interrupt will go to > the copy at the real zero. Any references to code outside that region > must be done via a full indrect branch (not a relative one), simiar to > the secondary startup (via following the function pointer in a descriptor > set in very low memory), or syscall entry and exception vectors via paca. > That would still break on normal PPC boxes, as any address accessed in real mode has to be inside the RMA. And the #include for kvm/book3s_rmhandlers.S happens after __end_interrupts. So I'd end up with code that gets executed outside of the RMA after a relocation, right? Alex -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Milton Miller on 20 Jul 2010 03:30
On Mon, 19 Jul 2010 about 14:00:56 +0200, Alexander Graf wrote: >Milton Miller wrote: >> I wrote: >> >>> On Mon Jul 19 2010 at about 03:36:51 EST, Alexander Graf wrote: >>> >>>> On 19.07.2010, at 03:11, Benjamin Herrenschmidt wrote: >>>> >>>> >>>>> On Thu, 2010-07-15 at 17:05 +0530, Subrata Modak wrote: >>>>> >>>>>> commit e62cee42e66dcca83aae02748535f62e0f564a0c solved the problem for >>>>>> 2.6.34-rc6. However some other bad relocation warnings generated against >>>>>> 2.6.35-rc5 on Power7/ppc64 below: >>>>>> >>>>>> MODPOST 2004 modules^M >>>>>> WARNING: 2 bad relocations^M >>>>>> c000000000008590 R_PPC64_ADDR32 .text+0x4000000000008460^M >>>>>> c000000000008594 R_PPC64_ADDR32 .text+0x4000000000008598^M >>>>>> >>>>> I think this is KVM + CONFIG_RELOCATABLE. Caused by: >>>>> >>>>> .global kvmppc_trampoline_lowmem >>>>> kvmppc_trampoline_lowmem: >>>>> .long kvmppc_handler_lowmem_trampoline - CONFIG_KERNEL_START >>>>> >>>>> .global kvmppc_trampoline_enter >>>>> kvmppc_trampoline_enter: >>>>> .long kvmppc_handler_trampoline_enter - CONFIG_KERNEL_START >>>>> >>>>> Alex, can you turn these into 64-bit on ppc64 so the relocator >>>>> can grok them ? >>>>> >>>> If I turn them into 64-bit, will the values be > RMA? In that case >>>> things would break anyways. How does relocation work on PPC? Are the >>>> first few megs copied over to low memory? Would I have to mask anything >>>> in the above code to make sure I use the real values? >>>> >>>> Alex >>>> >>>> >>> You can still do the subtraction, but you have to allocate 64 bits for >>> storage. Relocatable ppc64 kernels work by adjusting PPC64_RELOC_RELATIVE >>> entries during early boot (reloc in reloc_64.S called from head_64.S). >>> >>> The code purposely only supports 64 bit relative addressing. >>> >> >> Oh yea, and for book-3s, the code copies from 0x100 to __end_interrupts >> in arch/powerpc/kernel/exceptions-64s.h down to the real 0, but the rest >> of the kernel is at some disjointed address. The interrupt will go to >> the copy at the real zero. Any references to code outside that region >> must be done via a full indrect branch (not a relative one), simiar to >> the secondary startup (via following the function pointer in a descriptor >> set in very low memory), or syscall entry and exception vectors via paca. >> > >That would still break on normal PPC boxes, as any address accessed in >real mode has to be inside the RMA. And the #include for >kvm/book3s_rmhandlers.S happens after __end_interrupts. So I'd end up >with code that gets executed outside of the RMA after a relocation, right? > >Alex > Weither its outside of the RMA or not, DO_KVM is creating a branch outside of code copied to lowmem. This is BROKEN. We have a hard limit that we can't extend _end_interrupts past 0x7000, and a soft limit that we can't exceed 0x6000. If there is space, we could move the real mode handler extensions inside end_interrupts in exceptions-64s.S, and store the full address in a .quad so it gets relocated properly. Don't subtract the start, we have designed the kernel to run with start at a VA that can be used as a EA in real mode. Otherwise we need to mark KVM_BOOK3S_64 depends on (!RELOCATABLE || BROKEN) for 2.6.35 until we get fixes. I took a read though the book3s code as of 2.6.34. A few things I noticed: (1) The code is using slb large to control the segment size. It should be using SLB B field (or just impliment 256M segments only). (2) It appears that the mtspr and mfspr code is using the same storage for bats 4-7 as 0-3 ... I would have expected a 4 + a few places. (3) Its not clear to me that you clear RI when transitioning to the guest but its obviously required because you place state in srr0 & srr1. (4) I don't understand why __kvmppc_vcpu_run turns on interrupts so that __kvmppc_vcpu_entry can turn them back off. Something to do with irq trace annotations? milton -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |