Prev: [PATCH] ARM/nuc900: re-organize the nuc900 lcd arch platform data setting
Next: [git pull] Input updates for 2.6.35-rc5
From: Avi Kivity on 18 Jul 2010 11:40 On 07/18/2010 06:23 PM, Gleb Natapov wrote: > On Sun, Jul 18, 2010 at 06:14:11PM +0300, Avi Kivity wrote: > >> On 07/17/2010 07:31 AM, Gleb Natapov wrote: >> >>>>> Currently pages allocated for guest memory are required to be RW, so after your series >>>>> behaviour will remain exactly the same as before. >>>>> >>>> Except KSM pages. >>>> >>>> >>> KSM page will be COWed by __get_user_pages_fast(addr, 1, 1, page) in >>> get_user_page_and_protection() just like it COWed now, no? >>> >> Well, we don't want to COW it on write faults. >> I meant read faults here. >> The optimal behaviour is: >> >> - write faults: COW and instantiate a writeable spte >> > So do we or don't we want to COW on write faults? > We do (no choice). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Lai Jiangshan on 28 Jul 2010 22:20 On 07/16/2010 03:19 PM, Gleb Natapov wrote: >> +/* get a current mapped page fast, and test whether the page is writable. */ >> +static struct page *get_user_page_and_protection(unsigned long addr, >> + int *writable) >> +{ >> + struct page *page[1]; >> + >> + if (__get_user_pages_fast(addr, 1, 1, page) == 1) { >> + *writable = 1; >> + return page[0]; >> + } >> + if (__get_user_pages_fast(addr, 1, 0, page) == 1) { >> + *writable = 0; >> + return page[0]; >> + } >> + return NULL; >> +} >> + >> +static pfn_t kvm_get_pfn_for_page_fault(struct kvm *kvm, gfn_t gfn, >> + int write_fault, int *host_writable) >> +{ >> + unsigned long addr; >> + struct page *page; >> + >> + if (!write_fault) { >> + addr = gfn_to_hva(kvm, gfn); >> + if (kvm_is_error_hva(addr)) { >> + get_page(bad_page); >> + return page_to_pfn(bad_page); >> + } >> + >> + page = get_user_page_and_protection(addr, host_writable); >> + if (page) >> + return page_to_pfn(page); >> + } >> + >> + *host_writable = 1; >> + return kvm_get_pfn_for_gfn(kvm, gfn); >> +} >> + > kvm_get_pfn_for_gfn() returns fault_page if page is mapped RO, so caller > of kvm_get_pfn_for_page_fault() and kvm_get_pfn_for_gfn() will get > different results when called on the same page. Not good. > kvm_get_pfn_for_page_fault() logic should be folded into > kvm_get_pfn_for_gfn(). > The different results are the things we just need. We don't want to copy and write a page which is mapped RO when only read fault. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Gleb Natapov on 29 Jul 2010 02:00
On Thu, Jul 29, 2010 at 10:15:22AM +0800, Lai Jiangshan wrote: > On 07/16/2010 03:19 PM, Gleb Natapov wrote: > > >> +/* get a current mapped page fast, and test whether the page is writable. */ > >> +static struct page *get_user_page_and_protection(unsigned long addr, > >> + int *writable) > >> +{ > >> + struct page *page[1]; > >> + > >> + if (__get_user_pages_fast(addr, 1, 1, page) == 1) { > >> + *writable = 1; > >> + return page[0]; > >> + } > >> + if (__get_user_pages_fast(addr, 1, 0, page) == 1) { > >> + *writable = 0; > >> + return page[0]; > >> + } > >> + return NULL; > >> +} > >> + > >> +static pfn_t kvm_get_pfn_for_page_fault(struct kvm *kvm, gfn_t gfn, > >> + int write_fault, int *host_writable) > >> +{ > >> + unsigned long addr; > >> + struct page *page; > >> + > >> + if (!write_fault) { > >> + addr = gfn_to_hva(kvm, gfn); > >> + if (kvm_is_error_hva(addr)) { > >> + get_page(bad_page); > >> + return page_to_pfn(bad_page); > >> + } > >> + > >> + page = get_user_page_and_protection(addr, host_writable); > >> + if (page) > >> + return page_to_pfn(page); > >> + } > >> + > >> + *host_writable = 1; > >> + return kvm_get_pfn_for_gfn(kvm, gfn); > >> +} > >> + > > kvm_get_pfn_for_gfn() returns fault_page if page is mapped RO, so caller > > of kvm_get_pfn_for_page_fault() and kvm_get_pfn_for_gfn() will get > > different results when called on the same page. Not good. > > kvm_get_pfn_for_page_fault() logic should be folded into > > kvm_get_pfn_for_gfn(). > > > > > The different results are the things we just need. How so? Users of kvm_get_pfn_for_gfn() will think that page is invalid and may report misconfiguration to userspace and users of kvm_get_pfn_for_page_fault() will think that the access to page is OK. There are no many users of kvm_get_pfn_for_gfn() and may be your patch replace all of them with kvm_get_pfn_for_page_fault(), but this just strengthen the point that they should be merged. > We don't want to copy and write a page which is mapped RO when > only read fault. I don't see how returning inconsistent results helps us achieving that. BTW since kvm_get_pfn_for_gfn() will never map RO page get_user_page_and_protection() will never find any RO pages. Looks like kvm_get_pfn_for_page_fault() is equivalent to kvm_get_pfn_for_gfn() since !write_fault section will at best find mapped RW page. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ |