From: Avi Kivity on 28 Jun 2010 06:00 On 06/25/2010 03:06 PM, Xiao Guangrong wrote: > Consider using small page to fit guest's large page mapping: > > If the mapping is writable but the dirty flag is not set, we will find > the read-only direct sp and setup the mapping, then if the write #PF > occur, we will mark this mapping writable in the read-only direct sp, > now, other real read-only mapping will happily write it without #PF. > > It may hurt guest's COW > > Fixed by re-install the mapping when write #PF occur. > > Signed-off-by: Xiao Guangrong<xiaoguangrong(a)cn.fujitsu.com> > --- > arch/x86/kvm/mmu.c | 3 ++- > arch/x86/kvm/paging_tmpl.h | 18 ++++++++++++++++++ > 2 files changed, 20 insertions(+), 1 deletions(-) > > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c > index 556a798..0412ba4 100644 > --- a/arch/x86/kvm/mmu.c > +++ b/arch/x86/kvm/mmu.c > @@ -153,7 +153,8 @@ module_param(oos_shadow, bool, 0644); > #define CREATE_TRACE_POINTS > #include "mmutrace.h" > > -#define SPTE_HOST_WRITEABLE (1ULL<< PT_FIRST_AVAIL_BITS_SHIFT) > +#define SPTE_HOST_WRITEABLE (1ULL<< PT_FIRST_AVAIL_BITS_SHIFT) > +#define SPTE_NO_DIRTY (2ULL<< PT_FIRST_AVAIL_BITS_SHIFT) > > #define SHADOW_PT_INDEX(addr, level) PT64_INDEX(addr, level) > > diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h > index e46eb8a..fdba751 100644 > --- a/arch/x86/kvm/paging_tmpl.h > +++ b/arch/x86/kvm/paging_tmpl.h > @@ -325,6 +325,20 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr, > break; > } > > + if (*sptep& SPTE_NO_DIRTY) { > + struct kvm_mmu_page *child; > + > + WARN_ON(level != gw->level); > + WARN_ON(!is_shadow_present_pte(*sptep)); > + if (dirty) { > + child = page_header(*sptep& > + PT64_BASE_ADDR_MASK); > + mmu_page_remove_parent_pte(child, sptep); > + __set_spte(sptep, shadow_trap_nonpresent_pte); > + kvm_flush_remote_tlbs(vcpu->kvm); > + } > + } > + > Instead of adding a new bit, can you encode the protection in the direct sp's access bits? So we'll have one sp for read-only or writeable-but-not-dirty small pages, and another sp for writeable-and-dirty small pages. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Xiao Guangrong on 28 Jun 2010 06:10 Avi Kivity wrote: > > Instead of adding a new bit, can you encode the protection in the direct > sp's access bits? So we'll have one sp for read-only or > writeable-but-not-dirty small pages, and another sp for > writeable-and-dirty small pages. > It looks like it can't solve all problems, it fix the access corrupted, but will cause D bit losed: mapping A and mapping B both are writable-and-dirty, when mapping A write #PF occurs, the mapping is writable, then we can't set B's D bit anymore. Anyway, i think we should re-intall the mapping when the state is changed. :-( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 28 Jun 2010 07:20 On 06/28/2010 01:02 PM, Xiao Guangrong wrote: > > Avi Kivity wrote: > > >> Instead of adding a new bit, can you encode the protection in the direct >> sp's access bits? So we'll have one sp for read-only or >> writeable-but-not-dirty small pages, and another sp for >> writeable-and-dirty small pages. >> >> > It looks like it can't solve all problems, it fix the access corrupted, > but will cause D bit losed: > > mapping A and mapping B both are writable-and-dirty, when mapping A write > #PF occurs, the mapping is writable, then we can't set B's D bit anymore. > If B is writeable-and-dirty, then it's D bit is already set, and we don't need to do anything. If B is writeable-and-clean, then we'll have an spte pointing to a read-only sp, so we'll get a write fault on access and an opportunity to set the D bit. > Anyway, i think we should re-intall the mapping when the state is changed. :-( > When the gpte is changed from read-only to writeable or from clean to dirty, we need to update the spte, yes. But that's true for other sptes as well, not just large gptes. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Xiao Guangrong on 28 Jun 2010 21:30 Avi Kivity wrote: > On 06/28/2010 01:02 PM, Xiao Guangrong wrote: >> >> Avi Kivity wrote: >> >> >>> Instead of adding a new bit, can you encode the protection in the direct >>> sp's access bits? So we'll have one sp for read-only or >>> writeable-but-not-dirty small pages, and another sp for >>> writeable-and-dirty small pages. >>> >>> >> It looks like it can't solve all problems, it fix the access corrupted, >> but will cause D bit losed: >> >> mapping A and mapping B both are writable-and-dirty, when mapping A write >> #PF occurs, the mapping is writable, then we can't set B's D bit anymore. >> > > If B is writeable-and-dirty, then it's D bit is already set, and we > don't need to do anything. > > If B is writeable-and-clean, then we'll have an spte pointing to a > read-only sp, so we'll get a write fault on access and an opportunity to > set the D bit. > Sorry, a typo in my reply, i mean mapping A and B both are writable-and-clean, while A occurs write-#PF, we should change A's spte map to writable sp, if we only update the spte in writable-and-clean sp(form readonly to writable), the B's D bit will miss set. >> Anyway, i think we should re-intall the mapping when the state is >> changed. :-( >> > > When the gpte is changed from read-only to writeable or from clean to > dirty, we need to update the spte, yes. But that's true for other sptes > as well, not just large gptes. > I think the indirect sp is not hurt by this bug since for the indirect sp, the access just form its upper-level, and the D bit is only in the last level, when we change the pte's access, is not affect its sp's access. But for direct sp, the sp's access is form all level. and different mapping that not share the last mapping page will have the same last sp. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on 29 Jun 2010 03:10 On 06/29/2010 04:17 AM, Xiao Guangrong wrote: > >> If B is writeable-and-dirty, then it's D bit is already set, and we >> don't need to do anything. >> >> If B is writeable-and-clean, then we'll have an spte pointing to a >> read-only sp, so we'll get a write fault on access and an opportunity to >> set the D bit. >> >> > Sorry, a typo in my reply, i mean mapping A and B both are writable-and-clean, > while A occurs write-#PF, we should change A's spte map to writable sp, if we > only update the spte in writable-and-clean sp(form readonly to writable), the B's > D bit will miss set. > Right. We need to update something to notice this: - FNAME(fetch)() to replace the spte - FNAME(walk_addr)() to invalidate the spte I think FNAME(walk_addr) is the right place, we're updating the gpte, so we should update the spte at the same time, just like a guest write. But that will be expensive (there could be many sptes, so we have to call kvm_mmu_pte_write()), so perhaps FNAME(fetch) is easier. We have now if (is_shadow_present_pte(*sptep) && !is_large_pte(*sptep)) continue; So we need to add a check, if sp->role.access doesn't match pt_access & pte_access, we need to get a new sp with the correct access (can only change read->write). >>> Anyway, i think we should re-intall the mapping when the state is >>> changed. :-( >>> >>> >> When the gpte is changed from read-only to writeable or from clean to >> dirty, we need to update the spte, yes. But that's true for other sptes >> as well, not just large gptes. >> >> > I think the indirect sp is not hurt by this bug since for the indirect sp, the access > just form its upper-level, and the D bit is only in the last level, when we change the > pte's access, is not affect its sp's access. > > But for direct sp, the sp's access is form all level. and different mapping that not share > the last mapping page will have the same last sp. > Right. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo(a)vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
|
Next
|
Last
Pages: 1 2 3 4 Prev: KVM: MMU: fix forgot to flush all vcpu's tlb Next: [GIT pull] core fixes for 2.6.35 |