[PATCH] hugetlb: preserve hugetlb pte dirty state
authorKen Chen <kenchen@google.com>
Thu, 8 Feb 2007 22:20:27 +0000 (14:20 -0800)
committerLinus Torvalds <torvalds@woody.linux-foundation.org>
Fri, 9 Feb 2007 17:25:46 +0000 (09:25 -0800)
__unmap_hugepage_range() is buggy that it does not preserve dirty state of
huge_pte when unmapping hugepage range.  It causes data corruption in the
event of dop_caches being used by sys admin.  For example, an application
creates a hugetlb file, modify pages, then unmap it.  While leaving the
hugetlb file alive, comes along sys admin doing a "echo 3 >
/proc/sys/vm/drop_caches".

drop_pagecache_sb() will happily free all pages that aren't marked dirty if
there are no active mapping.  Later when application remaps the hugetlb
file back and all data are gone, triggering catastrophic flip over on
application.

Not only that, the internal resv_huge_pages count will also get all messed
up.  Fix it up by marking page dirty appropriately.

Signed-off-by: Ken Chen <kenchen@google.com>
Cc: "Nish Aravamudan" <nish.aravamudan@gmail.com>
Cc: Adam Litke <agl@us.ibm.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: <stable@kernel.org>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/hugetlbfs/inode.c
mm/hugetlb.c

index 4f4cd132b571337b8145d2c7432056b6d92b8d61..e6bd553fdc4cf29214e6f2b3a7fe50817e9e8d80 100644 (file)
@@ -449,10 +449,13 @@ static int hugetlbfs_symlink(struct inode *dir,
 }
 
 /*
- * For direct-IO reads into hugetlb pages
+ * mark the head page dirty
  */
 static int hugetlbfs_set_page_dirty(struct page *page)
 {
+       struct page *head = (struct page *)page_private(page);
+
+       SetPageDirty(head);
        return 0;
 }
 
index cb362f761f174b926c1c41b72ec2940c5d45763d..36db012b38dde252c827d0af4719c2e7b79e5083 100644 (file)
@@ -389,6 +389,8 @@ void __unmap_hugepage_range(struct vm_area_struct *vma, unsigned long start,
                        continue;
 
                page = pte_page(pte);
+               if (pte_dirty(pte))
+                       set_page_dirty(page);
                list_add(&page->lru, &page_list);
        }
        spin_unlock(&mm->page_table_lock);