Muli Ben-Yehuda's journal

July 31, 2003

Dave McCracken’s Shared Page Tables talk

Filed under: Uncategorized — Muli Ben-Yehuda @ 3:55 PM

Thu 15:00 PM

Listening to Dave McCracken’s Shared Page Tables talk. This is the most interesting talk I’ve heard so far, not in least because it’s something that I want to work on. [Later in the day, I did start working on it].

– shared memory areas mapped in many address spaces can take up more space in page table space than in data space.

– mm_struct: one per address space

– vma: one vma per mapped area per address space – linked list and tree anchored in mm_struct – describes a virtual address range and protection – reference to the backing file – anonymous vmas – have no backing file

– page table – one page table for each address space – pointed to from mm_struct – three levels – pgd, pmd, pte – doubles as hardware page table for most archs

– one address_space structure per open file. struct address_space does not describe an address space! it describes a file… – anchors list of all vmas that map a region of the file – contains a page cache of all physical pages containing data form the file

– struct page: one per physical page – describes how the page is used – has a pointer to address_space if it’s mapping data from a file – all page structs live in mem_map – with rmap – has a back pointer (or array of back pointers) to all of the ptes that map the page

– to create a new memory area – either mmap or shmemap – all shmem is file backed, either explicitly or implicitly via shmfs (internal file system) – if a page is marked prive and read_write, modified pages are converted to anonymous and backed by swap

– a page is only mapped when a task faults trying to access it – fault code finds the correct vma and pte entry, then finds and maps the page. if necessary, the pte page is allocated on the fly.

– mm subsytem has three primary locks: – read/write semapore, mmap_sem in mm_struct, protects the vma chain. taken for read during a page fault, taken for write for mmap, f.e. – spinlock page_table_lock protects the page_table – i_shared_sem in address_space protects a file’s vma chain. used to be a spinlock in 2.4, turned into a semaphore in 2.5

– sharing pte pages: – overhead for singly mapped area is small – overhead for each area grows linearly with number of mappings – massively mapped areas could use more physical pages memory for page tables than data pages – pte pages for large shared areas are identical in each address_space

[shared segments which aren’t mapped in the same virtual addresses aren’t currently considered shared – TODO ;-)]

– finding shareable pages: – vma must be shareable, must span entire pte page – walk address_space chain of vmas looking for one mapping the range – check the pte page for each mapping vma to see if it can be shared

– setting the pmd entry read-only allows you to do copy-on-write of pte pages?

[forks slowed down significantly in 2.5, due to rmap pte chains, and then shared pte sped that up again]

– locking changes: page_table_lock breaks when pte pages are shared – new lock in pte_page_lock protects pte page

– complications – reverse mapping includes pointer to mm_struct – shared page tables pages may need pointers to multiple mm_structs – pointer had to be converted to a chain – several system calls may modify mappings and require unsharing pte pages

[philosophy: better safe then sorry, if not 100% sure that the sharing is correct, unshare it]

– primary motivation of the project is reduction of memory overhead [page tables live in lowmem]

– COW improves fork performance by factor of 10 – unsharing costs as much as fork without COW, plus a little extra – all programs unshare at least 3 pte pages – small programs only have 3 pte pages – simple hack is to not do COW for such programs (with only 3 pte pages)

– kernel compile showed no change when sharing pte pages – applications with massively shared areas benefited indirectly from the extra avaliable memory

– status: patch was stable in about mid-novemeber last year – the patch is still there and dmc is still maintaining it – talk to dmc for his copy for the patch

during the break, met Jeff Dike in person, who told me that shared page tables should go into UML rather effortlessly, since the code is very similar in its organization, and also talked to dmc, who said that the patch -mjb is pretty much up to date.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

%d bloggers like this: