Operating System: Three Easy Pieces --- Paging: Small Tables (Note)
We now tackle the second problem that paging introduces: page tables are too big and thus consume
too much memory. Let us start out with a linear page table. As you might recall, linear page tables get
pretty big. Assume again a 32-bit address space, with 4KB pages and 4 bytes page table entry. An address
space that has roughly one million virtual pages in it. Multiply by the page table entry size and you see
that our page table is 4MB in size. Recall also: we usually have one page table for every process in the
system. With one hundred active process (not uncommon on a modern system), we will be allocating
hundreds of megabytes of memory just for page tables. As a result, we are insearch of some techniques
to reduce this heavy burden. There are a lot of them, so let us get going. But not before our crux: How
to make page tables smaller?
Simple array-based page tables (usually called linear page tables) are too big, taking up for too much
memory on typical systems. How can we make page tables smaller? What are the key issues? What
inefficiencies arise as a result of these new data structures?
1. Hybrid Approach: Paging and Segments
Whenever you have reasonable but different approaches to something in life. You should always examine
the combination of the two to see if you can obtain the best of both worlds. We call such a combination
2. Multi-Level Page Tables
A different approach does not rely on segmentation, but attcks the same problem: how to get rid of all
these invalid regions in the page table instead of keeping them all in memory? We call this approach a
multi-level page table, as it turns the linear page table into something like a tree. This approach is so
effective that many modern operating systems employ it.
The basic idea behind a multi-level page table is simple. First, chop up the page table into page-sized units;
then, if an entire page of page-table entries (PTEs) is invalid, do not allocate that page of the page table
at all. To track whether a page of the page table is valid (and if valid, where it is in memory), use a new
structure, called the page directory. The page directory thus either can be used to tell you where a page
of the page table is, or that the entire page of page table contains no valid pages. The page directory, in
a simple two-level page table, contains one entry per page of the page table. It consists of a number of
page directory entries (PDE). A PDE minimally has a valid bit and a page frame number (PFN), similar to a
PTE. However, as hinted at above, the meaning of this valid bit is slightly different: If the PDE entry is valid,
it means that at least one of the pages of the page table that the entry points to (via the PFN) is valid. e.g.
in at least one PTE on that page pointed to by this PDE; the valid bit in that PTE is set to one. If the PDE entry
is not valid, the rest of the PDE is not defined.
Multi-level page tables have some obvious advantages over approaches we have seen thus far. First, and
perhaps most obviously, the multi-level page table only allocates page-table space in proportion to the amount
of address space you are using. thus it is generally compact and supports sparse address space. Second, if
carefully constructed, each portion of the page table fits neatly within a page, making it easier to manage memory;
the OS can simply grab the next free page when it needs to allocate or grow a page table. Contrast this to a
simple linear page table, whic is just an array of PTEs indexed by VPN; with such a structure, the entire linear
page table must reside contiguously on physical memory. For a large page table (say 4MB), finding such a large
chunk of unused contiguous free physical memory can be quite a challenge.
With a multi-level structure, we add a level of indirection, through use of the page directory, which points to pieces
of the page table. That indirection allows us to place page table pages wherever we would like in physical memory.
It shoud be noted that there is a cost to multi-level tables; on a TLB miss, two loads from memory will be required
to get the right translation information from the page table (one for the page directory, and one for the PTE itself),
in contrast to just one load with linear page table. Thus, the multi-level page table is a small example of a time-space
trade-off. We wanted smaller tables and get them, but not for free; although in the common case (TLB hit), performance
is obviously identical, a TLB miss suffers from a higher cost with this smaller table. Another obvious negative is
complexity, whether it is the hardware of OS handling the page-table lookup on a TLB miss, doing so is undoubtedly
more involved than a simpler linear page table lookup. Often we are willing to increase complexity in order to improve
performance or reduce overheads; in the case of multi-level table, we make page table lookups more complicated
in order to save valuable memory.优质内容筛选与推荐>>