Outline

• Introduction
• Virtual memory concepts
  * Page replacement policies
  * Write policy
  * Page size tradeoff
  * Page mapping
• Page table organization
  * Page table entries
• Translation lookaside buffer

• Page table placement
  * Searching hierarchical page tables
• Inverted page table organization
• Segmentation
• Example implementations
  * Pentium
  * PowerPC
  * MIPS
Introduction

- Virtual memory deals with the main memory size limitations
  * Provides an illusion of having more memory than the system’s RAM
  * Virtual memory separates logical memory from physical memory
    » Logical memory: A process’s view of memory
    » Physical memory: The processor’s view of memory
  * Before virtual memory
    » Overlaying was used
      – It is a programmer controlled technique
Introduction (cont’d)

• Virtual memory
  * Automates the overlay management process
    » Big relief to programmers

• Virtual memory also provides
  * Relocation
    » Each program can have its own virtual address space
    » Run-time details do not have any impact on code generation
  * Protection
    » Programs are isolated from each other
      – A benefit of working in their own address spaces
    » Protection can be easily implemented
Introduction (cont’d)

- Principles involved are similar to the cache memory systems
  * Details are quite different
    » Due to different objectives
  * Concept of locality is key
    » Exploits both types of locality
      – Temporal
      – Spatial
  * Implementation is different
    » Due to different lower-level memory (disk)
      – Several orders of magnitude slower
Virtual Memory Concepts

- Implements a mapping function
  - Between virtual address space and physical address space
- Examples
  - PowerPC
    » 48-bit virtual address
    » 32-bit physical address
  - Pentium
    » Both are 32-bit addresses
      - But uses segmentation
Virtual Memory Concepts (cont’d)

• Virtual address space is divided into fixed-size chunks
  * These chunks are called virtual pages
  * Virtual address is divided into
    » Virtual page number
    » Byte offset into a virtual page
  * Physical memory is also divided into similar-size chunks
    » These chunks are referred to as physical pages
    » Physical address is divided into
      – Physical page number
      – Byte offset within a page
Virtual Memory Concepts (cont’d)

• Page size is similar to cache line size

• Typical page size
  » 4 KB

• Example
  * 32-bit virtual address to 24-bit physical address
  * If page size is 4 KB
    » Page offset: 12 bits
    » Virtual page number: 20 bits
    » Physical page number: 12 bits
  * Virtual memory maps $2^{20}$ virtual pages to $2^{12}$ physical pages
Virtual Memory Concepts (cont’d)

An example mapping of 32-bit virtual address to 24-bit physical address
Virtual Memory Concepts (cont’d)

Virtual to physical address mapping

Virtual address space

00008H
00007H
00006H
00005H
00004H
00003H
00002H
00001H
00000H

Physical address space

FFFFH
FFEH
FFDH
FFAH
FF9H
FF8H

005H
004H
003H
002H
001H
000H

12-bit physical page number

20-bit virtual page number
Virtual Memory Concepts (cont’d)

Page fault handling routine

Page fault

Consult disk address page table

Transfer the faulted page from disk to memory

Memory full?

Yes

Replace a page to make room

No

Update page tables

Uses page-replacement policy

Return to faulted instruction
Virtual Memory Concepts (cont’d)

- A virtual page can be
  * In main memory
  * On disk
- Page fault occurs if the page is not in memory
  * Like a cache miss
- OS takes control and transfers the page
  * Demand paging
    » Pages are transferred on demand

```
Page fault
 Consult disk address page table
 Transfer the faulted page from disk to memory
 Memory full?
 Yes
 Replace a page to make room
 No
 Update page tables
 Return to faulted instruction
 Uses page-replacement policy
```

© S. Dandamudi
Chapter 18: Page 12
Virtual Memory Concepts (cont’d)

• Page Replacement Policies
  * Similar to cache replacement policies
  * Implemented in software
    » As opposed to cache’s hardware implementation
  * Can use elaborate policies
    » Due to slow lower-level memory (disk)
  * Several policies
    » FIFO
    » Second chance
    » NFU
    » LRU (popular)
      – Pseudo-LRU implementation approximates LRU
Virtual Memory Concepts (cont’d)

• Write policies
  * For cache systems, we used
    » Write-through
      – Not good for VM due to disk writes
    » Write-back

• Page size tradeoffs
  * Factors favoring small page sizes
    » Internal fragmentation
    » Better match with working set
  * Factors favoring large page sizes
    » Smaller page sizes
    » Disk access time

Pentium, PowerPC:
  4 KB
MIPS:
  7 page sizes
  between
  4 KB to 16 MB
Virtual Memory Concepts (cont’d)

• Page mapping

  * Miss penalty is high
    » Should minimize miss rate

  * Can use fully associative mapping
    » Could not use this for cache systems due to hardware complexity

  * Uses a translation table
    » Called page table

  * Several page table organizations are possible
Page Table Organization (cont’d)

• Simple page table organization
  * Each entry in a page table consists of
    » A virtual page number (VPN)
    » Corresponding physical page number (PPN)
  * Unacceptable overhead

• Improvement
  * Use virtual page number as index into the page table

• Typical page table is implemented using two data structures
Page Table Organization (cont’d)

Two data structures
Page Table Organization (cont’d)

• Page table entry
  * Physical page number
    » Gives location of the page in memory if it is in memory
  * Disk page address
    » Specifies location of the page on the disk
  * Valid bit
    » Indicates whether the page is in memory
      – As in cache memory
  * Dirty bit
    » Indicates whether the page has been modified
      – As in cache memory
Page Table Organization (cont’d)

* Reference bit
  » Used to implement pseudo-LRU
    – OS periodically clears this bit
    – Accessing the page turns it on

* Owner information
  » Needed to implement proper access control

* Protection bits
  » Indicates type of privilege
    – Read-only, execute, read/write
  » Example: PowerPC uses three protection bits
    – Controls various types of access to user- and supervisor-mode access requests
Translation Lookaside Buffer

• For large virtual address spaces
  * Translation table must be in stored in virtual address space
    » Every address translation requires two memory accesses:
      – To get physical page number for the virtual page number
      – To get the program’s memory location
  
• To reduce this overhead, most recently used PTEs are kept in a cache
  * This is called the *translation lookaside buffer* (TLB)
    » Small in size
    » 32 – 256 entries (typical)
Translation Lookaside Buffer (cont’d)

- Each TLB entry consists of
  - A virtual page number
  - Corresponding physical page number
  - Control bits
    » Valid bit
    » Reference bit
    » . . .

- Most systems keep separate TLBs for data and instructions
  - Examples: Pentium and PowerPC
Translation Lookaside Buffer (cont’d)

Translation using a TLB

Virtual page number (VPN) requested by program

Perform TLB lookup

Requested page table entry in TLB?

Yes

Requested page in memory?

Yes

Update TLB

Generate physical address

No

Perform page table lookup

Handle page fault

No
Page Table Placement

- Large virtual address spaces need large translation tables
  * Placed in virtual address space
  * Since the entire page table is not needed, we can use a hierarchical design
    » Partition the page table into pages (like the user space)
    » Bring in the pages that are needed
    » We can use a second level page table to point to these first-level tables pages
    » We can recursively apply this procedure until we get a small page table
Page Table Placement (cont’d)

Three-level hierarchical page table

Virtual address

8 bits  10 bits  10 bits  12 bits

Byte offset

Index

Base address (in a register)

Level-3 page table (256 PTEs)

Level-2 page table (1024 PTEs)

Level-1 page table (1024 PTEs)

Physical page number
Page Table Placement (cont’d)

• Hierarchical page tables are also called *forward-mapped page tables*
  * Translation proceeds from virtual page number to physical page number
    » In contrast to inverted page table

• Examples
  * Pentium
    » 2-level hierarchy
    » Details later
  * Alpha
    » 4-level hierarchy
Page Table Placement (cont’d)

• Searching hierarchical page tables
  * Two strategies
    » Top-down
      – Starts at the root and follows all the levels
      – Previous example requires four memory accesses
        ➔ 3 for the three page tables
        ➔ 1 to read user data
    » Bottom-up
      – Reduces the unacceptable overhead with top-down search
      – Starts at the bottom level
        ➔ If the page is in memory, gets the physical page number
        ➔ Else, resorts to top-down search
Inverted page Table Organization

• Number of entries in the forward page table
  * Proportional to number of virtual pages
    » Quite large for modern 64-bit processors
    » Why?
      – We use virtual page number as index
• To reduce the number of entries
  * Use physical page number as index
    » Table grows with the size of memory
• Only one system-wide page table
  * VPN is hashed to generate index into the table
    » Needs to handle collisions
Inverted page Table Organization (cont’d)

Hash anchor table is used to reduce collision frequency

Hash anchor table

Index

Virtual address

Virtual page number

Byte offset

Hash function

Inverted page table

Pid CB VPN

Pid CB VPN

Physical page #

Byte offset

Physical page address
Segmentation

• Virtual address space is linear and 1-dimensional
  * Segmentation adds a second dimension
• Each process can have several virtual address spaces
  * These are called segments
  * Example: Pentium
    » Segments can be as large as 4 GB
• Address consists of two parts
  * Segment number
  * Offset within the segment
Segmentation (cont’d)

• Pentium and PowerPC support segmented-memory architecture
  * Paging is transparent to programmer
    » Segmentation is not
      – Pentium assembly programming makes it obvious
        ➔ Uses three segments: data, code, and stack

• Segmentation offers some key advantages
  * Protection
    » Can be provided on segment-by-segment basis
  * Multiple address spaces
  * Sharing
    » Segments can be shared among processes
Segmentation (cont’d)

Dynamic data structure allocation

(a) Stack
    (b) Code segment
    (b) Data segment
    (b) Stack segment

2003 © S. Dandamudi
Example Implementations

• Pentium

  * Supports both paging and segmentation
    » Paging can turned off
    » Segmentation can be turned off

  * Segmentation translates a 48-bit logical address to 32-bit linear address
    » If paging is used
      – It translates the 32-bit linear address to 32-bit physical address
    » If paging is off
      – Linear address is treated as the physical address
Example Implementations (cont’d)

Pentium’s logical to physical address translation
In Pentium, each segment can have its own page table.

Diagram:

- Local descriptor table (LDT)
  - Descriptor
  - Descriptor

- Page directory
  - PDE

- Page table
  - PTE
  - PTE
  - PTE

- Physical pages
Example Implementations (cont’d)

<table>
<thead>
<tr>
<th>31</th>
<th>12 11</th>
<th>9 8 7 6 5 4 3 2 1 0</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

20-bit physical page number

### Pentium’s page directory & page table entry format

(a) Page directory entry

<table>
<thead>
<tr>
<th>31</th>
<th>12 11</th>
<th>9 8 7 6 5 4 3 2 1 0</th>
</tr>
</thead>
<tbody>
<tr>
<td>20-bit physical page number</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Avail: Available for system programmer use

<p>| | | | | | | | | | |</p>
<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td>P</td>
<td>P</td>
<td>C</td>
<td>W</td>
<td>T</td>
<td>U</td>
<td>W</td>
<td>P</td>
</tr>
</tbody>
</table>

0 for 4-KB page size

(b) Page table entry

<table>
<thead>
<tr>
<th>Bit</th>
<th>Description</th>
<th>Value</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>Page present bit (P)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>1</td>
<td>Writes permitted (W)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>2</td>
<td>User/supervisor (U)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>3</td>
<td>Page-level write-through (PWT)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>4</td>
<td>Page-level cache-disable (PCD)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>Page accessed (A)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>0 in PDT dirty bit (D) in PTE</td>
<td></td>
<td></td>
</tr>
<tr>
<td>7</td>
<td>Page size in PDE (0 for 4 KB pages)</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Avail: Available for system programmer use

Shaded bits: Reserved by Intel (must be zero)

2003 © S. Dandamudi

Chapter 18: Page 35

Example Implementations (cont’d)

Pentium’s protection rings

- Applications
- Shared libraries
- System calls
- OS kernel
- Level 0
- Level 1
- Level 2
- Level 3
Example Implementations (cont’d)

• PowerPC
  * Supports both segmentation and paging
  * Logical and physical addresses are 32-bit long
  * 32-bit logical address consists of
    » 12-bit byte offset
    » 16-bit page index
    » 4-bit segment number
      – Selects one of 16 segment registers
      – Segment descriptor is a 24-bit virtual segment id (VSID)
  * 52-bit virtual address consists of
    » 40-bit VPN
    » 12-bit offset
Example Implementations (cont’d)

PowerPC’s logical to physical address translation process
Example Implementations (cont’d)

• PowerPC uses inverted page table
  * Uses two hash tables
    » Primary
      – Uses 8-way associative page table entry groups
    » Secondary
      – 1s complement of the primary hash function
  * PTEs are 8-bytes wide
    » Stores valid bit, reference bit, changed bit (i.e., dirty bit)
    » W bit (write-through)
      – W = 1: write-through policy
      – W = 0: write-back policy
    » I bit (cache inhibit)
      – I = 1: cache inhibited (accesses main memory directly)
Example Implementations (cont’d)

Hash table organization in PowerPC

40-bit virtual page number

Primary hash function

Hash function

19 bits

If the PTE is not in primary hash table

Secondary hash function

1s complement

One PTE group = 8 PTEs

Primary hash table

20-bit physical page number

Secondary hash table

20-bit physical page number

Example Implementations (cont’d)

• MIPS R4000
  * Segmentation is not used
    » Uses address space identifiers (ASIDs) for
      – Protection
      – Virtual address space extension
  * 32-bit virtual address consists of
    » 8-bit ASID
    » 20-bit VPN
      – Depends on the page size
    » 12-bit offset
      – Depends on the page size
      – Supports pages from 4 KB to 16 MB
Example Implementations (cont’d)

Virtual to physical address translation with 4 KB pages
Example Implementations (cont’d)

Virtual address

| 8-bit ASID | 8-bit VPN | 24-bit byte offset |

Virtual to physical address translation with 16 MB pages

TLB/Page table translation

12-bit physical page number | 24-bit byte offset |

36-bit physical address
Example Implementations (cont’d)

<table>
<thead>
<tr>
<th>Mask</th>
<th>0</th>
<th>Mask</th>
<th>0</th>
</tr>
</thead>
</table>

MIPS TLB entry format

<table>
<thead>
<tr>
<th>Mask</th>
<th>0</th>
</tr>
</thead>
</table>

| EntryHi | 95 |
| VPN2 | 19 |
| G | 1 |
| ASID | 4 |

| EntryLo1 | 63 |
| 0 | 2 |
| Physical page number | 24 |
| C | 3 |
| D | 1 |
| V | 1 |

| EntryLo0 | 31 |
| 0 | 2 |
| Physical page number | 24 |
| C | 3 |
| D | 1 |
| V | 1 |
Example Implementations (cont’d)

• MIPS R 4000 supports two TLB replacement policies
  * Random
    » Randomly selects an entry
  * Indexed
    » Selects the entry specified

• Two registers support these two policies
  * A Random register for the random policy
  * An Index register for the index policy
    » Specifies the entry to be replaced