This document assumes that the project will be split up into different sections that can be implemented and tested separately for the most part. It can also be done all at once, but this means that the entire thing has to be finished before any testing can be done. While this wasn't too much of an issue in Project 3, it's a BAD idea for Project 4. And yes, the caps are warranted there.

 

 

The TLB

========

Up until now, Nachos has used a page table to translate from a virtual address to a physical address. This project switches from the page table to a TLB. The TLB is (by default) a 4-entry TranslationEntry array; in effect, it's a miniature page table. Nachos uses the TLB to convert from the virtual address that it receives into a physical address in order to actually access Nachos main memory.

 

The problem with the TLB, just like the page table, is updating it. With a page table, it was simple; whenever there was a context switch, the machine->pageTable pointer was updated to point to the page table of the thread taking over the CPU. A TLB isn't quite as simple. There's only the one single TLB for the entire instance of Nachos, and Nachos itself has no clue what to do when it can't find something in the TLB. When this happens, it throws a PageFaultException.

 

PageFaultExceptions, just like all of the other varieties, wind up in ExceptionHandler, at the bottom of exception.cc. To begin with, it is only set up to handle SyscallExceptions. It must be modified to handle PageFaultExceptions. Creating a new function keeps things a bit cleaner. Name it something bold and imaginative, like HandlePageFault.

 

Since a PageFaultException is the result of Nachos being unable to find a particular virtual address in the TLB, handling the page fault must include putting that particular virtual address-physical address translation into the TLB before returning. At the lowest level, this consists of finding that entry in the current thread's address space and placing it in the TLB. Since the page table is indexed by virtual address, it is very simple to find the entry in the address space.

 

Once found, the translation must be placed into the TLB. This is to be done sequentially, replacing TLB entry 0, 1, 2, 3, 0, and so on. This can be accomplished with a simple counter.

 

NOTE: A TLB replacement can actually replace the translation needed to get the instruction that is being executed. Don't be surprised if you get back-to-back PageFaultExceptions. So long as no single instruction requires more than four different pages in memory, it will eventually be able to execute.

 

NOTE: A PageFaultException, unlike a SyscallException, should *not* increment the PC, as the instruction at that location in memory has not been completed.

 

NOTE: Nachos looks for a particular virtual address in the TLB. It does not have any concept of multiprogramming, so it can't tell the virtual address of one process from that of any other process. Be sure to invalidate the contents of the TLB on a context switch. If you want to get fancy, you can do a check and only invalidate the TLB if the incoming thread is from a different process than the outgoing thread. If they are from the same process, they have the same address space.

 

NOTE: Any time that the TLB is invalidated, the dirty bits must be checked. If something has been changed, this change must be propagated back to the page table/IPT. Basically, this means that the dirty bit must be set. The data has already been changed in memory, but if the page table/ITP don't know about this, that page in memory could potentially be overwritten.

 

NOTE: There are several functions (copyin, copyinz, copyout) that use the machine->pageTable pointer by default. When using the TLB, this will always be null. These functions must be modified to work without the machine->pageTable pointer.

 

Simple Algorithm

----------------

A. Get the virtual page required by dividing the virtual address by the page size.

B. Find that entry in the current thread's page table.

C. Copy the virtual and physical page numbers from the page table to the slot in the TLB.

   1. At this point, there is no need to propagate back to the page table, as everything is pre-loaded.

 

 

The IPT

=======

The IPT is an inverted page table; rather than being indexed by virtual page, it is indexed by physical page. It contains the information about what is located in different pages of Nachos main memory. Nachos knows nothing about an IPT. You must create it from the ground up.

 

The TranslationEntry class is insufficient for the IPT (and the page table, for that matter). Looking ahead to demand-paged virtual memory, you might consider keeping track of some or all of the following in a new class:

   1. Physical Page

   2. Virtual Page

   3. Page Type (code, data, mixed, etc.)

   4. Page Location (main memory, swapfile, executable, etc.)

   5. Dirty Bit (whether a page has been modified -- VERY important)

   6. Use Bit (Never did find anywhere that this is used...)

   7. Read-Only Bit (Never found this being used either)

   7. Process ID (Important because there are multiple virtual page Xs)

   8. Timestamp (For Nachos LRU page replacement later)

 

Please note that the above is not a definitive, complete list. You may want to add more to it.

 

The IPT is supposed to keep track of what is in memory. In order to do this, it needs to be updated any time something is moved into/out of memory. With pre-loading, this occurs only when a new address space is created. When demand-paged virtual memory is implemented, this will occur much more frequently during page faults.

 

Another Simple Algorithm

------------------------

A. When allocating memory in the AddrSpace constructor, also set the information in the IPT.

   1. At this point, the timestamp is irrelevant.

 

 

Demand-Paged Virtual Memory

===========================

Prior to this, everything was pre-loaded into Nachos main memory in the AddrSpace constructor. This is no longer the case. A page in main memory is only to be used when it is actually needed during program execution. Hence a page will only be loaded into main memory on a page fault. There is no more pre-loading of memory. This means that you cannot close the executable when creating a process. Check progtest for this. Since the executable remains open, you must keep a handle on it for later use.

 

The basic idea of virtual memory is to be able to hold more pages in "memory" without actually adding more RAM. The pages that get pushed out of main memory are copied into a swap file on a hard drive. If they are needed again, they get copied back into main memory.

 

As referenced in the previous section, the TranslationEntry class is no longer sufficient. The page table for each address space needs to be modified to keep track of where a particular page is; if a page is swapped to disk, the only place that knows about it is the page table for that process. Modify the page table to keep track of the additional information that is needed.

 

Page replacement also requires additional information in the form of a timestamp (at least for the Nachos LRU method of replacement). Nachos knows nothing about the IPT, or any enhanced TranslationEntry class; you'll need to keep track of the timestamp in the IPT. Since Nachos doesn't know about the IPT, it won't update the timestamp for you. The only real place where you can do so is while handling a page fault. At that point, update the timestamp for each page in the TLB (provided, of course, that said page is valid). Note that this is only an approximation, as four pages could potentially have the same timestamp, even if they were actually used at different times by Nachos. Nachos LRU selects the page with the lowest timestamp (lowest tick count) to replace.

 

All of this references a swap file; all this is in the case of Nachos is a plain old file that you copy stuff into and out of. See the AddrSpace constructor for an example of reading from a file. You'll need to create and open the swap file when Nachos starts, and close it when Nachos finishes. It must be accessible by all processes, as there is a single swap file per instance of Nachos.

 

On a page fault, if there is no more space in memory, something will need to be removed to make space. If the page is dirty, it must be copied into the swap file and the page table updated. The easiest way to do this is to use the OpenFile WriteAt function. WriteSegment also works, but it's harder to use. Just copy a page sized chunk from Nachos main memory into the swap file. Note that you will need some way to keep track of where in the swap file a particular page has been placed.

 

 

Integration

===========

All three of these sections can actually be done independently; the moment of truth arrives when the time comes to put everything together. In theory, it should be a matter of changing what gets updated, and where information is pulled from.

 

Integration Steps

-----------------

1. Remove any code that pre-loads memory (generally only in the AddrSpace constructor). You will only be loading into/out of memory on a page fault from here on out.

2. Modify the TLB to update from the IPT on a page fault.

3. Modify the IPT to load from the executable/swap file if something is not in memory.

4. Hope that everything works.

 

The following is *one* possible algorithm for handling a page fault.

 

Algorithm of Gigantitude

------------------------

A. Get the virtual page required.

   1. Ensure that the page is within a legal range.

B. Update the timestamp in the IPT for each valid entry of the TLB.

C. Check to see whether the needed page is in the IPT.

   1. If so, go on to step D.

   2. If not, then the page is either in the executable or the swap file and needs to be loaded into memory.

      a. This in turn means that there must be an available slot in main memory.

   3. Select a page in main memory in which to place the new page.

             a. If there is an unused page, use it.

             b. If there are no unused pages and random page replacement is being used, select a random page.

      c. If there are no unused pages and Nachos LRU page replacement is being used, select the page in the IPT with the lowest timestamp.

   4. Check to see whether the selected page is in the TLB.

      a. If so, set it invalid and propagate the dirty bit back to the IPT.

   5. Swap the page from main memory to the swap file.

      a. Depending on which implementation of the swap file you use, the page may or may not already have a swap file page associated with it.

             b. Write the contents of the page into the swap file from main memory.

             c. Update the page table to reflect the location of the evicted page.

   6. Now that there is an available slot in main memory, copy the needed page into it.

      a. Depending on where the page is stored (executable or swap file), read it from there into main memory.

   7. Update the IPT entry to correspond to the new page (PID, virtual page number, physical page number).

D. Update the TLB with the needed page.

   1. If the entry in the TLB that is to be replaced is valid and dirty, propagate the dirty bit back to the IPT.

   2. Copy the information about the needed page into the TLB (virtual page, physical page).

   3. Set the TLB entry's valid bit to true.

E. Increment the TLB counter used to select an entry to replace.

 

 

Remote Procedure Calls

======================

Get used to 'em. They'll be back with friends in Project 4. Armed with this knowledge, make sure to set them up well now; it will save you a great many headaches and sleepless nights later. One *IMPORTANT* thing to note is that while the Nachos packet is a character array, the data you send need not be a string. All of Nachos memory is a character array, so it's fairly simple to copy from one place to the other. How you structure your messages is up to you; there are quite a few ways to do it.

 

 

General Hints and Tricks

========================

1. Be careful with your macro guards (#ifdef VARIABLE, etc.). They can make life easier, or make a mess of your implementation.

 

2. Note that this project spans two directories: vm and network. When compiled from the vm directory, it *should not* use RPCs. Make sure that you don't delete the code you put together in the previous projects. Instead, use a macro (#ifdef NETWORK, or somesuch) to selectively compile one version or the other, depending on where it's being compiled from.

 

3. Whenever you find yourself doing the same thing many times (especially if you wind up copy/pasting code), put it in a function and make sure it works. Put together a little unit test. You really, *really* don't want to find out that your parsing code doesn't work after copying it to 47 different places in the code.

 

4. Use DDD. It makes life much easier. Run Nachos from within DDD if you hit segmentation faults/bus errors. When that happens, DDD will halt Nachos, but keep the stack. You can use that to find exactly where the segmentation fault is occurring. This is not the be-all and end-all of debugging, though. It really doesn't help when you get a segmentation fault in realfree under malloc.

 

5. Nachos packets give you a character array of something like 36 bytes by default. Note that you DO NOT have to send messages as strings.

 

6. When compiling in the network directory, Nachos gives each instance a PostOffice. Said PostOffice comes with its own thread, which keeps checking for messages. This thread never goes away, so Nachos will never halt on its own unless you explicitly halt it as part of Exit.

 

7. Archive your work. It can be handy to look at an earlier project. A good way to do this is to stick the code directory into a .tar.gz file once you're done with a project (and after you run gmake clean). Something like this:

   gtar -cf project1.tar code/

   gzip project1.tar

   This creates project1.tar.gz, which is the code directory in a compacted format. The code directory is still there as well. To untar, run the following command:

   gtar zxf project1.tar.gz

   This will re-inflate it into the code directory. It is strongly recommended that you do this somewhere other than in nachos-csci402; if you untar it here, it may overwrite your up-to-date files. Just put it in a temporary directory somewhere.

The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees