CSCI402: Frequently Asked Questions about the Nachos Assignments

Last updated: 17 February 2000

You can jump to frequently asked questions about:
General Questions about the Class
Questions about submitting your assignment
Questions about Nachos Programming
Project 1 Questions
Project 2 Questions
Project 3 Questions
Project 4 Questions


General Class Questions

  1. AM I ALLOWED TO CHEAT?

    NO. You are welcome to discuss the projects and the design of your code with other members of the class -- in fact, you are encouraged to do so. But this collaboration is allowed only as far as sharing ideas and discussing designs; you absolutely MUST NOT share code.

    The University has specific recommendations on how to deal with students who copy code from other students for assignments, and for the students who allow the code to be copied. And we have automatic tools that detect copying, so you will be caught. Students have gotten F in the course in the past, so DON'T DO IT.

  2. Can you debug my code for me?

    No. Debugging is your job. If you have a SPECIFIC question, you might want to make the question clearer by emailing us a SHORT (10-15 lines) piece of your code. However, please don't mail us an entire file and ask a general question like "Does this look correct to you?" or "Can you tell me why this doesn't work?".

  3. Is this class hard?

    Yes. You need significant programming experience and plenty of time to do the assignments. Also, you'll be working in a group, so you need to know how to manage small teams and, how to make sure that things get done when there is more than one person responsible.

Questions about Submitting your Assignment

  1. How do I submit my assignment?

    See this page.

  2. How do I know what files will be submitted when I submit?

    Run the 402submit program in "debug mode", i.e. with the "-d" argument.

  3. When we try to submit, we get "disk quota exceeded". What do we do?

    This actually means that your disk quota is being exceeded. You have a one meg disk quota on /auto/submit-scf, so don't try to submit more than one meg of files or 100 files. If you need to submit more than once, and your second submission pushes you over your disk quota, wait a few minutes for your first submission to be cleared out of /auto/submit-scf. This should happen within 15 minutes. You can check your disk quota by typing "quota -v" and looking under the entry marked "/auto/submit-scf".

    If you have trouble submitting, DO NOT EMAIL CODE TO THE TAs. You MUST wait for your disk partition to be cleared and then use the 402submit program again. This means you should PLAN YOUR WORK ACCORDINGLY - do not work right up until the deadline!

  4. How should the writeup be formatted?

    In plain ASCII (NOT postscript, NOT MS-Word, etc.).

Questions about Nachos Programming

  1. How do I get a printout of the entire source?

    Are you sure you really want to print all the code? It can be useful, but be warned that printing all the code will:

    Much more recommended is to use the code browsing facility in Emacs enabled by creating tags as described below.

    If you really want to print the code, first, go into the Makefile and find the line that says "LPR = lpr", and change it to read "LPR = lpr -Pxxxx", replacing "xxxx" with the name of the printer you want to use. Then type

    cd nachos-csci402/code
    gmake print
    
  2. How do I create a TAGS file?

    cd nachos-csci402/code
    etags -t `find . \( -name '*.c' -o -name '*.cc' -o -name '*.h' \)`
    
    In Emacs, you can then use M-. to find the highlighted symbol's definitions. This is very cool, since it allows you, while reading the code to jump into the definition of a subroutine from the place where it is used.

  3. I get the following error message when I try to run nachos
    ld.so.1: nachos: fatal: libucb.so.1: can't open file: errno=2
    

    Add this code to your .cshrc
    #!/bin/csh 
    #
    if ( ${?LD_LIBRARY_PATH} ) then
            setenv LD_LIBRARY_PATH /usr/ucblib:$LD_LIBRARY_PATH
    else
            setenv LD_LIBRARY_PATH /usr/ucblib
    endif
    

  4. I am trying to pass a class member function to the Fork() function but the linker says it's not found!

    You can only pass a pointer to a class member function if it is a static member function. (If you think about it, it should be clear why this is true -- think about the "this" pointer.)

  5. When I do a make, I get an error like make: Fatal error in reader: ../Makefile.common, line 126: Extra `:', `::', or `:=' on dependency line

    You have to use 'gmake' instead of 'make'.

  6. What does a "segmentation fault" or "bus error" mean?

    It often means you're not using pointers correctly. UNIX usually generates this error if you try to dereference a NULL pointer, or a pointer that you never initialized.

  7. coff2noff fails when I try to run it on programs in the 'test' directory; it gives me error messages like "can't handle both bss and sbss" or "can't handle both data and rdata".

    This is a known limitation of the standard Nachos distribution. Here's a workaround provided by Dr. F:
    It's a known limitation of our configuration.  It's actually the
    linker that's screwed up.
    
    Have him put the following in his test directory, called nachos.ld and
    change -T script to -T ./nachos.ld to his LDFLAGS.  Works like a
    charm, and took me forever to find. :-)  (Most people don't have the
    problem, so I didn't put the code in the tarball.)
    
     SEARCH_DIR(/vex/gnu/decstation-ultrix/lib);
    ENTRY(__start)
    SECTIONS
    {
      .text : {
         _ftext = . ;
        *(.init)
        *(.text)
         etext  =  .;
         _etext  =  .;
      }
      .data : {
        *(.rdata)
        *(.data)
        *(.sdata)
        *(.scommon)
      }
      .bss : {
        *(.bss)
        *(.sbss)
        *(COMMON)
      }
       end = .;
       _end = .;
    }
    
    

Assignment 1: Threads and Synchronization

Question 1

  1. Why do we need to pass the lock into the Signal() function?

    Because a thread should not be able to signal a condition variable unless it is already holding the lock associated with that variable.

  2. What is the difference between Signal and Broadcast?

    Signal() signals ONE of the processes waiting on a condition variable. (If there's more than one, ONLY one of the processes should be signalled -- it doesn't matter which one.)

    Broadcast() signals ALL of the processes waiting on a condition variable.

  3. Can I change the interface of the member functions of Condition and Lock classes?

    No, our test cases (and yours) depend on the standard interface for locks and condition variables, so you shouldn't change them.

Question 2

  1. Our test suite for locks and condition variables doesn't look exactly like the sample output shown as part of the assignment. Is that okay?

    Yes. You can decide what you want the result of an illegal operation should be (e.g., a process trying to release a lock that it doesn't hold). If you decide that this should just be ignored, your output will probably look similar to the sample output in the assignment. If you decide that the errant process should be terminated, your output will look different. And, whatever you do, make sure to document how you decided to design your code and describe it in the writeup so that your grader will understand it.

    However, note that in no case should a process actually be allowed to release a lock that it isn't holding!

  2. Exactly what tests are required for the test suite? How many tests should we include?

    To start with, your test suite should include, but NOT be limited to, the test described in Part 2 of the assignment -- i.e. making sure that processes can not release locks that they do not hold. But you'll have to write other tests, too. Your solution to Part 2 should consist of a set of tests such that all the tests, taken together, exhaustively test the feature set of your locks and condition variables. It is up to you to determine exactly what tests are required -- the design of the test suite is part of the assignment.

Question 3

  1. What should the customers do with their machines?

    After machines are allocated, you can simply print a message (something like "Customer 4 using machines 4, 5, and 6") and then release the machines again.

  2. How do we show that the given allocate() code can fail?

    Remember, in correctly written code, you should ALWAYS be able to insert a "currentThread->Yield()" statement anywhere that interrupts are enabled without changing the correctness of the code. This is really important, so I'll say it again: anywhere that interrupts are ENABLED, inserting a currentThread->Yield() should not change the correctness of the code. (The exact order in which events happen may change, but if your code is correct -- i.e., uses thread synchronization primitives correctly -- the Yield statements should never be able to make the code act incorrectly.)

    You should be able to get allocate() and release() to break with appropriate insertions of Yield statements.

  3. Do we have to worry about deadlock?

    Good question. If you assume that customers allocate one machine at a time, and don't use any of their machines until all machines have been allocated, then there is a possibility of deadlock. However, this problem was really designed to explore race conditions, so we're not going to worry about that right now. You can either have an allocation fail completely if there aren't enough machine, OR, have the customer use whatever machines were allocated successfully, release them, and then allocate more machines to complete the task. Either way, deadlock won't occur.

Question 4

  1. How do we use threads to solve this problem?

    It's up to you. Remember, a big part of this class is learning how to design code. A lot of these problems have more than one correct solution. Whatever you do, MAKE SURE to tell the grader what you did in your writeup! If the grader doesn't understand what you did, he won't give you any points.

    That said, here are some suggestions. One is you can have one thread exist for each Missionary or Cannibal that is there -- in other words, if you have 10 missionaries and 5 cannibals waiting for a boat, there would be 15 threads. Another way you can do it is have a Missionary thread and Cannibal thread; and each one keeps track of how many of its kind of people are waiting -- in this case, there would be 2 threads no matter how many missionaries and cannibals there are.

    These are just two suggestions. You can do something else if you have a better idea, as long as it inolves multiple threads that are communicating via locks and condition variables. (Well, I should say you can do ALMOST anything else... silly solutions will not get points.)

  2. What happens when the missionaries and cannibals get across the river?

    They disappear. The point of this assignment is just to get the missinaries and cannibals arranged into safe boatloads. Once the groupings are determined, you can simply print out a message saying what happened (e.g,. "Boat leaving with Missionary 4, Missionary 9, and Cannibal 1!").


Assignment 2: Multiprogramming and System Calls

General Questions

  • Why does the machine hang when I use the SynchConsole object?

    This is sort of a bug in Nachos (or, call it a feature, or a quirk, or whatever). The machine only halts when there are no runnable threads and no interrupts pending, but the SynchConsole adds a pending interrupt so the machine never halts. You can work around this fairly easily, though: in your implementation of the Exit system call, call machine->Halt() manually if Exit sees that the last thread is exiting.


    Assignment 3: Virtual Memory

    General Questions

    1. Do we need Assignment 2 to be working fully to do Assignment 3?

      You don't need SetPriority(), but you do need all the system calls to work, as well as multiprogramming.

    2. We may need to do a complete redesign of our assignment 2... will that be a problem?

      You can change all that you want; your code does not have to be anything like what you turned in for any previous assignment.

    3. What command-line arguments are required?

      Here are the legal and illegal combinations:
      legal:
      
      nachos -x noff_filname  // defualt NRU
      nachos -P LRU -x filename  // LRU -- NRU, RAND, FIFO also okay
      
      illegal:
      
      nachos -P filename // **ILLEGAL**
      nachos -P LRU noff_filename // **ILLEGAL**
      nachos -P -x noff_filename // **ILLEGAL
      

    Question 1 -- Using the TLB

    1. We are supposed to implement a TLB which is only a partial table of translations. Where do we put the rest of the translations? We cannot use the same page table used in assignment 2. Where else?

      You can use the same page table used in assignment 2. Just like assignment 2, you have one page table for each process. The difference is that instead of handing the entire page table to the hardware on each context switch (which is what you did last time by saying something like pageTable = currentThread->space->pagetable), you instead spoon-feed individual lines of that page table into the TLB, one line at a time, as the system generates page fault exceptions.

    2. But there's an assert statement(s) in translate.cc (func. Translate() line 203) saying you can have either a page table or a TLB, but not both. How can we have both?

      In assignment 2, you probably had one page table per address space, that you keep internally somewhere, AND, one page table that the hardware "knows about". (On each context switch you installed one of your internally kept page tables into hardware.) Now, you're right that the hardware can only have a TLB or a page table that it knows about. This is completely independent of the page tables that you keep internally in your structures. In assignment 2 the hardware knew about one page table. Now it knows about one TLB instead.

    Question 2 -- Virtual Memory

    1. Should we have one swap file per process, or one swap file for the entire system?

      It's up to you.

    2. according to the problem description of Assignment 3, NRU is to be implemented with "a reasonable clock tick". Does that mean we are to apply the clock algorithm?

      No, it means you pick a reasonable time interval in between times when you clear all the "referenced" bits.

    3. I was wondering what the number of ticks should be before we change pages from recently used to not recently used. Could you give me a ballpark range?

      No, it's up to you. If you don't know a good range, do some experiments.

    4. To implement Perfect LRU, we'd need hardware support (i.e., the hardware setting a timestamp on every reference), but the Nachos machine does not have this support. How do we do this without hardware support?

      You can timestamp the pages every time you load them into the TLB, instead of every time they are actually referenced.

    5. Are we allowed to implement the LRU any way we want? In the Anderson notes it says that we should use a linked list (descending order based on age) but this seems unnecessary; I'd rather use an array instead.

      Yes, you can implement it however you like, including as an array.

    6. In project 2, when a new address space was created, we immediately loaded the entire executable into memory. Do we do the same thing for project 3, or do we the load the pages only on a per-need-basis (Demand paging).

      It's up to you. Either one is fine.

    7. Can we swap out a page that is currently being referenced?

      First, remember that a single instruction may reference more than one page of memory. And, when you restart an instruction after it generates a page fault, the entire instruction is re-executed from the beginning including the instruction fetch. So, if you throw out one of the pages that an instruction needs in order to load one of the other pages the instruction needs, it will generate another page fault when you restart.

      Assuming your memory is larger than 3 physical pages, this should never be a problem with with LRU, will usually not happen with NRU, and should never happen with FIFO. With Random, if you are very unlucky this might keep happening over and over again, leading to "livelock" (starvation), so if you like you may explicitly remove the pages involved in the current instruction from the list of candidate pages for ejection from physical memory. However this check is not required by the assignment.

    Question 3 -- Experiments and Test Cases

    1. I can't seperate part one and two of assignment three. In both of them TLB_USE and VM are defined. So, even tests for part one will use the disk. Is that ok?

      Yes, it's okay if your test cases test the entire thing. It is not necessary to write test cases that test the TLB without disk.

    2. For part 3 of assignment 3, it asks for the rows on the matrices to occupy 5 pages. How can we actually tell how many pages a row takes?

      You can easily calculate it based on the number of bytes per page, as defined in machine.h.

    3. What do you mean by "page faults that modify control bits"?

      In general, the class of page faults that are not TLB misses and the page was in physical memory. For example, page faults that convert a page from read-only to dirty + read/write in case you are simulating the modified bit as described in the Anderson notes. Or, page faults that convert a page from invalid to used and valid in case you are simulating a use bit.

      Most implementations don't have these types of page faults, so this statistic will most likely simply be 0.

    Written Questions

    1. What is the 'R' bit?

      On page 108 of the book, at the bottom of the page (section 3.4.2), it says: "R is set whenever the page is referenced (read or written)." So the R bit is the referenced bit for a page.

    Assignment 4: File Systems

    General Questions

    1. Are there test cases available?

      Yes, in fact, there are two sets of test cases available. The first set runs in kernel space, so that you can do this assignment, even if you completely screwed up the second and third assignments. The second set are user-level programs that will work only if you've done all the other projects correctly.

    2. Since we're not required to use VM in this assignment, can we set numPhyPages in machine.h back to 512 instead of 32? If this is not possible then I don't know how we're supposed to do this without VM.

      No, you should not change numPhysPages. If your VM is broken, you can avoid using it by using the test cases that run as kernel threads, instead of the more difficult test cases that run as user programs.

    3. How are we graded?

      You only need to get the kernel-level test cases working in order to get full credit. However, if you get the user-level test cases working too (i.e., get everything from the entire course working together at the same time), you get a 10% bonus.

    4. Where is the Remove system call?

      Remove is already implemented in the kernel filesystem code that we gave you (i.e., the FileSystem::Remove function), but standard Nachos doesn't have a Remove system call. You should add one, the like you did in assignment 2 when you added a SetPriority system call (e.g., by by modifying syscall.h and start.s, and code in exception.cc that calls FileSystem::Remove).

    5. The Remove system call doesn't seem to be working!

      Check out this nice answer by fellow 402 student Rachel Limon, posted to the Class Newsgroup:
      Just because you still see it on the disk does not mean that it is not "removed". The pointer to it is deleted, so to really test it, run the "nachos -l" command to list what is in the directory. It should not be listed. Also, you may try copying a new file to the disk and see if it overwrites what you remove.

    6. The test cases pass two arguments to Create, but the Create system call prototype in syscall.h only takes one argument. What do we do?

      For this assignment, you should change the Create system call prototype in syscall.h to take two arguments -- the name of the file, and the size. Also, in exception.cc, make sure to pass the second argument (the size) to the FileSystem::Create function.

      Once you implement extensible files, your filesystem should work even if 0 is passed as the second argument of Create.

    7. Does a thread need to have a file open for it to be deleted (Removed)?

      No. A thread does not have to have a file open for it to delete the file. A thread can even delete a file that it has never opened.

    Question 1: Building a thread-safe filesystem

    1. Should we have a unique file position for each thread or each process/address-space?

      You should have one file pointer per thread. Note that this deviates from the semantics of a UNIX filesystem that keeps one file pointer per process.

    2. Should operations be atomic at the level of an individual block, or an entire system call?

      THIS IS VERY IMPORTANT -- Contrary to the way the assignment was originally presented, you should make entire filesystem operations atomic (i.e., an entire call to Read or Write). In other words, if one thread starts a Read on a file, no other thread should be allowed to write to any part of that file until the Read completes.

      This is explained in Maurice Bach's book The Design of the UNIX Operating System in the following way (note that a Nachos FileHdr is equivalent to a UNIX inode):

      When a process invokes the read system call, the kernel locks the inode for the duration of the call. Afterwards, it could go to sleep reading a buffer associated with data or with indirect blocks of the inode. If another process were allowed to change the file while the first process was sleeping, read could return inconsistent data. For example, a process may read several blocks of a file; if it slept while reading the first block and a second process were to write the other blocks, the returned data would contain a mixture of old and new data. Hence, the inode is left locked for the duration of the read call, affording the process a consistent view of the file as it existed at the start of the call.

      In other words, when one process is writing to a file, no other processes may read from or write to that file until the first process is finished with the entire write (even if that write modifies more than one disk block).

      Similarly, when one process is reading from a file, no other processes may write to that file until the first process is finished. However, you may allow more than one process to perform parallel Reads on the same file; allowing parallel reads is not required, though, because I think it is a little tricky (you'd have to make sure that a long series of reads won't lead to starvation of a writer waiting to write).

      Note that you must allow parallel reads and writes that are accessing different files. In other words, Process A writing to File A should never block Process B from writing to File B.

    3. Do the FetchFrom and WriteBack functions need to be atomic?

      No, it's really only the filesystem operations -- as seen from the user program's perspective -- that must be atomic. Those functions themselves aren't necessarily required to be atomic. There might be more than one FetchFrom running at the same time, as long as each one is running on a different file.

    4. Does making a function atomic imply that the function cannot be interrupted?

      Your filesystem should not prevent other processes from running while one process is blocked waiting for disk I/O to complete. However, you should, for example, block a process that's trying to start a write, if that write is on a file that has another write in progress by a different thread.

    5. To make these functions atomic, should I disable interrupts?

      You can't disable interrupts when you're using the disk. You should use sempahores, or the locks you implemented in Assignment 1.

    6. When a thread is the ONLY thread to have a file open, can that thread delete the file? Does the thread have to close the file before deleting it?

      A thread can delete a file that is has open. The semantics are exactly the same as if one thread deletes a file that a different thread has open -- in other words, the file doesn't actually "go away" until all threads have closed it.

    7. If a file is Removed (with the Remove system call), but has not yet been actually deleted because a thread still has it open, can another thread still open the file?

      No. When a file is Removed, its directory entry is immediately deleted, which means that no other threads can open that file. However, if the file was already opened by other threads at the time, the resources associated with the file (i.e., its FileHdr, indirect blocks, and data blocks) are not freed until the last thread closes the file. You should keep track of how many threads have each file open, and free the resources associated with the file when the last thread closes it.

      Of course, if a file is Removed and no threads have it open at the time, you can free that file's resources immediately.

    8. The assignment says that more than one thread should be able to read AND write to a file. Read, OK, but write as well??? Doesn't that almost guarantee some kind of conflict?

      You should allow multiple readers and writers in the same file. It is not your problem if that leads to a conflict -- it's the problem of the person who wrote the program. Your job as the author of the filesystem is just to make sure that reads/writes are atomic, not that the reads and writes make sense to the application.

    9. You said that the filesystem operations should be atomic. Does that include Create, Open, Close, and Remove?

      You should be careful to avoid race conditions in any operation that modifies a FileHdr. You should lock a FileHdr before a system call accesses it, and unlock it when the system call is done. This should make everything atomic that needs to be atomic, including Read and Write.

    Question 2: Large files, and Extensible Files

    1. What is an extensible file?

      A file that can grow (in size) beyond the size specified when the file was originally created. If you implement it correctly, you should be able to create files of size 0, and let them grow as needed.

    2. What's the difference between the two parts of this question, asking us to implement extensible files, and large files?

      These are really two separate requirements, that are almost independent:

      1. You should be able to create a single file that is as big as the entire disk. (The current filesystem doesn't let you create files more than about 4K.)

      2. You should be able to create a file of 0 size, and have that file grow dynamically as processes write to the file. (The current filesystem requires that the file be created with a fixed size that never changes.)

    3. Do we have to worry about files that are decreasing in size?

      No. For this assignment, you can assume that files will only get bigger.

    4. How do we implement extensible directories?

      Remember that a directory really is just a file (that contains a list of other files, and pointers to their FileHeaders). The directory-file should be able to grow like any other file.

    Question 3: Optimizing Filesystem Performance

    1. Nachos does not seem to have a -DREALISM flag. Where is it?

      Sorry, I did not quite explain this correctly in class. Nachos always tries to model the performance of the simulated disk by computing the read head seek time, rotational latency, and other factors. You do not need to add a flag to turn on this simulation support. -DREALISM is a flag that you should add to your code, surrounding the code that you write for this question.

    2. Does it count as an optimization if we allow block-by-block atomicity (like the original assignment, before the clarification)? This should be faster because multiple reads can happen in parallel.

      No. This is not "just an optimization" -- it will also change the semantics of the Read and Write system calls. An optimization should never change semantics.

    3. Just how much optimization work do we have to do in order to get full credit?

      The professor writes:
      I tried to make the question as clear as possible on this point. It says, here are some possible approaches, they can make a design decision to choose one and then implement it.

      In terms of grading guidelines, what I had in mind was the following. Implementing any one of the approaches correctly gets you 3/4 of the marks. Picking the one right one for the benchmark and explaining why gets you another quarter. (Its a sequential testcase, that runs just once through a file, so the sequential disk block laying appraoch will work best). And we can give a 10% bonus for implementing each additional approach and doing the comparison between them.


    This page will always be under construction; last updated 17 Feb 2000.

    The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees