Lecture 5 - Process Scheduling; Protection: Kernel and Address Spaces

CSCI402 - Operating Systems

Lecture 5 - Process Scheduling; Protection: Kernel and Address Spaces

Administrivia

Project 1 is due February 14, Sunday, by midnight. You can download the details from the class web site.

The mid-term is March 12th. I don't have a room number, yet.

I have typed up my lecture notes for the fourth lecture. They are available from the class web site.

Readings for this week

Sections 2.4, 3.1

This lecture covers Tom Anderson's lectures 11 and 12.

Today's Lecture

Process Scheduling

Definitions

Scheduler

The Scheduler's job is to decide who gets the CPU next.

Scheduling Algorithm

The algorithm used by the Scheduler to decide who gets the CPU next.

Policy

The rules used by the Scheduler to decide who gets the CPU next.

Mechanism

How the Scheduler has implemented its Policy. Note that the mechanism can be parameterized (parameters that can change) or fixed.

Goal of the Scheduler

Note that some of the goals listed above conflict. As batch jobs tend to be longer than interactive jobs, then minimizing the turnaround time for long batch jobs means that the response time for interactive users may increase. Also, it will more than likely reduce the number of jobs that are completed each hour.

The big questions then is...

How Long To Allow A Job To Have The CPU?

Nonpreemptive Scheduling

Early batch systems ran each job to completion. There was little need for a scheduler, other than being able to detect when a job finished, so that the next job in the queue could be started.

Preemptive Scheduling

The Scheduler interrupts jobs when they have had their share of the CPU. This is done today because the number of interactive jobs far exceed the number of batch (background) jobs. Most systems have a clock that interrupts the CPU 50 to 60 times a second (or less). Jobs typically receive from 10 to 100 milliseconds of CPU time before they are preempted. You don't want to interrupt the CPU too often as there is overhead associated with context switching.

Assumptions

CPU scheduling was a hot research topic in the 1970s. Because of this, all of the good scheduling algorithms are known (provably optimal). The algorithms typically make the following assumptions:


However, these assumptions are not true in most cases today, however, simplifying the problem often allows us to derive a solution that still works.

Scheduling Algorithms

Let's look at some scheduling algorithms and see how well they meet the goals of the Scheduler.

First-In First-Out (FIFO)

FIFO was used in early batch systems of the 1960s. It runs a job until it is done. This algorithm is completely fair, as it is like standing in line. The job at the front of the line receives the CPU until it is finished. Then the next job gets the CPU.

For example, say we have 5 jobs. Job 1 takes 24 hours and jobs 2 through 4 take 10 minutes. We run the first job for 24 hours and then run the next 4 after that. The average time to complete all 5 jobs is 24 hours 20 minutes. Not very good!

Round Robin

Round Robin is a preemptive scheduling algorithm that uses a timer to interrupt the CPU after a certain amount of time. The Scheduler maintains a queue of runnable jobs. The first job in the queue is executed until preempted. It is put at the end of the queue and the next job is executed.

This is also a fair algorithm as jobs receive essentially the same amount of CPU time. However, this can be a very inefficient algorithm if you give every process the same amount of time.

Comparing Performance of FIFO and Round Robin

Assume that we have 10 jobs, each of which takes 100 sec to execute. Let's compare the average turnaround time for the 10 jobs between FIFO and Round Robin.

 Job Completion Times
Job #FIFORound Robin
1100991
2200992
3300993
4400994
5500995
6600996
7700997
8800998
9900999
1010001000
Average turnaround time550 sec995 sec

FIFO and Round Robin Summary

Round Robin is good for handling short jobs. With jobs that are all the same length and for long jobs, Round Robin is not good, as you end up with a lot of context switching and jobs don't complete for a long time because of all the context switching.

You can also see that when all jobs take the same length, FIFO is optimal. However, the chance of this happening is very small.

Shortest Job First

There are a number of variations to SJF. The most common are shortest time to complete first and shortest remaining time to complete first. The goal of this class of scheduling algorithms is to get the short jobs out of the computer as quickly as possible.

The algorithms are most efficient, but least fair. This is because longer jobs will not be executed if there is a shorter job. In the worst case, the longest jobs would never run. Definitely not a fair algorithm.

Comparing Performance of SJF with FIFO and Round Robin

Example of Performance between FIFO, Round Robin and SJF

We have 3 jobs - A, B, and C. A and B are CPU bound jobs that take 1 week of computer time to execute. C is an I/O bound job that executes in a loop of 1 msec of CPU time and 10 msec of disk I/O. Essentially A and B require 100% CPU time and C requires 10% CPU time.

FIFO

We run A for 1 week, B for 1 week and then C runs. So the person wanting C to execute won't see any results for more than 2 weeks.

Round Robin

We will use a 100 msec time slice for our scheduler. Since C only requires 1 msec of CPU time, the 3 jobs will each run about 5 times in 1 second (500 msec for A, 500 msec for B, and 5 msec for C - close enough to 1 second for us). Thus, there is only about 5% utilization of the disk (5 disk accesses for 10 msec each during 1000 msec).

C will definitely finish sooner than in the FIFO case, however, A and B will both finish within 100 msec of each other, somewhere around 2 weeks after the jobs were submitted. It would be nice to be able to get a higher utilization of the disk, however.

SJF

There is really no need to preempt C, since it is going to interrupt itself every 1 msec of CPU time. In addition, when a disk I/O for C is complete, we can force a hardware interrupt of the CPU, causing A or B to release the CPU. This is what SJF can do for us.

What happens is that when C is ready to run its CPU job, it preempts A or B, runs for 1 msec, posts an I/O request and goes to sleep. One of the long jobs gets the CPU for 10 msec until C's I/O request is fulfilled. This process repeats until C completes.

With this scheme, we are still keeping the CPU 100% busy and we also get 90% usage of our disk. Much better than either of the other methods.

SJF Summary

Disadvantages

From the previous example, SJF seems to be wonderful. However, there are some problems with a pure SJF scheduler.

IBM used a scheduler on its older mainframes that required you to estimate how long your job would take. It based its scheduling on the value you provided. If you underestimated, your job was dropped and had to start over from the beginning. This led to serious over-estimating to avoid having your job killed. Thus, the optimality was lost because the scheduling was not based on the real time required to complete a job, but on a over-estimated time.

Advantages

The biggest advantage is that SJF is optimal for average response time and the number of jobs completed in an hour.

Knowledge of the Future

We can't always predict how long a job will take, but we can use the past history of the job's execution patterns to derive an estimate of the time required for the current job session.

Multilevel Feedback

The job time for the current session of the job is based on past performance. Jobs that only require short amounts of CPU time will tend to continue that pattern and jobs that are CPU bound tend to stay that way. Priority scheduling is based upon the estimates calculated.

Adaptive Policies

Scheduling decisions are based on job prior activity in some fashion. Thus, the priority of a job can change over time.

Multilevel Feedback Queues

Multilevel Feedback Queues were first implemented in CTSS (MIT). CTSS was the first real timesharing system. The problem with CTSS was that it could only have one process in memory at a time. This meant that a context switch required the entire running process to be swapped out to memory and the next process loaded in its entirety into memory. Because of this, the designers of CTSS needed to minimize context switching.

They developed an adaptive policy. They have multiple priority queues. Each priority queue used round robing scheduling within the queue. Jobs of lower-level priority would only run if there were no jobs of higher priority waiting to run.

By itself, this doesn't help the context switching problem. What the CTSS designers did that helped was to exponentially increase the time slice as the priority level dropped. The highest priority queue got a time slice of 1 unit. The next highest priority received a time slice quantum of 2. Each lower level priority queue had there time slice quantum doubled.

Which Priority Level for Jobs

So how do we decide which priority queue to place jobs? All jobs start in priority level 1. If a job uses up all of its allotted time (gets preempted based on using all of its time slice), it is bumped down one level. Thus, the next time it runs, it gets twice the CPU time. For CPU-bound jobs, they will work their way down the priority levels getting longer and longer pieces of CPU time.

This worked as far as minimizing context switching, however, during busy times, CPU-bound jobs would not get executed. There was no way to force a job to run as long as there were higher priority jobs available.

Scheduling for Fairness

Let's emphasize fairness over efficiency.

Adjust Job Priorities Over Time

One way to be more fair is to increase the priority level of jobs the longer they take to complete. Thus, big jobs that might start out at a lower priority would have their priority increased over time, giving them a larger share of the CPU time.

The problem with this mechanism is how to determine how fast priorities should increase. If they increase too fast, you will give too much CPU time to long jobs, at the expense of the shorter jobs (like interactive ones). In addition, if the CPU is very busy, all of the jobs will have increased priority so that new jobs entering the computer will be at the bottom of the queue.

Lottery Scheduling

Let's give every job some number of lottery tickets (that we don't take away). At each time slice, the Scheduler randomly picks a winning ticket. The priority level of the job is determined by the number of lottery tickets a process has. On average, the amount of CPU time a process gets is proportional to the number of lottery tickets they started with.

How do we assign tickets to jobs?
We can approximate the SJF scheduling algorithm if we give short jobs more tickets than long jobs. To avoid starvation, every job gets at least one ticket.

The advantages to this method is that as the load increases, long jobs are not delayed as much as in SJF (because they have at least one ticket). In addition, when jobs are added, or complete, the effect on the remaining jobs is spread equally. Everyone has their chance of executing reduced/increased proportionally.

Queueing Theory (When do you buy a faster computer?)

Queueing theory helps to predict response time. Response time is based on the number of users, the speed of the CPU and the speed of the disk drives. You want to buy a faster computer when queueing theory can show you that you will improve response time.

How does response time vary anyway?

Worst Case: All users submit jobs at the same time

Response time in this case decreases linearly with the number of users and increases linearly with a faster CPU.

Best Case: Each user submits a job when the last job finished

Response time does not change with increasing users until you reach 100% CPU utilization. Once that is reached, response gets linearly worse like in the worst case.

Protection: Kernel and Address Spaces

The goal of an OS is to keep processes and threads from hurting each other and from hurting the OS. Most OSes accomplish this through a protected kernel mode and an unprotected user mode. In kernel mode, you have access to all of physical memory. In user mode, you only have access to the memory allocated to your process. This is called your address space.

Operating System Organization

To better understand where modern OS protection schemes come from, we need to review some of the older techniques, some of which are still in use in desktop computer OSes.

Uniprogramming Without Protection

The OS only allowed one process at a time in memory. Your program could only hurt itself. There was no need for protection (what fun is it to hack your own program, anyway?).

Multiprogramming Without Protection

This is also called a Linker-Loader. When a program is copied into memory, all addresses are changed to reference addresses appropriate to where the program is loaded into memory. This takes time to convert all the addresses before you can start running your program. This must be done by software, as well.

Multiprogrammed OS With Protection

To successfully implement a protection scheme, we need help from the computer hardware. If our solution was solely dependent on software, then some condition could arise where a user program could break through the protection and change the protection software. We will also take advantage of kernel mode vs. user mode.

Address Translation

We said earlier that a program's address space is all of the memory that a program has available to it. Protection involves restricting the memory that a program can access. We will use the concept of address translation in our protection scheme.

Address translation will help because there is no way for a program to address any memory outside of its address space. We will use an MMU (Memory Management Unit) that will sit between the CPU and physical memory. Any address requested by a user program will be "translated" by the MMU to a location in physical memory. The MMU will know what is a valid address for a user program and what is not. Any invalid request will cause a trap to the OS, generating an error, or some other response by the OS.

We will see in next week's lecture how address translation can be implemented.

Dual Mode Operation: Kernel Mode vs. User Mode

In user mode, there is no access to physical memory. All addresses are translated by the MMU. In kernel mode, there is no translation. Kernel-mode programs (like the OS) have access to all physical memory. Nachos, Unix, and most OSes run in kernel mode.

Isses with Dual Mode Operation

Kernel Mode to User Mode

When the OS executes a user program, it creates a thread to do the following:

User Mode to Kernel Mode

How does a user program get back into the kernel so that it can do things like printing and reading/writing files? There are 2 ways a user program can access kernel mode:
Voluntarily: The program makes a system call. System calls are special instructions that jump to the OS. Essentially, your program is asking the OS to do something for you.

Involuntarily: A hardware interrupt or program exceptions cause the kernel to activate.

How Do System Calls Work?

The activities of a system call are mostly atomic (we generally don't want to interrupt them). The following things are fairly standard for system calls:

If we could interrupt a system call at the right time, we could have our user program run in kernel mode and really have a good time killing other processes and the OS.

Context Switching Between Programs

When we switch between programs we need to save/restore all of the registers (you have no way of knowing what registers someone else used), save/restore the address translation table and jump back to the value of the program counter.

System Call Arguments

System calls use CPU registers and can also write to physical memory. User program addresses are translated through the MMU, but kernel addresses are not.

Communication Between Address Spaces

Programs do not share memory (no sharing of address spaces). Thus all communication between processes (inter-process communication) is done through system calls. Once you allow processes to communicate with each other, you allow for the possibility of bugs. Just like system calls must validate the input, when sharing information between processes/threads you must validate the input. Most bugs in commercial software arise because someone assumed the input data would have certain characteristics.

Example: Application - Kernel Interaction: Shells and Unix Fork

A shell in Unix is a user program that prompts users to enter commands. The shell executes a system call to perform the command entered by the user.

Running a program on Unix causes the following to happen:

OK, what's a fork and an exec in Unix, anyway? A fork (in Unix, anyway) creates an exact copy of the current process (including the data and variables already created); it does not change the program that is to be run. The process executing the fork is stopped and the new process is put on the ready list.

An exec, on the other hand, does not create a new process, it just causes the current process to run a new program.

So why do we care about this, anyway?

Unix Does Not Support Threads

Because Unix does not have threads, there is no direct way to communicate between processes. Since we can't share address spaces, we need some other way to share information between processes. The Unix designers decided to split the fork and exec processes to simulate the sharing that can be done with threads. Since a fork creates an exact duplicate, all of the data available to the parent process is available to the child process created by the fork.

The disadvantage to this mechanism is that the child process has no easy way to give its results back to the parent. Since the information is copied, the parent cannot directly see what data was created by the child process. You can pipe the results to the parent (essentially like reading from a file), or physically store the information in a database or separate file.

What About Nachos

Since Nachos does support threads, it doesn't need to perform the same trick as Unix does. Nachos combines the fork and exec into a single operation.

The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees