Chapter 2

Project Phase 1

1. Introduction

In this phase, you are given a skeleton source code of a simple operating system and asked to complete it. This implementation assumes many programs are running on the computer at a time, and that they reside in main memory starting at location 0. The implementation works with a version of the Z502 machine with no address translation hardware of any kind. Memory management is the focus of Project 2. The functions that the kernel supports in project 1 are multiprocessing, interaction between these processes, and some simple error handling.

This chapter contains a vast amount of information about operating systems in general, and your system in particular. It concludes with a description of what, specifically, you are to do.

2. The Basic Interface

In project 1 your operating system will provide several services that fall into two categories: those that the user program requests and those that the hardware signals.

2.1 Kernel Service Requests

As far as the code you must write for project 1 is concerned, all services the user program can request act practically the same. Consequently we only describe one such request in detail here.

What must a user program do when it wants to get the current time? Notice that there is no instruction in the user mode instruction set (see Appendix C) that causes hardware access directly (or any I/O for that matter). This is because hardware access is only managed by the operating system. In fact, the I/O instructions are only part of the kernel instruction set. Consequently, the ability to read the clock is considered a service that the operating system must perform on behalf of the user program.

To request the service of reading the clock, the user program communicates its request by executing a SOFTWARE_TRAP instruction (UUO, SVC, POP, EMT, or KSR instruction on other computers). When the Z502 CPU encounters such an instruction, it generates a trap and switches to some operating system specified routine to handle it. Let's explain a little more about what happens when a trap occurs.

Whenever an exception occurs, the Z502 hardware always switches control of the CPU to an exception handler routine that is part of the operating system. Which routine to switch to depends on the type of exception. The addresses of all the exception handler routines are stored in the set of TO_VECTOR registers. (These registers are described in Appendix A, section 3.) Therefore, when the SOFTWARE_TRAP exception occurs, the Z502 hardware invokes the exception handler whose address is stored in the register TO_VECTOR[TO_VECTOR_TRAP_HANDLER_ADDR] . (The hardware also does other things when an exception occurs. Consult Appendix A, section 6 for details.) Part of what you have to do in Project 1 is to write this trap handler routine that is invoked when a SOFTWARE_TRAP instruction is encountered.

What this trap handler routine has to do is as follows. The operand field of the STAT_VECTOR register contains a number that indicates what service is being requested (a complete list is in Appendix C). Based on this code, the SOFTWARE_TRAP handler should simply call the proper service handler routine to provide that service. Those service handler routines that are required in Project 1 must be written by you to support various system calls. Let's go through one of those calls in detail.

Suppose that the service requested was a SLEEP. The service handler does some argument shuffling and some poking about in its data structures and finally decides to initiate the I/O instruction. It does this by calling Z502_DELAY_TIMER. Z502_DELAY_TIMER is a kernel I/O instruction that merely starts the I/O operation, thus when the function (the Z502_DELAY_TIMER instruction) returns, the operation has not necessarily been completed (refer to Appendix A, section 5 for a detailed description of Z502_DELAY_TIMER). Most I/O operations occur asynchronously and in parallel with respect to the central processor. An I/O exception will be generated by the Z502 I/O circuitry at some later time, when the I/O operation has completed.

Meanwhile, the operating system has control over the CPU. The service handler now has nothing to do until the time interval completes. (The definition of the "interval timer" service is such that the USER program may assume that when it gains control over the CPU again the service has been completed.) Consequently the service handler calls the routine

WAIT_FOR_INTERRUPT ( <reason> )

where <reason> indicates what is being waited for (in this case the symbolic constant TIMER_INTERRUPT, the only possibility in project 1). WAIT_FOR_INTERRUPT is written by you. It is a routine within the operating system that, when called, takes the CPU away from the caller, after remembering where to come back to the caller when the event indicated by <reason> does occur.

What should WAIT_FOR_INTERRUPT do then? If there is anything else that can be done while waiting for the timer to finish, then WAIT_FOR_INTERRUPT should ensure that it gets done (waiting for the timer to finish is a perfect time to wash a few windows).

We now digress for a moment to discuss one of the most important aspects of most operating systems: sharing of resources among more than one process. When one process has to wait for any reason (e.g. for an I/O operation to complete) the operating system should allow another process to run if possible. Changing from one process to another is called a process switch. Switching processes involves saving all information not normally saved by a context switch, e.g. a process id number, and restoring this same information for the new process. The place where process state information is stored is often called a process control block or PCB. In the Z502 machine, the routine switch_context automatically saves all of the register information described in the context structure. You are responsible for anything else that describes the process. While one process is waiting for an I/O interrupt, another process can be switched in to make use of the idle CPU. You must be careful to save all of the information associated with the process waiting for the interrupt, so that when it finally occurs, that process can be properly restarted from where it left off. Your operating system must also be able to handle the case where a process switch from process A to process B occurs, and then another switch occurs from process B to process C, and then finally the CPU gets switched back to process A.

Remember that I/O operations may not be serviced in the order requested. Matching an I/O response with a request is a device specific task.

WAIT_FOR_INTERRUPT must determine if some other process is ready to run by calling the dispatcher. If there's no job to be done, then WAIT_FOR_INTERRUPT should call Z502_IDLE(). The idea is to switch the CPU into idle context, which causes neither the user program nor the operating system to execute until some external event (for instance an interrupt) occurs. Let's now make a small diversion to explain this instruction. (It is also described in Appendix A, section 5.)

Whenever the Z502 CPU is executing code, either the user's or the operating system's, we say it does so in some context. A context is a state of the CPU. It includes the registers, the program counter, the local variables, and the status. While in the middle of performing a task, perhaps a service for an exception or trap, it is often nice to go off and do something else while waiting for things to complete. (This is when WAIT_FOR_INTERRUPT should be called.) At some later time you would like to come back and complete the service at the appropriate time. Thus the Z502 kernel instruction set includes an instruction that lets one context put itself on hold (so to speak), later to be restarted from where it left off.

The Z502 CPU maintains a register called Z502_REG_CURRENT_CONTEXT that contains a pointer to an image of the context for the currently running code. It is this pointer that the operating system must save if it ever wants to restart the context after it has been stalled. To cause a context to be put on hold you should call the function switch_context. For example, if a context A wants to be put on hold then it should store a copy of the value of Z502_REG_CURRENT_CONTEXT (that points to its own context) somewhere before it calls switch_context. That way, when later on some other context, say context B, decides that it is time to resume context A then it can call switch_context using the value of Z502_REG_CURRENT_CONTEXT stored by context A (which points to context A). Your routines WAIT_FOR_INTERRUPT and RESUME_PROCESS should make use of the switch_context instruction to implement this type of processing. (RESUME_PROCESS is discussed later.) Appendix A, section 5.3, explains in detail the steps taken during a context switch, and describes the primitives to handle contexts. We encourage you to read and understand that section after reading this chapter.

At this point in our discussion of a service request we are now, in the simplest case, in the idle context with the SOFTWARE_TRAP service context currently on hold. What happens next? Well. . .

2.2 Hardware Signals

Hardware signals occur asynchronously and are termed exceptions. You will also have to handle software traps that occur synchronously in software as explained above. A hardware signal causes the Z502 CPU to invoke your operating system via the routines you specify in the vector TO_VECTOR. We have already seen one such signal, SOFTWARE_TRAP, that occurs when the user program executes a SOFTWARE_TRAP instruction.

Another interrupt in our example is DELAY_TIMER. This happens when the requested I/O operation to the timer completes its action. For instance when the timer initiated with Z502_DELAY_TIMER o-so-many paragraphs ago finishes, an operating system routine is invoked from TO_VECTOR[TO_VECTOR_INT_HANDLER_ADDR] . You will have to supply this routine that directs what happens when the exception is received.

Your interrupt handler routine at this point must decide which device interrupted and call a handler for that device. That handler in turn determines which context is waiting for this exception and sets up that context so it may complete its service. This should be done by calling

RESUME_PROCESS (<reason> )

where <reason> is the same type of argument as for WAIT_FOR_INTERRUPT. Remember, RESUME_PROCESS is also written by you. RESUME_PROCESS must decide which context is to be resumed, what its value of Z502_REG_CURRENT_CONTEXT was and then must do the resumption. Resumption means making a process ready to run so that later on the dispatcher will run it. Once the dispatcher does run, the resumed context continues onward. Unfortunately the user process starts up at its initial entry ( at the start of the routine; sorry that's one of the hacks in the Z502 hardware.)

When the operating system finally switches to the user's program's context, the user's program resumes execution assuming that its service request has been satisfied. Gee, an awful lot of things happened simply to SLEEP!

The following list shows the major steps taken to service a SOFTWARE_TRAP request. Let us explain those steps briefly to review what was said in sections 2.1 and 2.2.

The user's code is running and Z502_REG_CURRENT_CONTEXT points to an image of the user's context.
The user issues a system call that causes a SOFTWARE_TRAP. The address of the SOFTWARE_TRAP handler was obtained from TO_VECTOR[TO_VECTOR_TRAP_HANDLER_ADDR].)
The trap handler initiates the I/O service and calls WAIT_FOR_INTERRUPT. WAIT_FOR_INTERRUPT stores a pointer to the handler's context in a variable and calls switch_context to put the CPU into some other process' context. Z502_REG_CURRENT_CONTEXT now points to the newly-running context.
When the I/O service is completed, the device causes an interrupt. The device handler (its address was stored in TO_VECTOR), starts running. Z502_REG_CURRENT_CONTEXT continues to point to the context of the process that was interrupted (WATCH OUT - this could be dangerous.)
The device handler calls RESUME_PROCESS that puts the context on the ready queue.

2.3 Resource Sharing

Since more than one process can be active in the system, it's possible for more than one process to request service concurrently. Your kernel must make sure that:

No more than one request for the disk can be issued at a time; otherwise, a fatal I/O error occurs, and simulation aborts.
The CPU is shared equitably among ready processes. The CPU is allowed to idle only if there are no processes ready to run. This means for instance that given a scheduling choice of running another process or waiting around for a timer or disk to complete, you should choose running another process.

Because of the first requirement, you have to design and implement a basic synchronization mechanism and to maintain a queue of processes waiting for access to each device. For the second requirement, you have to maintain a queue of processes waiting to run on the CPU. You should use the Z502 timer to prevent any process from hogging the CPU. The DELAY_TIMER can be used to set an interval timer on user processes for cpu scheduling. A DELAY_TIMER interrupt is generated when the value of TIMER reaches zero. You are free to design your own dispatching policy.

One last note. While it might seem that the Z502's very high level instructions for context switching are not very realistic, they do exist on real machines. At the heart of almost all operating systems is a layer just above the hardware layer that provides some set of concepts and instructions. These extended services are often "hardware assisted" through careful design of the instruction set, and on some machines fully implemented in micro-code.

2.4 Schedulers

There are lots of possible variations on schedulers you could include in this project. Required are only two:

First Come First Served ( FCFS ): services next the process that has been waiting the longest.
Simple priority scheduling: services next that process having the most favorable priority.

You can write other schedulers if you want, but these are required. The toughest part of designing a scheduler is in writing the testing and reporting mechanism which will assure to the reader that the schedulers are operating as expected.

2.5 Process Management Interface

A number of process management system calls are defined. They are listed here, and are detailed in Appendix C. It is recommended that you implement these calls in the order listed; it may be that some of the calls are beyond the time resources you have available. However, do note that several of the calls are needed for project 2, and you will be stuck if they aren't implemented.

Again, you are urged to use simple rather than complex designs. It's best, given a choice, to have the calls working correctly rather than elegantly.

Essential in Project 1 ( many used in Project 2, also )

GET_TIME_OF_DAY
SLEEP
CREATE_PROCESS
GET_PROCESS_ID
TERMINATE_PROCESS
SUSPEND_PROCESS
RESUME_PROCESS
CHANGE_PRIORITY
SEND_MESSAGE
RECEIVE_MESSAGE

Implemented in Project 2:

MEM_READ
MEM_WRITE
DISK_READ
DISK_WRITE
READ_MODIFY
DEFINE_SHARED_AREA

3. Provided Routines

Some routines reside in the supplied code. The base operating system provided to you consists of base.c and scheduler_printer.c.

The code for the Z502 simulator can be found in z502.c Read it and enjoy it if you wish. When the written instructions and descriptions differ from the code, believe the code. It is hoped, however, that you won't need to spend much time reading the simulator code since this document should describe everything you need for your interface.

3.1 Initialization Routines

The operating system is initialized by a call to os_init, a semi-provided routine residing in base.c. Os_init sets up the TO_VECTOR exception vector registers, and defines a single user process as the first to be run. You may well want to add to this code to get it to do more. A brief description of the initialization steps follows:

The TO_VECTOR exception handling vector is filled with the names of routines to call for each possible exception. (Names and routines to handle three possible exception types must be supplied by the student.)
Structures to support a scheduler, etc. are created.
A context switch is done to the user process, causing it to begin execution.

Each of these steps is conceptually very simple, but the process as a whole needs to be well understood in order to implement later projects. See also Appendix A, section 10.

3.2 Termination Routines

There are two types of termination: termination of the user process and termination of the system. A process termination can be due to either a TERMINATE_PROCESS Software_Trap or a process error (illegal instruction, illegal page reference, etc. ). In either of these situations, an exception is generated and the handler gives control to the provided routine. It is NOT legal in a user program to simply "return" or reach the end of a routine; the simulation will end if you do so.

A call to Z502_HALT() halts the simulation. This might occur when all user contexts have been terminated, and is under OS502 control.

3.3 Scheduler Printer

A routine has been provided that prints what's going on with the processes in various states within the scheduling mechanism. In essence, you make numerous calls to the printer manager, defining the various items you want to print. Then, when you've defined everything, you give the command to actually do the print. You should understand that this routine may be too painful for you to use; it's supplied so as to be an example of a detailed printout - modify it any way you wish. Details on its usage are given in Appendix D.

4. Project Phase 1 Assignment

Your assignment, which you've already decided to accept, is to provide a design document, source code, and test document as defined in the Student Manual. The subject of these documents is described below. The correct approach here is to do what it takes to get the test programs running. For instance, test1a requires that you provide several system call handlers; as you go on to other tests you will need interrupt and fault handlers, and also a scheduler. Specific milestones include:

Understand the Z502 machine architecture and study the code provided for you in this phase. Pay special attention to how the SWITCH_CONTEXT kernel instruction works and what exactly the machine does when an exception occurs. You will be writing a scheduler that can do both FCFS and priority scheduling; it will be based on this SWITCH_CONTEXT instruction.
Modify and add to the given code to dispatch SOFTWARE_TRAP's and to deal with I/O exceptions, initializing the TO_VECTOR register elements corresponding to these exceptions with the entry names of your own handler functions You are required to write functions that will handle:
- The hardware interrupts generated by the completion of a DELAY_TIMER.
- The faults generated by the Z502 processor.
- The execution of a SOFTWARE_TRAP instruction by a user program.
Support for the process related user system calls; the memory commands are part of Project 2.
Write WAIT_FOR_INTERRUPT and RESUME_PROCESS, described above, and other pieces in order to make a scheduler. You will also need a dispatcher to implement both FIFO and priority driven schedulers.
Run supplied tests and write some of your own.

5. Some Final Advice

Project Phase 1 requires a fairly substantial piece of implementation. It might well require several thousand lines of code (not counting blank lines or comments) in addition to the base operating system given to you in Project Phase 1. We encourage you to get started early, both on your preliminary design specification and your coding.