CSCI201 - Software Development

The Curriculum

January 29, 2012

1      Introduction

Software development is about building applications that run on computers. It is much more than algorithm design, selecting a programming language, and coding. Those are important, but for larger applications we need a larger perspective, or system view. And we must worry more about the creation process, often called software engineering.

To emphasize the system view, this class will take a system’s approach to teaching you about software development.

1.1    What is a system’s approach?

Large software applications involve interacting systems of:

·         stakeholders who have a need for an application;

·         developers who together design and build the application;

·         software components that together are the application;

·         maintainers who deploy, fix, and change the application;

·         environments in which the application is built and in which the application  runs.

Stakeholders, developers, components, maintainers, and environments each have constraints and hierarchical relationships.  A system’s approach sensitizes you to the sophisticated, subtle, deep thinking required to succeed in a complex world. As we’ll see, even job descriptions can cause problems.

1.2    How this class will work

Young computer scientists learn about system building by experience. Very little is theoretical. This class is about software development, but the larger issues of software engineering will always be in focus. Though you later will have formal classes in CSCI477ab, the software engineering needs of this class are covered in section 3.

You will be taught about writing code that is part of a system by immersing you into large bodies of code. The approach is more like mentoring than teaching.

1.2.1    Immersion into existing projects

You will be asked to enhance existing projects. You are going to be thrown into them without knowing (except through documentation) how the systems got to be the way they are. Our motivation is simple:

·         Young programmers don’t deal much with design.

·         They are often thrown into existing projects without much handholding.

·         They must learn how to read and analyze code.

·         They must learn to work with many different computer systems.

Along with class lectures to fill in gaps, you will leave this class as programmers who will care more about how the system works than just how the code works.

1.2.2    The Class Applications

You will have problems for different applications. The goal is to sharpen your software development skills by immersing you in “large” projects that you could never do by yourself in just one semester. Each application emphasizes different computer science themes:

Restaurant Application -- Concurrent application in which Customers go to a restaurant to get fed. The starting point is a version with customers, waiters, a cook and a host. You will add a cashier and a market.

 

Factory Application -- You will be given a specification of a factory floor cell that has various components. You will build the entire control structure and animation.

 

Both of these applications depends on agents. See the agent roadmap for details. We have versions of the agents in Java and C++. Agents  work through concurrency and messaging; the skills you will develop include encapsulation, testing, and refactoring;

2      Recurrent Themes in Problem Solving and Programming

Solving problems is where the money is. It is no surprise that many of the themes of general problem solving find their way into programming. We will consider two themes: (1) Divide and Conquer, and (2) Reuse.

2.1    Divide and Conquer

Many problems seem hopelessly complicated. Fortunately, they can often be divided into a collection of subproblems whose solutions together conquer the original problem. Divide and conquer has many benefits:

·         Easier - The subproblems may be “easier” to solve, i.e. less time, less space, or less complexity.

·         Concurrency - The subproblems can be allocated to different processors/people for concurrent solution and execution.

·         Reuse - Maybe the subproblems are common and their solutions can be reused in another application. Reuse will be the subject of the next section.

And I’m not just talking about programming. For example, building a car seems a formidable, almost impossible task. But, if you think about designing a body, an engine, and a drive-train separately and then putting them together, the problem is approachable. For our above three advantages, the car example offers:

Easier – An engine is easier think about and design than the whole car. One key issue will be integration, i.e., assembling the components into the whole. Of course, the design of the parts contributes directly to the ease of integration.

Concurrency - Notice also that you could have three separate teams working on each part at the same time. They need to interact somewhat during the design, and much more during integration.

Reuse – A particular engine might fit into different models of your car line.

For a software example, consider designing an automated teller machine for a bank. Almost any first pass design will divide the problem into three parts: 1) the physical ATM machine, 2) the central database of accounts, and 3) the network connecting them. Or did you have a different decomposition?

How you divide the problem is not always obvious. Much of this skill comes from experience. Besides just trying to design functional, buildable parts, you might consider trying to design (1) interesting, reusable parts that are (2) easy to fit together.

Building any of the components mentioned above is a sophisticated problem in its own right. Each in turn yields to divide and conquer techniques as well. In other words divide and conquer is naturally recursive.

2.2    Reuse

Reuse is the “Holy Grail” of programming and many other enterprises. The whole discussion of standards is about reusability. The effort and expense to develop working solutions forces us to think about reusing solutions. How would we develop anything complex if we always had to build everything from scratch? Think about how knowledge and science are based upon building on the ideas of the past. That’s a part of reusability.

2.2.1    Classes, components, subsystems, and applications

The reusable objects of programming fall roughly into four categories:

·         Class – Lowest level reusable entity, e.g., the collection classes. Before object-oriented programming, reusable libraries of functions were the lowest level.

·         Component – A class that has semantic value to an application, e.g., an employee class in a business system.

·         Subsystem – A component that is free running, i.e., has its own thread of control, but is generally not useful by itself. For example, a data base server.

·         Application – A free running object that completely solves some problem.

As you move up in complexity, the objects do more, but are harder to reuse. [Make sure you understand that.] When I don’t care about the distinction, I will use the term component to mean anything that can be part of an application (including another application).

2.2.2    Inheritance

Inheritance, as we know it from object-oriented programming, is a separate feature of reusability. A key part of working with components of any type and in any domain is understanding them:

·         knowing what they do;

·         knowing any constraints they have;

·         knowing how they interact;

·         knowing how much they cost.

Relating two components by an inheritance-type statement, simplifies your understanding of them. Consider:

If I tell you that B is a kind of A but that B handles two more cases,

and if you already understand A,

then you only have to understand the two new cases in order to understand B.

One of the constraints of object-oriented programming is to force you into thinking about classes and inheritance. This extra work benefits reusability.

2.2.3    Patterns

Physical objects are reusable, but so are ideas. Patterns try to capture the ideas behind design decisions so that they can be reused. For example, using class constructors explicitly intertwines your application tightly with the classes you are instantiating. Someone invented the idea of a Factory to which you describe the object you want from some family of objects, and it “constructs” it for you. Turns out that idea is useful and reusable.

We will study patterns later in section 9.

2.2.4    Encapsulation

Reuse depends not only on the functionality of the component, but on how simple it is to describe and use. Mostly, that means:

·         Can you describe what it does?

·         Can you describe how it interacts with other components?

If you can’t do both, how can you use it? For a class in an object-oriented programming language that means:

·         Understanding what the public methods do;

·         Understanding what the “public” instance variables represent. Even if the instance variables are made accessible only by get() and set() routines, you still have to understand their purpose.

·         Understanding the protocol of using the classes. Just because you know what the methods are doesn’t mean you know how to use the class. That’s why sample programs often accompany packages of classes.

You will learn that entangled, or highly connected, code is difficult to understand, debug and maintain. In theory, each component should be as self-contained as possible. It should have a “small” easy to understand public interface. Mostly, this is about scooping the data and methods that an application uses:

·         Private – Visible only to the single component.

·         Shared – Visible to a collection of cooperating components.

·         Global – Visible to all the components of an application.

Data encapsulation is particularly important. Dozens of books have been written over the years to explain that:

·         global data can wreck a design,

·         it is important to limit shared data, and

·         shared data is difficult to handle.

A good part of this class is about exactly these issues.

This problem is so important that whole methodologies are invented specifically to deal with encapsulation. The Agent architecture of our first programming project has encapsulation as one of its principal design features.

2.2.5    Granularity

Another key part in designing reusable components is getting the size, or granularity, right. A component that is too big is too specialized and hard to understand. A component that is too small is often of little use. So, how big is too big? How small is too small? Experience will be your guide. Being able to design the right classes, the right components, and the right subsystems is a sure sign of an experienced designer.

3      A Little Software Engineering

So far your programming career probably consists only of homework problems that you did quickly by yourself, mostly by directly coding a solution. Real-world projects are different because they are bigger in several ways:

·         Large projects have several distinct phases. (Section 3.1)

·         The process of doing the project needs managing. (Section 3.2)

·         It takes a team to do the project. (Section 3.2.3)

·         Large software projects have special technical concerns. (Section 3.4)

3.1    The Phases

The next few sections lay out the phases typically found in large projects.

3.1.1    Requirements and Domain Analysis

Every project must specify what the thing being developed is supposed to do. That specification is called requirements. Requirement elicitation is the process of gathering the requirements from the owners and users of the proposed system.

Understanding the key objects in the domain is part of requirement elicitation. During this domain analysis the subtleties of the domain are explored to uncover hidden constraints, needs, and so forth. Typically domain experts are used during requirement elicitation for precisely this purpose. I remember once looking at a ball-bearing feeder to be used in a manufacturing application I was programming. As I watched it operating, I noticed that feeding errors increased as it got close to being empty. I asked the engineer about this and he told me that you should always keep the feeder at least Ľ full. On further probing I found out that the machine did not have a sensor, so I would have to count based on a guess of how many bearings fill up the feeder. Since the count didn’t have to be perfect, that solution was good enough. Do you see how poor domain analysis can wreck a design?

Requirements can be functional or non-functional. For example, you may specify the need for a feeding machine. That’s a functional requirement; it describes something the system must do.  Saying that the machine should be kept at least Ľ full is a non-functional requirement; it makes a statement of how the function is supposed to behave.

Requirements are tricky. In particular:

·         Requirements can be underspecified. For example, I may tell you to display some data in order.  If I don't tell you what kind of order, then the requirement is underspecified. If the data are strings, then you might assume alphabetic—that would be natural and part of the domain knowledge you bring to the party. You have to know in order to code it. Notice if the requirement says to sort the data then you have told me how to order it. That's a well-know problem.

·         Requirements can be overspecified. If you require that the display of the order data must be created in linear time, you have overspecified the problem. No known sort algorithms work in linear time.

·         Requirements can be contradictory. If you say that 1) the system has to run on a single server, and 2) that the system has to be running 24x7, i.e., 100% of the time, then you had better not have another requirement that says 3) the system will be taken down for an hour each week for maintenance.

·         Requirements may become irrelevant. It is tempting to try to think of every feature a system might need during the requirements phase. Unfortunately, features complicate a design. This fact of life is particularly aggravating if the need for the feature doesn’t survive through to the system’s delivery.

·         Requirement can change during development and new ones discovered. More about this when we talk about the process (section 3.2).

How do you specify requirements? For us, it will be a combination of:

·         English statements

·         Big Block diagrams (identifying major components of the system)

·         High level interaction diagrams (showing how the components interact)

These will be shown in the Agent document.

3.1.2    Design

Design is taking requirements and figuring out how the system should work. Design is the mapping between “what” (requirements) and “how” (the implementation). In this phase, you analyze the requirements and design a solution that can be implemented (section 3.1.3) and deployed (section 3.1.4). For our simple ordering example, the requirement to order the input must be turned into an algorithm we can implement. That’s easy to understand.

For larger systems, the design phase is more complex, but the idea is the same. The designer must take the requirements and turn it into something that can be coded and/or constructed. The complexity forces the designer to think about the structure of the application and the tools needed to implement it.

Many young programmers don't appreciate design. They are ready to code a problem directly from the requirements; after all, that is how they address their school problems. But, there are serious questions that "design" answers where "coding" does not:

·         How does the stakeholder know that you, the developers, understand the problem?

·         How does the stakeholder help make decisions about design alternatives?

·         How does the stakeholder know if you are making timely progress?

·         How does the stakeholder know if you are producing something that is flexible and maintainable?

If all the stakeholder has to look at is code, the answer to all these questions is "He doesn't!"

So, let's assume you acknowledge that design is important. How do you specify a design? There are dozens of design methodologies. UML (Unified Modeling Language) is a famous one that we will have more to say about later in the semester. For now we will ignore them all and focus on a few methods that seem to be part of any design:

·         Detailed interaction diagrams

·         Class description of system objects

·         Pseudo-code

We will see all of these in detail in our Agent application.

Inevitably, design discussions must answer questions about environments, applications, frameworks and architecture. What are these things?

·         An environment is something in which you build and run applications.

·         An application is a solution to some need. An application can be static like a piece of furniture or a building, or dynamic like a business or some software. You use static applications; you run or operate dynamic ones.

·         A framework is an environment in which you can generate specific types of applications. Applications built in a framework generally have an architecture.

·         An architecture is the structure, or organization, of an application. A colonial house has an architecture seen in 18th century America.  The first programming system used by this class has an agent architecture. All applications have an architecture, however weak.

A framework forces some decisions on you—perhaps what tools exist, what expertise is needed, and so forth. These restrictions are outweighed by the simplification in developing the class of applications for which the framework was designed. For example, a woodshop is a framework for making wood products. It has special tools and requires special skills.

A programming environment is a framework that helps you generate software. The Java J2EE framework helps you generate web-based software. The dot Net framework does the same thing using Microsoft products.

By the way, just using classes in object-oriented programming doesn’t imply an architecture. Saying you used object-oriented programming doesn’t say much. Using an object-oriented design methodology is much different. [Isn’t it? Is this point too subtle for the class?]

3.1.3    Implementation

Implementation starts with a design. The developers identify existing components they can reuse, code ones they can’t, and integrate everything. Implementation takes place in an environment that supports programmers, mostly to write code.

Programmers must validate their code (determine if it satisfies the requirements) and test it (see if it performs as designed). The testing must be not only in the environments on which they code, but also in the deployment environment on which the system will run. Testing inevitably requires debugging. Debugging can be about:

·         finding errors introduced by the coder, e.g. a pointer error;

·         finding flaws in the design, e.g., a class is missing a method “needed” by another class; or

·         finding flaws in the requirements, e.g., discovering that some requirements can’t be met.

Validation and testing are often haphazard affairs, testing especially. Programmers sit at a terminal and run test cases often by typing directly into user interfaces. For small programs this technique might work. For large systems, it is hopeless.

In this class you will be exposed to two techniques to support validation and testing:

·         Code Inspection – Inspection means reading aloud every line of code. A team inspects all code in the presence of the code’s author. Any questions that arise are fielded immediately. It is amazing how many bugs are discovered this way. Not only just coding bugs, but also sophisticated design and requirement flaws.

·         Unit Testing – Every component, or unit, of your system must be independently testable for obvious reasons. And the tests must be saved in order to be rerun whenever necessary. Why? Code changes and you can’t possibly afford full at-the-computer testing each time it does (as if that were even possible).

Both techniques will be a part of every homework assignment. Teaching programming is not easy. And, most of what you are going to learn is the discipline to do these unglamorous tasks. You are going to learn by doing.

3.1.4    Deployment Strategies

The implementation environment may not be, and typically isn’t, the same as the deployment environment in which the application is going to run. In fact, deployment decisions may influence the design. We’ll see later in the discussion about concurrency (section 4) how multi-machine environments can change the low-level details of how processes communicate.

3.1.5    Monitoring and Maintenance

Once deployed, systems need monitoring and maintenance usually by teams different from the developers. Some of their tasks include:

·         Running help centers and Customer Relationship Management (CRM) tools to support end users.

·         Inspecting logs and gathering statistics to determine how the system is behaving.

·         Making sure databases are consistent.

·         Bringing new releases and updates online seamlessly.

These efforts are expensive but no large project can afford to be without them. We won’t be spending any time on this phase of software development. That doesn’t mean it is not important, only that we have other things to do.

3.2    The Process

As you can see, production quality systems are a lot different from school problems. The process of building them is different as well. Homework has a “see it-do it” flavor. Real systems have the above phases that, not surprisingly, must be managed.

3.2.1    Waterfall

It seems logical to do these phases one at a time, delivering the results of each phase to the team responsible for the next phase, like a waterfall cascading over a cliff with many levels.

Managers love the waterfall metaphor. Just schedule each phase, do it as best you can, and move on to the next phase. Unfortunately, it doesn’t work that way in the real world. Mostly the problem is with requirements. They change and developers must respond.

3.2.2    Iteration

Even if the requirements are perfectly specified and without conflict, requirements change. Mainly for two reasons:

Requirements evolve through understanding of the problem. When requirements are analyzed and code is being developed, requirements change, mostly by becoming more complex, catching missed cases, making delayed decisions, etc.

Introducing a computerized solution to a group that has been something manually changes the environment in ways that cannot be anticipated. Think about how email systems must now handle spam. The requirements of the original email programs couldn’t know that without special handling, spam overwhelms most users. The design of Digital Video Receiver (DVR) software will face similar problems. A DVR is not just a VCR with a hard disk, as the developers will learn.

The result is that the waterfall process is fiction. Instead, building most systems involves iteration. That is:

·         A subset of the key requirements is identified. A minimum system called a prototype is designed and implemented.

·         The client uses the prototype and then reviews the requirements. Perhaps some new ones are added, some old ones discarded, existing ones changed.

·         A next set of requirements to be covered by the next release is decided upon and implemented.

·         The last two steps are repeated until the system meets the satisfaction of the people paying the bills. The prototype is renamed and deployed.

Another name for this process is spiral development. Whatever it’s called, it has many advantages.

·         A complete and accurate set of requirements does not need to be specified before coding begins.

·         The iteration accounts for inevitably changing requirements.

·         Clients get to see preliminary working systems, review them, and comment before final delivery. The waterfall process often delivered systems that were out of date before users even touched them. Iteration produces useful and usable systems.

3.2.3    Extreme Programming (XP)

Extreme Programming (XP) embraces the iterative process described above. In fact as we’ll see, it embraces it extremely, and hence its name. In particular, two of its principles are especially relevant to this class:

·         Many quick, small releases – While the iterative process acknowledges doing releases until the project is done, it doesn’t say how many to do or how big each should be. XP believes in small releases, perhaps one every few weeks. This procedure is taxing, but it forces the design team to constantly prioritize the requirements to determine what goes in the next release. That counters the problem of requirements becoming irrelevant—a requirement doesn’t influence the design until it becomes part of a release. XP denies the long-held tradition of building a design that can “support” all features. XP believes (1) you account for the features only when they make it into a release, and (2) you will be able to easily refactor your design (see section 7.3) to accommodate the new feature. We’ll see if these two XP features are true.

·         Pair programming – XP believes that all at-the-computer time should be done by a pair of programmers: one actually responsible for the coding, and one thinking about the big picture. Pair programming will be a step on the way to learning how to work in teams or groups.

Part of the complexity of figuring out how do quick releases depends on your ability to plan. For this, XP has something called the Planning Game.

3.2.3.1  The Planning Game

In the planning game business people decide the scope, priority, and composition releases. They also decide on the dates of releases. Of course, it is ridiculous to think they can make any date-related decisions in a vacuum. What do they know of the technical issues of developing the code that goes into their release?

In fact, technical people make estimates, elaborate the consequences of the business decisions, and, most importantly, do the detailed scheduling. With this feedback the business people can make informed “final” decisions about the composition of a release.

This Planning Game as thus described has been formalized into a series of three moves:

·         Exploration - Find out what's new to do. Write stories, scenarios, interaction diagrams, whatever is your favorite way of specifying requirements and try to estimate them.

·         Commitment - Decide on a subset as business sorts the options by value, cost, and risk. Then prioritizes the list.

·         Steer - Guide development as reality hits. Plan on iteration of 1-3 weeks of work.  Of course if the plan is failing, you must plan for a recovery strategy that probably includes new scenarios and reestimates.

3.2.3.2  Developmental Iteration

The iteration by the development team mentioned in step 3 just above has the same three planning game steps, but with different actions:

·         Exploration - Design the tasks and document them on Engineering Task Cards. Split and combine tasks as required.

·         Commitment - Programmers accept tasks and then estimates them. This is so reasonable. How can anyone but the programmer who is going to do the work do the estimate? Of course the manager must do load balancing, i.e. make sure all the assigned tasks are doable in a sensible schedule and that no one is overcommitted.

·         Steering - Keep checking your progress, perhaps plan some recovery, and redesign if things aren't going well.

3.2.3.3  An Engineering Task Card

Task cards are used by the programmer to chart and document his progress. The task card is analogous to the journal kept by experimental scientists. All reputable scientists keep journals of their experiments. How else do they know what they are doing. Why don’t programmers do the same? There is no magic in a task card. Design it in any way you want, with whatever information is relevant to you. All that’s important is that it has everything you need to chart your progress on a task and know when it is done. It might look like this:

Engineering Task Card

 

Date:___________  Engineer________________________ Task Estimate____________ 

 

Task Description:

 

 

 

 

 

 

 

Engineer’s Notes:

 

 

 

 

 

 

 

Task Tracking:

 

Date

Done

To Do

Comments

 

 

 

 

 

 

 

 

                                                                                                      

 

 

 

 

 

 

3.2.3.4  Pair Programming

XP believes that all at-the-computer time should be done by a pair of programmers: one actually responsible for the coding (the driver), and one thinking about the big picture (the navigator). Pair programming will be a step on the way to learning how to work in teams or groups.

Pair programming has been adopted by eXtreme programmers. Many books have been written about this technique dating back to the early days of programming. Pair programming is not about one person programming and one watching. It is about a highly interactive session of programming, brainstorming, reading code, discussing ideas, creating tests, and other jobs primary to the job of programming.

Pair programming is not a tutoring session. Often experienced programmers may be paired with inexperienced ones, but mentoring is not the goal. Better programs are.

Pairs are not fixed for all time. In fact, pairs join together only for a single session. It is part of the methodology to make sure that all members of a team can pair with one another.

The Seven synergistic behaviors of Pair programming – Pair programming is said to increase productivity of the two programmers that previously worked alone. The pair programming people say there are seven examples of this synergistic behaviors:

1.      Positive Pair Pressure – Programmers work harder and smarter. They don’t want to let partner down; they are motivated to complete tasks; and working in pairs helps them both in following procedures.

2.      Pair Negotiation – Each partner brings different skills to the terminal. Research has shown that two heads are better than one in cognitive tasks, which programming certainly is. The natural negotiation that arises between pairs leads to more alternatives. And, of course, there is shared satisfaction of the team effort it takes to complete a task.

3.      Pair Courage – Programmers need courage when undertaking challenging tasks. It is impossible to underestimate the benefit of having a partner share your risk. Often one asks “Does this look right to you?” A positive response gives support to the driver. The affirmation is a confidence builder. With confidence comes the courage to admit we don’t know. Then, when you have a problem you can ask your partner for his idea of an alternative without losing face. Otherwise you might just muddle through.

4.      Pair Reviews – The navigator watches as code is entered. He reads the code as best he can and comments on anything that he needs explained. These observations are just another form of code reviews. And code reviews are IMPORTANT!!! Another pair of eyes is critical to this process. The human eye doesn’t see what it doesn’t want to see and, often, after you’ve worked on some code for a long time, you no longer SEE it. It is simply a matter of fact that third party code reviewers find glaring errors that developers overlook.

5.      Pair Debugging – Explaining your code by verbalizing it to your partner will root out bugs. If you can’t explain it, you don’t know it. The navigator often asks questions that force explanations that turn up flaws.

6.      Pair Learning – During a pair programming session knowledge and skills are passed in both directions. This flow is obvious when the pairs are mentor/apprentice, but it works in all relationships. Managers get more people familiar with more of the code. That’s a terrific by-product of pair programming.

7.      Pair Trust – And, finally, as you work together, you learn trust one another. Trust engenders more trust, team play, etc.

The seven myths – Was all that too touchy, feely for you? Are you skeptical? You’re not alone. Here are the seven myths that often hold teams back from adopting pair programming:

1.      It will double the workload – It appears that you have two people doing the job of one. If that really were true, we would indeed be only half as productive. But, if the two together can produce better code, find more bugs, and, in general, prevent problems that will be found down the line, the “increased” cost will easily be worth it.

2.      I’ll never get to work alone – Certainly, the time you spend alone will diminish. But, you still will have plenty of time alone. You don’t compose the code at the terminal, you just type it in there. That leaves plenty of thinking time to design your algorithms. Also it is clear that leaving programmers alone for too long is a mistake. They can easily get on the wrong track. Pair programming prevents that from happening.

3.      It only works with the right partner – Sure, a good partner is better than a bad one. But, all pairs will have their strengths and weaknesses. The random mixing of pairs will strengthen all the individuals. If a particular programmer consistently produces bad pair programming sessions, then remove him from those sessions. If you insist on doing pair programming, fire him.

4.      It’s only good for training or mentoring – Pair programming is a good way to train or mentor. But there is enough evidence to show that the value of pair programming goes beyond training or mentoring.

5.      I’ll never get credit for doing anything – Even if that were true, so what? What’s important is the project succeeds. That means producing the required product on time and on budget. If that miracle happens, you will get plenty of credit from management.

6.      Navigators only find syntax errors – Only if your navigator is poor. Finding syntax errors is his least important job. The strategic thinking required of the navigator can be liberating to the driver, who has to worry about getting a particular piece of code to run. The big picture is hard to keep in focus while you’re debugging. That’s why it takes two to do the overall job.

7.      I work best alone – That may be true. But like (2) above, working alone may not be best. Learning to work with others, as pair programming insists, can force you to grow as a team player. You might work best alone now, but with some pair programming training, you may be better in the future.

Seven Habits of Effective Pair Programmers – People have noticed some things that effective pair programmers do:

1.      Take Breaks – Pair programming is exhaustive but productive. You need breaks: check your email, take a walk, something to take your mind off your work and refresh yourself.

2.      Practice Humility – The idea of egoless programming has been around for decades. Excess ego can wreck a collaborative relationship. No one is perfect. John von Neumann, the inventor of our computer architecture and reputed to be the smartest person who ever lived, was famous for asking for advice.

3.      Be Confident / Be Receptive – Everyone is nervous about their image and adequacy. Everyone! Be confident in your abilities. After all you’ve been admitted to an elite program–you must be doing something right. At the same time, don’t be overconfident. Be receptive to new ideas or ideas that run counter to your own. Accepting advice is not a sign of weakness. Remember von Neumann.

4.      Communicate – That’s the point of pair programming. Just talking aloud can be enough to provoke a solution. This is the key part of collaboration. How many times has someone come to talk to me about a problem, and in presenting it, thought of the answer?

5.      Listen – It is said that it is more important to understand than to be understood. In other words, listen and listen carefully, with respect. Unsurprisingly, people listen more to good listeners than good talkers (who they tune out).

6.      Be a Team Player – That’s the point. The goal of the pair programming collaboration is to make the driver and the team succeed.

7.      Hone the Balance between Compromise and Standing Firm – Developers who are too egotistical may lack the ability to compromise and may become argumentative when paired. Conversely, people who always agree in order to minimize tension aren’t getting the full benefit of collaboration. Some of both are needed.

3.3    Working in a group

Large projects involve teams of people. That brings with it (1) social issues of working together and (2) management problems of hiring, firing, promotion, etc. Dozens of books have been written about creating and managing teams of programmers. All have something to offer. But, managing is not that important a topic for you, yet. Much more important is teaching you how to be manageable—how to work within a group, how to make your manager and team more effective, how to develop good habits, etc.

It is not enough just to be technically competent. While there is always a place for the lone genius, the real world puts the rest of us in groups. And it’s not easy. Group dynamics are usually a mystery to young employees. Some never get it. With few exceptions the university emphasizes individual work (mostly to make grading easier). In pair programming you will learn how to communicate with your partner, how to divide up work, and other skills.

Before discussing some group issues, you should know what kinds of colleagues you will surely encounter during your career.

3.3.1    Types of Colleagues

We would like to think that all our colleagues are like us—qualified, competent, cooperative, and normal. Unfortunately that’s not true. You will certainly meet the following characters:

Fake Superstar – Knows it all, let’s everyone know it, but somehow never seems to be around when the hard problems get solved.

Superstar - Does know it all, lives for hard problems; can bring confidence to a team. The superstar is rare, but can be hard to live with.

Underachiever – Could be qualified, competent, and cooperative, yet always seems to under-perform. Frustrating to a manager.

Overachiever – Whatever their abilities, overachievers always perform excellently. Loved by managers.

Normal Achiever – Performs up to their capabilities.

Not qualified – Could be very competent at something else, but not at what they are currently doing. Great coders might be poor designers, while good programmers may be poor managers. Good management gets qualified people into all positions.

In addition, people have different personalities:

Shy – Shy is often mistaken for incompetent. Shy, competent people often have the “most” to say.

Assertive – Assertive doesn’t mean competent, just loud. Still, many competent people are overly confident and can seem overly aggressive.

Rude – Being assertive is one thing, being rude is much another. If you are rude, you had better be a true superstar.

Cooperative – A great personality when paired with competence. Incompetent, overly cooperative people are obsequious (brown-nosers, yes-men).

Ladder climber – All of us want success and want to get ahead. Unfortunately, these types are more interested in their career than the success of their projects.

What happens when these diverse types are thrown together? The answer depends on how members of each team perceive one another.  

3.3.2    Skill Sets, Job Titles, and Status

Companies have suborganizations—executive, technical, financial, legal, marketing, human resources, manufacturing, sales, and others. Depending on the culture, the company, and the people in it, where you are may lead to undeniable differences in status. The differences can be important.

Some think that Japan’s superiority in manufacturing emanates from the high cultural status given to the work done on the factory floor. In Japan it is common to see design engineers on the floor working directly with manufacturing personnel. Stories abound about top executives marveling over a simple elegant switch. Undeniably, such high manufacturing status engenders better production.

Differences in experience also result in differences in status. A competent, experienced person naturally has more status than a competent inexperienced person. Status gets more complicated when the skill sets are related, even perhaps overlap, but are not identical.

For example, developers are often divided into designers and programmers. Designers have status, programmers don’t. Why? Because designers formulate the design (the hard, fun part) that programmers simply code (the so-called easy, routine part). However, the design/code split is not so simple. Few designs, especially large ones, are put down so precisely that all it takes is “coding.” Programmers are often the ones who find design flaws, suggest fixes, make key implementation decisions, and more. In other words programmers bridge the (wide) gap between designing and coding. If designers appreciate this reality, they have more respect for experienced programmers and will include and value them on the design team. In fact, some think that the best designers emerge from the best implementers, whatever the field.

Successful organizations solve the status problem, especially at meetings.

3.3.3    Meetings

Meetings can be large involving a whole team or as small as two people face-to-face. They can be about anything, but all meetings have a purpose. Good meetings can give energy and purpose to a project.

We take meetings for granted. Everyone has been to them, but only experienced people understand meeting dynamics. You should consider the two roles you must play in a meeting, the obvious one of participant and the subtle one of observer.

As a participant, you contribute to fulfilling the meeting’s agenda, whatever it may be. You are in the moment: listening to suggestions, giving advice, voting, trying to reach consensus, whatever it takes to achieving the meeting’s purpose.

As an observer, you are out-of-body, paying attention to the meta-issues of a meeting. In particular:

·         What are the politics involved? If your boss’ boss is at the meeting, it may be more important to make sure that your boss shines rather than worry about key technical decisions, which can be sorted out later.

·         Is the meeting going well? If not, what dynamics can you change—call a break, change the topic? Can you or should you take control? Before you take such action, make sure you have the authority.

·         What if two geniuses are in conflict? If you should be so lucky to be among more than one genius, prepare yourself for the meeting where they have conflicting ideas. Be careful. A genius rebuffed can sulk and turn ineffective, or think about leaving the organization.

·         What is the proper role of a manager in a technical meeting? A manager is there to facilitate a meeting and make decisions. Strangely, a great manager can make great decisions without knowing or understanding all the details, technical or otherwise. Besides being cooperative, you can help your manager make great decisions by giving him great input. The next section discusses this in detail.

These are not easy questions. Many of the Dilbert cartoons capture perfectly the absurdity of dysfunctional meetings. Personalities can certainly destroy a meeting, but often the problem is ethics.

3.3.4    Ethics and Power

Ethics describes correct behavior. At a minimum ethics is about doing the right thing. Aristotle thought ethical behavior was more: doing the right thing at the right time. For example, donating money when you are rich is one thing, donating when you are poor is much another. And, strangely, power has a lot to do with ethics.

Power motivates people. The work environment breeds people seeking power; some deserve it, some don’t. Power has come to have negative connotations:

·         Often the wrong people come to power.

·         Power seeking seems unseemly.

·         Being less powerful seems weak.

Perhaps. But power used wisely is potent. The German philosopher Friedrich Nietzsche talks about deriving power through truth and will. He means that 1) by being completely honest and facing the truth despite the consequences and 2) having the will to see struggles through, that 3) you will be seen as powerful by your peers. What could be more ethical?

I agree strongly with Nietzsche, certainly when it applies to group dynamics:

·         Truth and bad outcomes – The truth can hurt. Sometimes you hurt others when you dispute their arguments or point out their flaws. Hurt feelings are common. Sometimes you hurt yourself if you disagree with your boss. Still, the truth is necessary (but not sufficient) to correct decision making. Having said all this, make sure you understand the difference between truth and opinion.

·         Dealing with bad news – No one likes bad news. It is easy to deny bad news or shrink from it. But it takes will to face it squarely and consider what to do next. Believe me, in your working career you will face bad news many times. Will you show power or cowardice?

·         Cooperation or furthering a personal agenda – You are in an argument. Are you positive you are right? Can you consider that you may be wrong? Yielding is not weak if you have Nietzsche’s kind of power. More meetings have been ruined by conflicts between unyielding members, each sure that he is right. Doesn’t Nietzsche say that will is important? Yes, but make sure you understand that yielding to a position may be appropriate and correct, not weak.

·         Acknowledging power differences properly – We are not equal in capability, status, or power. How you react to those differences is important. Do you cooperate with your boss? Do you respect him even he is technically inferior to you? How do you treat those less competent or powerful than you? Are you arrogant in your superiority?

I often hear of effective people. When I probe, it is less about technical competency and more about the above issues. It takes a variety of skills to ensure a project’s success.

[I often dream of meetings where the participants, in true Nietzschean fashion, put their true cards on the table, discuss honestly the reality of the situation, and plan effectively from this state of affairs. Can you imagine the power that comes from such openness? Is it in you?]

3.4    Special Technical Concerns for “Large” Software

3.4.1    Management of the code

One of the tricks you can play on programmers is to ask them how much of their time they spend actually writing code. You will hear estimates running anywhere from 70-90%. When you tell them that studies have shown that even the best programmers tend to produce on average about 15 lines of code per day, the programmers are shocked. What they fail to realize is that on the days that they are writing code. production is high. But, many more days are spent on managing the code. The next few sections describe some of these non-coding activities.

3.4.1.1  Build Tools

If your program consists of one all-inclusive file, just compile and run it. For anything more complicated, organizing the source code and using build tools are essential. The recurrent tasks for build tools include:

·         Compiling source code and pointing to files necessary to compilation.

·         Storing the resulting object and executable files.

·         Creating aggregate files (tar, jar, zip)

and many other things as we’ll see.

We create scripts of actions that carry out these tasks. Scripts are programs and must be coded so they can be understood, reused, modified, and debugged.

Unix people are familiar with makefiles. Java has a new system called Ant. You will be using both in your projects.

3.4.1.2  Unit testing

For your typical school problems, you write code, test it, turn it in, never to see it again. Investing a lot of energy in the testing phase makes no sense. But large systems have a long lifetime. The kind of sitting-at-terminal testing you do for small projects is inappropriate for large systems that evolve over time. You must have a testing methodology that permanently captures the test cases and can be rerun at will.

This is not theoretical work as you will see. Mostly it takes discipline to write and save the tests. Fortunately, serious developers have built frameworks for unit testing. JUnit is for Java. Several have been built for C++. CppUnit at http://sourceforge.net/projects/cppunit claims to be a port of JUnit.

You will become familiar with both in your project work.

3.4.1.3  Source Code Control

With teams of people working on a project you simple must use tools to help with controlling the source code. There are many systems out there for doing this. One for free is at http://www.gnu.org/software/cvs/

I have decided NOT to have you use any source code management tools in this class. With 2-person teams, they are simply not needed. But I do want you to be aware of what they can do. Below is information taken from the CVS website.

What is CVS About? CVS maintains a history of the source and how it was developed. It stamps each change with the time of the change and the user who made it. Usually, the person provides a bit of text describing why they made the change. Given that information, CVS can help developers answer questions like:

·         Who made a given change?

·         When did they make it?

·         Why did they make it?

·         What other changes did they make at the same time?

Checking out a working directory—CVS doesn’t work on ordinary directory trees; you need to work within a directory that CVS created for you. Just as you check out a book from a library before taking it home to read it, you use a checkout command to get a directory tree from CVS before working on it.

Merging your changes—Since each developer uses their own working directory, the changes you make to your working directory aren’t automatically visible to the other developers on your team. CVS doesn’t publish your changes until you’re ready. When you’re done testing your changes, you must commit them to the repository to make them available to the rest of the group.

However, what if another developer has changed the same files you have, or the same lines? Whose changes should prevail? It’s generally impossible to answer this question automatically; CVS certainly isn’t competent to make that judgment. Thus, before you can commit your changes, CVS requires your sources to be in sync with any changes committed by the other team members.

Committing your changes—Now that you have brought your sources up to date with the rest of the group and tested them, you are ready to commit your changes to the repository and make them visible to the rest of the group.

Examining changes—At this point, you might well be curious what changes the other developer made. To look at the log entries for a given file, you can use the cvs log command.

You can find plenty of documentation at the website.

Source code management prevents cataclysmic disasters that many of us working on large projects have been through.  Here is a quote from the “Ant in Anger” paper in the Ant documentation:

“If you don’t have a source code management system, you are going to end up hosed. If you don’t have everything under source code management, including web pages, dependent jars, installation files, you are still going to end up hosed, it’s just a question of when it’s going to happen. CVS is effectively free and works well with Ant, but Sourcesafe, Perforce, Clearcase and StarTeam also have Ant tasks. These tasks let you have auto-incrementing build counters, and automated file update processes.“

I couldn’t have said it better.

3.4.2    Configuration Files

Often, large software system require startup information in order to:

·         run in a certain environment,

·         run a particular set of cases;

·         execute in different modes, such as debugging.

·         and anything else that might affect its behavior

This configuration information can be passed as command line parameters, but is more often stored in an external file that the program reads during startup. The question is what is the format of the file?

Mostly, developers invent a homegrown format convenient for their application. They write (simple) parsers that process the configuration file. As XML became more popular, developers started it for formatting their configuration files. Freely available XML parsers saved a lot of work. The result is that XML-based configuration files have become standard.

The Tomcat java web server and the Ant build-management system both use XML formatted files for their configuration. Since you will be using both systems, you need to learn a little about XML. Section 10 discusses XML in detail.

3.4.3    Documentation

Code is not sufficient as documentation of a software system. Code shows only how something works. Code is just the implementation of a design. For sufficient documentation the following is absolutely the minimum required:

·         What the requirements are, i.e. what is the system supposed to do.

·         What the design is, i.e., how the system is going to fulfill its requirements.

·         How each design feature is motivated by a requirement, i.e., make sure the system doesn’t do unwanted things.

·         How each requirement is expressed in the design, i.e., make sure the system does all the required things.

That seems straightforward. In fact, your textbook says that for all systems, “If not documented, then maintenance can be farcical.” Of course. How can you maintain or modify a system if you don’t know why the code is written the way it is?

Even once we agree on documentation, system builders face the following problems:

·         What format should the documentation be in?

·         What formalisms should be used to describe the requirements and the design?

·         How easy will it be to maintain the documentation?

These are hard questions and you will study them in your software engineering classes. The next few sections will talk about them briefly, enough for our class.

3.4.3.1  Format of the Documentation

Over the years, many organizations and researchers have tried to specify a format for their documentation. If you work for an organization doing government or military work, you will become quite familiar with their comprehensive documentation requirements. Some of those organizations have settled on a format called MBASE developed right here at USC by our own Professor Barry Boehm.

MBASE stands for Model Based System Architecting and Software Engineering. MBASE identifies absolutely everything that must be documented, from requirements, to design, to implementation plan, to maintenance, everything. MBASE is for large projects that take years to complete. It is overkill for our needs.

Instead you will be using an informal and non-standard format called a Roadmap. A roadmap satisfies all the basic documentation needs. Don’t be fooled by its informal nature. It will be your key to successfully navigating though our code.

3.4.3.2  Design Formalisms and Diagrams

Whether using MBASE or Roadmaps, diagrams are necessary. Common ones are:

·         Big Block Architecture Diagrams – What are the major components of our system and how do they interrelate?

·         Interaction Diagrams – How does an end user interact with the system?

·         Flow Diagrams – How does the code execute?

·         Timing Diagrams – How does the system behave over time?

There are many formalisms for expressing these diagrams. UML (Unified Modeling Language) has become the standard. UML has a way to build a diagram for everything, from requirements, to object-oriented classes, to dependencies. Your textbook covers UML in depth. Industry uses it. You will use some of it. I’m not a big fan.

UML tries to document everything using diagrams. And its unification is open to question. The computer tools for building UML diagrams are clunky. It’s hard to draw the pictures, it takes a lot of time, and it is hard to maintain the pictures. A common complaint that I hear from my industry colleagues is that people spend too much time on UML when they should be working on code. I don’t want you spending a lot of time on making pretty pictures. I’m much more interested in pretty code.

Remember documentation is about exchanging information. The best way may be not always be pictures. Sometimes text is fine.

3.4.4    Scheduling

The school semester is so constrained, that most projects require only the most rudimentary schedules. Real world software projects are different. Management has many tools to help scheduling. All are base on:

·         Describing Tasks – What they are, how they long they will take, who will work on them.

·         Describing Dependencies – What task(s) does a task depend on.

With this information, tools can create activity graphs and Pert charts for seeing the task dependencies and showing how long a project will take. The so-called critical path, the minimum time it will take to do the project, is of particular importance to a manager.

This kind of scheduling is beyond the scope of this course. Instead, I will do the scheduling for you. As you remember, we will be using an Extreme Programming methodology. That means lots of small releases—perfect for the classroom experience.

4      Concurrency

Concurrency is about events or actions happening simultaneously. Multi-tasking is doing multiple tasks at the same time. We will investigate multi-tasking for both computers and people carefully, since the analysis will help us understand concurrent software.

4.1    Logical Concurrency

Most problems have solutions involving concurrency. Solving them requires divide and conquer techniques:

·         Divide problem into logical tasks that can be carried out simultaneously, or in parallel.

·         Assign each logical task to a physical execution entity. Let’s call this pairing a physical task (or task for short). Tasks can be scheduled. Also, you may assign several logical tasks to a single execution entity; this will influence how the execution entity carries out its multiple responsibilities.

·         Determine the tasks’ dependencies.  The execution entities may need to communicate or share data.

These steps apply both to human and computer systems. In human systems:

·         The tasks can be instructions, directions, or recipes.

·         The execution engine is a person. Groups of people can form a team.

·         The dependencies are the messages that people send (or speak) to one another and the information they share.

In computer systems (explain in detail in section 4.5.1):

·         The tasks are threads within a process. A process can have multiple threads.

·         The execution engine is a CPU. Multiple CPUs can cooperate.

·         The dependencies are the shared and global data, methods or services, or messages between processes.

This compelling analogy is one of the reasons why I often use examples from own lives to demonstrate the issues raised by concurrency. People have the same problems with concurrency as computers do. One major problem, often underestimated, is how to represent a solution.

4.2    Representation of Concurrency

In a family application whose current goal is to have dinner, the following might be one requirement of many:

“Have cooked carrots for four.”

Notice that the requirement is under specified. It doesn’t say how to cook the carrots or what style to prepare them. We could microwave, mash and serve them. That would fulfill the requirement. Do you see how this under specified requirement could give the cook (the designer) too much freedom?

On the other hand, what if the designer was a good chef. Do you see how this under specified requirement could give the cook (the designer) just the right flexibility? Perhaps if you had a truly great chef, you should only have the one requirement: “Prepare dinner for four at 8pm.” Requirements are tricky. [This actually happened to me recently in Italy. Our host was an American friend who rented a villa near Sienna. Though he knows nothing about Italian food, he would insist on crafting menus for our private, expert cooks. No amount of prodding from me could convince him to leave the decisions (the design) up to the cooks.]

In any case here’s a recipe/design for the “Cooked carrots for four” requirement:

Cooked carrots: Put chopped carrots into boiling water for 15 minutes. Drain and serve.

Notice that the algorithm is also under specified. How much water should we use? How many carrots should we chop? A designer must be careful about such details. But here I’m concerned about concurrency issues.

If I gave this algorithm to an inexperience cook (the execution entity), you might see the following execution scenario:

1.      Chop carrots. [Takes 5 minutes.]

2.      Put on water to boil. [Takes 5 minutes.]

3.      Put carrots in boiling water. [Takes 15 minutes.]

4.      Drain the carrots.

5.      Serve the carrots

This cook simply followed the recipe linearly and didn’t realize that putting on the water should have been done first. That way, the water would be boiling when the cook finished chopping the carrots. An experienced cook would know that. Perhaps, a more expressive algorithm would help a novice.

The problem is that the recipe didn’t explicitly say that boiling the water and chopping the carrots can be done concurrently. The textual, linear representation is not conducive to showing concurrency. Perhaps we should have written the recipe as:

Cooked carrots (numberOfServings){

Concurrently do {chop carrots; boil water};

Put carrots in boiling water for 15 minutes;

Drain;

Serve.

}

That’s more expressive, but we might not sell many cookbooks with these kind of directions.  Yet the advantages are obvious. It is easy to see that if there were three cooks, one could chop the carrots, one could boil the water, the other could do the rest. A single cook who understood these directions would know that “chop carrots” and “boil water” are concurrent actions and could be done in any order. In any case the algorithm shows the logical concurrency, and it is up to the execution entities to schedule the work intelligently.

Note that I also put a parameter list in the first line, but I didn’t use it. I had in mind a parameter for number of portions. Then that parameter could be used to compute how many carrots to chop, how much water to boil, and so forth. Do you thinks home cooks would like recipes written that way?

These problem may not be too important for cooks, but they are critical for software developers, especially concurrency. Later we will see graphical representations for modeling concurrent behavior. There are even graphical programming languages, mostly found in manufacturing, that emphasize concurrency.

4.3    Task Dependencies

Concurrent does not mean independent. A good design limits dependencies. The more dependent one task is on others, the more difficult it will be to understand, implement and change the design. Many books have been written on this subject. Managing dependencies is what encapsulation is all about: try to keep as much private as you can; recognize what must be global or public; limit the task interactions.

4.3.1    Private Data

Private data is just that, private. Only the task that owns private data can view and modify it. This restricted locality limits where you have to look to see how the private data is used. The private data is encapsulated.

4.3.2    Shared Data - Global and Public Data

Sometimes it is convenient and necessary to share data. Here are some examples from human systems:

·         In the classroom, all students can see anything written on the blackboard. The writing on the blackboard is global to the classroom application. All the students and the teacher are acting concurrently.

·         In a delicatessen, customers take a number from a single dispenser in order to queue up to order food. The dispenser is public and shared by all customers in the delicatessen application. All the customers are acting concurrently.

·         Two cars are going in the same direction in lanes one and three of a three-lane highway. The middle lane, lane two, is empty and either car could change lanes into it. All drivers share the lanes. All the drivers are acting concurrently.

These situations are so familiar that we aren’t even aware of the sharing problems:

·         Two people cannot write simultaneously on the same space on the blackboard. Yet, everyone can read the blackboard simultaneously.

·         Two people cannot grab a delicatessen number at exactly the same time.

·         Two cars in the outside lanes cannot both merge into the same space of the center lane.

Think about how you solve these problems in real life. Later in section 4.5.2 we will see how to solve these problems in computer systems.

4.3.3    Task Interactions

Two (or more) tasks may interact in carrying out their responsibilities:

·         Providing Status – One may simply inform the other of some event, e.g. “I will be free at 3pm.”

·         Asking for services – One may make a request of the other, e.g. “Can you get me Fred’s email address?” Traditional method calls are requests for service.

These interactions can be synchronous or asynchronous:

Synchronous means the initiator waits for a status acknowledgement or the service to be completed.

Asynchronous means the initiator doesn’t wait.

In human systems the interactions are communicated in two ways:

Directly through voice, gestures, etc. This can be done synchronously or asynchronously.

Indirectly through mail. This is always asynchronous.

4.4    Concurrency in Software systems

Concurrent software applications are all around us. Web browsers and web servers are an obvious example. Web applications are also examples of a distributed system. We’ll spend a lot of time on this application later in the course in section 8.

Another smaller example of a concurrent application is your email program. All email programs have the following basic functional requirements:

·         We should be able to receive and read incoming mail.

·         We should be able to compose and send outgoing mail.

In most implementations the following happens:

Downloading Mail: On a timer or mouse click, our local email program connects to our mail server to download new messages. It adds them to the input mail folder and displays key headers (from, to, subject) on the screen.

Composing Mail: On a mouse click, a mail template is put on the screen for you to fill out. When done, your SMTP (Simple Mail Transport Protocol) server sends the mail to the recipients.

Downloading and composing mail are logically concurrent. They could be happening simultaneously: the incoming mail can be downloaded (in the background), while you are composing mail (in the foreground).

And yet, in many of the early email systems when you clicked “Read Mail” nothing else could be done until the incoming mail was completely processed. If you had a lot of mail, this wait was particularly annoying. In modern email systems, the two tasks are assigned to different threads and the operating system interleaves their execution. The next few section will explain these concepts.

4.5    A Little About Operating Systems (OS)

We’ve spent a lot of time investigating applications that have been divided into concurrent tasks. Now it is time to see how operating systems work and support concurrent software.

4.5.1    Software and How it Executes

. In general:

·         Programs are textual representations of tasks that are going to execute on computers. Programs are compiled from a source language, such as Java or C++, into machine code that will execute on a target computer, such as a Unix or Windows machine.

·         The compilation produces an object, often called an executable, which can be loaded. Most of the executable is the code plus data (the logical address space).

·         When loaded, the executable becomes a process. [Some programs are compiled into libraries whose routines that can be called by a process. Libraries can be shared among processes or linked directly into the process].

·         A process can have one or more threads of control.

·         The operating system scheduler time-shares among all the active threads. This time-sharing interleaves the execution of the threads.

Remember, computers don’t execute programs. Programs are simply the source text that is be compiled into executables.

4.5.1.1  Processes

A process is often called a program in execution. That informal statement captures the notion of a passive program versus an active process. Once the program’s executable is stored onto disk, you can run it. First, the operating creates a process. Here are some of the steps:

·         The OS loads the logical address space of the executable into main memory. The newly loaded physical address space contains the code to be executed, storage for the data used by code, and stack space given to the process for use during method calls

·         The OS creates a thread of control by assigning a program counter (PC) to point to the current instruction. On initial load the PC points to first instruction of the main routine.

·         The OS puts the thread on the scheduler’s ready queue.

As we will now see, it is threads that the CPU executes.

4.5.1.2  Threads

We often think of threads as physical things. People have trouble visualizing what a thread is. You won’t be one of them after this class.

As mentioned above, each process has a default thread that starts executing at the beginning of the main routine. However, all modern operating systems and languages have support for multi-threaded processes. In other words the thread that is running can spawn another thread (or whole process). All that is required is to create an instance of the code at which to start execution.

In Java, you create a class that extends Thread or implements the runnable interface. Then you invoke the instance’s start() method, which tells the JVM to create a thread and begin its execution at the run() method. Easy.

In C++ , you use the Pthreads package (see 13.1).
http://www.awprofessional.com/articles/article.asp?p=169479&seqNum=1

In both cases the operating system creates a new thread of control by assigning a program counter to the first instruction, assigning some stack space (and a few other things), and putting the thread on the ready queue. The new thread competes for the CPU time just like any other thread.

However, and I can’t say this too strongly, newly created threads are part of the same address space as the thread that created them. Any global data in that address space is available to all the threads within the process. The next section explains.

4.5.1.3   Address Space, Program Counters, and Execution Details

We will use the Agent application to show how memory and execution are managed. Figure 1 shows the state of memory after loading Main.java of v1.0. Here is the code we care about:

            MaitreDAgent maitreD = new MaitreDAgent("MaitreD");   maitreD.startThread();
            CustomerAgent c1 = new CustomerAgent("Fred");
            CustomerAgent c2 = new CustomerAgent("Ethel");
            CustomerAgent c3 = new CustomerAgent("Ginger");
            c1.startThread();
            c2.startThread();
            c3.startThread();

We’ve called the main thread t1. Its program counter is denoted by an arrow pointing to the next instruction to be executed in that thread. Before going into the details of the applications, there are several things to note about this “representation”:

A computer’s memory is shared amongst all the active processes. Each process will look like this one with at least one thread of control. Each address space is protected from one another so they do not interfere (you will learn about memory management in detail in your operating system class).

Each cell is meant to represent one word of memory. Generally, a word of memory is 4 bytes long, or long enough to hold enough to hold a computer instruction or a pointer to another location in memory.

The loaded code is shown in source form (with some comments). In reality, there are no comments and the code is compiled into word-sized machine instructions.

Notice that an address space contains:

·         One copy of all the global variables.

·         One copy of the code.

You will see that each class variable points to its own copy of the local variables associated the class.

 

//All Code below for Test.java

//Data for Test.java Follows

t1

 

 
//main routine

//global variables (if any)

maitreD=new MaitreDAgent(); maitreD.startThread();

c1=new CustomerAgent("Fred"); c1.startThread();

//local variable of main routine

c2=new CustomerAgent("Ethel"); c2.startThread();

maitreD: null

c3=new CustomerAgent("Ginger"); c3.startThread();

c1: null

c1.setMaitreD(maitreD);

c2: null

c2.setMaitreD(maitreD);

c3: null

c3.setMaitreD(maitreD);

c1.setHungry();

c2.setHungerLevel(7);

c2.setHungry();

c3.setHungry();

currentThread.kill();//put in by compiler

//AgentThread Methods

//run()

goOn = True

while (goOn) {

}

currentThread.kill();//put in by compiler

//other AgentThread methods

//CustomerAgent Code

//CustomerAgent Constructor

CustomerAgent(String name){

}

//other CustomerAgent methods

respondToStateChange() {

}

//MaitreDAgent Code

//MaitreDAgent Constructor

//Stack space for thread t1 follows

MaitreDAgent(){

}

//other MaitreDAgent methods

respondToStateChange() {

}

//Code for utility classes

//TimerQueue code

schedule(){

}

Figure 1 Memory after Loading Test.java

When executed, the line of code pointed to by t1’s program counter causes execution of the MaitreDAgent constructor: maitreD=new MaitreDAgent().  Once the startThread() is called, the full instantiation is shown in Figure 2. In particular:

Another thread, t2, is started. The new program counter points to the first instruction of run() (inherited by the MaitreDAgent from the Agent base class).

A stack has been allocated for t2.

Some data space has been allocated for the local variables of this instance of a MaitreDAgent. After execution of the constructor, the variable, maitreD, is associated with this space (as well as the methods of MaitreDAgent).

The new thread competes with all threads for the CPU. Currently, t1 is running. Any thread runs until:

·         the scheduler picks another thread after a certain amount of time elapses;

·         the thread blocks waiting for I/O;

·         the thread blocks waiting for an event. For example, in the infinite loop in the run() method, our agent can block while waiting for its semaphore to be releaseed.

Thread t1 may run to completion before thread t2 executes even one instruction. Or t2 might be selected right now. These timing uncertainties are called race conditions (see section 4.5.2).

 

//All Code below for Test.java

//Data for Test.java Follows

//main routine

//global variables (if any)

maitreD=new MaitreDAgent()

c1=new CustomerAgent("Fred")…

//local variable of main routine

c2=new CustomerAgent("Ethel")…

maitreD: //points to data below

c3=new CustomerAgent("Ginger")…

c1: null

c1.setMaitreD(maitreD);

c2: null

c2.setMaitreD(maitreD);

c3: null

c3.setMaitreD(maitreD);

c1.setHungry();

//Instance data of maitreD variable

c2.setHungerLevel(7);

//Agent local variables

c2.setHungry();

stateChange:

c3.setHungry();

agentThread://pointer to thread

currentThread.kill();//put in by compiler

//AgentThread Class variables

//AgentThread Methods

goOn:

//run()

//MaiterDAgent local variables

goOn = True

waitingCustomers: pointer

while (goOn) {

tables: pointer

}

currentThread.kill();//put in by compiler

//other AgentThread methods

//CustomerAgent Code

//CustomerAgent Constructor

CustomerAgent(String name){

}

//other CustomerAgent methods

respondToStateChange() {

}

//Stack space for thread t1 follows

//MaitreDAgent Code

//MaitreDAgent Constructor

MaitreDAgent(){

}

//other MaitreDAgent methods

respondToStateChange() {

//Stack space for thread t2 follows

}

//Code for utility classes

//TimerQueue code

schedule(){

}

Figure 2 Memory After Instantiating MaitreD agent

Figure 3 shows memory after executing the next line of code in thread t1:

c1=new CustomerAgent("Fred"); c1.startThread();

Notice the following:

·         Another thread, t3, has been created for the new customer. Note that t2 and t3 point to the same instruction. It is imperative that you understand that the threads execute within the same copy of code.

·         A stack has been allocated for t3.

·         Space has been associated with the variable, c1. This space has the local variables of the class CustomerAgent. Some of those variables come from the inherited base class.

In thread t1, after execution of

c2=new CustomerAgent("Ethel"); c2.startThread();

c3=new CustomerAgent("Ginger"); c3.startThread();

there will be a total of five threads running in the Test.java process: the main thread, one MaitreDAgent thread, and three CustomerAgent threads.

The main thread, t1, might continue until it finishes. In that case t1 will exit, leaving the other threads to continue competing for CPU time.

You should be able to simulate execution of the CPU by:

·         Picking an active thread;

·         “Executing” the instruction pointed to by its program counter.

·         “Incrementing” the program counter to point to the next instruction.

·         Continuing the Executing/Incrementing until you pick another thread.

Notice that that there is no real concurrency going on here since there is only one CPU. However, the scheduling behavior allows interleaving and gives the illusion of concurrency. If you assigned each thread to a separate processor, then you would have real concurrency. Since typical computers only have one CPU for executing code, only distributed applications (section 5) have real concurrency.


 

 

//All Code below for Test.java

//Data for Test.java Follows

//main routine

//global variables (if any)

maitreD=new MaitreDAgent()

c1=new CustomerAgent("Fred");…

//local variable of main routine

c2=new CustomerAgent("Ethel");…

maitreD: //points to data below

c3=new CustomerAgent("Ginger");…

c1: //points to data below

c1.setMaitreD(maitreD);

c2: null

c2.setMaitreD(maitreD);

c3: null

c3.setMaitreD(maitreD);

c1.setHungry();

//Instance data of variable maitreD

c2.setHungerLevel(7);

//Agent local variables

c2.setHungry();

stateChange:

c3.setHungry();

agentThread://pointer to thread

currentThread.kill();//put in by compiler

//AgentThread Class variables

//AgentThread Methods

goOn:

//run()

//MaiterDAgent local variables

goOn = True

waitingCustomers: pointer

t3

 

 
while (goOn) {

tables: pointer

}

//Instance data of variable c1

currentThread.kill();//put in by compiler

//Agent local variables

//other AgentThread methods

stateChange:

agentThread://pointer to thread

//CustomerAgent Code

//AgentThread Class variables

//CustomerAgent Constructor

goOn:

CustomerAgent(String name){

//CustomerAgent local variables

name: “Fred”

}

hungerLevel: 5

//other CustomerAgent methods

maitreD: null

respondToStateChange() {

isHungry: false

isWaiting: false

isSeated: false

}

//Stack space for thread t1 follows

//MaitreDAgent Code

//MaitreDAgent Constructor

MaitreDAgent(){

//Stack space for thread t2 follows

}

//other MaitreDAgent methods

respondToStateChange() {

//Stack space for thread t3 follows

}

//Code for utility classes

//TimerQueue code

schedule(){

}

Figure 3 Memory After Instantiating MaitreD agent and Customer Agent Fred

4.5.1.4  Data Sharing Problems - Producer-Consumer example

Concurrency and data sharing are separate issues:

·         The illusion of concurrency via threads produces interleaving behavior.

·         Data Sharing between threads allows information exchange.

Concurrency and data sharing together cause the Critical Section Problem. The Agent application has examples of this problem, but the classic Producer-Consumer problem shows it more simply. Assume there are multiple producers of items that are used by multiple consumers as follows:

·         Producers put items into a circular array called buffer whose length is size.

·         The next free location for a new producer item is called in.

·         Consumers take elements out of buffer from the location called out.

·         If in==out then buffer is empty.

·         If in+1==out then buffer is full.

·         The variables, in, out, and buffer are shared by all producers and consumers.

·         Each producer and consumer is executing in its own thread.

Figure 4 shows the address space for this problem with two producers—threads P1 and P2—and one consumer with thread C1.

//producer code

//global variables

while (true) {

in:   7

  while ((in+1)mod size)==out);//wait

out:  4

  buffer[in]=nextProduced();

size: 10

  in=(in+1)mod size;

buffer[0]:

}

buffer[1]:

buffer[2]:

//consumer code

buffer[3]:

while (true) {

buffer[4]:   xyz

  while (in==out);//wait, empty

buffer[5]:   abc

  next=buffer[out];

buffer[6]:   def

  out=(out+1)mod size;

buffer[7]:   ghi //by P1

}

buffer[8]:

buffer[9]:

Figure 4 An address space for Producer Consumer

Let’s suppose that in this execution:

·         P1 is running and has finished executing buffer[in]=nextProduced(); The result put “ghi” into buffer[7]. P1’s program counter has been incremented and now points to the code that is about to increment in.

·         Let’s say the operating system scheduler interrupts P1 at exactly this point and gives control to P2. Remember that P1 and P2 are executing the same code and sharing the global variables.

·         P2 sees there is room in the buffer and moves on to the next instruction.

·         P2  executes buffer[in]=nextProduced() and puts its output, say “mno,” into buffer[7], the same place that P1 did. Yikes!!

·         P2 executes the increment statement, making in=8.

·         Let’s say P2 in interrupted right now.

Memory now looks like the following:

 

//producer code

//global variables

while (true) {

in:    8

  while ((in+1)mod size)==out);//wait

out:  4

  buffer[in]=nextProduced();

size: 10

  in=(in+1)mod size;

buffer[0]:

}

buffer[1]:

buffer[2]:

//consumer code

buffer[3]:

while (true) {

buffer[4]:   xyz

  while (in==out);//wait, empty

buffer[5]:   abc

  next=buffer[out];

buffer[6]:   def

  out=(out+1)mod size;

buffer[7]:   mno

}

buffer[8]:

buffer[9]:

Figure 5 Just after P2 runs

If control is given back to P1, then P1 increments in to be 9. Yikes again!!! The state of the system is terrible:

·         P1’s output into buffer[7] has been overwritten and lost.

·         in is now pointing to element 9 of buffer.

·         buffer[8] does not have a valid item in it. What will happen to the consumer when it gets there?

4.5.2    Race Conditions and the Critical Section Problem

This disaster occurred because P1 was interrupted before it got a chance to update the in variable. However next time this program is run, P1 might complete its update and all might be well. P1 and P2 are in a race to complete their update before they get interrupted. Race conditions are part of any multi-threaded application. The bug here is particularly difficult to track down because it may not be reproducible—every mix of code in the operating system will produce different timings and results. We can’t prevent race conditions, but we can write better code.

The above analysis shows that only one producer thread should be allowed to execute the code in the while(true) loop. That code is called a critical section, and its execution must be atomic, i.e., its execution must complete before any other thread can enter it. You will study this problem in depth in your Operating System class. For now, here are two techniques to solve this error, synchronization and messaging.

4.5.2.1  Synchronization to the Rescue

Synchronization controls event ordering. We need them anytime we want concurrent events to happen simultaneously or in some sequence. For the Critical Section problem we want to control entry in the critical section—only letting one thread in at a time—and upon exit from the critical section we can let another thread in, i.e., provide serial access to the critical section.

Both semaphores and monitors provide synchronization.

4.5.2.1.1   Semaphore

A semaphore, s, is an object with two operations, acquire() and release(). wait() and release() are also known as wait and signal, p and v, down and up, or take and give. Semaphore has a slot called permits, which controls the number of thread that can "get into" the rest of the code. Whatever they are called, they have the following semantics:

acquire() {

   while (s.permits <= 0); //wait

   s--;

}

release(){

   s.permits++

}

 

This is not code, but how any implementation should behave. Acquire and Release are meant to execute atomically, that is, they can’t be interrupted.

A semaphore can be used for simple synchronization between threads. Suppose thread A must wait for thread B to finish some task S before proceeding. They can use a semaphore as follow:

//global

semaphore synch = new Semaphore(0); //initial value set to zero.

//thread A

//thread B

synch.acquire();

S;

synch.release()

A semaphore also can guard a critical section as follows:

//global variable

Semaphore s = new Semaphore(1); //initial value set to one.

 

while (true)

  s.acquire();  //grab semaphore lock

  critical_section();

  s.release(); //release semaphore lock

  other_code();

}

 

In java versions beyond 1.5, we can try to acquire a semaphore. The two versions  returns a boolean if we were successful. Here are the two versions:

boolean tryAcquire()
          Acquires a permit from this semaphore, only if one is available at the time of invocation.    

 

boolean tryAcquire(long timeout, TimeUnit unit)
          Acquires a permit from this semaphore, if one becomes available within the given waiting time and the current thread has not been interrupted.

 

How would you use these calls. Here are two examples:

 

while (!sem.tryAcquire()) {

  ... //perform this until the semaphore is available

}

//code that depends on having the semaphore

...

or

while (true) {

if (sem.tryAcquire(100 milliseconds)) {

... //activity for which semaphore was needed

} else {

...      //see if there’s something else I should be doing,

//which doesn’t require the semaphore. 

//Or log a warning that I’m still waiting for the semaphore.

//Or return failure.

//Or ...

    }   

}

4.5.2.1.2   Monitor

Semaphores are easy to misuse. Monitors offer an alternative that can be enforced by the compiler. A monitor encapsulates the various pieces of code that need serial access. For example:

monitor MyMonitor{

//shared data declarations here

initialization_code(){…} // could be a constructor

 

// monitor operations follow.

//Only one can be active at a time

f1() {…}

f2() {…}

}

This monitor says that only one of the monitor operations can be active at a time. Monitors should be put “around” all shared data.

Here is the producer consumer problem in pseudo-code using monitors:

monitor ProducerConsumerMonitor {

//shared data declarations here

int size=10;

private buffer[size];

int in=0, out=0;

 

//monitor operations

insert() {…}

remove() {…; return item}

}

 

//global variable

ProducerConsumerMonitor theMonitor;

 

//producer code (in its own thread)

while(true) {

Item data = produce_item();

theMonitor.insert(data);

}

 

//consumer code (in its own thread)

while(true) {

Item data = theMonitor.remove();

consume_item(data);

}

Since insert() and remove() are in the monitor, access to the shared variables are guaranteed to be serial. Monitors elegantly solve this difficult problem.

Java has a mechanism that approximates monitors. By assigning the keyword synchronized to a method, you are declaring that only serial access is allowed. For a full Java implementation of the Producer-Consumer problem, see ProducerConsumer Monitor,  ProducerConsumer Application,   Another App. With Multiple Producers and Consumers,    We will discuss it in detail in class.

There are also several instances of synchronization in the agent application.

[Aside: Producer-Consumer is a pattern. If you can cast your problem as one of producer-consumer, than you can immediately use the solutions shown above. On simple reflection this can be done to many problems. For example, in the email application, the thread that gets mail from the internet server is a producer; the thread that puts summaries on the screen is a consumer.]

4.5.2.1  Messaging to the Rescue

Semaphores, monitors, and synchronized methods are fraught with danger. The very best programmers have trouble getting this kind of code to work. If you could desynchronize the code in your application by NOT sharing data, things would be much simpler.

In messaging, a sender process creates a message and sends it to a recipient process. The message is not shared: during creation the sender owns it, after reception the recipient owns it. You can build messaging based applications. The Agent Roadmap details such an application.

5      Distributed Systems

5.1    Introduction

So far all of our interactions have been between threads running in the same process. However, with the advent of networks, processes, perhaps on different computers, can communicate with one another in at least three logical ways:

·         Exchanging steams of data, which we will investigate via Socket Programming.

·         Messaging, which we have already discussed.

·         Ordinary subroutine, or method, calls, which we will investigate via Java’s Remote Method Invocation (RMI).

Whatever method is used, the receiving process receives a deep copy of the data. The result is:

·         No data is shared.

·         The invocation—whether it is the sending of a data stream, a message, or a method call—is physically asynchronous because the two communicating processes are in separate processes.

·         The “calling” application can treat the invocation as logically synchronous by waiting for the return result, if there is one. If nothing is returned, then the effect is like messaging.

Though the communication methods are different logically, they all work similarly at the lowest levels by exchanging streams of data. Let’s look at that layer first.

5.2    Distributed Programming with Sockets

Here is a summary of how a client-server system works with sockets.

·         A Server process “listens” on a socket;

·         Client processes “connect” to that socket and start the conversation;

·         Each Client and the Server have a “conversation” over 2 channels:

·         One from client to server;

·         The other from server to client.

·         The details of the conversation depend on the protocol.

Client and Server can be written in different computer languages.

The following examples come from “Core Web Programming” by Marty Hall and Larry Brown, ISBN 0-13089793-0 ( www.corewebprogramming.com).

5.2.1    Implementing a Client with Sockets

First comes the main() routine to house the client.

public class NetworkClientTest {

  public static void main(String[] args) {

    String host = "localhost";

    if (args.length > 0)

      host = args[0];

    int port = 8088;

    if (args.length > 1)

      port = Integer.parseInt(args[1]);

    NetworkClient nwClient

      = new NetworkClient(host, port);

    nwClient.connect();

  }

}

 

We will show the detail of the NetworkClient class in a moment. We are covering clients before we cover servers because clients are slightly easier. We can test clients by connecting to existing servers that are already on the Internet. For example, using this client we can connect to Netscape’s ftp server over port 21 as follows:

 

>java NetworkClientTest ftp.netscape.com 21

Generic Network Client:

Made connection to ftp.netscape.com and got

‘220 ftp26 FTP server (UNIX(r) System V Release 4.0)

ready.’ in response
>

Here are the steps of creating a client in the
NetworkClient class:

1)  Create a Socket object
Socket client = new Socket("hostname", portNumber);

2)  Create an output stream that can be used to send info to the Socket (to the server):
// Last arg of true means autoflush -- flush stream
// when println is called
PrintWriter out = new PrintWriter(client.getOutputStream(), true);


3)  Create an input stream to read the response from the server (from the socket):
BufferedReader in = new BufferedReader
    (new InputStreamReader(client.getInputStream()));

4)  Do I/O with the input and output Streams

5)  For the output stream, PrintWriter, use print and println, similar to System.out.println.

6)  For the input stream, BufferedReader, you can call read to get a single character or an array of characters,
or call readLine to get a whole line.

7)  Close the socket when done:
client.close();//closes the associated input and output streams

Here is the actual code of the NetworkClient class, which implements the above five steps.

import java.net.*; 

import java.io.*;

 

public class NetworkClient {

  protected String host;

  protected int port;

 

  public NetworkClient(String host, int port) {

    this.host = host;

    this.port = port;

  }

  public String getHost() {

    return(host);

  } 

  public int getPort() {

    return(port);

  } 

  /** Establishes the connection, then passes the socket

   *  to handleConnection. */

  // see http://java.sun.com/j2se/1.4.2/docs/api/index.html

  // Look for Socket class 

  public void connect() {

    try {

      Socket client = new Socket(host, port);

      handleConnection(client);

    } catch(UnknownHostException uhe) {

      System.out.println("Unknown host: " + host);

      uhe.printStackTrace();

    } catch(IOException ioe) {

      System.out.println("IOException: " + ioe);

      ioe.printStackTrace();

    }

  }

    /** This is the method you will override when

   *  making a network client for your task.

   *  This default version sends a single line

   *  ("Generic Network Client") to the server,

   *  reads one line of response, prints it, then exits.

   */ 

  protected void handleConnection(Socket client)

      throws IOException {

    PrintWriter out = SocketUtil.getPrintWriter(client);

    BufferedReader in =

      SocketUtil.getBufferedReader(client);

    out.println("Generic Network Client");//to the server

    System.out.println

      ("Generic Network Client:\n" +

       "Made connection to " + host +

       " and got '" + in.readLine() + "' in response");

    client.close();

  }

}

 

So, it creates the two streams from the “client” socket and

·         sends the opening message via out.println()and

·         reads the response via in.readLine(), which blocks until the server responds, thus making the exchange synchronous).

SocketUtil  has the code for the creation of the Reader and Writer:

import java.net.*;

import java.io.*;

public class SocketUtil {

  /** Make a BufferedReader to get incoming data.  */ 

  public static BufferedReader getBufferedReader

                        (Socket s) throws IOException {

    return(new BufferedReader(

      new InputStreamReader(s.getInputStream())));

  }

  /** Make a PrintWriter to send outgoing data.

   *  This PrintWriter will automatically flush stream

   *  when println is called.

   */

  public static PrintWriter getPrintWriter(Socket s)

      throws IOException {// true means autoflush

    return(new PrintWriter(s.getOutputStream(), true));

  }

}

 

5.2.2    Implementing a Server with Sockets

Here is the main routine to house the Server:

public class NetworkServerTest {

  public static void main(String[] args) {

    int port = 8088;

    if (args.length > 0) {

      port = Integer.parseInt(args[0]);

    }

    NetworkServer server = new NetworkServer(port, 1);

    server.listen();

  }

}

We will show the detail of the
NetworkServer class in a moment. NetworkServerTest can accept a connection from any WWW Browser over port 8088 once we start it on server.com as follows:    

server.com>java NetworkServerTest

Then, if a standard Web browser on client.com requests file foo whose URL is  http://server.com:8088/foo/, the following will be output by our server:

Generic Network Server:

got connection from client.com

with first line 'GET /foo/ HTTP/1.0'

Here are the steps of creating a server in the NetworkServer class:

1)  Create a ServerSocket object
ServerSocket listenSocket = new ServerSocket(portNumber);

2)  Create a Socket object from ServerSocket
while(someCondition) {
  Socket server = listenSocket.accept();
  doSomethingWith(server);//often new thread}

3)  Create an input stream to read client input
BufferedReader in =
 new BufferedReader
  (new InputStreamReader(server.getInputStream()));

4)  Create an output stream that can be used to send info back to the client.
// Last arg of true means autoflush stream
// when println is called
PrintWriter out = new PrintWriter(server.getOutputStream(), true)

5)  Do I/O with input and output Streams:

·         Most common input: readLine

·         Most common output: println

6)  Close the socket when done
server.close();

Here is the actual code of the  NetworkServer class, which implements the above six steps.

import java.net.*;

import java.io.*;

/** A starting point for network servers. */

 

public class NetworkServer {

  protected int port, maxConnections;

  /** Build a server on specified port. It will continue

   *  to accept connections (passing each to

   *  handleConnection) until an explicit exit

   *  command is sent (e.g. System.exit) or the

   *  maximum number of connections is reached. Specify

   *  0 for maxConnections if you want the server

   *  to run indefinitely.

   */ 

  public NetworkServer(int port, int maxConnections) {

    this.port = port;

    this.maxConnections = maxConnections;

  }

  /** Monitor a port for connections. Each time one

   *  is established, pass resulting Socket to

   *  handleConnection.

   */ 

  public void listen() {

    int i=0;

    try {

      ServerSocket listener = new ServerSocket(port);

      Socket server;

      while((i++ < maxConnections) ||

            (maxConnections == 0)) {

        server = listener.accept();

        handleConnection(server);

      }

    } catch (IOException ioe) {

      System.out.println("IOException: " + ioe);

      ioe.printStackTrace();

    }

  }

  protected void handleConnection(Socket server)

      throws IOException{

    BufferedReader in =

      SocketUtil.getBufferedReader(server);

    PrintWriter out =

      SocketUtil.getPrintWriter(server);

    System.out.println

      ("Generic Network Server:\n" +

       "got connection from " +

       server.getInetAddress().getHostName() + "\n" +

       "with first line '" +

       in.readLine() + "'");

    out.println("Generic Network Server");

    server.close();

  }

}

5.2.3    Aside: Parsing Strings Using StringTokenizer

When using encoded strings to pass information, you need methods to break out the tokens. The general method works as follows:

·         Build a tokenizer from an initial string.

·         Retrieve tokens one at a time with nextToken.

·         You can also see how many tokens are remaining (countTokens) or simply test if the number of tokens remaining is nonzero (hasMoreTokens)

 

The code will look as follows:

 

StringTokenizer tok

     = new StringTokenizer(input, delimiters);

while (tok.hasMoreTokens()) {

     doSomethingWith(tok.nextToken());

}

There are many packages to do this. One can be found in the Core Web Programming book. We won’t be doing this in our class.

5.2.4    A Little about the HTTP Protocol

Thus far we have learned how to set up connections using sockets, talked a little about tokenizing, but have ignored the most complicated part, what do the strings look like that we send and receive. That depends on the protocol. As we mentioned before, you can roll your own protocol, or use one of the established ones. Here we will look briefly at HTTP.

To show what HTTP strings are like, let’s try to retrieve the page  http://www.apl.jhu.edu/~lmb using telnet over port 80. Here’s how:

Unix> telnet www.apl.jhu.edu 80

Trying 128.220.101.100 ...

Connected to aplcenmp.apl.jhu.edu.

Escape character is '^]'.

GET /~lmb/ HTTP/1.0

 

HTTP/1.0 200 Document follows

Date: Sat, 30 Jun 2001 14:34:58 GMT

Server: NCSA/1.5.2

Last-modified: Tue, 11 Jul 2001 15:13:56 GMT

Content-type: text/html

Content-length: 50479

 

<!DOCTYPE HTML PUBLIC

          "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML>

...

</HTML>Connection closed by foreign host.

Unix>

Here is what happened:

Via telnet, we can directly pass strings to a Web server over port 80: telnet www.apl.jhu.edu 80

We typed GET /~lmb/ HTTP/1.0 (and a blank line) to the server. GET is one of the commands in the HTTP protocol.

The document is returned. Notice that after the string: HTTP/1.0 200 Document follows there are 5 HTTP headers giving information about the retrieval and a blank line (to signify the end of the headers):

The file follows starting with <!DOCTYPE HTML PUBLIC

Finally, the connection is closed terminating the GET request.

The documentation for the HTTP protocols describes all the commands and headers.

With the above example, it is fairly easy to see how we could build a simple graphical user interface to communicate with HTTP servers. User can interactively specify:

·         Host and Port

·         HTTP request line

·         HTTP request headers

Then the HTTP request could be performed in a separate thread and the response document put into a scrollable text area. Here is what a GUI might look like:

The request in the picture asked for the (default) html file at the URL www.apache.org (usually this is a file called index.html)  The If-Modified-Since header asked only for a file modified since the date given. There were none, as can be seen from the results.

To build such a client, you have to know what HTTP requests and HTTP responses look like. In general a request is as follows:

Request

GET /filename/ HTTP/1.0

Header1: …

Header2: …

HeaderN: …

Blank Line

All request headers are optional except for Host (required only for HTTP/1.1 requests). Responses are like this:

Response

HTTP/1.0 200 OK

Content-Type: text/html

Header2: …

HeaderN: …

Blank Line

<!DOCTYPE …>

<HTML>

</HTML>

All response headers are optional except for Content-Type. You can research all the HTTP details online as needed.

So, creating a simple HTTP Server is straightforward:

·         Read all the lines sent by the browser, storing them in an array.

·         Use readLine (a line at a time) until you reach a blank line.

·         Note: with the POST request you have to read some extra data

·         Send an HTTP response line (e.g. "HTTP/1.0 200 OK")

·         Send a Content-Type line. This indicates the file type being returned; text/html is the most common.

·         Send a blank line.

·         Send the contents of the file.

·         Close the connection.

You can download all source files for this WebClient and WebServer from http://archive.corewebprogramming.com/Chapter17.html

5.3    Distributed Objects

Socket programming is hard!! With Sockets: not only did you have to manage the sending and receiving of data, you had to design a protocol (decide what goes into the strings and how the conversation should proceed) or use an existing one. For many applications we just want a “simple” procedure-call protocol. In other words we want to invoke public methods on remote objects. We want to make all the hard network stuff transparent to the application programmer. Distributed Object systems do exactly that. Let’s make sure we understand how distributed objects differ from local objects.

In a local method call, the data in the parameters are passed directly to the target method by putting a pointer to the data on the stack. This “sharing” of data doesn’t cause any synchronization problems because control is passed to the method along with the data; all access is serial.

With distributed objects, method calls can be remote, i.e., between objects running in different processes on different computers. The key question is how to make this transparent to the application programmer, i.e., how to allow a process on one machine to make normal looking method calls on an object running on another machine. This transparency is the key to distributed objects. Systems that offer this transparency are called middleware. CORBA and Java’s RMI are two such systems.

In CORBA the communicating objects do NOT need to be programmed in the same language. In Java’s RMI (Remote Method Invocation) the objects all have to be coded in Java. Both work similarly:

·         A special compiler uses the remote interface specification to build stubs and skeletons.

·         The remote object is started and identifies itself to some registry.

·         The caller finds the object it wants to use in the registry. This lookup is a substitute for a constructor call.

·         The caller invokes methods on the object normally.

·         The method invocation is fielded by the stub. Every call in the remote object’s interface has a method in the stub. The stub marshals the call as a data stream and sends it to the remote object.

·         The stream is read by the skeleton, parsed, and the actual method in the remote object is invoked.

·         The skeleton marshalls the return value and sends it back to the stub, which returns it to the original invocation.

That is a lot of work when compared to a simple local method call. Figure 6 below shows some pseudo-code that implements this scenario.

Figure 6 - A Distributed Object Architecture

 

5.4    Java’s Remote Method Invocation (RMI)

Using RMI you will be able to deploy objects on different machines. RMI handles network connections automatically. Java “serialization” lets you pass complex objects over the network without writing code to parse and reconstruct them.

5.4.1    RMI Details

To do RMI you must specify four classes:

1.      An interface for the remote object - Used by both the client and the server

2.      The RMI client - This will look up the object on the remote server, cast it to the type of the interface from Step 1, then use it like a local object. Note that as long as there is a “live” reference to the remote object, an open network connection is maintained. The connection will be automatically closed when the remote object is garbage collected on the client.

3.      The object implementation - This object needs to implement the interface of Step 1, and will be used by the server.

4.      The RMI server - This will create an instance of the object from Step 3 and register it with its URL.

Once you have written the four classes:

·         Compiles the remote object interface and implementation automatically.

 

The client system will need the client class, the interface class, and the client stub class

The server system will need the server class, the remote object interface and implementation, and the server skeleton class

Then to run the system:

·         This only needs to be done once, not for each remote object

·         The current version of RMI requires this registry to be running on the same system as server

·         This step must be on the same machine as the registry of step 7.

 

This step can be done on an arbitrary machine.

5.4.2    The Four Required Classes: The Interface

 

import java.rmi.*;

 

/** The RMI client will use this interface directly.

 *  The RMI server will make a real remote object that

 *  implements this, then register an instance of it

 *  with some URL.

 */

 

public interface Rem extends Remote {

  public String getMessage() throws RemoteException;

}

5.4.3    The Four Required Classes: The Client

 

import java.rmi.*; // For Naming, RemoteException, etc.

import java.net.*; // For MalformedURLException

import java.io.*;  // For Serializable interface

 

public class RemClient {

  public static void main(String[] args) {

    try {

      String host = (args.length > 0) ? args[0] : "localhost";

      Rem remObject = (Rem)Naming.lookup("rmi://" + host + "/Rem");

      System.out.println(remObject.getMessage());

    } catch(RemoteException re) {

      System.out.println("RemoteException: " + re);

    } catch(NotBoundException nbe) {

      System.out.println("NotBoundException: " + nbe);

    } catch(MalformedURLException mfe) {

      System.out.println("MalformedURLException: " + mfe);

    }

  }

}

 

5.4.4    The Four Required Classes: The Remote Object Implementation

 

import java.rmi.*;

import java.rmi.server.UnicastRemoteObject;

 

public class RemImpl extends UnicastRemoteObject

                     implements Rem {

  public RemImpl() throws RemoteException {}

 

  public String getMessage() throws RemoteException {

    return("Here is a remote message.");

  }

}

 

5.4.5    The Four Required Classes:  The RMI Server

 

import java.rmi.*;

import java.net.*;

 

public class RemServer {

  public static void main(String[] args) {

    try {

      RemImpl localObject = new RemImpl();

      Naming.rebind("rmi:///Rem", localObject);

    } catch(RemoteException re) {

      System.out.println("RemoteException: " + re);

    } catch(MalformedURLException mfe) {

      System.out.println("MalformedURLException: " + mfe);

    }

  }

}

 

5.4.6    Compiling and Running the System

 

1.  Compile the Client and the Server
Prompt> javac RemClient.java

2.  Compile the Rem interface and RemImpl object automatically
Prompt> javac RemServer.java

3.  Generate the Client Stub and Server Skeleton
Prompt> rmic –classpath . RemImpl

·         This builds RemImpl_Stub.class and RemImpl_Skeleton.class

·         The client machine needs Rem.class, RemClient.class, and RemImpl_Stub.class

·         The server machine needs Rem.class, RemImpl.class, RemServer.class, and RemImpl_Skeleton.class

·         Java 1.4 no longer builds the skeleton class explicitly.

·         With Java 1.5 rmic is done automatically during compilation and NO stub or skeleton classes are created explicitly.

4.  Start the RMI Registry
Server> rmiregistry

5.  Start the Server
Server> java –classpath . RemServer

6.  Start the Client
Client> java RemClient –classpath . hostname
Here is a remote message.

5.4.7    Summary

 

 

5.5    Physical Deployment and Distributed Systems

Once an application has been divided into logical processes, the designer must decide how deploy these processes to physical hardware before the final programming details can be determined. The key ideas are:

Once those decisions have been made, there are several configurations to consider.

5.5.1    Single Process, multiple thread

The single process can be hosted on any computer that can run it. Typically that means the program must be compiled on a computer of the same type as the host. [Java is different. Java compilers all produce the same byte codes; the host computer just needs an installation of the Java Virtual Machine.]

All the shared data issues must be solved using semaphores or monitors.

5.5.2    Single Process, single thread, simulated concurrency

Before multi-threading, programmers could simulate concurrency by organizing their code into sections and calling them in sequence as follows:

//case 1: simulated concurrency

//global variables

main () {

  forever do {

     call A();

     call B();

     call C();

}

Now, of course one would reorganize the sections as threads and write something like:

//case 2: concurrency with threads

//global variables

main () {

     start_Thread A; //A is: forever do {call A();}

     start_Thread B; //B is: forever do {call B();}

     start_Thread C; //C is: forever do {call C();}

}

In case 1 there no interleaving of code between the sections because the operating system schedule sees just one process with one thread of control. A() runs to completion, then B() runs to completion, then C() runs to completion, and so on. Thus even though the three sections share data there is no critical section problem. [Make sure you understand this.]

In case 2, the operating system sees each thread and can schedule them separately. Thread A might be interrupted in the middle of executing A() and thread B continued. We have interleaving among A(), B(), and C() that we didn’t have in the first case. And, of course, we have all the data sharing problems.

5.5.3    Multiple-Process

Multiple processes can be deployed on separate computers. Each process will obviously have its own address space. Therefore:

5.6    Another Deployment Strategy - Web Services

We know that reusing code is wonderful. But what about offering services to application programmers dynamically through distributed servers? The big questions are how do you write such a service, how does an application programmer find out about your service and how does he use it? The idea of remote services is not new.

5.6.1    A Little History about Application/Service Topologies

Before distributed hardware, services were offered through subroutines. In single-machine, multi-process configurations, IPC (inter-process communication) was a way for different processes to communicate. Soon, multi-machine communications became real.

Distributed Applications (1st Gen.)

Local Area Networks (LAN) featured remote procedure calls, messaging, and remote file servers. With the invention of the Internet came new applications such as ftp and email communicating using protocols built on top of TCP/IP.

5.6.1.1  Distributed Applications – 2nd Gen.

Still there were no BIG systems linked together. National Software Works (1976) tried to link software via a procedure-call-protocol over the Internet. Limited Client/Server configurations came into being, especially involving databases. But, still no large integrated system existed featuring interoperability.

5.6.1.2  Distributed Applications – 3rd Gen.

Maybe the answer was to give programmers more powerful tools. CORBA was such a system. It featured

Despite GIANT effort, CORBA had no real impact. Programmers like to roll their own application! What will programmer's unify around? The Web!!

5.6.2    World Wide Web (WWW)

5.6.2.1  An Overview

The WWW is a 3-tiered architecture:

·         Tier one is the browser displaying HTML pages. It communicates with the web server via the HTTP protocol.

·         Tier two is the web browser retrieving pages (and generating them as well).

·         Tier three is the database to store application data.

Applications written for the web are loosely connected via HTTP protocol, not the tight subroutine interface common to traditional programming. At the web server, scripts (perl, servlets, asp) decode the HTTP and invoke subroutines or methods like any other code. Mostly the web is about CONTENT retrieval. eCommerce has extended the Web with a powerful application.

What's appears to be on the horizon is more powerful web applications! Many technologies are shifting their focus to be compliant with the Web. A few are:

·         Knowledge Representation ŕ Semantic Web

·         Java Platform ŕ J2EE

·         CORBA ŕ Web Services

·         Windows Platform ŕ .NET Platform

 

5.6.3    Motivation for Web Services

Programmers want their code to be used. The web offers a way of making that code available as a service rather than a Product. And why not?

·         It will be easy to maintain

·         There is a central point of change, rather than distributing service packs or hot fixes.

·         The Black Box model is convenient. The reference to the service remains the same.

·         Easy to charge for services (has not been shown to be the case).

·         Easy to avoid piracy

·         Achieve a balance between cost and usage.

At least that is the theory.  It’s easy to identify services that programmers want. Here are some of the services already in .NET.

·         Identity - Passport

·         Notification and Messaging - email, fax

·         Personalization

·         Store

·         Calendar

·         Directory and Search - find Services and people

·         Dynamic Delivery

 

5.6.4    Requirements of a Service

 

Clients must be able to discover services. (Universal Description Discovery and Integration - UDDI)

Providers must be able to describe their service (Web Services Description Language – WSDL)

Providers must be able to describe the interactions (conversations) to use the service (Web Services Conversation Language - WSCL)

 

5.6.5    Universal Description Discovery and Integration - UDDI

The UDDI XML schema (which we won’t go into here) defines four core types of service information:

·         Business information (such as business name and contact information)

·         Business service information (general technical and business descriptions of web services),

·         Binding information (specific information needed to invoke a service)

·         Service specification information (associating a service’s binding information with the business service information it implements).

 

5.6.6    Web Services Description Language – WSDL

A service must be able to describe its abstract interfaces and protocol bindings so that clients can figure out how to invoke it. It is much more complicated than our normal programming language interface.

See WSDL Schema for an example of the description.

See WSDL Example (Instance) for an example of an invocation.

 

5.6.7    Web Services Conversation Language - WSCL

A service must be able to describe the kinds of interactions (conversations) that it supports (e.g., that it expects clients to login before they can request a catalog) so that it can engage in complex exchanges with its clients. That why programming packages often come with sample programs—you can’t figure out how to use the code without an example.

WSCL is a graph formalism for describing proper usage. See WSDL Graph (p. 5) for an example.

 

5.6.8    SOAP (Simple Object Access Protocol)

Soap is a syntax for encoding calls to web services. It offers a standard on-wire protocol to describe the action and results of a web service function invocation. It is basically a way of describing a remote procedure call (RPC) as text. Some features:

·         It is XML Based

·         It uses HTTP (thus can bypass firewalls) or any other protocol (SMTP, etc).

·         It is a Cross Platform/Language standard

 

SOAP is not a big deal. It’s just a format.

5.6.8.1  SOAP Example

Here is the SOAP response to a service request that asked for the name of a snowboarder who endorsed a particular snow board:

<SOAP-ENV:Envelope

  xmlns:SOAP-ENV=http://schemas.xmlsoap.org/soap/envelope/
  SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">

  <SOAP-ENV:Body>
      <m:GetEndorsingBoarderResponse
            xmlns:m="http://namespaces.snowboard-info.com">
            <endorsingBoarder>Chris Englesmann</endorsingBoarder>
      </m:GetEndorsingBoarderResponse>
  </SOAP-ENV:Body>

</SOAP-ENV:Envelope>

You can see the answer is “Chris Englesmann.All the rest is “tags” that form the structure.

Web servers must be “Soap-enabled” to be able to deal with web services. Most are. The most common:

IIS 5.0 – Microsoft’s .NET implementation

Apache –  SOAP Module Built by IBM

Tomcat (Java Server) - Apache Axis is a package to help build web services.

 

We’ll see how success Web Services become.

6      An Example of a Distributed Application - Agent Systems

You will be working with an agent application as your first system. The Agent Roadmap will document the system. In working with this system you will be exposed to most of the issues raised above:

·         Concurrency – Each agent is logically a separate process.

·         Multi-threaded process – The first implementation assigns a thread to each agent within a single process.

·         Distributed deployment – You will change the agent application from being multi-threaded to being distributed and multi-process using Java’s Remote Method Invocation.

·         Synchronization – There are several examples of the need for synchronized access to shared data.

·         Semaphores – The agent scheduler uses a semaphore to know when something has happened.

·         Messaging – Agents communicate using messaging.

·         Requirement Analysis – You will see how implementation-shortcuts are justified by analyzing the requirement, especially with messaging.

·         Unit Testing – You will use the JUNIT package to make your unit tests persistent.

…and many more things as well.

7      Object-Oriented Design

Section 2.1 introduced Divide and Conquer as a fundamental technique for solving complex problems. We didn’t say how to divide up the problem, only that decomposition into smaller parts works. For complex software problems what those parts look like defines various software methodologies. In vogue right now is object-oriented design and analysis.

7.1    Object-Oriented Design Basics

Chapter 13 in our text does a nice job of explaining object-oriented design. The highlights are:

·         Requirements elicitation identifies the use-cases, i.e. what the system is supposed to do.

·         Convert the use-cases into scenarios that expose the detail of the behavior

·         Identify the objects (classes) in those scenarios.

·         Identify the data and methods of the classes.

·         Build a main routine to program the application.

Of course, it’s not that simple. There will be iteration. Programming classes that are not part of the domain will emerge. Conflicts will be found and so on. Still, object-oriented analysis seems to be an improvement for programming development.   

7.2    Decomposition – Algorithmic vs. Object-Oriented

Before object-oriented there was algorithmic decomposition. The idea was to divide a problem into modules where each module denotes a major step in the overall process. Each module could then be further refined in a similar manner. This decomposition technique, best known as top-down structured programming, was taught by Edsger Dijksta, one of our most famous computer scientists (he died in 2002). Dijskstra was also responsible for identifying and solving the critical section problem we studied in section 4.5.2.

Algorithmic decomposition would yield trees like the following:

 

Update File

 
 

 

 

 

 

 

 

 

 

 

 

 


Figure 7 – An algorithmic decomposition

Thus, to do an Update File, first you do Get Master, then Get Updates, then Match, then Do Update. Any of these steps could be further decomposed. The process of doing this decomposition was natural to us but hard. Interestingly, hardly any two designers would come up with anything resembling the same design using top-down techniques.

Then came object-oriented design, in which the boxes were abstractions, not of the functions of the process, but the objects in the domain. So, for the above problem an object-oriented diagram might look like:

 

Figure 8 – An object-oriented decomposition

The picture doesn’t show the process anymore. It shows the domain objects and the messages that can be sent between them. When decoding requirements, object-oriented practitioners talk about identifying objects by finding the nouns in the sentences, and finding the methods by finding the verbs.

There is a duality between algorithmic and object-oriented design. By that I mean that both are different views of the same thing. And we can implement either view to get the same functionality. The key question is why would you choose one view over the other? Here are some thoughts about that question:

Object-oriented analysis aligns better with domain analysis. Domain analysis, part of any requirements phase, involves understanding the domain of the application, especially identifying the objects of the domain and what constraints exist between them. That analysis translates well to an object-oriented design.

The programmed flow of algorithmic code aligns with how the system is supposed to behave. It is undeniable that many people have trouble understanding how an object-oriented system works by just looking at the code. A “trace” of the execution of the object-oriented code will match the flow “programmed into” the algorithmic code. This duality is not easily understood.

If you coded an application both ways, all the code used in programming algorithmically will be “found” somewhere in the object-oriented system. It’s just organized differently. The object-oriented view produces modules that are more reusable and that may be all the advantage object-oriented needs to supplant algorithmic design.

Changing an application designed algorithmically can be devastatingly hard. Somewhat surprisingly, object-oriented systems seem more accommodating to change. Mostly because of process called refactoring.

7.3    Refactoring

Suppose you were designing a system to play the 4-person card game of Hearts. As part of an object-oriented design you might have a class specification like:

public class HeartGame{

     private Player[] player = new Player[4];

     private int gameID;

     private DeckOfCards deck;

     private HeartsScore score;

 

     public HeartGame () {} //constructor

     public boolean Register(Player p) {}

     private void startGame(){}

     private void scoreGame() {}

}

Don’t worry about the details. Just assume this design for now. Now suppose you want to add the 4-person game of Bridge to the system. A BridgeGame class might look like:

public class BridgeGame{

     private Player[] player = new Player[4];

     private int gameID;

     private DeckOfCards deck;

     private BridgeScore score;

 

     public BridgeGame() {} //constructor

     public boolean Register(Player p);

     private void startGame();

     private void scoreGame();

     private in getGameID();

}

A (not very) close look shows that the two classes have some of the same data (player, gameID, deck, score) have two common methods (Register(),  getGameID()), and share some of the same method abstractions (startGame(), scoreGame()). As good object-oriented programmers you would recognize these similarities and refactor the classes as follows:

·         Create a base class called Game;

·         Move the common data to the base class;

·         Move the common methods to the base class.

·         Create abstract methods for the methods you want inherited classes to implement.

When you’ve done all this, you will have three classes as follows:

public class Game{

     private Player[] player = new Player[4];

     private int gameID;

     private DeckOfCards deck;

 

     public Game () ; //constructor

     public boolean Register(Player p);

 

//derived class must implement all abstract methods

     abstract private void startGame();

     abstract private void scoreGame();

}

 

public class HeartGame extends Game{

     private HeartsScore score;

 

     public HeartGame () ; //constructor

     private void startGame();

     private void scoreGame();

}

 

public class BridgeGame extends Game{

     private BridgeScore score;

 

     public BridgeGame () ; //constructor

     private void startGame();

     private void scoreGame();

}

An expert designer might argue that you should have anticipated that this was going to happen and had a Game base class right from the beginning. If your requirements had mentioned that both Hearts and Bridge were going to be implemented, the expert might have a point.

However, Extreme Programming says not to do this for two main reasons:

·         The requirements may change and the second game may never be implemented. The time and effort designing and managing the base class will have been wasted. And a design without the base class is easier to understand.

·         Refactoring is NOT THAT HARD.

Sometimes this argument is framed in terms of flexibility. The argument goes that you should design your code so that the inevitable changes will be easy to incorporate. Expert designers believe strongly in this principal and will design frameworks of abstract base classes that can be extended in any way possible. These flexible designs can be very complicated and that’s the point. Often, that flexibility is unused and all that effort and complexity are wasted. We will argue in class about this controversial issue.

7.4    Is Object-Oriented Design the same as Object-Oriented Programming?

The short answer is NO. Programmers have always understood the need for data structures. And, of course, data structures came with operations and semantics. So, many algorithmic programmers just used classes to define their data structures and did everything else the same way.

They ignored class hierarchies, inheritance, and polymorphism. They divided their main routine into the processing chunks we saw in  Figure 7. Then they further divided those into smaller chunks.

Even “experienced” object-oriented programmers sometimes code too much “process” into a method. Utilizing the full object-oriented methodology takes practice and plenty of object-oriented thinking. The agent methodology we explored in section 5.5 forces object-oriented thinking without using all parts of object-oriented design.

8      3-Tier Architecture and Web Applications

As we’ve said before, an architecture defines an object’s structure, or organization. We use the term freely to describe anything whose structure we want to name. The first programming system used by this class has an agent architecture.

We also use architecture to describe the hardware or deployment structure of applications. Deployment architectures and software architectures are related, but they are not the same. Sometimes this can be confusing. Our discussion of 3-tier architecture will clarify all the issues.

8.1    What is a 3-Tier architecture?

A 3-tier architecture has three logical components:

·         The first tier is the user interface.

·         The second tier runs the business logic.

·         The third tier stores the data.

 

It has the following big block structure:


 

 

 

 

 

 

 


Figure 9 - 3-Tier Architecture

Each of these three components can be hosted on separate computers, and typically are. Often, however, the database runs on the same computer as the business logic. No problem. All we really care about in drawing such a diagram is to emphasize that we have three separate processes, each of which has its own address space.

The most interesting thing about 3-tier is defining the arrows, i.e., how do the different components communicate. In virtually every application, the backend two tiers (the business logic and data base) communicate using a database protocol called SQL (Standard Query Language). That is, the database is set up as a server and any business logic component can access it in a standard way as shown in Figure 10.

 

 

 

 

 

 

 

 


Figure 10 – Multiple business applications accessing the same data base server

The user process and business logic can communicate using any protocol the designers wish. Typically choices are technologies like Corba or TCP/IP. The World Wide Web changed all that.

8.2    What is a Web Application?

A web application is a specific kind of 3-tier application:

·         The first tier is a web browser. Web browsers display and execute documents written in the HyperText Markup Language (HTML).

·         The second tier is a web server. It runs the code that responds to requests from web browsers.

·         The first two tiers communicate using the HyperText Transfer Protocol (HTTP).

Though web applications can vary widely, most have these features:

·         User interfaces run on web browsers using HTML. Fortunately, HTML can be augmented by images, script languages, code for plug-ins, and so forth. Almost any interface need can be implemented in HTML, though sometimes the look and feel seems clumsy when compared to a native GUI.

·         Data is gathered through the use of HTML forms. Forms are linked to code that receive and process the information. The code runs on a web server.

·         The receiving code must be HTTP compliant. This means parsing the HTTP request stream (for parameters), doing the work required by the form, and generating an HTML reply. The receiving code, often called CGI code, can be written in many languages (Perl, PHP, C, C++, Java, C#). When the language is Java, the code is called a servlet.

8.3    What is a Web Framework?

A framework is an environment that simplifies building certain kinds of applications by forcing some decisions on you. By accepting those decisions some capabilities are specialized, emphasized, and simplified. The Web currently has two dominant frameworks: .Net for Microsoft applications and J2EE for Java applications.

Both frameworks offer elaborate interactive development environments (IDE). Several vendors supply the J2EE framework. Both frameworks have:

·         Extensive class libraries and tools to support HTTP programming;

·         Tools for generating HTML documents;

·         Tools to support deploying a Web Application;

·         An embedded Web Server to help test the application.

8.4    An Example of a Web Application - Multi-User Game Playing

Your last programming system will be a web application for playing multi-user games. The Games Roadmap will document the system. In working with this system you will be exposed to most of the issues raised in section 5 and section 8:

·         Object-Oriented Design – Initially, the application will be implemented using object-oriented analysis and design.

·         3-Tier architecture – You will rearchitect your application so that it can be deployed as a 3-tier application in the J2EE framework.

·         Web Browser Front-End – You will implement the user interface as HTML pages with embedded forms.

·         Refactoring – You will refactor your object-oriented classes when you add a second game to the application.

·         Requirement Analysis – You will see how implementation-shortcuts are justified by analyzing the requirement, especially with messaging.

·         Unit Testing – You will use the JUNIT package to make your unit tests persistent.

…and many more things as well.

9      Design Patterns

As we have said before in section 2.2, reuse is one of themes of an effective decomposition. Nothing is more fulfilling to a manager than to see his team save time, effort, and money by being able to reuse something they have previously done. And reuse is not limited to physical objects or code, we can reuse ideas as well. In the programming world Design Patterns has come into fashion.

Chapter 6 in our textbook discusses design patterns. The book says “Design patterns are class combinations and accompanying algorithms that fulfill common design purposes. A design pattern expresses an idea rather than a fixed class combination. Accompanying algorithms express the pattern’s basic operation.”

Design patterns fall into one of three categories:

·         Creational Patterns – Creating a collection of objects in flexible ways. Examples are Factory, Prototype, Singleton.

·         Structural Patterns – Representing a collection of related objects. Examples are Composite, Decorator, Façade.

·         Behavioral Patterns – Capturing behavior among a collection of objects. Examples are Chain of Responsibility, Command, Mediator.

We will discuss many of these patterns in class.

10  XML

10.1Introduction

EXtensible Markup Language (XML) is a language for describing the content of a document. XML does not specify the tag set or grammar of the language; that is up to the application. The Tag Grammar is described via Document Type Definition (DTD) or XML Schema. XML with a DTD or Schema is designed to be self-descriptive. But remember, XML by itself does not do anything!!!

When people refer to XML, they typically are referring to XML and related technologies.

 

 


XML is a simple syntax for formatting documents. It uses “tags” to separate fields so that it is easily parsable. As we see from the above diagram XML is used in many different technologies: The Semantic Web as RDF, HTML, ANT  and web server configuration files, etc.

The examples and much of the text that follows come from “Core Web Programming” by Marty Hall and Larry Brown.

10.1.1 Simple XML Document

 

<?xml version="1.0"encoding="ISO-8859-1"?>
<authors>    <!----- root element -->
  <name>     <!----- child element -->
    <firstname>Larry</firstname>
    <lastname>Brown</lastname>
  </name>
  <name>
    <firstname>Marty</firstname>
    <lastname>Hall</lastname>
  </name>
</authors>
<!-- closing tag of root element -->

The above document is about authors, two of which are shown by their name, which is in turn a  firstname followed by a  lastname. The allowable tags and their placement are defined either by a Document Type Definition (DTD) or an XML Schema (XSD).

10.1.2 Defining the Tags with a DTD

Here is the DTD for the authors document above:

<DOCTYPE authors [
<!ELEMENT authors (name)*>
<!ELEMENT name (firstname, lastname)>
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>
]>

It says, in BNF-like form, that authors comprise 0 or more instances of name. That a name is a firstname followed by a lastname, that a firstname and lastname are PCDATA (parsed strings). These definitions can be inline or in a separate file. The details will be discussed later

10.1.3 XML Resources

There are many places to go to find information about XML. Here are a few:

10.2XML Components

An XML document is made up a prolog (the xml version, entity definitions, and DOCTYPE) followed by the XML Building Blocks (elements, tags and attributes, PCDATA and CDATA, entities, and comments)

10.2.1 XML Prolog

Here is an example:

 <?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>

 

We will see that the prolog can contain inline entity and DTD definitions

10.2.2 XML DOCTYPE

The doctype usually specifies the location of the defining DTD; the doctype can also contain the actually DTD definitions inline.  Here are the common forms: 

<!DOCTYPE root [InternalDTD]>

<!DOCTYPE root SYSTEM URL>

<!DOCTYPE root PUBLIC FPI-identifier URL>

<!DOCTYPE root [dtd-definitions]>

                

Here are some examples

<!DOCTYPE book SYSTEM "DTDs/CWP.dtd">
     
This specifies a dtd relative to the XML document it came from.

 

<!DOCTYPE book SYSTEM  "http://www.corewebprogramming.com/DTDs/CWP.dtd">
     
This specifies a dtd via a full URL..

A public DTD is slightly more complicated:

  <!DOCTYPE root PUBLIC FPI-identifier URL>

The Formal Public Identifier (FPI) has four parts:

Here are some FPI examples

<!DOCTYPE Book

   PUBLIC "-//W3C//DTD XHMTL 1.0 Transistional//EN"

   "http://www.w3.org/TR?xhtml1/DTD/xhtml1-transitional.dtd">

 

<!DOCYTPE CWP

   PUBLIC "-//Prentice Hall//DTD Core Series 1.0//EN"

   "http://www.prenticehall.com/DTD/Core.dtd">

10.2.3 XML Comments   

XML comments are the same as HTML comments:

  <!-- This is an XML and HTML comment -->

10.2.4 XML Root Element

All XML-aware applications recognize beginning and end of document via the root element. In the example below, the root element is <book>.

 

      <?xml version="1.0" ?>

      <book>

        <title>Core Web Programming</title>

        <contents>

          <chapter number="1">

            Designing Web Pages with HTML

          </chapter>

          <chapter number="2">

            Block-level Elements in HTML 4.0

          </chapter>

          <chapter number="3">

            Text-level Elements in HTML 4.0

          </chapter>

          ...

        </contents>

      </book>

10.2.5 XML Tags and Elements

The typical XML element has a start tag and an end tag with the value of the element between them:

        <elementOne> … </elementOne>

When an element has no value, the end tag is shown only by a “/”. This is called an empty element tag:

        <elementTwo />

Tag names are case sensitive, and start with a letter or underscore. After first character numbers, - and . are allowed, but no white space. Don't use colon—they are for indicating namespaces. All tags are completely nested (tag order cannot be mixed).

10.2.6 XML Element Attributes

Elements can also have attributes, which are located in the start tag:

  <message to="Gates@microsoft.com" from="Gosling@sun.com">

    <text>We put the . in .com.

          What did you do?

    </text>

  </message>

Every attribute must be enclosed in "…" with no commas in between.

The general rule is to use elements for presentable data and attributes for meta data. Here are two ways to model a textbook chapter:

<chapter number="23" focus="Server-side programming"

title=”XML Processing with Java”>

This chapter is about ...

</chapter>

 

In the above method, the author has decided that the value of the chapter is the text in between the two tags. The number, focus, and title are simply information about the chapter (i.e., meta) and therefore are represented as attributes.

In the method below, the author has made all the information presentable, i.e., as elements of the value for the chapter.

<chapter>

  <number>23</number>

  <focus>Server-side programming</focus>

  <title>XML Processing with Java</title>
  <text> This chapter is about...</text>

</chapter>

Which do you prefer? It’s not always obvious which method is “correct.”

10.3Well-Formed versus Valid                                 

An XML document is well-formed if it follows basic syntax rules, i.e. tags match, tags embed correctly, etc. An XML document is valid if its structure matches a Document Type Definition (DTD).

10.4Document Type Definition (DTD)

A DTD defines the structure of the document. In particular:

As we saw the prolog can contain an inline DTD or a reference to an external DTD. Here is an inline example:

<?xml version="1.0"?>

<!DOCTYPE perennials [

<!ELEMENT perennials (daylily)*>

<!ELEMENT daylily (variety, award*, bloom, cost)+>

<!ATTLIST daylily

   status (in-stock | limited | sold-out) #REQUIRED>

<!ELEMENT variety (#PCDATA)>

<!ELEMENT award (name, year)>

<!ELEMENT name (#PCDATA)>

<!ATTLIST name note CDATA #IMPLIED>

<!ELEMENT year (#PCDATA)>

<!ELEMENT bloom (#PCDATA)>

<!ATTLIST bloom code (E | EM | M | ML | L | E-L) #REQUIRED>

<!ELEMENT cost (#PCDATA)>

<!ATTLIST cost discount CDATA #IMPLIED>

<!ATTLIST cost currency (US | UK | CAN) "US">

]

<perennials>

    

</perennials>

10.4.1 Defining DTD Elements

Defining DTD elements is like using BNF to define a grammar.

<!ELEMENT name elementDefinition>

 

The elementDefinition types can be any of:

In defining elements, you can add cardinality constraints and list operators:

[none]     Default (one and only one instance)

?          0, 1

*          0, 1, …, N

+         1, 2, …, N

,          Sequence (in order)

|         Choice (one of several)

So that in the element:

<!ELEMENT daylily (cultivar, award*, bloom, cost)+>

·         exactly 1 cultivar

·         0 or more awards (because of the *)

·         exactly 1 bloom

·         exactly 1 cost

In the following element:

<!ELEMENT id (#PCDATA | catalog_id)>

In the following element:

<!ELEMENT Elem0 (Elem1?, Elem2*)+>

In the following element:

<!ELEMENT Elem0 ((Elem1, Elem2) | Elem3)*>

This formulation comes from the specification of Regular Sets.

Here is a full example with inline definitions:

  <?xml version="1.0" standalone="yes"?>

  <!DOCTYPE Person [

      <!ELEMENT Person ( (Mr|Ms|Miss)?, FirstName,

            MiddleName*, LastName, (Jr|Sr)? )>

  <!ELEMENT FirstName (#PCDATA)>

  <!ELEMENT MiddleName (#PCDATA)>

  <!ELEMENT LastName (#PCDATA)>

  <!ELEMENT Mr EMPTY>

  <!ELEMENT Ms EMPTY>

  <!ELEMENT Miss EMPTY>

  <!ELEMENT Jr EMPTY>

  <!ELEMENT Sr EMPTY>

  ]>

  <Person>

    <Mr/>

    <FirstName>Lawrence</FirstName>

    <LastName>Brown</LastName>

  </Person>

 

10.4.2 Defining Attributes

The attribute definition looks like:

<!ATTLIST element attrName type modifier>

Examples:

<!ELEMENT Customer  (#PCDATA )>

<!ATTLIST Customer id CDATA  #IMPLIED>

 

<!ELEMENT Product (#PCDATA )>

<!ATTLIST Product

          cost CDATA  #FIXED "200"

          id   CDATA  #REQUIRED>

10.4.3 Attribute Types

Here are the types allowable for attributes:

      <!ATTLIST Customer id CDATA  #IMPLIED>

 

<!ATTLIST restaurant charges
          (cheap|moderate|expensive)  "cheap">

 

ID, IDREF, NMTOKEN, NMTOKENS, ENTITY, ENTITIES, NOTATION

10.4.4 Attribute Modifiers

There are a variety ways to modify an attribute:

     <!ATTLIST cost discount CDATA #IMPLIED>

<!ATTLIST account balance CDATA #REQUIRED>

 

<!ATTLIST interpreter language CDATA #FIXED "EN">

     <!ATTLIST car color (red | white | blue) "white" )

10.4.5 Defining Entities

Entities are shortcuts to common text.

<!ENTITY name "replacement">

For example:    <!ENTITY copyright "Copyright 2001">

You could use the copyright as follows:

    <book>

      <title>Core Web Programming, &copyright;</title>

    </book>

Entities can be defined in an external file:

<!ENTITY copyright SYSTEM "http://x.y.com/entities.dtd">

 

 

10.4.6 Limitations of DTDs

DTD’s have limitations:

XML Schema is intended to resolve these issues but DTDs are going to be around for a while. Before we discuss XML Schema, we must learn about XML Namespaces.

10.5XML Namespaces

XML Namespaces provides a method to avoid element name conflicts. The following XML element is about an HTML table:

<table>
   <tr> <td>Apples</td> <td>Bananas</td> </tr>
</table>

The following XML element is about a piece of furniture:

     <table>

          <name>African Coffee Table</name>

          <width>80</width>
          <length>120</length>

     </table>

These two elements can't coexist in the same document. We need a namespace convention to distinguish them. The general namespace specification is:

xmlns:namespace-prefix="namespace"

where the namespace is an URI. Here's an example:

<f:table xmlns:f="http://www.w3schools.com/furniture">   

<f:name>African Coffee Table</f:name>

<f:width>80</f:width>
     <f:length>120</f:length>
</f:table>

The URI is not used by the parser to look up information. The only purpose is to give the namespace a unique name. However, it is often a pointer to a real Web page containing information about the namespace.

A default namespace for an element saves us from using prefixes in all the child elements. In the following example notice that no prefix is given after the xmlns attribute:

<table  xmlns="http://www.w3schools.com/furniture">

     <name>African Coffee Table</name>

     <width>80</width>
     <length>120</length>
</table>

Here is an example with two namespaces, the (unspecified default) is HTML, the other is from XSL:

<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/xsl">
<xsl:template match="/">
     <html>
     <body>
          <table border="2" bgcolor="yellow">
          <tr>
              <th>Title</th> <th>Artist</th>
          </tr>
          <xsl:for-each select="CATALOG/CD">
          <tr>
              <td><xsl:value-of select="TITLE"/></td>
              <td><xsl:value-of select="ARTIST"/></td>
          </tr>
          </xsl:for-each>

                </table>
     </body>
     </html>
</xsl:template>
</xsl:stylesheet>

 

10.6XML Schema Definition (XSD)

XML Schema is an alternative to a DTD. It became the official W3C recommendation in May 2001. Some benefits are:

Clearly the future but there is limited support right now.

10.6.1 The Old Way

A Sample XML file

<?xml version="1.0"?>
<note>
     <to>Tove</to>
     <from>Jani</from>
     <heading>Reminder</heading>
     <body>Don't forget me this weekend!</body>
</note>

It's DTD might be:

<!ELEMENT note (to, from, heading, body)>

<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>

10.6.2 Using XML Schema

Here’s how it might be defined using an XML schema located in note.xsd:

<?xml version="1.0"?>
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema
     targetNamespace=http://www.w3schools.com
     xmlns=http://www.w3schools.com
     elementFormDefault="qualified">

  <xs:element name="note">   
     <xs:complexType>     
       <xs:sequence>   
          <xs:element name="to" type="xs:string"/>
          <xs:element name="from" type="xs:string"/>
          <xs:element name="heading" type="xs:string"/>
          <xs:element name="body" type="xs:string"/>
      </xs:sequence>   
     </xs:complexType>
  </xs:element>
</xs:schema>

The details will follow.

10.6.3 References in the XML File

A reference to a DTD file might be:

<!DOCTYPE note SYSTEM
         "http://www.w3schools.com/dtd/note.dtd">
<note>
     <to>Tove</to>
     <from>Jani</from>
     <heading>Reminder</heading>
     <body>Don't forget me this weekend!</body>
</note>

A reference to a Schema file might be:

<note
     xmlns=http://www.w3schools.com
     xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance
     xsi:schemaLocation=
          "http://www.w3schools.com/schema/note.xsd">

     <to>Tove</to>
     <from>Jani</from>
     <heading>Reminder</heading>
     <body>Don't forget me this weekend!</body>
</note>

10.6.4 XSD Simple Elements

Here is how to define a simple element:

<xs:element name="xxx" type="yyy"/>

Where yyy can be one of:

xs:string

xs:decimal

xs:integer

xs:boolean

xs:date

xs:time

For example:

<xs:element name="color" type="xs:string" default="red"/>

<xs:element name="age" type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>


10.6.5 Complex Elements

A complex element is an XML element that contains other elements and possibly attributes. Here is an example.

<xs:element name="employee">

<xs:complexType>
 <xs:sequence>
   <xs:element name="firstname" type="xs:string"/>
   <xs:element name="lastname" type="xs:string"/>
 </xs:sequence>
</xs:complexType>

</xs:element>

10.6.6 XSD Attributes

All attributes are declared as simple types. Only complex elements can have attributes! An example attribute in a "shoesize" def:

<xs:element name="shoesize">
  <xs:complexType>
     <xs:simpleContent>
       <xs:extension base="xs:integer">
        
<xs:attribute name="country" type="xs:string" />
       </xs:extension>
     </xs:simpleContent>
  </xs:complexType>
</xs:element>

With the above definition, the following could appear in XML statement:f

     <shoesize country="france">35</shoesize>

10.6.7 XSD Restrictions

Restrictions are used to control acceptable values for XML elements or attributes. Restrictions on XML elements are called facets. Here is an example which constrains age to be between 0 and 100.

<xs:element name="age">
  <xs:simpleType>
     <xs:restriction base="xs:integer">
          <xs:minInclusive value="0"/>
          <xs:maxInclusive value="100"/>
     </xs:restriction>
  </xs:simpleType>
</xs:element>

 

10.6.8 A Modern Schema Example   (shiporder.xsd)

Here is a complete example of an XSD specification:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

<!-- definition of simple elements -->
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string"/>
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>

<!-- definition of attributes -->
<xs:attribute name="orderid" type="xs:string"/>

<!-- definition of complex elements -->
<xs:element name="shipto">
  <xs:complexType>
     <xs:sequence>
          <xs:element ref="name"/>
          <xs:element ref="address"/>
          <xs:element ref="city"/>
          <xs:element ref="country"/>
     </xs:sequence>
  </xs:complexType>
</xs:element>

 

<xs:element name="item">
  <xs:complexType>
     <xs:sequence>
          <xs:element ref="title"/>
          <xs:element ref="note" minOccurs="0"/>
          <xs:element ref="quantity"/>
          <xs:element ref="price"/>
     </xs:sequence>
  </xs:complexType>
</xs:element>

<xs:element name="shiporder">
  <xs:complexType>
     <xs:sequence>
        <xs:element ref="orderperson"/>
        <xs:element ref="shipto"/>
        <xs:element ref="item" maxOccurs="unbounded"/>
     </xs:sequence>
     <xs:attribute ref="orderid" use="required"/>
  </xs:complexType>
</xs:element>

</xs:schema>

Here is an XML file using this schema:

<?xml version="1.0" encoding="ISO-8859-1"?>
<shiporder orderid="889923"
   xmlns:xsi=
     http://www.w3.org/2001/XMLSchema-instance
   xsi:NamespaceSchemaLocation="shiporder.xsd">

     <orderperson>John Smith</orderperson>
     <shipto>
          <name>Ola Nordmann</name>
          <address>Langgt 23</address>
          <city>4000 Stavanger</city>
          <country>Norway</country>
     </shipto>
    
<item>
          <title>Empire Burlesque</title>
          <note>Special Edition</note>
          <quantity>1</quantity>
          <price>10.90</price>
     </item>
     <item>
          <title>Hide your heart</title>
          <quantity>1</quantity>
          <price>9.90</price>
     </item>

</shiporder>

 

10.7Summary

11  Database Systems

Database systems answer a recurrent requirement of system designers: to store and retrieve voluminous amounts of persistent data.  In so doing they solve all of the difficult problem:

·         Atomicity of updates. An update must be considered as a critical section. Failures may leave database in an inconsistent state with partial updates carried out. E.g. transfer of funds from one account to another should either complete or not happen at all.

·         Concurrent access by multiple users.

·         Manage security centrally.

Database systems offer solutions to all the above problems.

The figure below shows a table from a database. It is a collection of 5 instances (or records), each record defined by its attributes (or fields). A database is defined formally by its schema.


11.1Schemas and Instances

The database schema defines the logical structure of the database:

·         What the fields of the table are. :

·         How tables relate to one another. :

The schema is analogous to class/type information of programming languages.An instance (or record) of the database is part of the actual content of the database at a particular point in time. This is like a programming variable and its value.

11.2Entity-Relationship Model

Most database people characterize a database using the Entity-Relationship (ER) Model. The example below shows two entities, a customer and an account related by the depositor relationship. A customer has four attributes (customer-id, etc.), while an account has two attributes (account-number and balance).

 

 

 

 

 

 

 

 

 

 

 


11.3A Sample Relational Database

A particular database is a collection of tables. Each table is a relation. Below we see the three tables that might exist for the ER diagram shown in the last section.


11.4Data Definition Language (DDL)

With an ER model of our database, we are ready to specify the database schema using the Data Definition Language (DDL). Here is how to define the account relation.

create table account (
            account-number    char(10),
            balance                 integer) 

The DDL compiler generates table definitions and stores in a data dictionary ready for users to create the instances.

11.5Data Manipulation Language (DML)

Once table instances are created, a Data Manipulation Language (DML) accesses and manipulates the data. SQL is the most widely used query language. We will look at SQL in detail, but here are two simple examples:

            select   customer.customer-name
            from     customer
            where  customer.customer-id = ‘192-83-7465’

 

          select   account.balance
            from     depositor, account
            where  depositor.customer-id = ‘192-83-7465’ and
               
depositor.account-number = account.account-number

11.6Database Administrator

Companies with large, critical data often have a separate database administrator with duties that include:

 

11.7Designing the Relations

Table relations are designed with many requirements in mind:

·         Ease of access

·         Storage efficiency

·         Simplifying maintenance

·         “Naturalness”

When you study the theoretical issues about database systems, you will learn about normalization. Normalization takes existing table relations and transforms according to certain rules to satisfy various requirements. The whole idea is to create optimum schemas. For our purposes, just think about creating tables that store only information related to the intent of the relation, and no more.

For example, information about a banking enterprise could be broken up into parts, each represented by a relation.

The question is why have three tables? Couldn’t we put all the information in one table?

11.8Why not one table?

Storing all information about customers and accounts as a single relation such as:
   bank (account-number, branch-name, balance, customer-name, customer-street, customer-city)

This might produce a table that looked like:

account-number

branch-name

balance

customer-name

customer-street

customer-city

A-101

Downtown

500

Smith     

4 North St.

Rye

A-102

Downtown

600

Smith     

4 North St.

Rye

A-103

Perryridge

700

Hayes

3 Main St.

Harrison

null

null

null

White

75 First St.

Palo Alto

The above table has two serious problems:

Normalization deals with how to design relational schemas to avoid those problems. Normalization centers on which attributes uniquely identifies an entity.

11.9Keys

Let K be a subset of the attributes. K is a key of R if values for K are sufficient to identify a unique tuple (row). Consider the following customer table:


 

Note that there is no ID field here. Do you see any field that by itself could be a key. The customer-name field doesn’t work—last names are not unique. The {customer-name, customer-street} combination doesn’t work since the same street name could exist in different cities. The {customer-name, customer-street, customer-city} doesn’t work either. What if two members of the same household each have an account?

The easiest way to solve this problem is introduce the notion of a unique identifier for use as the key. So, the customer table below works. Look at the first two rows; two different customers who coincidently have the same name, street and city can be told apart.

 

customer-id

customer-name

customer-street

customer-city

c-001-001

Smith     

4 North St.

Rye

c-001-002

Smith     

4 North St.

Rye

c-001-003

Hayes

3 Main St.

Harrison

c-001-004

White

75 First St.

Palo Alto

 

11.10                SQL

SQL is the defacto standard for accessing data. Only a small subset of SQL will be introduced here—more than enough for you to do your homework. In the examples below the following schema will be used. The highlighted names are the keys within the relations. The arrows show which fields are in common. For example, the branch-name of the account and the branch-name of the loan are intended to be the same as the branch-name in the branch relation—that’s how you link entities together.

 


11.11                Basic Structure

SQL is based on set and relational operations.A typical SQL query has the form:

            select A1, A2, ..., An
            from r
1, r2, ..., rm
            where P

 

The result of an SQL query is a relation (i.e. a table).

11.12                The select Clause

The select clause is used to list the attributes desired in the result of a query.

Example: Find the names of all branches in the loan relation.

            select branch_name
            from loan

An asterisk in the select clause denotes “all attributes”. The following example returns the whole loan table.

select *
            from loan

NOTE:  SQL names are case insensitive, meaning you can use upper case or lower case. 

SQL allows duplicates in relations as well as in query results. To force the elimination of duplicates, insert the keyword distinct after select. For example, to find the names of all branches in the loan relations, and remove duplicates

select distinct branch_name
            from
loan

The keyword all specifies that duplicates not be removed.

select all branch_name

from loan

The select clause can contain arithmetic expressions involving the operation, +, –, *, and /, and operating on constants or attributes of tuples.

For example, the following query: would return a relation which is the same as the loan relations, except that the attribute amount is multiplied by 100.

select loan_number, branch_name, amount * 100
            from loan

 

11.13                The where Clause

The where clause is a predicate involving attributes of the relations in the from clause. Logical connectives are and, or, and not.

For example, find all loan number of loans made at the Perryridge branch whose loan amounts are greater than $1200

            select loan_number
            from
loan
            where
branch_name = ‘Perryridge’ and amount >1200

SQL Includes a between comparison operator to simplify queries.

For example, find the loan number of those loans with loan amounts between $90,000 and $100,000.

            select loan_number
            from loan
            where amount between 90000 and 100000

11.14                The from Clause

The from clause lists the relations to be scanned in the evaluation of the expression.

For example find the Cartesian product of borrower “cross” loan.

select *
            from borrower, loan

This will create a relation with all the attributes of borrower and loan and all the records in both.

For example, find the name, loan number and loan amount of all customers having a loan at the Perryridge branch.

            select customer_name, borrower.loan_number, amount
           
from borrower, loan
           
where   borrower.loan_number = loan.loan_number  and
                  
branch_name = ‘Perryridge’

11.15                Tuple Variables

Find the customer names and their loan numbers for all customers having a loan at some branch.

      select customer_name, T.loan_number, S.amount
     
from borrower as T, loan as S
     
where  T.loan_number = S.loan_number

The above is just syntactic sugar.

Find the names of all branches that have greater assets than some branch located in Brooklyn.

            select distinct T.branch_name
           
from branch as T, branch as S
           
where T.assets > S.assets and
                        
S.branch_city = ‘Brooklyn

Here it is required.

11.16                String Operations

SQL includes a string-matching operators. Patterns are described using two special characters:

Find the names of all customers whose street includes the substring “Main”.

            select customer_name
           
from customer
           
where customer_street like
%Main%

SQL supports a variety of string operations such as:

 

11.17                Ordering the Display of Tuples

You can also sort the results of a retrieval. For example: list in alphabetic order the names of all customers having a loan in Perryridge branch

select distinct customer_name
           
from    borrower, loan
           
where borrower loan_number = loan.loan_number

                                    and branch_name = Perryridge
order by
customer_name

We may specify desc for descending order or asc for ascending order, for each attribute; ascending order is the default.

For example,  order by customer_name desc

11.18                Set Operations

The set operations union, intersect, and except operate on relations. Each automatically eliminates duplicates; to retain all duplicates use the corresponding multiset versions union all, intersect all and except all.

For example, find all customers who have a loan, an account, or both:

(select customer_name from depositor)
union
(select
customer_name from borrower)

Find all customers who have both a loan and an account.

(select customer_name from depositor)
            intersect
            (select
customer_name from borrower)

Find all customers who have an account but no loan.

(select customer_name from depositor)
            except
            (select
customer_name from borrower)

 

11.19                Aggregate Functions

These functions operate on the multiset of values of a column of a relation, and return a value:

Here’s an example. Find the average account balance at the Perryridge branch.

            select avg (balance)
            from account
            where branch_name = ‘Perryridge’

Find the number of tuples in the customer relation.

            select count (*)
            from customer

Find the number of depositors in the bank.

            select count (distinct customer_name)
             
from depositor

 

11.20                Aggregate Functions – Group By

Find the number of depositors for each branch.

          select branch_name, count (distinct customer_name)
           
from depositor, account
           
where depositor.account-number
                                    = account.account-number
           
group by branch_name

Note:  Attributes in select clause outside of aggregate functions must appear in the group by list.

11.21                Aggregate Functions – Having Clause

Find the names of all branches where the average account balance is more than $1,200.

            select branch_name, avg (balance)
           
from account
           
group by branch_name
           
having avg (balance) > 1200

Note:  predicates in the having clause are applied after the formation of groups whereas predicates in the where clause are applied before forming groups.

11.22                Null Values

It is possible for attributes to have no value. null signifies an unknown or non-existent value. The predicate  is null can be used to check for null values.

E.g. Find all loan number in the loan relation with null values for amount.

            select loan_number
           
from loan
           
where amount is null

The result of any arithmetic expression involving null is null, e.g.  5 + null  returns null. Aggregate functions simply ignore nulls.

11.23                Null and Three Valued Logic

Any comparison with null returns unknown

E.g.  5 < null   or   null <> null    or    null = null all return unknown

Three-valued logic using the truth value unknown:

OR:  (unknown or true) = true,
            (unknown or false) = unknown
            (unknown or unknown) = unknown
AND:  (true and unknown) = unknown,
     (false
and unknown) = false,
     (unknown
and unknown) = unknown
NOT:  (not unknown) = unknown

P is unknown” is true if P is unknown.

Result of where clause predicate is treated as false if it evaluates to unknown.

11.24                Null Values and Aggregates

Null values don’t affect aggregates. For example suppose you want to total all loan amounts. The following statement ignores null amounts:

select sum (amount)
           
from loan

The result is null if there is no non-null amount.

All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes.

11.25                And there is more retrieval stuff…

There are more things that can go into retrievals

But, we won’t cover them here.

11.26                Modification – Deletion

To delete all account records at the Perryridge branch

delete from account                   
where branch_name = Perryridge

Of course these can get complicated.

11.27                Modification – Insertion

To add a new tuple to account (according to schema):

insert into account
values (‘A-9732’, ‘Perryridge’,1200)

Or equivalently if you change the order:

insert into account (branch_name, balance, account-number)
           
values (‘Perryridge’, 1200, ‘A-9732’)

Add a new tuple to account with balance set to null

insert into account
values (‘A-777’,‘Perryridge’, null)

 

11.28                Modification – Updates

To increase all accounts with balances over $10,000 by 6%, the rest receive 5% increase, write two update statements:

     

update account
           
set balance = balance * 1.06
            where balance > 10000

            update account
           
set balance = balance * 1.05
            where balance <= 10000

The order is important. Why?

11.29                Data Definition Language (DDL)

Allows the specification of not only a set of relations but also information about each relation, including:

11.30                Domain Types in SQL

11.31                Date/Time Types in SQL

11.32                Create Table Construct

An SQL relation is defined using the create table command:

                          

create table r (A1 D1, A2 D2, ..., An Dn,
           
(integrity-constraint1),
            ...,
            (integrity-constraintk))

Where:

 

For example:

                          

create table branch (
            branch_name  char(15),
            branch-city     char(30),
            assets integer)

 

11.33                Integrity Constraints in create table

There are three ways to express constraints in this command:

Example:  Declare branch_name as the primary key for branch, ensure it is not null, and ensure that the values of assets are non-negative.

                           create table branch
                           (branch_name char(15)
not null,
                           branch-city     char(30)
                           assets             integer,
                          
primary key (branch_name),
                          
check (assets >= 0))

 

11.34                SQL Drivers

An SQL Driver contains application program interface (API) to:

Below are example from Perl, C, and Java.

11.35                PERL/DBI Code

$dbh=DBI->connect("dbi:mysql:cs351:nunki.usc.edu:9776", "cs351","cs351") || die
$DBI::err;

$sth=$dbh->prepare('SELECT * from data');
$sth->execute;
print "Student\tFavorite Color\n";
print "-------\t--------------\n";
while (@row=$sth->fetchrow_array) {
print $row[0]."\t".$row[1]."\n";
}
$sth->finish;
$dbh->disconnect;

 

11.36                ODBC Code

int ODBCexample()
{
  RETCODE error;
  HENV    env;     /* environment */
  HDBC    conn;  /* database connection */
  SQLAllocEnv(&env);
  SQLAllocConnect(env, &conn);
  SQLConnect(conn, "aura.bell-labs.com", SQL_NTS, "avi", SQL_NTS,
"avipasswd", SQL_NTS);
  { …. Do actual work … }

  SQLDisconnect(conn);
  SQLFreeConnect(conn);
  SQLFreeEnv(env);
}

 

11.37                JDBC Code

public static void JDBCexample(String dbid, String userid, String passwd) {     
     try {
  Class.forName ("oracle.jdbc.driver.OracleDriver"); 
 
Connection conn = DriverManager.getConnection(   "jdbc:oracle@bell-labs.com:2000:bankdb", userid, passwd); 
          Statement stmt = conn.createStatement();
            … Do Actual Work ….
          stmt.close();    
          conn.close();  
   }                   
   catch (SQLException sqle) {                  
        System.out.println("SQLException : " + sqle);                   
   }                   
  }

 

12  JDBC – Java Database Connectivity

12.1JDBC Overview

Java has a full package called JDBC to work with databases. All examples from Core Web Programming by Marty Hall and Larry Brown. As with most Java packages there are many on-line resources:

In the picture below you can see that JDBC consists of two parts: 

 


12.2Using JDBC

There are seven basic steps in using JDBC:

  1. Load the driver
  2. Define the Connection URL
  3. Establish the Connection
  4. Create a Statement object
  5. Execute a query
  6. Process the results
  7. Close the connection

 

1.      Load the driver

  try {

    Class.forName("oracle.jdbc.driver.OracleDriver");

    Class.forName("com.mysql.jdbc.Driver");

  } catch { ClassNotFoundException cnfe) {

    System.out.println("Error loading driver: " cnfe);

  }

2.      Define the Connection URL to the database

  String host = "dbhost.yourcompany.com";

  String dbName = "someName";

  int port = 1234;

  String oracleURL = "jdbc:oracle:thin:@" + host +

                     ":" + port + ":" + dbName;

  String mysqlURL = "jdbc:mysql://" + host  +

                    ":" + port + "/" + dbName;

3.      Establish the Connection

     String username = "jay_debesee";

     String password = "secret";

      Connection connection =

        DriverManager.getConnection(mysqlURL,
                                username,
                                password);

 

     Optionally, get information about the db system

     DatabaseMetaData dbMetaData = connection.getMetaData();

     String productName =

       dbMetaData.getDatabaseProductName();

     System.out.println("Database: " + productName);

     String productVersion =

       dbMetaData.getDatabaseProductVersion();

     System.out.println("Version: " + productVersion);

 

4.      Create a Statement

     Statement statement = connection.createStatement();

      // discuss PreparedStatements later

      

 

5.      Execute a Query

     String query = "SELECT col1, col2, col3 FROM sometable";

     ResultSet resultSet = statement.executeQuery(query);

To modify the database, use executeUpdate, supplying a string that uses UPDATE, INSERT, or DELETE

Use statement.setQueryTimeout to specify a maximum delay to wait for results

 

6.      Process the Result

while(resultSet.next()) {

System.out.println(resultSet.getString(1) + " " +

                           resultSet.getString(2) + " " +

                           resultSet.getString(3));

     }

Note that the first column has index 1, not 0.ResultSet provides various getXxx methods that take a column index or name and returns the data.

7.      Close the Connection

      connection.close();

 

12.3A Complete Basic JDBC Example

import java.sql.*;

 

public class TestDriver {

   public static void main(String[] Args) {

try { 
          Class.forName
               ("com.mysql.jdbc.Driver").newInstance();

}             

     catch (Exception E) {

          System.err.println("Unable to load driver.");

          E.printStackTrace();

     }

     try {

          Connection C = DriverManager.getConnection(

              "jdbc:mysql://almaak.usc.edu:3307/menagerie",

              "root", "xyz"); //?user=root&password=xyz");

             

          Statement s = C.createStatement();

          String sql="select * from pet";    

          s.execute(sql);

          ResultSet res=s.getResultSet();

          if (res!=null)   {

              while(res.next()){//MySql start with 1

              System.out.println("\n"+res.getString(1)

                         + "\t"+res.getString(2));

              }

          }

          c.close();

     }

     catch (SQLException E) {

          System.out.println("SQLException: " + E.getMessage());

          System.out.println("SQLState:     " + E.getSQLState());

          System.out.println("VendorError:  " + E.getErrorCode());

     }

   }

}

 

12.4ResultSet Methods

All ResultSet methods can throw a SQLException

float        short               long               Time   Object

 

12.5Using MetaData

You can derive a ResultSetMetaData object from a ResultSet in order to look up information about the result set. In particular

ResultSetMetaData answers the following questions:

Here are some useful metadata methods:

Here is a full example of how to use MetaData:

  Connection connection =

  DriverManager.getConnection(url, username, password);

 

  // Look up info about the database as a whole.

  DatabaseMetaData dbMetaData = connection.getMetaData();

  String productName =

    dbMetaData.getDatabaseProductName();

  System.out.println("Database: " + productName);

  String productVersion =

      dbMetaData.getDatabaseProductVersion();

 

  //Make a query

  Statement statement = connection.createStatement();

  String query = "SELECT * FROM pet";

  ResultSet resultSet = statement.executeQuery(query);

  // Look up information about a particular table.

  ResultSetMetaData resultsMetaData =
  resultSet.getMetaData();

  int columnCount = resultsMetaData.getColumnCount();

  // Column index starts at 1 (a la SQL) not 0 (a la Java).

  for(int i=1; i<columnCount+1; i++) {

    System.out.print(resultsMetaData.getColumnName(i) + 
                   "  ");

  }

  System.out.println();

 

  // Print results.

  while(resultSet.next()) {

    // Name of animal

    System.out.print("    " + resultSet.getString(1));

    // Name of owner
 ...

  }

 

12.6Using MetaData, Result

Here is the result of the above code on the pet database.

Database: MySQL

Product Version: 3.23.52-log

Pet Table from menagerie Database

name

owner

species

sex

birth

death

Puffball2

Diane

hamster

f

2000-01-30

0000-00-00

Puffball3

Diane

hamster

m

2004-03-23

2005-03-30

Freddy

Dave

terrier

m

1999-03-30

0000-00-00

Buster

Jeannette

cat         

m

1992-06-30

2005-02-20

Annabelle

Jeannette

cat         

f

1992-06-30

0000-00-00


 

12.7Using the Statement Object

Through the Statement object, SQL statements are sent to the database. There are three types of statement objects are available:

Here are some useful statement methods:

       ResultSet results =

         statement.executeQuery("SELECT a, b FROM table");

       int rows =

         statement.executeUpdate("DELETE FROM EMPLOYEES" +

                                 "WHERE STATUS=0");

12.8Prepared Statements (Precompiled Queries)

If you are going to execute similar SQL statements multiple times, using “prepared” (parameterized) statements will be more efficient. First create it for compilation before actually being used. Then, each time you use it, you simply replace some of the marked parameters using the setXxx methods. PreparedStatement's execute methods have no parameters:

Here is an example:

Connection connection =

  DriverManager.getConnection(url, user, password);

PreparedStatement statement =

  connection.prepareStatement("UPDATE employees " +

                              "SET salary = ? " +

                              "WHERE id = ?");

float[] newSalaries = getSalaries();

int[] employeeIDs = getIDs();

for(int i=0; i<employeeIDs.length; i++) {

  statement.setFloat(1, newSalaries[i]);

  statement.setInt(2, employeeIDs[i]);

  statement.executeUpdate();

}

Here are a few useful prepared statement methods:

12.9SQL Exception Handling

Nearly every JDBC method can throw a SQLException in response to a data access error. If more than one error occurs, they are chained together. SQL exceptions contain:

Here is an SQL exception example:

try {

     ... // JDBC statement.

} catch (SQLException sqle) {

  while (sqle != null) {

    System.out.println("Message:  " + sqle.getMessage());

    System.out.println("SQLState: " + sqle.getSQLState());

    System.out.println("Vendor Error: " +

                       sqle.getErrorCode());

    sqle.printStackTrace(System.out);

    sqle = sqle.getNextException();

  }

}

12.10                SQL Warnings

SQLWarnings are rare, but provide information about database access warnings. The warning are chained to object whose method produced the warning. The following objects can receive a warning:

Call getWarning to obtain the warning object, and getNextWarning (on the warning object) for any additional warnings. Warnings are cleared on the object each time the statement is executed.

Here is an example:

ResultSet results = statement.executeQuery(someQuery);

SQLWarning warning = statement.getWarnings();

while (warning != null) {

  System.out.println("Message: " + warning.getMessage());

  System.out.println("SQLState: " + warning.getSQLState());

  System.out.println("Vendor Error: " +

                     warning.getErrorCode());

  warning = warning.getNextWarning();

}

while (results.next()) {

  int value = results.getInt(1);

     ... // Call additional methods on result set.

  SQLWarning warning = results.getWarnings();

  while (warning != null) {

    System.out.println("Message: " + warning.getMessage());

    System.out.println("SQLState: " + warning.getSQLState());

    System.out.println("Vendor Error: " +

                       warning.getErrorCode());

    warning = warning.getNextWarning();

  }

 }

 

12.11                Transactions

By default, after each SQL statement is executed the changes are automatically committed to the database. It is often the case that you want two or more SQL statements to be considered atomic, that is, they all must finish successfully or you don’t want any. A transaction encapsulates SQL statements in the same way as a critical section does for code.

In order to group two or more statements together into a transaction, turn auto-commit off as follows:

connection.setAutoCommit(false)

 

Then make your SQL statements and, if successful, call connection.commit() to permanently record the changes. If an error occurs call connection.rollback().  Here is an example:

 

 Connection connection =

   DriverManager.getConnection(url, username, passwd);

 connection.setAutoCommit(false);

 try {

   statement.executeUpdate(...);

   statement.executeUpdate(...);

   ...

 } catch (Exception e) {

   try {

     connection.rollback();

   } catch (SQLException sqle) {
   // report problem

   }

 } finally {

   try {

     connection.commit();

     connection.close();

   } catch (SQLException sqle) { }

 }

Here is a review of the transaction methods:

12.12                Some JDBC Utilities

Performing JDBC queries and formatting output are common tasks, so the authors of the above code have created helper classes to perform this function: DatabaseUtilities and DBResults.

DatabaseUtilities has 3 static utilities:

The DBResults class stores completed results of a JDBC Query. It differs from a ResultSet in several ways:

·         ResultSet doesn't necessarily have all the data; reconnection to database occurs as you ask for later rows.

·         This class stores results as strings, in arrays.

·         This class includes DatabaseMetaData (database product name and version) and ResultSetMetaData (the column names).

·         This class has a toHTMLTable(headingColor) method that turns the results into a long string corresponding to an HTML table.

·         See the class definitions for all the methods.

Here is a simple example:

DBResults results =

  DatabaseUtilities.getQueryResults(driver, url,

                                    username, password,

                                    query, true);

out.println(results.toHTMLTable("RED"));

12.13                Summary

We have covered a lot of material here. Here a few summary points about JDBC by the authors of Core Web Programming.

13  Miscellaneous Topics

13.1Pthreads

Pthreads are for threading in C++. Here is sample threading code followed by an explanation:.

#include <iostream>

#include <pthread.h>

 

void *task1(void *X) //task to be executed by ThreadA

{

   //...

   cout << "Thread A complete" << endl;

}

 

void *task2(void *X) //task to be executed by ThreadB

{

   //...

   cout << "Thread B complete" << endl;

}

 

int main(int argc, char *argv[])

{

   pthread_t ThreadA,ThreadB; // declare threads

   pthread_create(&ThreadA,NULL,task1,NULL);
   pthread_create(&ThreadB,NULL,task2,NULL);

   // additional processing

   pthread_join(ThreadA,NULL); // wait for threads

   pthread_join(ThreadB,NULL);

   return(0);

}

In this example, the main line of the example defines the set of instructions for the primary thread. The primary thread, in this case, is also the boss thread. The boss thread declares two threads, ThreadA and ThreadB. By using the pthread_create() function, these two threads are associated with the tasks they are to execute. The two tasks, task1 and task2, are defined. They simply send a message to the standard out but could be programmed to do anything. The pthread_create() function causes the threads to immediately execute their assigned tasks. The pthread_join() function works the same way as wait() for processes. The primary thread waits until both threads return. Figure 11 shows the layout of the example. It also shows what happens to the flow of controls as the pthread_create() and pthread_join() functions are called.

In Figure 11, the pthread_create() function causes a fork in the flow of control in the main line or primary thread. Two additional flows of control, one for each thread, are executing concurrently. The pthread_create() function returns immediately after the threads are created. It is an asynchronous function. As each thread executes its set of instructions, the primary thread executes its instructions. The pthread_join() causes the primary thread to wait until each thread terminates and rejoins the main control flow.

http://www.awprofessional.com/content/images/chap4_0131013769/elementLinks/04fig11.gif

Figure 11 The layout, output, and flow of control for the example .

The University of Southern California does not screen or control the content on this website and thus does not guarantee the accuracy, integrity, or quality of such content. All content on this website is provided by and is the sole responsibility of the person from which such content originated, and such content does not necessarily reflect the opinions of the University administration or the Board of Trustees