CSCI201 - Software Development
The Curriculum
January 29, 2012
Software development is about building applications that run on computers. It is much more than algorithm design, selecting a programming language, and coding. Those are important, but for larger applications we need a larger perspective, or system view. And we must worry more about the creation process, often called software engineering.
To emphasize the system view, this class will take a system’s approach to teaching you about software development.
Large software applications involve interacting systems of:
· stakeholders who have a need for an application;
· developers who together design and build the application;
· software components that together are the application;
· maintainers who deploy, fix, and change the application;
· environments in which the application is built and in which the application runs.
Stakeholders, developers, components, maintainers, and environments each have constraints and hierarchical relationships. A system’s approach sensitizes you to the sophisticated, subtle, deep thinking required to succeed in a complex world. As we’ll see, even job descriptions can cause problems.
Young computer scientists learn about system building by experience. Very little is theoretical. This class is about software development, but the larger issues of software engineering will always be in focus. Though you later will have formal classes in CSCI477ab, the software engineering needs of this class are covered in section 3.
You will be taught about writing code that is part of a system by immersing you into large bodies of code. The approach is more like mentoring than teaching.
You will be asked to enhance existing projects. You are going to be thrown into them without knowing (except through documentation) how the systems got to be the way they are. Our motivation is simple:
· Young programmers don’t deal much with design.
· They are often thrown into existing projects without much handholding.
· They must learn how to read and analyze code.
· They must learn to work with many different computer systems.
Along with class lectures to fill in gaps, you will leave this class as programmers who will care more about how the system works than just how the code works.
You will have problems for different applications. The goal is to sharpen your software development skills by immersing you in “large” projects that you could never do by yourself in just one semester. Each application emphasizes different computer science themes:
Restaurant Application -- Concurrent application in which Customers go to a restaurant to get fed. The starting point is a version with customers, waiters, a cook and a host. You will add a cashier and a market.
Factory Application -- You will be given a specification of a factory floor cell that has various components. You will build the entire control structure and animation.
Both of these applications depends on agents. See the agent roadmap for details. We have versions of the agents in Java and C++. Agents work through concurrency and messaging; the skills you will develop include encapsulation, testing, and refactoring;
Solving problems is where the money is. It is no surprise that many of the themes of general problem solving find their way into programming. We will consider two themes: (1) Divide and Conquer, and (2) Reuse.
Many problems seem hopelessly complicated. Fortunately, they can often be divided into a collection of subproblems whose solutions together conquer the original problem. Divide and conquer has many benefits:
· Easier - The subproblems may be “easier” to solve, i.e. less time, less space, or less complexity.
· Concurrency - The subproblems can be allocated to different processors/people for concurrent solution and execution.
· Reuse - Maybe the subproblems are common and their solutions can be reused in another application. Reuse will be the subject of the next section.
And I’m not just talking about programming. For example, building a car seems a formidable, almost impossible task. But, if you think about designing a body, an engine, and a drive-train separately and then putting them together, the problem is approachable. For our above three advantages, the car example offers:
Easier – An engine is easier think about and design than the whole car. One key issue will be integration, i.e., assembling the components into the whole. Of course, the design of the parts contributes directly to the ease of integration.
Concurrency - Notice also that you could have three separate teams working on each part at the same time. They need to interact somewhat during the design, and much more during integration.
Reuse – A particular engine might fit into different models of your car line.
For a software example, consider designing an automated teller machine for a bank. Almost any first pass design will divide the problem into three parts: 1) the physical ATM machine, 2) the central database of accounts, and 3) the network connecting them. Or did you have a different decomposition?
How you divide the problem is not always obvious. Much of this skill comes from experience. Besides just trying to design functional, buildable parts, you might consider trying to design (1) interesting, reusable parts that are (2) easy to fit together.
Building any of the components mentioned above is a sophisticated problem in its own right. Each in turn yields to divide and conquer techniques as well. In other words divide and conquer is naturally recursive.
Reuse is the “Holy Grail” of programming and many other enterprises. The whole discussion of standards is about reusability. The effort and expense to develop working solutions forces us to think about reusing solutions. How would we develop anything complex if we always had to build everything from scratch? Think about how knowledge and science are based upon building on the ideas of the past. That’s a part of reusability.
The reusable objects of programming fall roughly into four categories:
· Class – Lowest level reusable entity, e.g., the collection classes. Before object-oriented programming, reusable libraries of functions were the lowest level.
· Component – A class that has semantic value to an application, e.g., an employee class in a business system.
· Subsystem – A component that is free running, i.e., has its own thread of control, but is generally not useful by itself. For example, a data base server.
· Application – A free running object that completely solves some problem.
As you move up in complexity, the objects do more, but are harder to reuse. [Make sure you understand that.] When I don’t care about the distinction, I will use the term component to mean anything that can be part of an application (including another application).
Inheritance, as we know it from object-oriented programming, is a separate feature of reusability. A key part of working with components of any type and in any domain is understanding them:
· knowing what they do;
· knowing any constraints they have;
· knowing how they interact;
· knowing how much they cost.
Relating two components by an inheritance-type statement, simplifies your understanding of them. Consider:
If I tell you that B is a kind of A but that B handles two more cases,
and if you already understand A,
then you only have to understand the two new cases in order to understand B.
One of the constraints of object-oriented programming is to force you into thinking about classes and inheritance. This extra work benefits reusability.
Physical objects are reusable, but so are ideas. Patterns try to capture the ideas behind design decisions so that they can be reused. For example, using class constructors explicitly intertwines your application tightly with the classes you are instantiating. Someone invented the idea of a Factory to which you describe the object you want from some family of objects, and it “constructs” it for you. Turns out that idea is useful and reusable.
We will study patterns later in section 9.
Reuse depends not only on the functionality of the component, but on how simple it is to describe and use. Mostly, that means:
· Can you describe what it does?
· Can you describe how it interacts with other components?
If you can’t do both, how can you use it? For a class in an object-oriented programming language that means:
· Understanding what the public methods do;
· Understanding what the “public” instance variables represent. Even if the instance variables are made accessible only by get() and set() routines, you still have to understand their purpose.
· Understanding the protocol of using the classes. Just because you know what the methods are doesn’t mean you know how to use the class. That’s why sample programs often accompany packages of classes.
You will learn that entangled, or highly connected, code is difficult to understand, debug and maintain. In theory, each component should be as self-contained as possible. It should have a “small” easy to understand public interface. Mostly, this is about scooping the data and methods that an application uses:
· Private – Visible only to the single component.
· Shared – Visible to a collection of cooperating components.
· Global – Visible to all the components of an application.
Data encapsulation is particularly important. Dozens of books have been written over the years to explain that:
· global data can wreck a design,
· it is important to limit shared data, and
· shared data is difficult to handle.
A good part of this class is about exactly these issues.
This problem is so important that whole methodologies are invented specifically to deal with encapsulation. The Agent architecture of our first programming project has encapsulation as one of its principal design features.
Another key part in designing reusable components is getting the size, or granularity, right. A component that is too big is too specialized and hard to understand. A component that is too small is often of little use. So, how big is too big? How small is too small? Experience will be your guide. Being able to design the right classes, the right components, and the right subsystems is a sure sign of an experienced designer.
So far your programming career probably consists only of homework problems that you did quickly by yourself, mostly by directly coding a solution. Real-world projects are different because they are bigger in several ways:
· Large projects have several distinct phases. (Section 3.1)
· The process of doing the project needs managing. (Section 3.2)
· It takes a team to do the project. (Section 3.2.3)
· Large software projects have special technical concerns. (Section 3.4)
The next few sections lay out the phases typically found in large projects.
Every project must specify what the thing being developed is supposed to do. That specification is called requirements. Requirement elicitation is the process of gathering the requirements from the owners and users of the proposed system.
Understanding the key objects in the domain is part of requirement elicitation. During this domain analysis the subtleties of the domain are explored to uncover hidden constraints, needs, and so forth. Typically domain experts are used during requirement elicitation for precisely this purpose. I remember once looking at a ball-bearing feeder to be used in a manufacturing application I was programming. As I watched it operating, I noticed that feeding errors increased as it got close to being empty. I asked the engineer about this and he told me that you should always keep the feeder at least Ľ full. On further probing I found out that the machine did not have a sensor, so I would have to count based on a guess of how many bearings fill up the feeder. Since the count didn’t have to be perfect, that solution was good enough. Do you see how poor domain analysis can wreck a design?
Requirements can be functional or non-functional. For example, you may specify the need for a feeding machine. That’s a functional requirement; it describes something the system must do. Saying that the machine should be kept at least Ľ full is a non-functional requirement; it makes a statement of how the function is supposed to behave.
Requirements are tricky. In particular:
· Requirements can be underspecified. For example, I may tell you to display some data in order. If I don't tell you what kind of order, then the requirement is underspecified. If the data are strings, then you might assume alphabetic—that would be natural and part of the domain knowledge you bring to the party. You have to know in order to code it. Notice if the requirement says to sort the data then you have told me how to order it. That's a well-know problem.
· Requirements can be overspecified. If you require that the display of the order data must be created in linear time, you have overspecified the problem. No known sort algorithms work in linear time.
· Requirements can be contradictory. If you say that 1) the system has to run on a single server, and 2) that the system has to be running 24x7, i.e., 100% of the time, then you had better not have another requirement that says 3) the system will be taken down for an hour each week for maintenance.
· Requirements may become irrelevant. It is tempting to try to think of every feature a system might need during the requirements phase. Unfortunately, features complicate a design. This fact of life is particularly aggravating if the need for the feature doesn’t survive through to the system’s delivery.
· Requirement can change during development and new ones discovered. More about this when we talk about the process (section 3.2).
How do you specify requirements? For us, it will be a combination of:
· English statements
· Big Block diagrams (identifying major components of the system)
· High level interaction diagrams (showing how the components interact)
These will be shown in the Agent document.
Design is taking requirements and figuring out how the system should work. Design is the mapping between “what” (requirements) and “how” (the implementation). In this phase, you analyze the requirements and design a solution that can be implemented (section 3.1.3) and deployed (section 3.1.4). For our simple ordering example, the requirement to order the input must be turned into an algorithm we can implement. That’s easy to understand.
For larger systems, the design phase is more complex, but the idea is the same. The designer must take the requirements and turn it into something that can be coded and/or constructed. The complexity forces the designer to think about the structure of the application and the tools needed to implement it.
Many young programmers don't appreciate design. They are ready to code a problem directly from the requirements; after all, that is how they address their school problems. But, there are serious questions that "design" answers where "coding" does not:
· How does the stakeholder know that you, the developers, understand the problem?
· How does the stakeholder help make decisions about design alternatives?
· How does the stakeholder know if you are making timely progress?
· How does the stakeholder know if you are producing something that is flexible and maintainable?
If all the stakeholder has to look at is code, the answer to all these questions is "He doesn't!"
So, let's assume you acknowledge that design is important. How do you specify a design? There are dozens of design methodologies. UML (Unified Modeling Language) is a famous one that we will have more to say about later in the semester. For now we will ignore them all and focus on a few methods that seem to be part of any design:
· Detailed interaction diagrams
· Class description of system objects
· Pseudo-code
We will see all of these in detail in our Agent application.
Inevitably, design discussions must answer questions about environments, applications, frameworks and architecture. What are these things?
· An environment is something in which you build and run applications.
· An application is a solution to some need. An application can be static like a piece of furniture or a building, or dynamic like a business or some software. You use static applications; you run or operate dynamic ones.
· A framework is an environment in which you can generate specific types of applications. Applications built in a framework generally have an architecture.
· An architecture is the structure, or organization, of an application. A colonial house has an architecture seen in 18th century America. The first programming system used by this class has an agent architecture. All applications have an architecture, however weak.
A framework forces some decisions on you—perhaps what tools exist, what expertise is needed, and so forth. These restrictions are outweighed by the simplification in developing the class of applications for which the framework was designed. For example, a woodshop is a framework for making wood products. It has special tools and requires special skills.
A programming environment is a framework that helps you generate software. The Java J2EE framework helps you generate web-based software. The dot Net framework does the same thing using Microsoft products.
By the way, just using classes in object-oriented programming doesn’t imply an architecture. Saying you used object-oriented programming doesn’t say much. Using an object-oriented design methodology is much different. [Isn’t it? Is this point too subtle for the class?]
Implementation starts with a design. The developers identify existing components they can reuse, code ones they can’t, and integrate everything. Implementation takes place in an environment that supports programmers, mostly to write code.
Programmers must validate their code (determine if it satisfies the requirements) and test it (see if it performs as designed). The testing must be not only in the environments on which they code, but also in the deployment environment on which the system will run. Testing inevitably requires debugging. Debugging can be about:
· finding errors introduced by the coder, e.g. a pointer error;
· finding flaws in the design, e.g., a class is missing a method “needed” by another class; or
· finding flaws in the requirements, e.g., discovering that some requirements can’t be met.
Validation and testing are often haphazard affairs, testing especially. Programmers sit at a terminal and run test cases often by typing directly into user interfaces. For small programs this technique might work. For large systems, it is hopeless.
In this class you will be exposed to two techniques to support validation and testing:
· Code Inspection – Inspection means reading aloud every line of code. A team inspects all code in the presence of the code’s author. Any questions that arise are fielded immediately. It is amazing how many bugs are discovered this way. Not only just coding bugs, but also sophisticated design and requirement flaws.
· Unit Testing – Every component, or unit, of your system must be independently testable for obvious reasons. And the tests must be saved in order to be rerun whenever necessary. Why? Code changes and you can’t possibly afford full at-the-computer testing each time it does (as if that were even possible).
Both techniques will be a part of every homework assignment. Teaching programming is not easy. And, most of what you are going to learn is the discipline to do these unglamorous tasks. You are going to learn by doing.
The implementation environment may not be, and typically isn’t, the same as the deployment environment in which the application is going to run. In fact, deployment decisions may influence the design. We’ll see later in the discussion about concurrency (section 4) how multi-machine environments can change the low-level details of how processes communicate.
Once deployed, systems need monitoring and maintenance usually by teams different from the developers. Some of their tasks include:
· Running help centers and Customer Relationship Management (CRM) tools to support end users.
· Inspecting logs and gathering statistics to determine how the system is behaving.
· Making sure databases are consistent.
· Bringing new releases and updates online seamlessly.
These efforts are expensive but no large project can afford to be without them. We won’t be spending any time on this phase of software development. That doesn’t mean it is not important, only that we have other things to do.
As you can see, production quality systems are a lot different from school problems. The process of building them is different as well. Homework has a “see it-do it” flavor. Real systems have the above phases that, not surprisingly, must be managed.
It seems logical to do these phases one at a time, delivering the results of each phase to the team responsible for the next phase, like a waterfall cascading over a cliff with many levels.
Managers love the waterfall metaphor. Just schedule each phase, do it as best you can, and move on to the next phase. Unfortunately, it doesn’t work that way in the real world. Mostly the problem is with requirements. They change and developers must respond.
Even if the requirements are perfectly specified and without conflict, requirements change. Mainly for two reasons:
Requirements evolve through understanding of the problem. When requirements are analyzed and code is being developed, requirements change, mostly by becoming more complex, catching missed cases, making delayed decisions, etc.
Introducing a computerized solution to a group that has been something manually changes the environment in ways that cannot be anticipated. Think about how email systems must now handle spam. The requirements of the original email programs couldn’t know that without special handling, spam overwhelms most users. The design of Digital Video Receiver (DVR) software will face similar problems. A DVR is not just a VCR with a hard disk, as the developers will learn.
The result is that the waterfall process is fiction. Instead, building most systems involves iteration. That is:
· A subset of the key requirements is identified. A minimum system called a prototype is designed and implemented.
· The client uses the prototype and then reviews the requirements. Perhaps some new ones are added, some old ones discarded, existing ones changed.
· A next set of requirements to be covered by the next release is decided upon and implemented.
· The last two steps are repeated until the system meets the satisfaction of the people paying the bills. The prototype is renamed and deployed.
Another name for this process is spiral development. Whatever it’s called, it has many advantages.
· A complete and accurate set of requirements does not need to be specified before coding begins.
· The iteration accounts for inevitably changing requirements.
· Clients get to see preliminary working systems, review them, and comment before final delivery. The waterfall process often delivered systems that were out of date before users even touched them. Iteration produces useful and usable systems.
Extreme Programming (XP) embraces the iterative process described above. In fact as we’ll see, it embraces it extremely, and hence its name. In particular, two of its principles are especially relevant to this class:
· Many quick, small releases – While the iterative process acknowledges doing releases until the project is done, it doesn’t say how many to do or how big each should be. XP believes in small releases, perhaps one every few weeks. This procedure is taxing, but it forces the design team to constantly prioritize the requirements to determine what goes in the next release. That counters the problem of requirements becoming irrelevant—a requirement doesn’t influence the design until it becomes part of a release. XP denies the long-held tradition of building a design that can “support” all features. XP believes (1) you account for the features only when they make it into a release, and (2) you will be able to easily refactor your design (see section 7.3) to accommodate the new feature. We’ll see if these two XP features are true.
· Pair programming – XP believes that all at-the-computer time should be done by a pair of programmers: one actually responsible for the coding, and one thinking about the big picture. Pair programming will be a step on the way to learning how to work in teams or groups.
Part of the complexity of figuring out how do quick releases depends on your ability to plan. For this, XP has something called the Planning Game.
In the planning game business people decide the scope, priority, and composition releases. They also decide on the dates of releases. Of course, it is ridiculous to think they can make any date-related decisions in a vacuum. What do they know of the technical issues of developing the code that goes into their release?
In fact, technical people make estimates, elaborate the consequences of the business decisions, and, most importantly, do the detailed scheduling. With this feedback the business people can make informed “final” decisions about the composition of a release.
This Planning Game as thus described has been formalized into a series of three moves:
· Exploration - Find out what's new to do. Write stories, scenarios, interaction diagrams, whatever is your favorite way of specifying requirements and try to estimate them.
· Commitment - Decide on a subset as business sorts the options by value, cost, and risk. Then prioritizes the list.
· Steer - Guide development as reality hits. Plan on iteration of 1-3 weeks of work. Of course if the plan is failing, you must plan for a recovery strategy that probably includes new scenarios and reestimates.
The iteration by the development team mentioned in step 3 just above has the same three planning game steps, but with different actions:
· Exploration - Design the tasks and document them on Engineering Task Cards. Split and combine tasks as required.
· Commitment - Programmers accept tasks and then estimates them. This is so reasonable. How can anyone but the programmer who is going to do the work do the estimate? Of course the manager must do load balancing, i.e. make sure all the assigned tasks are doable in a sensible schedule and that no one is overcommitted.
· Steering - Keep checking your progress, perhaps plan some recovery, and redesign if things aren't going well.
Task cards are used by the programmer to chart and document his progress. The task card is analogous to the journal kept by experimental scientists. All reputable scientists keep journals of their experiments. How else do they know what they are doing. Why don’t programmers do the same? There is no magic in a task card. Design it in any way you want, with whatever information is relevant to you. All that’s important is that it has everything you need to chart your progress on a task and know when it is done. It might look like this:
Engineering Task Card
Date:___________ Engineer________________________ Task Estimate____________
Task Description:
Engineer’s Notes:
Task Tracking:
|
Date |
Done |
To Do |
Comments |
XP believes that all at-the-computer time should be done by a pair of programmers: one actually responsible for the coding (the driver), and one thinking about the big picture (the navigator). Pair programming will be a step on the way to learning how to work in teams or groups.
Pair programming has been adopted by eXtreme programmers. Many books have been written about this technique dating back to the early days of programming. Pair programming is not about one person programming and one watching. It is about a highly interactive session of programming, brainstorming, reading code, discussing ideas, creating tests, and other jobs primary to the job of programming.
Pair programming is not a tutoring session. Often experienced programmers may be paired with inexperienced ones, but mentoring is not the goal. Better programs are.
Pairs are not fixed for all time. In fact, pairs join together only for a single session. It is part of the methodology to make sure that all members of a team can pair with one another.
The Seven synergistic behaviors of Pair programming – Pair programming is said to increase productivity of the two programmers that previously worked alone. The pair programming people say there are seven examples of this synergistic behaviors:
1. Positive Pair Pressure – Programmers work harder and smarter. They don’t want to let partner down; they are motivated to complete tasks; and working in pairs helps them both in following procedures.
2. Pair Negotiation – Each partner brings different skills to the terminal. Research has shown that two heads are better than one in cognitive tasks, which programming certainly is. The natural negotiation that arises between pairs leads to more alternatives. And, of course, there is shared satisfaction of the team effort it takes to complete a task.
3. Pair Courage – Programmers need courage when undertaking challenging tasks. It is impossible to underestimate the benefit of having a partner share your risk. Often one asks “Does this look right to you?” A positive response gives support to the driver. The affirmation is a confidence builder. With confidence comes the courage to admit we don’t know. Then, when you have a problem you can ask your partner for his idea of an alternative without losing face. Otherwise you might just muddle through.
4. Pair Reviews – The navigator watches as code is entered. He reads the code as best he can and comments on anything that he needs explained. These observations are just another form of code reviews. And code reviews are IMPORTANT!!! Another pair of eyes is critical to this process. The human eye doesn’t see what it doesn’t want to see and, often, after you’ve worked on some code for a long time, you no longer SEE it. It is simply a matter of fact that third party code reviewers find glaring errors that developers overlook.
5. Pair Debugging – Explaining your code by verbalizing it to your partner will root out bugs. If you can’t explain it, you don’t know it. The navigator often asks questions that force explanations that turn up flaws.
6. Pair Learning – During a pair programming session knowledge and skills are passed in both directions. This flow is obvious when the pairs are mentor/apprentice, but it works in all relationships. Managers get more people familiar with more of the code. That’s a terrific by-product of pair programming.
7. Pair Trust – And, finally, as you work together, you learn trust one another. Trust engenders more trust, team play, etc.
The seven myths – Was all that too touchy, feely for you? Are you skeptical? You’re not alone. Here are the seven myths that often hold teams back from adopting pair programming:
1. It will double the workload – It appears that you have two people doing the job of one. If that really were true, we would indeed be only half as productive. But, if the two together can produce better code, find more bugs, and, in general, prevent problems that will be found down the line, the “increased” cost will easily be worth it.
2. I’ll never get to work alone – Certainly, the time you spend alone will diminish. But, you still will have plenty of time alone. You don’t compose the code at the terminal, you just type it in there. That leaves plenty of thinking time to design your algorithms. Also it is clear that leaving programmers alone for too long is a mistake. They can easily get on the wrong track. Pair programming prevents that from happening.
3. It only works with the right partner – Sure, a good partner is better than a bad one. But, all pairs will have their strengths and weaknesses. The random mixing of pairs will strengthen all the individuals. If a particular programmer consistently produces bad pair programming sessions, then remove him from those sessions. If you insist on doing pair programming, fire him.
4. It’s only good for training or mentoring – Pair programming is a good way to train or mentor. But there is enough evidence to show that the value of pair programming goes beyond training or mentoring.
5. I’ll never get credit for doing anything – Even if that were true, so what? What’s important is the project succeeds. That means producing the required product on time and on budget. If that miracle happens, you will get plenty of credit from management.
6. Navigators only find syntax errors – Only if your navigator is poor. Finding syntax errors is his least important job. The strategic thinking required of the navigator can be liberating to the driver, who has to worry about getting a particular piece of code to run. The big picture is hard to keep in focus while you’re debugging. That’s why it takes two to do the overall job.
7. I work best alone – That may be true. But like (2) above, working alone may not be best. Learning to work with others, as pair programming insists, can force you to grow as a team player. You might work best alone now, but with some pair programming training, you may be better in the future.
Seven Habits of Effective Pair Programmers – People have noticed some things that effective pair programmers do:
1. Take Breaks – Pair programming is exhaustive but productive. You need breaks: check your email, take a walk, something to take your mind off your work and refresh yourself.
2. Practice Humility – The idea of egoless programming has been around for decades. Excess ego can wreck a collaborative relationship. No one is perfect. John von Neumann, the inventor of our computer architecture and reputed to be the smartest person who ever lived, was famous for asking for advice.
3. Be Confident / Be Receptive – Everyone is nervous about their image and adequacy. Everyone! Be confident in your abilities. After all you’ve been admitted to an elite program–you must be doing something right. At the same time, don’t be overconfident. Be receptive to new ideas or ideas that run counter to your own. Accepting advice is not a sign of weakness. Remember von Neumann.
4. Communicate – That’s the point of pair programming. Just talking aloud can be enough to provoke a solution. This is the key part of collaboration. How many times has someone come to talk to me about a problem, and in presenting it, thought of the answer?
5. Listen – It is said that it is more important to understand than to be understood. In other words, listen and listen carefully, with respect. Unsurprisingly, people listen more to good listeners than good talkers (who they tune out).
6. Be a Team Player – That’s the point. The goal of the pair programming collaboration is to make the driver and the team succeed.
7. Hone the Balance between Compromise and Standing Firm – Developers who are too egotistical may lack the ability to compromise and may become argumentative when paired. Conversely, people who always agree in order to minimize tension aren’t getting the full benefit of collaboration. Some of both are needed.
Large projects involve teams of people. That brings with it (1) social issues of working together and (2) management problems of hiring, firing, promotion, etc. Dozens of books have been written about creating and managing teams of programmers. All have something to offer. But, managing is not that important a topic for you, yet. Much more important is teaching you how to be manageable—how to work within a group, how to make your manager and team more effective, how to develop good habits, etc.
It is not enough just to be technically competent. While there is always a place for the lone genius, the real world puts the rest of us in groups. And it’s not easy. Group dynamics are usually a mystery to young employees. Some never get it. With few exceptions the university emphasizes individual work (mostly to make grading easier). In pair programming you will learn how to communicate with your partner, how to divide up work, and other skills.
Before discussing some group issues, you should know what kinds of colleagues you will surely encounter during your career.
We would like to think that all our colleagues are like us—qualified, competent, cooperative, and normal. Unfortunately that’s not true. You will certainly meet the following characters:
Fake Superstar – Knows it all, let’s everyone know it, but somehow never seems to be around when the hard problems get solved.
Superstar - Does know it all, lives for hard problems; can bring confidence to a team. The superstar is rare, but can be hard to live with.
Underachiever – Could be qualified, competent, and cooperative, yet always seems to under-perform. Frustrating to a manager.
Overachiever – Whatever their abilities, overachievers always perform excellently. Loved by managers.
Normal Achiever – Performs up to their capabilities.
Not qualified – Could be very competent at something else, but not at what they are currently doing. Great coders might be poor designers, while good programmers may be poor managers. Good management gets qualified people into all positions.
In addition, people have different personalities:
Shy – Shy is often mistaken for incompetent. Shy, competent people often have the “most” to say.
Assertive – Assertive doesn’t mean competent, just loud. Still, many competent people are overly confident and can seem overly aggressive.
Rude – Being assertive is one thing, being rude is much another. If you are rude, you had better be a true superstar.
Cooperative – A great personality when paired with competence. Incompetent, overly cooperative people are obsequious (brown-nosers, yes-men).
Ladder climber – All of us want success and want to get ahead. Unfortunately, these types are more interested in their career than the success of their projects.
What happens when these diverse types are thrown together? The answer depends on how members of each team perceive one another.
Companies have suborganizations—executive, technical, financial, legal, marketing, human resources, manufacturing, sales, and others. Depending on the culture, the company, and the people in it, where you are may lead to undeniable differences in status. The differences can be important.
Some think that Japan’s superiority in manufacturing emanates from the high cultural status given to the work done on the factory floor. In Japan it is common to see design engineers on the floor working directly with manufacturing personnel. Stories abound about top executives marveling over a simple elegant switch. Undeniably, such high manufacturing status engenders better production.
Differences in experience also result in differences in status. A competent, experienced person naturally has more status than a competent inexperienced person. Status gets more complicated when the skill sets are related, even perhaps overlap, but are not identical.
For example, developers are often divided into designers and programmers. Designers have status, programmers don’t. Why? Because designers formulate the design (the hard, fun part) that programmers simply code (the so-called easy, routine part). However, the design/code split is not so simple. Few designs, especially large ones, are put down so precisely that all it takes is “coding.” Programmers are often the ones who find design flaws, suggest fixes, make key implementation decisions, and more. In other words programmers bridge the (wide) gap between designing and coding. If designers appreciate this reality, they have more respect for experienced programmers and will include and value them on the design team. In fact, some think that the best designers emerge from the best implementers, whatever the field.
Successful organizations solve the status problem, especially at meetings.
Meetings can be large involving a whole team or as small as two people face-to-face. They can be about anything, but all meetings have a purpose. Good meetings can give energy and purpose to a project.
We take meetings for granted. Everyone has been to them, but only experienced people understand meeting dynamics. You should consider the two roles you must play in a meeting, the obvious one of participant and the subtle one of observer.
As a participant, you contribute to fulfilling the meeting’s agenda, whatever it may be. You are in the moment: listening to suggestions, giving advice, voting, trying to reach consensus, whatever it takes to achieving the meeting’s purpose.
As an observer, you are out-of-body, paying attention to the meta-issues of a meeting. In particular:
· What are the politics involved? If your boss’ boss is at the meeting, it may be more important to make sure that your boss shines rather than worry about key technical decisions, which can be sorted out later.
· Is the meeting going well? If not, what dynamics can you change—call a break, change the topic? Can you or should you take control? Before you take such action, make sure you have the authority.
· What if two geniuses are in conflict? If you should be so lucky to be among more than one genius, prepare yourself for the meeting where they have conflicting ideas. Be careful. A genius rebuffed can sulk and turn ineffective, or think about leaving the organization.
· What is the proper role of a manager in a technical meeting? A manager is there to facilitate a meeting and make decisions. Strangely, a great manager can make great decisions without knowing or understanding all the details, technical or otherwise. Besides being cooperative, you can help your manager make great decisions by giving him great input. The next section discusses this in detail.
These are not easy questions. Many of the Dilbert cartoons capture perfectly the absurdity of dysfunctional meetings. Personalities can certainly destroy a meeting, but often the problem is ethics.
Ethics describes correct behavior. At a minimum ethics is about doing the right thing. Aristotle thought ethical behavior was more: doing the right thing at the right time. For example, donating money when you are rich is one thing, donating when you are poor is much another. And, strangely, power has a lot to do with ethics.
Power motivates people. The work environment breeds people seeking power; some deserve it, some don’t. Power has come to have negative connotations:
· Often the wrong people come to power.
· Power seeking seems unseemly.
· Being less powerful seems weak.
Perhaps. But power used wisely is potent. The German philosopher Friedrich Nietzsche talks about deriving power through truth and will. He means that 1) by being completely honest and facing the truth despite the consequences and 2) having the will to see struggles through, that 3) you will be seen as powerful by your peers. What could be more ethical?
I agree strongly with Nietzsche, certainly when it applies to group dynamics:
· Truth and bad outcomes – The truth can hurt. Sometimes you hurt others when you dispute their arguments or point out their flaws. Hurt feelings are common. Sometimes you hurt yourself if you disagree with your boss. Still, the truth is necessary (but not sufficient) to correct decision making. Having said all this, make sure you understand the difference between truth and opinion.
· Dealing with bad news – No one likes bad news. It is easy to deny bad news or shrink from it. But it takes will to face it squarely and consider what to do next. Believe me, in your working career you will face bad news many times. Will you show power or cowardice?
· Cooperation or furthering a personal agenda – You are in an argument. Are you positive you are right? Can you consider that you may be wrong? Yielding is not weak if you have Nietzsche’s kind of power. More meetings have been ruined by conflicts between unyielding members, each sure that he is right. Doesn’t Nietzsche say that will is important? Yes, but make sure you understand that yielding to a position may be appropriate and correct, not weak.
· Acknowledging power differences properly – We are not equal in capability, status, or power. How you react to those differences is important. Do you cooperate with your boss? Do you respect him even he is technically inferior to you? How do you treat those less competent or powerful than you? Are you arrogant in your superiority?
I often hear of effective people. When I probe, it is less about technical competency and more about the above issues. It takes a variety of skills to ensure a project’s success.
[I often dream of meetings where the participants, in true Nietzschean fashion, put their true cards on the table, discuss honestly the reality of the situation, and plan effectively from this state of affairs. Can you imagine the power that comes from such openness? Is it in you?]
One of the tricks you can play on programmers is to ask them how much of their time they spend actually writing code. You will hear estimates running anywhere from 70-90%. When you tell them that studies have shown that even the best programmers tend to produce on average about 15 lines of code per day, the programmers are shocked. What they fail to realize is that on the days that they are writing code. production is high. But, many more days are spent on managing the code. The next few sections describe some of these non-coding activities.
If your program consists of one all-inclusive file, just compile and run it. For anything more complicated, organizing the source code and using build tools are essential. The recurrent tasks for build tools include:
· Compiling source code and pointing to files necessary to compilation.
· Storing the resulting object and executable files.
· Creating aggregate files (tar, jar, zip)
and many other things as we’ll see.
We create scripts of actions that carry out these tasks. Scripts are programs and must be coded so they can be understood, reused, modified, and debugged.
Unix people are familiar with makefiles. Java has a new system called Ant. You will be using both in your projects.
For your typical school problems, you write code, test it, turn it in, never to see it again. Investing a lot of energy in the testing phase makes no sense. But large systems have a long lifetime. The kind of sitting-at-terminal testing you do for small projects is inappropriate for large systems that evolve over time. You must have a testing methodology that permanently captures the test cases and can be rerun at will.
This is not theoretical work as you will see. Mostly it takes discipline to write and save the tests. Fortunately, serious developers have built frameworks for unit testing. JUnit is for Java. Several have been built for C++. CppUnit at http://sourceforge.net/projects/cppunit claims to be a port of JUnit.
You will become familiar with both in your project work.
With teams of people working on a project you simple must use tools to help with controlling the source code. There are many systems out there for doing this. One for free is at http://www.gnu.org/software/cvs/
I have
decided NOT to have you use any source code management tools in this class.
With 2-person teams, they are simply not needed. But I do want you to be aware
of what they can do. Below is information taken from the CVS website.
What is CVS About? CVS maintains a history of the source and how it was developed. It stamps each change with the time of the change and the user who made it. Usually, the person provides a bit of text describing why they made the change. Given that information, CVS can help developers answer questions like:
· Who made a given change?
· When did they make it?
· Why did they make it?
· What other changes did they make at the same time?
Checking out a working directory—CVS doesn’t work on ordinary directory trees; you need to work within a directory that CVS created for you. Just as you check out a book from a library before taking it home to read it, you use a checkout command to get a directory tree from CVS before working on it.
Merging your changes—Since each developer uses their own working directory, the changes you make to your working directory aren’t automatically visible to the other developers on your team. CVS doesn’t publish your changes until you’re ready. When you’re done testing your changes, you must commit them to the repository to make them available to the rest of the group.
However, what if another
developer has changed the same files you have, or the same lines? Whose changes
should prevail? It’s generally impossible to answer this question
automatically; CVS certainly isn’t competent to make that judgment. Thus,
before you can commit your changes, CVS requires your sources to be in sync
with any changes committed by the other team members.
Committing your changes—Now that you have brought your sources up to date with the rest of the group and tested them, you are ready to commit your changes to the repository and make them visible to the rest of the group.
Examining changes—At this point, you might well be curious what changes the other developer made. To look at the log entries for a given file, you can use the cvs log command.
You can find plenty of documentation at the website.
Source code management prevents cataclysmic disasters that many of us working on large projects have been through. Here is a quote from the “Ant in Anger” paper in the Ant documentation:
“If you don’t have a source code management system, you are going to end up hosed. If you don’t have everything under source code management, including web pages, dependent jars, installation files, you are still going to end up hosed, it’s just a question of when it’s going to happen. CVS is effectively free and works well with Ant, but Sourcesafe, Perforce, Clearcase and StarTeam also have Ant tasks. These tasks let you have auto-incrementing build counters, and automated file update processes.“
I couldn’t have said it better.
Often, large software system require startup information in order to:
· run in a certain environment,
· run a particular set of cases;
· execute in different modes, such as debugging.
· and anything else that might affect its behavior
This configuration information can be passed as command line parameters, but is more often stored in an external file that the program reads during startup. The question is what is the format of the file?
Mostly, developers invent a homegrown format convenient for their application. They write (simple) parsers that process the configuration file. As XML became more popular, developers started it for formatting their configuration files. Freely available XML parsers saved a lot of work. The result is that XML-based configuration files have become standard.
The Tomcat java web server and the Ant build-management system both use XML formatted files for their configuration. Since you will be using both systems, you need to learn a little about XML. Section 10 discusses XML in detail.
Code is not sufficient as documentation of a software system. Code shows only how something works. Code is just the implementation of a design. For sufficient documentation the following is absolutely the minimum required:
· What the requirements are, i.e. what is the system supposed to do.
· What the design is, i.e., how the system is going to fulfill its requirements.
· How each design feature is motivated by a requirement, i.e., make sure the system doesn’t do unwanted things.
· How each requirement is expressed in the design, i.e., make sure the system does all the required things.
That seems straightforward. In fact, your textbook says that for all systems, “If not documented, then maintenance can be farcical.” Of course. How can you maintain or modify a system if you don’t know why the code is written the way it is?
Even once we agree on documentation, system builders face the following problems:
· What format should the documentation be in?
· What formalisms should be used to describe the requirements and the design?
· How easy will it be to maintain the documentation?
These are hard questions and you will study them in your software engineering classes. The next few sections will talk about them briefly, enough for our class.
Over the years, many organizations and researchers have tried to specify a format for their documentation. If you work for an organization doing government or military work, you will become quite familiar with their comprehensive documentation requirements. Some of those organizations have settled on a format called MBASE developed right here at USC by our own Professor Barry Boehm.
MBASE stands for Model Based System Architecting and Software Engineering. MBASE identifies absolutely everything that must be documented, from requirements, to design, to implementation plan, to maintenance, everything. MBASE is for large projects that take years to complete. It is overkill for our needs.
Instead you will be using an informal and non-standard format called a Roadmap. A roadmap satisfies all the basic documentation needs. Don’t be fooled by its informal nature. It will be your key to successfully navigating though our code.
Whether using MBASE or Roadmaps, diagrams are necessary. Common ones are:
· Big Block Architecture Diagrams – What are the major components of our system and how do they interrelate?
· Interaction Diagrams – How does an end user interact with the system?
· Flow Diagrams – How does the code execute?
· Timing Diagrams – How does the system behave over time?
There are many formalisms for expressing these diagrams. UML (Unified Modeling Language) has become the standard. UML has a way to build a diagram for everything, from requirements, to object-oriented classes, to dependencies. Your textbook covers UML in depth. Industry uses it. You will use some of it. I’m not a big fan.
UML tries to document everything using diagrams. And its unification is open to question. The computer tools for building UML diagrams are clunky. It’s hard to draw the pictures, it takes a lot of time, and it is hard to maintain the pictures. A common complaint that I hear from my industry colleagues is that people spend too much time on UML when they should be working on code. I don’t want you spending a lot of time on making pretty pictures. I’m much more interested in pretty code.
Remember documentation is about exchanging information. The best way may be not always be pictures. Sometimes text is fine.
The school semester is so constrained, that most projects require only the most rudimentary schedules. Real world software projects are different. Management has many tools to help scheduling. All are base on:
· Describing Tasks – What they are, how they long they will take, who will work on them.
· Describing Dependencies – What task(s) does a task depend on.
With this information, tools can create activity graphs and Pert charts for seeing the task dependencies and showing how long a project will take. The so-called critical path, the minimum time it will take to do the project, is of particular importance to a manager.
This kind of scheduling is beyond the scope of this course. Instead, I will do the scheduling for you. As you remember, we will be using an Extreme Programming methodology. That means lots of small releases—perfect for the classroom experience.
Concurrency is about events or actions happening simultaneously. Multi-tasking is doing multiple tasks at the same time. We will investigate multi-tasking for both computers and people carefully, since the analysis will help us understand concurrent software.
Most problems have solutions involving concurrency. Solving them requires divide and conquer techniques:
· Divide problem into logical tasks that can be carried out simultaneously, or in parallel.
· Assign each logical task to a physical execution entity. Let’s call this pairing a physical task (or task for short). Tasks can be scheduled. Also, you may assign several logical tasks to a single execution entity; this will influence how the execution entity carries out its multiple responsibilities.
· Determine the tasks’ dependencies. The execution entities may need to communicate or share data.
These steps apply both to human and computer systems. In human systems:
· The tasks can be instructions, directions, or recipes.
· The execution engine is a person. Groups of people can form a team.
· The dependencies are the messages that people send (or speak) to one another and the information they share.
In computer systems (explain in detail in section 4.5.1):
· The tasks are threads within a process. A process can have multiple threads.
· The execution engine is a CPU. Multiple CPUs can cooperate.
· The dependencies are the shared and global data, methods or services, or messages between processes.
This compelling analogy is one of the reasons why I often use examples from own lives to demonstrate the issues raised by concurrency. People have the same problems with concurrency as computers do. One major problem, often underestimated, is how to represent a solution.
In a family application whose current goal is to have dinner, the following might be one requirement of many:
“Have cooked carrots for four.”
Notice that the requirement is under specified. It doesn’t say how to cook the carrots or what style to prepare them. We could microwave, mash and serve them. That would fulfill the requirement. Do you see how this under specified requirement could give the cook (the designer) too much freedom?
On the other hand, what if the designer was a good chef. Do you see how this under specified requirement could give the cook (the designer) just the right flexibility? Perhaps if you had a truly great chef, you should only have the one requirement: “Prepare dinner for four at 8pm.” Requirements are tricky. [This actually happened to me recently in Italy. Our host was an American friend who rented a villa near Sienna. Though he knows nothing about Italian food, he would insist on crafting menus for our private, expert cooks. No amount of prodding from me could convince him to leave the decisions (the design) up to the cooks.]
In any case here’s a recipe/design for the “Cooked carrots for four” requirement:
Cooked carrots: Put chopped carrots into boiling water for 15 minutes. Drain and serve.
Notice that the algorithm is also under specified. How much water should we use? How many carrots should we chop? A designer must be careful about such details. But here I’m concerned about concurrency issues.
If I gave this algorithm to an inexperience cook (the execution entity), you might see the following execution scenario:
1. Chop carrots. [Takes 5 minutes.]
2. Put on water to boil. [Takes 5 minutes.]
3. Put carrots in boiling water. [Takes 15 minutes.]
4. Drain the carrots.
5. Serve the carrots
This cook simply followed the recipe linearly and didn’t realize that putting on the water should have been done first. That way, the water would be boiling when the cook finished chopping the carrots. An experienced cook would know that. Perhaps, a more expressive algorithm would help a novice.
The problem is that the recipe didn’t explicitly say that boiling the water and chopping the carrots can be done concurrently. The textual, linear representation is not conducive to showing concurrency. Perhaps we should have written the recipe as:
Cooked
carrots (numberOfServings){
Concurrently do {chop
carrots; boil water};
Put carrots in
boiling water for 15 minutes;
Drain;
Serve.
}
That’s more expressive, but we might not sell many cookbooks with these kind of directions. Yet the advantages are obvious. It is easy to see that if there were three cooks, one could chop the carrots, one could boil the water, the other could do the rest. A single cook who understood these directions would know that “chop carrots” and “boil water” are concurrent actions and could be done in any order. In any case the algorithm shows the logical concurrency, and it is up to the execution entities to schedule the work intelligently.
Note that I also put a parameter list in the first line, but I didn’t use it. I had in mind a parameter for number of portions. Then that parameter could be used to compute how many carrots to chop, how much water to boil, and so forth. Do you thinks home cooks would like recipes written that way?
These problem may not be too important for cooks, but they are critical for software developers, especially concurrency. Later we will see graphical representations for modeling concurrent behavior. There are even graphical programming languages, mostly found in manufacturing, that emphasize concurrency.
Concurrent does not mean independent. A good design limits dependencies. The more dependent one task is on others, the more difficult it will be to understand, implement and change the design. Many books have been written on this subject. Managing dependencies is what encapsulation is all about: try to keep as much private as you can; recognize what must be global or public; limit the task interactions.
Private data is just that, private. Only the task that owns private data can view and modify it. This restricted locality limits where you have to look to see how the private data is used. The private data is encapsulated.
Sometimes it is convenient and necessary to share data. Here are some examples from human systems:
· In the classroom, all students can see anything written on the blackboard. The writing on the blackboard is global to the classroom application. All the students and the teacher are acting concurrently.
· In a delicatessen, customers take a number from a single dispenser in order to queue up to order food. The dispenser is public and shared by all customers in the delicatessen application. All the customers are acting concurrently.
· Two cars are going in the same direction in lanes one and three of a three-lane highway. The middle lane, lane two, is empty and either car could change lanes into it. All drivers share the lanes. All the drivers are acting concurrently.
These situations are so familiar that we aren’t even aware of the sharing problems:
· Two people cannot write simultaneously on the same space on the blackboard. Yet, everyone can read the blackboard simultaneously.
· Two people cannot grab a delicatessen number at exactly the same time.
· Two cars in the outside lanes cannot both merge into the same space of the center lane.
Think about how you solve these problems in real life. Later in section 4.5.2 we will see how to solve these problems in computer systems.
Two (or more) tasks may interact in carrying out their responsibilities:
· Providing Status – One may simply inform the other of some event, e.g. “I will be free at 3pm.”
· Asking for services – One may make a request of the other, e.g. “Can you get me Fred’s email address?” Traditional method calls are requests for service.
These interactions can be synchronous or asynchronous:
Synchronous means the initiator waits for a status acknowledgement or the service to be completed.
Asynchronous means the initiator doesn’t wait.
In human systems the interactions are communicated in two ways:
Directly through voice, gestures, etc. This can be done synchronously or asynchronously.
Indirectly through mail. This is always asynchronous.
Concurrent software applications are all around us. Web browsers and web servers are an obvious example. Web applications are also examples of a distributed system. We’ll spend a lot of time on this application later in the course in section 8.
Another smaller example of a concurrent application is your email program. All email programs have the following basic functional requirements:
· We should be able to receive and read incoming mail.
· We should be able to compose and send outgoing mail.
In most implementations the following happens:
Downloading Mail: On a timer or mouse click, our local email program connects to our mail server to download new messages. It adds them to the input mail folder and displays key headers (from, to, subject) on the screen.
Composing Mail: On a mouse click, a mail template is put on the screen for you to fill out. When done, your SMTP (Simple Mail Transport Protocol) server sends the mail to the recipients.
Downloading and composing mail are logically concurrent. They could be happening simultaneously: the incoming mail can be downloaded (in the background), while you are composing mail (in the foreground).
And yet, in many of the early email systems when you clicked “Read Mail” nothing else could be done until the incoming mail was completely processed. If you had a lot of mail, this wait was particularly annoying. In modern email systems, the two tasks are assigned to different threads and the operating system interleaves their execution. The next few section will explain these concepts.
We’ve spent a lot of time investigating applications that have been divided into concurrent tasks. Now it is time to see how operating systems work and support concurrent software.
. In general:
· Programs are textual representations of tasks that are going to execute on computers. Programs are compiled from a source language, such as Java or C++, into machine code that will execute on a target computer, such as a Unix or Windows machine.
· The compilation produces an object, often called an executable, which can be loaded. Most of the executable is the code plus data (the logical address space).
· When loaded, the executable becomes a process. [Some programs are compiled into libraries whose routines that can be called by a process. Libraries can be shared among processes or linked directly into the process].
· A process can have one or more threads of control.
· The operating system scheduler time-shares among all the active threads. This time-sharing interleaves the execution of the threads.
Remember, computers don’t execute programs. Programs are simply the source text that is be compiled into executables.
· The OS loads the logical address space of the executable into main memory. The newly loaded physical address space contains the code to be executed, storage for the data used by code, and stack space given to the process for use during method calls
· The OS creates a thread of control by assigning a program counter (PC) to point to the current instruction. On initial load the PC points to first instruction of the main routine.
· The OS puts the thread on the scheduler’s ready queue.
As we will now see, it is threads that the CPU executes.
We often think of threads as physical things. People have trouble visualizing what a thread is. You won’t be one of them after this class.
As mentioned above, each process has a default thread that starts executing at the beginning of the main routine. However, all modern operating systems and languages have support for multi-threaded processes. In other words the thread that is running can spawn another thread (or whole process). All that is required is to create an instance of the code at which to start execution.
In Java, you create a class that extends Thread or implements the runnable interface. Then you invoke the instance’s start() method, which tells the JVM to create a thread and begin its execution at the run() method. Easy.
In C++ , you use the Pthreads
package (see 13.1).
http://www.awprofessional.com/articles/article.asp?p=169479&seqNum=1
In both cases the operating system creates a new thread of control by assigning a program counter to the first instruction, assigning some stack space (and a few other things), and putting the thread on the ready queue. The new thread competes for the CPU time just like any other thread.
However, and I can’t say this too strongly, newly created threads are part of the same address space as the thread that created them. Any global data in that address space is available to all the threads within the process. The next section explains.
We will use the Agent application to show how memory and execution are managed. Figure 1 shows the state of memory after loading Main.java of v1.0. Here is the code we care about:
MaitreDAgent maitreD = new MaitreDAgent("MaitreD"); maitreD.startThread();
CustomerAgent
c1 = new CustomerAgent("Fred");
CustomerAgent
c2 = new CustomerAgent("Ethel");
CustomerAgent
c3 = new CustomerAgent("Ginger");
c1.startThread();
c2.startThread();
c3.startThread();
We’ve called the main thread t1. Its program counter is denoted by an arrow pointing to the next instruction to be executed in that thread. Before going into the details of the applications, there are several things to note about this “representation”:
A computer’s memory is shared amongst all the active processes. Each process will look like this one with at least one thread of control. Each address space is protected from one another so they do not interfere (you will learn about memory management in detail in your operating system class).
Each cell is meant to represent one word of memory. Generally, a word of memory is 4 bytes long, or long enough to hold enough to hold a computer instruction or a pointer to another location in memory.
The loaded code is shown in source form (with some comments). In reality, there are no comments and the code is compiled into word-sized machine instructions.
Notice that an address space contains:
· One copy of all the global variables.
· One copy of the code.
You
will see that each class variable points to its own copy of the local variables
associated the class.
|
//All
Code below for Test.java |
//Data
for Test.java Follows |
||
|
t1 |
//global
variables (if any) |
||
|
|
… |
||
|
c1=new CustomerAgent("Fred");
c1.startThread(); |
//local
variable of main routine |
||
|
c2=new CustomerAgent("Ethel");
c2.startThread(); |
maitreD: null |
||
|
c3=new CustomerAgent("Ginger");
c3.startThread(); |
c1:
null |
||
|
c1.setMaitreD(maitreD); |
c2:
null |
||
|
c2.setMaitreD(maitreD); |
c3:
null |
||
|
c3.setMaitreD(maitreD); |
|||
|
c1.setHungry(); |
|||
|
c2.setHungerLevel(7); |
|||
|
c2.setHungry(); |
|||
|
c3.setHungry(); |
|||
|
currentThread.kill();//put in by compiler |
|||
|
//AgentThread Methods |
|||
|
//run() |
|||
|
goOn = True |
|||
|
while
(goOn) { |
|||
|
… |
|||
|
} |
|||
|
currentThread.kill();//put in by compiler |
|||
|
//other
AgentThread methods |
|||
|
… |
|||
|
//CustomerAgent Code |
|||
|
//CustomerAgent Constructor |
|||
|
CustomerAgent(String name){ |
|||
|
… |
|||
|
} |
|||
|
//other
CustomerAgent methods |
|||
|
respondToStateChange() { |
|||
|
… |
|||
|
… |
|||
|
} |
|||
|
… |
|||
|
//MaitreDAgent Code |
|||
|
//MaitreDAgent Constructor |
//Stack
space for thread t1 follows |
||
|
MaitreDAgent(){ |
|||
|
… |
|||
|
} |
|||
|
//other
MaitreDAgent methods |
|||
|
respondToStateChange() { |
|||
|
… |
|||
|
… |
|||
|
} |
|||
|
… |
|||
|
//Code
for utility classes |
|||
|
//TimerQueue code |
|||
|
schedule(){ |
|||
|
… |
|||
|
} |
|||
|
… |
Figure 1 Memory after Loading Test.java
When executed, the line of code pointed to by t1’s program counter causes execution of the MaitreDAgent constructor: maitreD=new MaitreDAgent(). Once the startThread() is called, the full instantiation is shown in Figure 2. In particular:
Another thread, t2, is started. The new program counter points to the first instruction of run() (inherited by the MaitreDAgent from the Agent base class).
A stack has been allocated for t2.
Some data space has been allocated for the local variables of this instance of a MaitreDAgent. After execution of the constructor, the variable, maitreD, is associated with this space (as well as the methods of MaitreDAgent).
The new thread competes with all threads for the CPU. Currently, t1 is running. Any thread runs until:
· the scheduler picks another thread after a certain amount of time elapses;
· the thread blocks waiting for I/O;
· the thread blocks waiting for an event. For example, in the infinite loop in the run() method, our agent can block while waiting for its semaphore to be releaseed.
Thread t1 may run to completion before thread t2 executes even one instruction. Or t2 might be selected right now. These timing uncertainties are called race conditions (see section 4.5.2).
|
//All
Code below for Test.java |
//Data
for Test.java Follows |
|
//main
routine |
//global
variables (if any) |
|
|
… |
|
c1=new CustomerAgent("Fred")… |
//local
variable of main routine |
|
c2=new CustomerAgent("Ethel")… |
|
|
c3=new CustomerAgent("Ginger")… |
c1:
null |
|
c1.setMaitreD(maitreD); |
c2:
null |
|
c2.setMaitreD(maitreD); |
c3:
null |
|
c3.setMaitreD(maitreD); |
|
|
c1.setHungry(); |
//Instance
data of maitreD variable |
|
c2.setHungerLevel(7); |
//Agent
local variables |
|
c2.setHungry(); |
|
|
c3.setHungry(); |
agentThread://pointer
to thread |
|
currentThread.kill();//put in by compiler |
//AgentThread Class variables |
|
|
goOn: |
|
//run() |
//MaiterDAgent local variables |
|
goOn = True |
waitingCustomers: pointer |
|
while
(goOn) { |
tables:
pointer |
|
… |
|
|
} |
|
|
currentThread.kill();//put in by compiler |
|
|
//other
AgentThread methods |
|
|
… |
|
|
//CustomerAgent Code |
|
|
//CustomerAgent Constructor |
|
|
CustomerAgent(String name){ |
|
|
… |
|
|
} |
|
|
//other
CustomerAgent methods |
|
|
respondToStateChange() { |
|
|
… |
|
|
… |
|
|
} |
//Stack
space for thread t1 follows |
|
… |
|
|
//MaitreDAgent Code |
|
|
//MaitreDAgent Constructor |
|
|
MaitreDAgent(){ |
|
|
… |
|
|
} |
|
|
//other
MaitreDAgent methods |
|
|
respondToStateChange() { |
//Stack
space for thread t2 follows |
|
… |
|
|
… |
|
|
} |
|
|
… |
|
|
//Code
for utility classes |
|
|
//TimerQueue code |
|
|
schedule(){ |
|
|
… |
|
|
} |
|
|
… |
Figure 2 Memory After Instantiating MaitreD agent
Figure 3 shows memory after executing the next
line of code in thread t1:
c1=new CustomerAgent("Fred"); c1.startThread();
Notice the following:
·
Another thread, t3, has been created for the new customer. Note
that t2 and t3 point to the same instruction. It is imperative that you understand
that the threads execute within the same copy of code.
·
A stack has been allocated for t3.
·
Space has been associated with the variable, c1. This space has
the local variables of the class CustomerAgent. Some of
those variables come from the inherited base class.
In thread t1, after execution of
c2=new CustomerAgent("Ethel"); c2.startThread();
c3=new CustomerAgent("Ginger"); c3.startThread();
there will be a total of five
threads running in the Test.java process: the main thread, one MaitreDAgent thread, and three CustomerAgent
threads.
The main thread, t1, might
continue until it finishes. In that case t1 will exit, leaving the other
threads to continue competing for CPU time.
You should be able to simulate execution of the CPU by:
·
Picking an active thread;
·
“Executing” the instruction pointed to by its program counter.
·
“Incrementing” the program counter to point to the next
instruction.
·
Continuing the Executing/Incrementing until you pick another
thread.
Notice that that there is no
real concurrency going on here since there is only one CPU. However, the
scheduling behavior allows interleaving and gives the illusion of
concurrency. If you assigned each thread to a separate processor, then you
would have real concurrency. Since typical computers only have one CPU
for executing code, only distributed applications (section 5) have real concurrency.
|
//All
Code below for Test.java |
//Data
for Test.java Follows |
||
|
//main
routine |
//global
variables (if any) |
||
|
maitreD=new MaitreDAgent() |
… |
||
|
|
//local
variable of main routine |
||
|
c2=new CustomerAgent("Ethel");… |
|
||
|
c3=new CustomerAgent("Ginger");… |
|
||
|
c1.setMaitreD(maitreD); |
|
||
|
c2.setMaitreD(maitreD); |
c3:
null |
||
|
c3.setMaitreD(maitreD); |
|||
|
c1.setHungry(); |
//Instance
data of variable maitreD |
||
|
c2.setHungerLevel(7); |
//Agent
local variables |
||
|
c2.setHungry(); |
|
||
|
c3.setHungry(); |
agentThread://pointer
to thread |
||
|
currentThread.kill();//put in by compiler |
//AgentThread Class variables |
||
|
|
goOn: |
||
|
//run() |
//MaiterDAgent local variables |
||
|
|
waitingCustomers: pointer |
||
|
t3 |
tables:
pointer |
||
|
… |
|||
|
} |
//Instance
data of variable c1 |
||
|
currentThread.kill();//put in by compiler |
//Agent
local variables |
||
|
//other
AgentThread methods |
|
||
|
… |
agentThread://pointer
to thread |
||
|
//CustomerAgent Code |
//AgentThread Class variables |
||
|
//CustomerAgent Constructor |
goOn: |
||
|
CustomerAgent(String name){ |
//CustomerAgent local variables |
||
|
… |
name:
“Fred” |
||
|
} |
hungerLevel: 5 |
||
|
//other
CustomerAgent methods |
maitreD: null |
||
|
respondToStateChange() { |
isHungry: false |
||
|
… |
isWaiting: false |
||
|
… |
isSeated: false |
||
|
} |
|||
|
… |
//Stack
space for thread t1 follows |
||
|
//MaitreDAgent Code |
|||
|
//MaitreDAgent Constructor |
|||
|
MaitreDAgent(){ |
|||
|
… |
//Stack
space for thread t2 follows |
||
|
} |
|||
|
//other
MaitreDAgent methods |
|||
|
respondToStateChange() { |
|||
|
… |
|||
|
… |
//Stack
space for thread t3 follows |
||
|
} |
|||
|
… |
|||
|
//Code
for utility classes |
|||
|
//TimerQueue code |
|||
|
schedule(){ |
|||
|
… |
|||
|
} |
|||
|
… |
Figure 3 Memory
After Instantiating MaitreD agent and Customer Agent
Fred
Concurrency and data sharing are separate issues:
· The illusion of concurrency via threads produces interleaving behavior.
· Data Sharing between threads allows information exchange.
Concurrency and data sharing together cause the Critical Section Problem. The Agent application has examples of this problem, but the classic Producer-Consumer problem shows it more simply. Assume there are multiple producers of items that are used by multiple consumers as follows:
·
Producers put items into a circular array called buffer whose
length is size.
·
The next free location for a new producer item is called in.
·
Consumers take elements out of buffer from the location called
out.
·
If in==out then buffer is empty.
·
If in+1==out then buffer is full.
·
The variables, in, out, and buffer are shared by all producers and
consumers.
·
Each producer and consumer is executing in its own thread.
Figure 4 shows the address space for this problem with two producers—threads P1 and P2—and one consumer with thread C1.
|
|
//global variables |
|
while (true) { |
in: 7 |
|
|
out: 4 |
|
buffer[in]=nextProduced(); |
size: 10 |
|
in=(in+1)mod size; |
buffer[0]: |
|
} |
buffer[1]: |
|
buffer[2]: |
|
|
//consumer code |
buffer[3]: |
|
while (true) { |
buffer[4]:
xyz |
|
|
buffer[5]:
abc |
|
next=buffer[out]; |
buffer[6]:
def |
|
out=(out+1)mod size; |
buffer[7]:
ghi //by P1 |
|
} |
buffer[8]: |
|
buffer[9]: |
Figure 4 An address
space for Producer Consumer
Let’s suppose that in this execution:
·
P1 is running and has finished executing buffer[in]=nextProduced(); The result put “ghi”
into buffer[7]. P1’s program counter has been incremented and now points to the
code that is about to increment in.
·
Let’s say the operating system scheduler interrupts P1 at exactly
this point and gives control to P2. Remember that P1 and P2 are executing the
same code and sharing the global variables.
·
P2 sees there is room in the buffer and moves on to the next
instruction.
·
P2 executes buffer[in]=nextProduced() and puts its output, say “mno,” into buffer[7], the same place that P1 did. Yikes!!
·
P2 executes the increment statement, making in=8.
· Let’s say P2 in interrupted right now.
Memory now looks like the following:
|
//producer code |
//global variables |
|
while (true) { |
in: 8 |
|
|
out: 4 |
|
|
size: 10 |
|
in=(in+1)mod size; |
buffer[0]: |
|
} |
buffer[1]: |
|
buffer[2]: |
|
|
//consumer code |
buffer[3]: |
|
while (true) { |
buffer[4]:
xyz |
|
|
buffer[5]:
abc |
|
next=buffer[out]; |
buffer[6]:
def |
|
out=(out+1)mod size; |
buffer[7]:
mno |
|
} |
buffer[8]: |
|
buffer[9]: |
Figure 5 Just after P2 runs
If control is given back to P1, then P1 increments in to be 9. Yikes again!!! The state of the system is terrible:
· P1’s output into buffer[7] has been overwritten and lost.
· in is now pointing to element 9 of buffer.
· buffer[8] does not have a valid item in it. What will happen to the consumer when it gets there?
This disaster occurred because P1 was interrupted before it got a chance to update the in variable. However next time this program is run, P1 might complete its update and all might be well. P1 and P2 are in a race to complete their update before they get interrupted. Race conditions are part of any multi-threaded application. The bug here is particularly difficult to track down because it may not be reproducible—every mix of code in the operating system will produce different timings and results. We can’t prevent race conditions, but we can write better code.
The above analysis shows that only one producer thread should be allowed to execute the code in the while(true) loop. That code is called a critical section, and its execution must be atomic, i.e., its execution must complete before any other thread can enter it. You will study this problem in depth in your Operating System class. For now, here are two techniques to solve this error, synchronization and messaging.
Synchronization controls event ordering. We need them anytime we want concurrent events to happen simultaneously or in some sequence. For the Critical Section problem we want to control entry in the critical section—only letting one thread in at a time—and upon exit from the critical section we can let another thread in, i.e., provide serial access to the critical section.
Both semaphores and monitors provide synchronization.
A semaphore, s, is an object with two operations, acquire()
and release(). wait() and release() are also known as wait and signal, p and v,
down and up, or take and give. Semaphore has a slot called permits, which
controls the number of thread that can "get into" the rest of the
code. Whatever they are called, they have the following semantics:
|
acquire() { while (s.permits <= 0); //wait s--; } |
release(){ s.permits++ } |
This is not code, but how any implementation should behave. Acquire and Release are meant to execute atomically, that is, they can’t be interrupted.
A semaphore can be used for simple synchronization between threads. Suppose thread A must wait for thread B to finish some task S before proceeding. They can use a semaphore as follow:
|
//global semaphore synch = new Semaphore(0); //initial
value set to zero. |
|
|
//thread A |
//thread B |
|
… synch.acquire(); … |
… S; synch.release() … |
A semaphore also can guard a critical section as follows:
//global
variable
Semaphore
s = new Semaphore(1); //initial value set to one.
while
(true)
s.acquire(); //grab semaphore
lock
critical_section();
s.release();
//release semaphore lock
other_code();
}
In java versions beyond 1.5, we can try to acquire a semaphore. The two versions returns a boolean if we were successful. Here are the two versions:
boolean tryAcquire()
Acquires a permit
from this semaphore, only if one is available at the time of invocation.
boolean tryAcquire(long timeout, TimeUnit unit)
Acquires a permit
from this semaphore, if one becomes available within the given waiting time and
the current thread has not been interrupted.
How would you use these calls. Here are two examples:
while (!sem.tryAcquire())
{
... //perform
this until the semaphore is available
}
//code that depends on having the semaphore
...
or
while (true) {
if (sem.tryAcquire(100
milliseconds)) {
... //activity for which semaphore was needed
} else {
... //see
if there’s something else I should be doing,
//which doesn’t require the semaphore.
//Or log a warning that I’m still waiting for
the semaphore.
//Or return failure.
//Or ...
}
}
Semaphores are easy to misuse. Monitors offer an alternative that can be enforced by the compiler. A monitor encapsulates the various pieces of code that need serial access. For example:
monitor MyMonitor{
//shared
data declarations here
…
initialization_code(){…} // could be a constructor
// monitor
operations follow.
//Only
one can be active at a time
f1() {…}
f2() {…}
…
}
This monitor says that only one of the monitor operations can be active at a time. Monitors should be put “around” all shared data.
Here is the producer consumer problem in pseudo-code using monitors:
monitor ProducerConsumerMonitor {
//shared
data declarations here
int size=10;
private buffer[size];
int in=0, out=0;
//monitor
operations
insert()
{…}
remove()
{…; return item}
}
//global
variable
ProducerConsumerMonitor theMonitor;
//producer
code (in its own thread)
while(true)
{
Item data = produce_item();
theMonitor.insert(data);
}
//consumer
code (in its own thread)
while(true)
{
Item data = theMonitor.remove();
consume_item(data);
}
Since insert() and remove() are in the monitor, access to the shared variables are guaranteed to be serial. Monitors elegantly solve this difficult problem.
Java has a mechanism that approximates monitors. By assigning the keyword synchronized to a method, you are declaring that only serial access is allowed. For a full Java implementation of the Producer-Consumer problem, see ProducerConsumer Monitor, ProducerConsumer Application, Another App. With Multiple Producers and Consumers, We will discuss it in detail in class.
There are also several instances of synchronization in the agent application.
[Aside: Producer-Consumer is a pattern. If you can cast your problem as one of producer-consumer, than you can immediately use the solutions shown above. On simple reflection this can be done to many problems. For example, in the email application, the thread that gets mail from the internet server is a producer; the thread that puts summaries on the screen is a consumer.]
Semaphores, monitors, and synchronized methods are fraught with danger. The very best programmers have trouble getting this kind of code to work. If you could desynchronize the code in your application by NOT sharing data, things would be much simpler.
In messaging, a sender process creates a message and sends it to a recipient process. The message is not shared: during creation the sender owns it, after reception the recipient owns it. You can build messaging based applications. The Agent Roadmap details such an application.
So far all of our interactions have been between threads running in the same process. However, with the advent of networks, processes, perhaps on different computers, can communicate with one another in at least three logical ways:
· Exchanging steams of data, which we will investigate via Socket Programming.
· Messaging, which we have already discussed.
· Ordinary subroutine, or method, calls, which we will investigate via Java’s Remote Method Invocation (RMI).
Whatever method is used, the receiving process receives a deep copy of the data. The result is:
· No data is shared.
· The invocation—whether it is the sending of a data stream, a message, or a method call—is physically asynchronous because the two communicating processes are in separate processes.
· The “calling” application can treat the invocation as logically synchronous by waiting for the return result, if there is one. If nothing is returned, then the effect is like messaging.
Though the communication methods are different logically, they all work similarly at the lowest levels by exchanging streams of data. Let’s look at that layer first.
Here is a summary of how a client-server system works with sockets.
· A Server process “listens” on a socket;
· Client processes “connect” to that socket and start the conversation;
· Each Client and the Server have a “conversation” over 2 channels:
· One from client to server;
· The other from server to client.
· The details of the conversation depend on the protocol.
Client and Server can be written in different computer languages.
The following examples come from “Core Web Programming” by Marty Hall and Larry Brown, ISBN 0-13089793-0 ( www.corewebprogramming.com).
First comes the main() routine to house the client.
public
class NetworkClientTest {
public static void main(String[] args) {
String host = "localhost";
if (args.length
> 0)
host = args[0];
int port = 8088;
if (args.length
> 1)
port = Integer.parseInt(args[1]);
NetworkClient nwClient
= new NetworkClient(host,
port);
nwClient.connect();
}
}
We will show the detail of the NetworkClient class in a moment. We are covering clients before we cover servers because clients are slightly easier. We can test clients by connecting to existing servers that are already on the Internet. For example, using this client we can connect to Netscape’s ftp server over port 21 as follows:
>java
NetworkClientTest ftp.netscape.com 21
Generic
Network Client:
Made
connection to ftp.netscape.com and got
‘220
ftp26 FTP server (UNIX(r) System V Release 4.0)
ready.’
in response
>
Here are the steps of creating a client in the NetworkClient class:
1)
Create a Socket object
Socket client = new
Socket("hostname", portNumber);
2)
Create an output stream that can be used to
send info to the Socket (to the server):
// Last arg
of true means autoflush -- flush stream
// when println is called
PrintWriter out = new PrintWriter(client.getOutputStream(), true);
3)
Create an input stream to read the response
from the server (from the socket):
BufferedReader in = new BufferedReader
(new InputStreamReader(client.getInputStream()));
4)
Do I/O with the input and output Streams
5)
For the output
stream, PrintWriter, use print and println, similar to System.out.println.
6)
For the input stream, BufferedReader, you can call read to get a single character or an array of
characters,
or call readLine to
get a whole line.
7)
Close the socket when done:
client.close();//closes the associated input and output
streams
Here is the actual code of the NetworkClient class, which implements the above five steps.
import
java.net.*;
import
java.io.*;
public
class NetworkClient {
protected String host;
protected int port;
public NetworkClient(String
host, int port) {
this.host = host;
this.port = port;
}
public String getHost()
{
return(host);
}
public int getPort() {
return(port);
}
/** Establishes the connection, then passes
the socket
* to handleConnection. */
// see http://java.sun.com/j2se/1.4.2/docs/api/index.html
// Look for Socket class
public void connect() {
try {
Socket client = new Socket(host, port);
handleConnection(client);
} catch(UnknownHostException
uhe) {
System.out.println("Unknown
host: " + host);
uhe.printStackTrace();
} catch(IOException
ioe) {
System.out.println("IOException: " + ioe);
ioe.printStackTrace();
}
}
/** This is the method you will override when
*
making a network client for your task.
*
This default version sends a single line
*
("Generic Network Client") to the server,
* reads one line of response, prints it, then
exits.
*/
protected void handleConnection(Socket
client)
throws IOException
{
PrintWriter out = SocketUtil.getPrintWriter(client);
BufferedReader in =
SocketUtil.getBufferedReader(client);
out.println("Generic Network Client");//to
the server
System.out.println
("Generic Network Client:\n" +
"Made connection to " + host +
" and got '" + in.readLine() + "' in response");
client.close();
}
}
So, it creates the two streams from the “client” socket and
· sends the opening message via out.println()and
· reads the response via in.readLine(), which blocks until the server responds, thus making the exchange synchronous).
SocketUtil has the code for the creation of the Reader and Writer:
import
java.net.*;
import
java.io.*;
public
class SocketUtil {
/** Make a BufferedReader
to get incoming data. */
public static BufferedReader
getBufferedReader
(Socket s) throws IOException {
return(new BufferedReader(
new InputStreamReader(s.getInputStream())));
}
/** Make a PrintWriter
to send outgoing data.
*
This PrintWriter will automatically flush
stream
*
when println is called.
*/
public static PrintWriter
getPrintWriter(Socket s)
throws IOException
{// true means autoflush
return(new PrintWriter(s.getOutputStream(), true));
}
}
Here is the main routine to house the Server:
public
class NetworkServerTest {
public
static void main(String[] args) {
int port = 8088;
if (args.length
> 0) {
port = Integer.parseInt(args[0]);
}
NetworkServer
server = new NetworkServer(port, 1);
server.listen();
}
}
We will show the detail of the NetworkServer class
in a moment. NetworkServerTest
can accept a connection from any WWW Browser over port 8088 once we start it on
server.com as follows:
server.com>java
NetworkServerTest
Then, if a standard Web browser on client.com requests file foo whose URL is http://server.com:8088/foo/, the following will be output by our server:
Generic
Network Server:
got
connection from client.com
with
first line 'GET /foo/ HTTP/1.0'
Here are the steps of creating a server in the NetworkServer class:
1)
Create a ServerSocket
object
ServerSocket listenSocket = new ServerSocket(portNumber);
2)
Create a Socket object from ServerSocket
while(someCondition)
{
Socket server = listenSocket.accept();
doSomethingWith(server);//often
new thread}
3)
Create an input stream to read client input
BufferedReader in =
new BufferedReader
(new InputStreamReader(server.getInputStream()));
4)
Create an output stream that can be used to send
info back to the client.
// Last arg of true means autoflush
stream
// when println is called
PrintWriter out = new PrintWriter(server.getOutputStream(), true)
5)
Do I/O with input and output Streams:
·
Most common input: readLine
·
Most common output: println
6)
Close the socket when done
server.close();
Here is the actual code of the NetworkServer class, which implements the above six steps.
import
java.net.*;
import
java.io.*;
/** A
starting point for network servers. */
public
class NetworkServer {
protected int port,
maxConnections;
/** Build a server on specified port. It will
continue
* to
accept connections (passing each to
* handleConnection) until an explicit exit
*
command is sent (e.g. System.exit) or the
*
maximum number of connections is reached. Specify
* 0
for maxConnections if you want the server
* to
run indefinitely.
*/
public NetworkServer(int port, int maxConnections)
{
this.port = port;
this.maxConnections
= maxConnections;
}
/** Monitor a port for connections. Each time
one
* is
established, pass resulting Socket to
* handleConnection.
*/
public void listen() {
int i=0;
try {
ServerSocket
listener = new ServerSocket(port);
Socket server;
while((i++ <
maxConnections) ||
(maxConnections
== 0)) {
server = listener.accept();
handleConnection(server);
}
} catch (IOException
ioe) {
System.out.println("IOException: " + ioe);
ioe.printStackTrace();
}
}
protected void handleConnection(Socket
server)
throws IOException{
BufferedReader in
=
SocketUtil.getBufferedReader(server);
PrintWriter out =
SocketUtil.getPrintWriter(server);
System.out.println
("Generic Network Server:\n" +
"got connection from " +
server.getInetAddress().getHostName() + "\n" +
"with first line '" +
in.readLine()
+ "'");
out.println("Generic
Network Server");
server.close();
}
}
When using encoded
strings to pass information, you need methods to break out the tokens. The
general method works as follows:
· Build a tokenizer from an initial string.
· Retrieve tokens one at a time with nextToken.
· You can also see how many tokens are remaining (countTokens) or simply test if the number of tokens remaining is nonzero (hasMoreTokens)
The code will look as follows:
StringTokenizer tok
= new StringTokenizer(input,
delimiters);
while
(tok.hasMoreTokens()) {
doSomethingWith(tok.nextToken());
}
There are many packages to do this. One can be found in the Core Web Programming book. We won’t be doing this in our class.
Thus far we have learned how to set up connections using sockets, talked a little about tokenizing, but have ignored the most complicated part, what do the strings look like that we send and receive. That depends on the protocol. As we mentioned before, you can roll your own protocol, or use one of the established ones. Here we will look briefly at HTTP.
To show what HTTP strings are like, let’s try to retrieve the page http://www.apl.jhu.edu/~lmb using telnet over port 80. Here’s how:
Unix> telnet www.apl.jhu.edu 80
Trying
128.220.101.100 ...
Connected to aplcenmp.apl.jhu.edu.
Escape
character is '^]'.
GET /~lmb/ HTTP/1.0
HTTP/1.0 200
Document follows
Date: Sat, 30
Jun 2001 14:34:58 GMT
Server:
NCSA/1.5.2
Last-modified:
Tue, 11 Jul 2001 15:13:56 GMT
Content-type:
text/html
Content-length:
50479
<!DOCTYPE HTML
PUBLIC
"-//W3C//DTD HTML 4.0
Transitional//EN">
<HTML>
...
</HTML>Connection closed by foreign host.
Unix>
Here is what happened:
Via telnet, we can directly pass strings to a Web server over port 80: telnet www.apl.jhu.edu 80
We typed GET /~lmb/ HTTP/1.0 (and a blank line) to the server. GET is one of the commands in the HTTP protocol.
The document is returned. Notice that after the string: HTTP/1.0 200 Document follows there are 5 HTTP headers giving information about the retrieval and a blank line (to signify the end of the headers):
The file follows starting with <!DOCTYPE HTML PUBLIC
Finally, the connection is closed terminating the GET request.
The documentation for the HTTP protocols describes all the commands and headers.
With the above example, it is fairly easy to see how we could build a simple graphical user interface to communicate with HTTP servers. User can interactively specify:
· Host and Port
· HTTP request line
· HTTP request headers
Then the HTTP request could be performed in a separate thread and the response document put into a scrollable text area. Here is what a GUI might look like:

The request in the picture asked for the (default) html file at the URL www.apache.org (usually this is a file called index.html) The If-Modified-Since header asked only for a file modified since the date given. There were none, as can be seen from the results.
To build such a client, you have to know what HTTP requests and HTTP responses look like. In general a request is as follows:
Request
GET
/filename/ HTTP/1.0
Header1:
…
Header2:
…
…
HeaderN:
…
Blank
Line
All request headers are optional except for Host (required only for HTTP/1.1 requests). Responses are like this:
Response
HTTP/1.0
200 OK
Content-Type:
text/html
Header2:
…
…
HeaderN:
…
Blank
Line
<!DOCTYPE
…>
<HTML>
…
</HTML>
All response headers are optional except for Content-Type. You can research all the HTTP details online as needed.
So, creating a simple HTTP Server is straightforward:
· Read all the lines sent by the browser, storing them in an array.
· Use readLine (a line at a time) until you reach a blank line.
· Note: with the POST request you have to read some extra data
· Send an HTTP response line (e.g. "HTTP/1.0 200 OK")
· Send a Content-Type line. This indicates the file type being returned; text/html is the most common.
· Send a blank line.
· Send the contents of the file.
· Close the connection.
You can download all source files for this WebClient and WebServer from http://archive.corewebprogramming.com/Chapter17.html
Socket programming is hard!! With Sockets: not only did you have to manage the sending and receiving of data, you had to design a protocol (decide what goes into the strings and how the conversation should proceed) or use an existing one. For many applications we just want a “simple” procedure-call protocol. In other words we want to invoke public methods on remote objects. We want to make all the hard network stuff transparent to the application programmer. Distributed Object systems do exactly that. Let’s make sure we understand how distributed objects differ from local objects.
In a local method call, the data in the parameters are passed directly to the target method by putting a pointer to the data on the stack. This “sharing” of data doesn’t cause any synchronization problems because control is passed to the method along with the data; all access is serial.
With distributed objects, method calls can be remote, i.e., between objects running in different processes on different computers. The key question is how to make this transparent to the application programmer, i.e., how to allow a process on one machine to make normal looking method calls on an object running on another machine. This transparency is the key to distributed objects. Systems that offer this transparency are called middleware. CORBA and Java’s RMI are two such systems.
In CORBA the communicating objects do NOT need to be programmed in the same language. In Java’s RMI (Remote Method Invocation) the objects all have to be coded in Java. Both work similarly:
· A special compiler uses the remote interface specification to build stubs and skeletons.
· The remote object is started and identifies itself to some registry.
· The caller finds the object it wants to use in the registry. This lookup is a substitute for a constructor call.
· The caller invokes methods on the object normally.
· The method invocation is fielded by the stub. Every call in the remote object’s interface has a method in the stub. The stub marshals the call as a data stream and sends it to the remote object.
· The stream is read by the skeleton, parsed, and the actual method in the remote object is invoked.
· The skeleton marshalls the return value and sends it back to the stub, which returns it to the original invocation.
That is a lot of work when compared to a simple local method call. Figure 6 below shows some pseudo-code that implements this scenario.

Figure 6 - A Distributed Object Architecture
Using RMI you will
be able to deploy objects on different machines. RMI handles network
connections automatically. Java “serialization” lets you pass complex
objects over the network without writing code to parse and reconstruct
them.
To do RMI you must specify four classes:
1. An interface for the remote object - Used by both the client and the server
2. The RMI client - This will look up the object on the remote server, cast it to the type of the interface from Step 1, then use it like a local object. Note that as long as there is a “live” reference to the remote object, an open network connection is maintained. The connection will be automatically closed when the remote object is garbage collected on the client.
3. The object implementation - This object needs to implement the interface of Step 1, and will be used by the server.
4. The RMI server - This will create an instance of the object from Step 3 and register it with its URL.
Once you have written the four classes:
· Compiles the remote object interface and implementation automatically.
The client system will need the client class, the interface class, and the client stub class
The server system will need the server class, the remote object interface and implementation, and the server skeleton class
Then to run the system:
· This only needs to be done once, not for each remote object
· The current version of RMI requires this registry to be running on the same system as server
· This step must be on the same machine as the registry of step 7.
This step can be done on an arbitrary machine.
import
java.rmi.*;
/**
The RMI client will use this interface directly.
* The
RMI server will make a real remote object that
*
implements this, then register an instance of it
* with
some URL.
*/
public
interface Rem extends Remote {
public String getMessage() throws
RemoteException;
}
import
java.rmi.*; // For Naming, RemoteException, etc.
import
java.net.*; // For MalformedURLException
import
java.io.*; // For Serializable interface
public
class RemClient {
public static void main(String[] args) {
try {
String host = (args.length > 0) ?
args[0] : "localhost";
Rem remObject =
(Rem)Naming.lookup("rmi://" + host + "/Rem");
System.out.println(remObject.getMessage());
} catch(RemoteException re) {
System.out.println("RemoteException:
" + re);
} catch(NotBoundException nbe) {
System.out.println("NotBoundException: " + nbe);
} catch(MalformedURLException mfe) {
System.out.println("MalformedURLException: " + mfe);
}
}
}
import
java.rmi.*;
import
java.rmi.server.UnicastRemoteObject;
public
class RemImpl extends UnicastRemoteObject
implements Rem {
public RemImpl() throws RemoteException {}
public String getMessage() throws
RemoteException {
return("Here is a remote
message.");
}
}
import
java.rmi.*;
import
java.net.*;
public
class RemServer {
public static void main(String[] args) {
try {
RemImpl localObject = new RemImpl();
Naming.rebind("rmi:///Rem",
localObject);
} catch(RemoteException re) {
System.out.println("RemoteException:
" + re);
} catch(MalformedURLException mfe) {
System.out.println("MalformedURLException: " + mfe);
}
}
}
1. Compile
the Client and the Server
Prompt> javac
RemClient.java
2. Compile the Rem interface and RemImpl object
automatically
Prompt> javac
RemServer.java
3. Generate
the Client Stub and Server Skeleton
Prompt> rmic –classpath
. RemImpl
·
This
builds RemImpl_Stub.class and RemImpl_Skeleton.class
·
The
client machine needs Rem.class, RemClient.class, and RemImpl_Stub.class
·
The
server machine needs Rem.class, RemImpl.class, RemServer.class, and RemImpl_Skeleton.class
·
Java 1.4 no longer builds the skeleton class
explicitly.
·
With Java 1.5 rmic is done automatically during
compilation and NO stub or skeleton classes are created explicitly.
4. Start
the RMI Registry
Server> rmiregistry
5. Start the Server
Server> java
–classpath . RemServer
6. Start the Client
Client> java
RemClient –classpath . hostname
Here is a remote message.
Once an application has been divided into logical processes, the designer must decide how deploy these processes to physical hardware before the final programming details can be determined. The key ideas are:
Once those decisions have been made, there are several configurations to consider.
The single process can be hosted on any computer that can run it. Typically that means the program must be compiled on a computer of the same type as the host. [Java is different. Java compilers all produce the same byte codes; the host computer just needs an installation of the Java Virtual Machine.]
All the shared data issues must be solved using semaphores or monitors.
Before multi-threading, programmers could simulate concurrency by organizing their code into sections and calling them in sequence as follows:
//case 1: simulated concurrency
//global variables
…
main () {
forever do {
call A();
call B();
call C();
}
Now, of course one would reorganize the sections as threads and write something like:
//case 2: concurrency with threads
//global variables
…
main () {
start_Thread A; //A is:
forever do {call A();}
start_Thread B; //B is:
forever do {call B();}
start_Thread C; //C is:
forever do {call C();}
}
In case 1 there no interleaving of code between the sections because the operating system schedule sees just one process with one thread of control. A() runs to completion, then B() runs to completion, then C() runs to completion, and so on. Thus even though the three sections share data there is no critical section problem. [Make sure you understand this.]
In case 2, the operating system sees each thread and can schedule them separately. Thread A might be interrupted in the middle of executing A() and thread B continued. We have interleaving among A(), B(), and C() that we didn’t have in the first case. And, of course, we have all the data sharing problems.
Multiple processes can be deployed on separate computers. Each process will obviously have its own address space. Therefore:
We know that reusing code is wonderful. But what about offering services to application programmers dynamically through distributed servers? The big questions are how do you write such a service, how does an application programmer find out about your service and how does he use it? The idea of remote services is not new.
Before distributed hardware, services were offered through subroutines. In single-machine, multi-process configurations, IPC (inter-process communication) was a way for different processes to communicate. Soon, multi-machine communications became real.
Local Area Networks (LAN) featured remote procedure calls, messaging, and remote file servers. With the invention of the Internet came new applications such as ftp and email communicating using protocols built on top of TCP/IP.
Still there were no BIG systems linked
together. National Software Works (1976) tried to link software via a
procedure-call-protocol over the Internet. Limited Client/Server configurations
came into being, especially involving databases. But, still no large integrated
system existed featuring interoperability.
Maybe the answer was to give programmers more powerful tools. CORBA was such a system. It featured
Despite GIANT effort, CORBA had no real impact. Programmers like to roll their own application! What will programmer's unify around? The Web!!
The WWW is a 3-tiered architecture:
· Tier one is the browser displaying HTML pages. It communicates with the web server via the HTTP protocol.
· Tier two is the web browser retrieving pages (and generating them as well).
· Tier three is the database to store application data.
Applications written for the web are loosely connected via HTTP protocol, not the tight subroutine interface common to traditional programming. At the web server, scripts (perl, servlets, asp) decode the HTTP and invoke subroutines or methods like any other code. Mostly the web is about CONTENT retrieval. eCommerce has extended the Web with a powerful application.
What's appears to be on the horizon is more powerful web applications! Many technologies are shifting their focus to be compliant with the Web. A few are:
· Knowledge Representation ŕ Semantic Web
· Java Platform ŕ J2EE
· CORBA ŕ Web Services
· Windows Platform ŕ .NET Platform
Programmers want their code to be used. The web offers a way of making that code available as a service rather than a Product. And why not?
· It will be easy to maintain
· There is a central point of change, rather than distributing service packs or hot fixes.
· The Black Box model is convenient. The reference to the service remains the same.
· Easy to charge for services (has not been shown to be the case).
· Easy to avoid piracy
·
Achieve a balance between cost and usage.
At least that is the theory. It’s easy to identify services that programmers want. Here are some of the services already in .NET.
· Identity - Passport
· Notification and Messaging - email, fax
· Personalization
· Store
· Calendar
· Directory and Search - find Services and people
· Dynamic Delivery
Clients must be able to discover services. (Universal Description Discovery and Integration - UDDI)
Providers must be able to describe their service (Web Services Description Language – WSDL)
Providers must be able to describe the interactions (conversations) to use the service (Web Services Conversation Language - WSCL)
The UDDI XML schema (which we won’t go into here) defines four core types of service information:
· Business information (such as business name and contact information)
· Business service information (general technical and business descriptions of web services),
· Binding information (specific information needed to invoke a service)
· Service specification information (associating a service’s binding information with the business service information it implements).
A service must be able to describe its abstract interfaces and protocol bindings so that clients can figure out how to invoke it. It is much more complicated than our normal programming language interface.
See WSDL Schema for
an example of the description.
See WSDL Example (Instance) for
an example of an invocation.
A service must be able to describe the kinds of interactions (conversations) that it supports (e.g., that it expects clients to login before they can request a catalog) so that it can engage in complex exchanges with its clients. That why programming packages often come with sample programs—you can’t figure out how to use the code without an example.
WSCL is a graph formalism for describing proper usage. See WSDL Graph (p. 5) for an example.
Soap is a syntax for encoding calls to web services. It offers a standard on-wire protocol to describe the action and results of a web service function invocation. It is basically a way of describing a remote procedure call (RPC) as text. Some features:
· It is XML Based
· It uses HTTP (thus can bypass firewalls) or any other protocol (SMTP, etc).
· It is a Cross Platform/Language standard
SOAP is not a big deal. It’s just a format.
Here is the SOAP response to a service request that asked for the name of a snowboarder who endorsed a particular snow board:
<SOAP-ENV:Envelope
xmlns:SOAP-ENV=http://schemas.xmlsoap.org/soap/envelope/
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<SOAP-ENV:Body>
<m:GetEndorsingBoarderResponse
xmlns:m="http://namespaces.snowboard-info.com">
<endorsingBoarder>Chris
Englesmann</endorsingBoarder>
</m:GetEndorsingBoarderResponse>
</SOAP-ENV:Body>
</SOAP-ENV:Envelope>
You can see the answer is “Chris Englesmann.” All the rest is “tags” that form the structure.
Web servers must be “Soap-enabled” to be able
to deal with web services. Most are. The most common:
IIS 5.0 – Microsoft’s .NET implementation
Apache – SOAP Module Built by IBM
Tomcat (Java Server) - Apache Axis is a package to help build web services.
We’ll see how success Web Services become.
You will be working with an agent application as your first system. The Agent Roadmap will document the system. In working with this system you will be exposed to most of the issues raised above:
· Concurrency – Each agent is logically a separate process.
· Multi-threaded process – The first implementation assigns a thread to each agent within a single process.
· Distributed deployment – You will change the agent application from being multi-threaded to being distributed and multi-process using Java’s Remote Method Invocation.
· Synchronization – There are several examples of the need for synchronized access to shared data.
· Semaphores – The agent scheduler uses a semaphore to know when something has happened.
· Messaging – Agents communicate using messaging.
· Requirement Analysis – You will see how implementation-shortcuts are justified by analyzing the requirement, especially with messaging.
· Unit Testing – You will use the JUNIT package to make your unit tests persistent.
…and many more
things as well.
Section 2.1 introduced Divide and Conquer as a fundamental technique for solving complex problems. We didn’t say how to divide up the problem, only that decomposition into smaller parts works. For complex software problems what those parts look like defines various software methodologies. In vogue right now is object-oriented design and analysis.
Chapter 13 in our text does a nice job of explaining object-oriented design. The highlights are:
· Requirements elicitation identifies the use-cases, i.e. what the system is supposed to do.
· Convert the use-cases into scenarios that expose the detail of the behavior
· Identify the objects (classes) in those scenarios.
· Identify the data and methods of the classes.
· Build a main routine to program the application.
Of course, it’s not that simple. There will be iteration. Programming classes that are not part of the domain will emerge. Conflicts will be found and so on. Still, object-oriented analysis seems to be an improvement for programming development.
Before object-oriented there was algorithmic decomposition. The idea was to divide a problem into modules where each module denotes a major step in the overall process. Each module could then be further refined in a similar manner. This decomposition technique, best known as top-down structured programming, was taught by Edsger Dijksta, one of our most famous computer scientists (he died in 2002). Dijskstra was also responsible for identifying and solving the critical section problem we studied in section 4.5.2.
Algorithmic decomposition would yield trees like the following:
Update File

Figure
7 – An algorithmic decomposition
Thus, to do an Update File, first you do Get Master, then Get Updates, then Match, then Do Update. Any of these steps could be further decomposed. The process of doing this decomposition was natural to us but hard. Interestingly, hardly any two designers would come up with anything resembling the same design using top-down techniques.
Then came object-oriented design, in which the boxes were abstractions, not of the functions of the process, but the objects in the domain. So, for the above problem an object-oriented diagram might look like:

Figure 8 – An object-oriented
decomposition
The picture doesn’t show the process anymore. It shows the domain objects and the messages that can be sent between them. When decoding requirements, object-oriented practitioners talk about identifying objects by finding the nouns in the sentences, and finding the methods by finding the verbs.
There is a duality between algorithmic and object-oriented design. By that I mean that both are different views of the same thing. And we can implement either view to get the same functionality. The key question is why would you choose one view over the other? Here are some thoughts about that question:
Object-oriented analysis aligns better with domain analysis. Domain analysis, part of any requirements phase, involves understanding the domain of the application, especially identifying the objects of the domain and what constraints exist between them. That analysis translates well to an object-oriented design.
The programmed flow of algorithmic code aligns with how the system is supposed to behave. It is undeniable that many people have trouble understanding how an object-oriented system works by just looking at the code. A “trace” of the execution of the object-oriented code will match the flow “programmed into” the algorithmic code. This duality is not easily understood.
If you coded an application both ways, all the code used in programming algorithmically will be “found” somewhere in the object-oriented system. It’s just organized differently. The object-oriented view produces modules that are more reusable and that may be all the advantage object-oriented needs to supplant algorithmic design.
Changing an application designed algorithmically can be devastatingly hard. Somewhat surprisingly, object-oriented systems seem more accommodating to change. Mostly because of process called refactoring.
Suppose you were designing a system to play the 4-person card game of Hearts. As part of an object-oriented design you might have a class specification like:
public class
HeartGame{
private Player[] player = new Player[4];
private int gameID;
private DeckOfCards deck;
private HeartsScore score;
public HeartGame () {} //constructor
public boolean Register(Player p) {}
private void startGame(){}
private void scoreGame() {}
}
Don’t worry about the details. Just assume this design for now. Now suppose you want to add the 4-person game of Bridge to the system. A BridgeGame class might look like:
public class
BridgeGame{
private Player[] player = new Player[4];
private int gameID;
private DeckOfCards deck;
private BridgeScore score;
public BridgeGame() {} //constructor
public boolean Register(Player p);
private void startGame();
private void scoreGame();
private in getGameID();
}
A (not very) close look shows that the two classes have some of the same data (player, gameID, deck, score) have two common methods (Register(), getGameID()), and share some of the same method abstractions (startGame(), scoreGame()). As good object-oriented programmers you would recognize these similarities and refactor the classes as follows:
·
Create
a base class called Game;
·
Move
the common data to the base class;
·
Move
the common methods to the base class.
·
Create
abstract methods for the methods you want inherited classes to implement.
When you’ve done all this, you will have three classes as follows:
public class Game{
private Player[] player = new Player[4];
private int gameID;
private DeckOfCards deck;
public Game () ; //constructor
public boolean Register(Player p);
//derived class must
implement all abstract methods
abstract private void startGame();
abstract private void scoreGame();
}
public class
HeartGame extends Game{
private HeartsScore score;
public HeartGame () ; //constructor
private void startGame();
private void scoreGame();
}
public class
BridgeGame extends Game{
private BridgeScore score;
public BridgeGame () ; //constructor
private void startGame();
private void scoreGame();
}
An expert designer might argue that you should have anticipated that this was going to happen and had a Game base class right from the beginning. If your requirements had mentioned that both Hearts and Bridge were going to be implemented, the expert might have a point.
However, Extreme Programming says not to do this for two main reasons:
·
The
requirements may change and the second game may never be implemented. The time
and effort designing and managing the base class will have been wasted. And a
design without the base class is easier to understand.
·
Refactoring
is NOT THAT HARD.
Sometimes this argument is framed in terms of flexibility. The argument goes that you should design your code so that the inevitable changes will be easy to incorporate. Expert designers believe strongly in this principal and will design frameworks of abstract base classes that can be extended in any way possible. These flexible designs can be very complicated and that’s the point. Often, that flexibility is unused and all that effort and complexity are wasted. We will argue in class about this controversial issue.
The short answer is NO. Programmers have always understood the need for data structures. And, of course, data structures came with operations and semantics. So, many algorithmic programmers just used classes to define their data structures and did everything else the same way.
They ignored class hierarchies, inheritance, and polymorphism. They divided their main routine into the processing chunks we saw in Figure 7. Then they further divided those into smaller chunks.
Even “experienced” object-oriented programmers sometimes code too much “process” into a method. Utilizing the full object-oriented methodology takes practice and plenty of object-oriented thinking. The agent methodology we explored in section 5.5 forces object-oriented thinking without using all parts of object-oriented design.
As we’ve said before, an architecture defines an object’s structure, or organization. We use the term freely to describe anything whose structure we want to name. The first programming system used by this class has an agent architecture.
We also use architecture to describe the hardware or deployment structure of applications. Deployment architectures and software architectures are related, but they are not the same. Sometimes this can be confusing. Our discussion of 3-tier architecture will clarify all the issues.
A 3-tier architecture has three logical components:
· The first tier is the user interface.
· The second tier runs the business logic.
· The third tier stores the data.
It has the following big block structure:

Figure 9 - 3-Tier Architecture
Each of these three components can be hosted on separate computers, and typically are. Often, however, the database runs on the same computer as the business logic. No problem. All we really care about in drawing such a diagram is to emphasize that we have three separate processes, each of which has its own address space.
The most interesting thing about 3-tier is defining the arrows, i.e., how do the different components communicate. In virtually every application, the backend two tiers (the business logic and data base) communicate using a database protocol called SQL (Standard Query Language). That is, the database is set up as a server and any business logic component can access it in a standard way as shown in Figure 10.

Figure 10 – Multiple business applications accessing the same data base server
The user process and business logic can communicate using any protocol the designers wish. Typically choices are technologies like Corba or TCP/IP. The World Wide Web changed all that.
A web application is a specific kind of 3-tier application:
· The first tier is a web browser. Web browsers display and execute documents written in the HyperText Markup Language (HTML).
· The second tier is a web server. It runs the code that responds to requests from web browsers.
· The first two tiers communicate using the HyperText Transfer Protocol (HTTP).
Though web applications can vary widely, most have these features:
· User interfaces run on web browsers using HTML. Fortunately, HTML can be augmented by images, script languages, code for plug-ins, and so forth. Almost any interface need can be implemented in HTML, though sometimes the look and feel seems clumsy when compared to a native GUI.
· Data is gathered through the use of HTML forms. Forms are linked to code that receive and process the information. The code runs on a web server.
· The receiving code must be HTTP compliant. This means parsing the HTTP request stream (for parameters), doing the work required by the form, and generating an HTML reply. The receiving code, often called CGI code, can be written in many languages (Perl, PHP, C, C++, Java, C#). When the language is Java, the code is called a servlet.
A framework is an environment that simplifies building certain kinds of applications by forcing some decisions on you. By accepting those decisions some capabilities are specialized, emphasized, and simplified. The Web currently has two dominant frameworks: .Net for Microsoft applications and J2EE for Java applications.
Both frameworks offer elaborate interactive development environments (IDE). Several vendors supply the J2EE framework. Both frameworks have:
· Extensive class libraries and tools to support HTTP programming;
· Tools for generating HTML documents;
· Tools to support deploying a Web Application;
·
An embedded Web Server to help test the
application.
Your last programming system will be a web application for
playing multi-user games. The Games Roadmap will
document the system. In working with this system you
will be exposed to most of the issues raised in section 5 and section 8:
· Object-Oriented Design – Initially, the application will be implemented using object-oriented analysis and design.
· 3-Tier architecture – You will rearchitect your application so that it can be deployed as a 3-tier application in the J2EE framework.
· Web Browser Front-End – You will implement the user interface as HTML pages with embedded forms.
· Refactoring – You will refactor your object-oriented classes when you add a second game to the application.
· Requirement Analysis – You will see how implementation-shortcuts are justified by analyzing the requirement, especially with messaging.
· Unit Testing – You will use the JUNIT package to make your unit tests persistent.
…and many more things as well.
As we have said before in section 2.2, reuse is one of themes of an effective decomposition. Nothing is more fulfilling to a manager than to see his team save time, effort, and money by being able to reuse something they have previously done. And reuse is not limited to physical objects or code, we can reuse ideas as well. In the programming world Design Patterns has come into fashion.
Chapter 6 in our textbook discusses design patterns. The book says “Design patterns are class combinations and accompanying algorithms that fulfill common design purposes. A design pattern expresses an idea rather than a fixed class combination. Accompanying algorithms express the pattern’s basic operation.”
Design patterns fall into one of three categories:
· Creational Patterns – Creating a collection of objects in flexible ways. Examples are Factory, Prototype, Singleton.
· Structural Patterns – Representing a collection of related objects. Examples are Composite, Decorator, Façade.
· Behavioral Patterns – Capturing behavior among a collection of objects. Examples are Chain of Responsibility, Command, Mediator.
We will discuss many of these patterns in class.
EXtensible Markup Language (XML) is a language for describing the content of a document. XML does not specify the tag set or grammar of the language; that is up to the application. The Tag Grammar is described via Document Type Definition (DTD) or XML Schema. XML with a DTD or Schema is designed to be self-descriptive. But remember, XML by itself does not do anything!!!
When people refer to XML, they typically are referring to XML and related technologies.

XML is a simple syntax for formatting documents. It uses “tags” to separate fields so that it is easily parsable. As we see from the above diagram XML is used in many different technologies: The Semantic Web as RDF, HTML, ANT and web server configuration files, etc.
The examples and much of the text that follows come from “Core Web Programming” by Marty Hall and Larry Brown.
<?xml version="1.0"encoding="ISO-8859-1"?>
<authors> <!-----
root element -->
<name> <!----- child element -->
<firstname>Larry</firstname>
<lastname>Brown</lastname>
</name>
<name>
<firstname>Marty</firstname>
<lastname>Hall</lastname>
</name>
</authors> <!-- closing tag of root element -->
The above document is about authors, two of which are shown by their name, which is in turn
a firstname followed by a lastname. The allowable tags and their placement are
defined either by a Document Type Definition (DTD) or an XML Schema (XSD).
Here is the DTD for the authors document above:
<DOCTYPE
authors [
<!ELEMENT authors (name)*>
<!ELEMENT name (firstname, lastname)>
<!ELEMENT firstname (#PCDATA)>
<!ELEMENT lastname (#PCDATA)>
]>
It says, in BNF-like form, that authors comprise 0 or more instances of name. That a name is a firstname followed by a lastname, that a firstname and lastname are PCDATA (parsed strings). These definitions can be inline or in a separate file. The details will be discussed later
There are many places to go to find information about XML. Here are a few:
An XML document is made up a prolog (the xml version, entity definitions, and DOCTYPE) followed by the XML Building Blocks (elements, tags and attributes, PCDATA and CDATA, entities, and comments)
Here is an example:
<?xml version="1.0"
encoding="ISO-8859-1" standalone="no"?>
We will see that the prolog can contain inline entity and DTD definitions
The doctype usually specifies the location of the defining
DTD; the doctype can also contain the actually DTD definitions inline. Here are the common forms:
<!DOCTYPE root [InternalDTD]>
<!DOCTYPE root SYSTEM URL>
<!DOCTYPE root PUBLIC FPI-identifier
URL>
<!DOCTYPE root [dtd-definitions]>
Here are some examples
<!DOCTYPE book SYSTEM
"DTDs/CWP.dtd">
This specifies a dtd relative to the XML document it came from.
<!DOCTYPE book SYSTEM
"http://www.corewebprogramming.com/DTDs/CWP.dtd">
This specifies a dtd via a full URL..
A public DTD is slightly more complicated:
<!DOCTYPE root PUBLIC FPI-identifier
URL>
The Formal Public Identifier (FPI) has four parts:
Here are some FPI examples
<!DOCTYPE
Book
PUBLIC "-//W3C//DTD XHMTL 1.0
Transistional//EN"
"http://www.w3.org/TR?xhtml1/DTD/xhtml1-transitional.dtd">
<!DOCYTPE
CWP
PUBLIC "-//Prentice Hall//DTD Core
Series 1.0//EN"
"http://www.prenticehall.com/DTD/Core.dtd">
XML comments are the same as HTML comments:
<!-- This is an XML and HTML comment
-->
All XML-aware applications recognize beginning and end of
document via the root element. In the example below, the root element is <book>.
<?xml
version="1.0" ?>
<book>
<title>Core Web Programming</title>
<contents>
<chapter number="1">
Designing Web Pages with HTML
</chapter>
<chapter number="2">
Block-level Elements in HTML 4.0
</chapter>
<chapter
number="3">
Text-level Elements in HTML 4.0
</chapter>
...
</contents>
</book>
The typical XML element has a start tag and an end tag with the value of the element between them:
<elementOne> … </elementOne>
When an element has no value, the end tag is shown only by a “/”. This is called an empty element tag:
<elementTwo />
Tag names are case sensitive, and start with a letter or
underscore. After first character numbers, -
and . are allowed, but no white
space. Don't use colon—they are for
indicating namespaces. All tags are completely nested (tag order cannot
be mixed).
Elements can also have attributes, which are located in the start tag:
<message
to="Gates@microsoft.com" from="Gosling@sun.com">
<text>We put the . in .com.
What did you do?
</text>
</message>
Every attribute must be enclosed in "…" with no commas in between.
The general rule is to use elements for presentable data and attributes for meta data. Here are two ways to model a textbook chapter:
<chapter
number="23" focus="Server-side programming"
title=”XML
Processing with Java”>
This chapter is
about ...
</chapter>
In the above method, the author has decided that the value of the chapter is the text in between the two tags. The number, focus, and title are simply information about the chapter (i.e., meta) and therefore are represented as attributes.
In the method below, the author has made all the information presentable, i.e., as elements of the value for the chapter.
<chapter>
<number>23</number>
<focus>Server-side
programming</focus>
<title>XML Processing with
Java</title>
<text> This chapter is
about...</text>
</chapter>
Which do you prefer? It’s not always obvious which method is “correct.”
An XML document is well-formed if it follows basic syntax rules, i.e. tags match, tags embed correctly, etc. An XML document is valid if its structure matches a Document Type Definition (DTD).
A DTD defines the structure of the document. In particular:
As we saw the prolog can contain an inline DTD or a reference to an external DTD. Here is an inline example:
<?xml
version="1.0"?>
<!DOCTYPE
perennials [
<!ELEMENT
perennials (daylily)*>
<!ELEMENT
daylily (variety, award*, bloom, cost)+>
<!ATTLIST
daylily
status (in-stock | limited | sold-out)
#REQUIRED>
<!ELEMENT
variety (#PCDATA)>
<!ELEMENT
award (name, year)>
<!ELEMENT
name (#PCDATA)>
<!ATTLIST
name note CDATA #IMPLIED>
<!ELEMENT
year (#PCDATA)>
<!ELEMENT
bloom (#PCDATA)>
<!ATTLIST
bloom code (E | EM | M | ML | L | E-L) #REQUIRED>
<!ELEMENT
cost (#PCDATA)>
<!ATTLIST
cost discount CDATA #IMPLIED>
<!ATTLIST
cost currency (US | UK | CAN) "US">
]
<perennials>
…
</perennials>
Defining DTD elements is like using BNF to define a grammar.
<!ELEMENT
name elementDefinition>
The
elementDefinition types can be any of:
In defining elements, you can add cardinality constraints and list operators:
[none] Default
(one and only one instance)
? 0, 1
* 0,
1, …, N
+ 1, 2, …, N
, Sequence
(in order)
| Choice (one of several)
So that in the element:
<!ELEMENT
daylily (cultivar, award*, bloom, cost)+>
·
exactly
1 cultivar
·
0 or
more awards (because of the *)
·
exactly
1 bloom
·
exactly
1 cost
In the following
element:
<!ELEMENT
id (#PCDATA | catalog_id)>
In the following
element:
<!ELEMENT
Elem0 (Elem1?, Elem2*)+>
In the following
element:
<!ELEMENT
Elem0 ((Elem1, Elem2) | Elem3)*>
This formulation comes from the specification of Regular Sets.
Here is a full example with inline definitions:
<?xml version="1.0"
standalone="yes"?>
<!DOCTYPE Person [
<!ELEMENT Person ( (Mr|Ms|Miss)?,
FirstName,
MiddleName*, LastName, (Jr|Sr)?
)>
<!ELEMENT FirstName (#PCDATA)>
<!ELEMENT MiddleName (#PCDATA)>
<!ELEMENT LastName (#PCDATA)>
<!ELEMENT Mr EMPTY>
<!ELEMENT Ms EMPTY>
<!ELEMENT Miss EMPTY>
<!ELEMENT Jr EMPTY>
<!ELEMENT Sr EMPTY>
]>
<Person>
<Mr/>
<FirstName>Lawrence</FirstName>
<LastName>Brown</LastName>
</Person>
The attribute definition looks like:
<!ATTLIST
element attrName type modifier>
Examples:
<!ELEMENT
Customer (#PCDATA )>
<!ATTLIST
Customer id CDATA #IMPLIED>
<!ELEMENT
Product (#PCDATA )>
<!ATTLIST
Product
cost CDATA #FIXED "200"
id
CDATA #REQUIRED>
Here are the types allowable for attributes:
<!ATTLIST
Customer id CDATA #IMPLIED>
<!ATTLIST
restaurant charges
(cheap|moderate|expensive) "cheap">
ID,
IDREF, NMTOKEN, NMTOKENS,
ENTITY, ENTITIES, NOTATION
There are a variety ways to modify an attribute:
<!ATTLIST
cost discount CDATA #IMPLIED>
<!ATTLIST account balance CDATA #REQUIRED>
<!ATTLIST
interpreter language CDATA #FIXED "EN">
<!ATTLIST car color (red | white |
blue) "white" )
Entities are shortcuts to common text.
<!ENTITY
name "replacement">
For example: <!ENTITY copyright "Copyright
2001">
You could use the copyright as follows:
<book>
<title>Core Web Programming,
©right;</title>
</book>
Entities can be defined in an external file:
<!ENTITY
copyright SYSTEM "http://x.y.com/entities.dtd">
DTD’s have limitations:
XML Schema is intended to resolve these issues but DTDs are going to be around for a while. Before we discuss XML Schema, we must learn about XML Namespaces.
XML Namespaces provides a method to avoid element name
conflicts. The following XML element is about an HTML table:
<table>
<tr> <td>Apples</td>
<td>Bananas</td> </tr>
</table>
The following XML element is about a piece of furniture:
<table>
<name>African Coffee
Table</name>
<width>80</width>
<length>120</length>
</table>
These two elements can't coexist in the same document. We need a namespace convention to distinguish them. The general namespace specification is:
xmlns:namespace-prefix="namespace"
where the namespace
is an URI. Here's an example:
<f:table
xmlns:f="http://www.w3schools.com/furniture">
<f:name>African
Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
The URI is not used by the parser to look up information. The only purpose is to give the namespace a unique name. However, it is often a pointer to a real Web page containing information about the namespace.
A default namespace for an element saves us from using prefixes in all the child elements. In the following example notice that no prefix is given after the xmlns attribute:
<table xmlns="http://www.w3schools.com/furniture">
<name>African
Coffee Table</name>
<width>80</width>
<length>120</length>
</table>
Here is an example with two namespaces, the (unspecified default) is HTML, the other is from XSL:
<?xml
version="1.0" encoding="ISO-8859-1"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/xsl">
<xsl:template match="/">
<html>
<body>
<table border="2"
bgcolor="yellow">
<tr>
<th>Title</th>
<th>Artist</th>
</tr>
<xsl:for-each
select="CATALOG/CD">
<tr>
<td><xsl:value-of
select="TITLE"/></td>
<td><xsl:value-of
select="ARTIST"/></td>
</tr>
</xsl:for-each>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
XML Schema is an alternative to a DTD. It became the official W3C recommendation in May 2001. Some benefits are:
Clearly the future but there is limited support right now.
A Sample
XML file
<?xml
version="1.0"?>
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
It's DTD might be:
<!ELEMENT note
(to, from, heading, body)>
<!ELEMENT to
(#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
Here’s how it might be defined using an XML schema located in note.xsd:
<?xml
version="1.0"?>
<xs:schema xmlns:xs=http://www.w3.org/2001/XMLSchema
targetNamespace=http://www.w3schools.com
xmlns=http://www.w3schools.com
elementFormDefault="qualified">
<xs:element
name="note">
<xs:complexType>
<xs:sequence>
<xs:element name="to" type="xs:string"/>
<xs:element name="from" type="xs:string"/>
<xs:element name="heading" type="xs:string"/>
<xs:element name="body" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The details will follow.
A reference to a DTD file might be:
<!DOCTYPE note SYSTEM
"http://www.w3schools.com/dtd/note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
A reference to a Schema file might be:
<note
xmlns=http://www.w3schools.com
xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance
xsi:schemaLocation=
"http://www.w3schools.com/schema/note.xsd">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this
weekend!</body>
</note>
Here is how to define a simple element:
<xs:element
name="xxx" type="yyy"/>
Where yyy can be one of:
xs:string
xs:decimal
xs:integer
xs:boolean
xs:date
xs:time
For example:
<xs:element name="color"
type="xs:string" default="red"/>
<xs:element name="age"
type="xs:integer"/>
<xs:element name="dateborn" type="xs:date"/>
A complex element is an XML element that contains other
elements and possibly attributes. Here is an example.
<xs:element
name="employee">
<xs:complexType>
<xs:sequence>
<xs:element
name="firstname" type="xs:string"/>
<xs:element
name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
All attributes are declared as simple types. Only complex elements can have attributes! An example attribute in a "shoesize" def:
<xs:element
name="shoesize">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute
name="country" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
With the above definition, the following could appear in XML statement:f
<shoesize
country="france">35</shoesize>
Restrictions are used to control acceptable values for XML elements or attributes. Restrictions on XML elements are called facets. Here is an example which constrains age to be between 0 and 100.
<xs:element
name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive
value="0"/>
<xs:maxInclusive
value="100"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Here is a complete example of an XSD specification:
<?xml
version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<!-- definition of simple elements -->
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string"/>
<xs:element name="quantity"
type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
<!--
definition of attributes -->
<xs:attribute name="orderid" type="xs:string"/>
<!-- definition of complex elements -->
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element
ref="name"/>
<xs:element ref="address"/>
<xs:element
ref="city"/>
<xs:element
ref="country"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element
name="item">
<xs:complexType>
<xs:sequence>
<xs:element
ref="title"/>
<xs:element
ref="note" minOccurs="0"/>
<xs:element
ref="quantity"/>
<xs:element
ref="price"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element ref="orderperson"/>
<xs:element ref="shipto"/>
<xs:element ref="item" maxOccurs="unbounded"/>
</xs:sequence>
<xs:attribute
ref="orderid" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
Here is an XML file using this schema:
<?xml version="1.0"
encoding="ISO-8859-1"?>
<shiporder orderid="889923"
xmlns:xsi=
http://www.w3.org/2001/XMLSchema-instance
xsi:NamespaceSchemaLocation="shiporder.xsd">
<orderperson>John
Smith</orderperson>
<shipto>
<name>Ola
Nordmann</name>
<address>Langgt
23</address>
<city>4000
Stavanger</city>
<country>Norway</country>
</shipto>
<item>
<title>Empire
Burlesque</title>
<note>Special
Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide your
heart</title>
<quantity>1</quantity>
<price>9.90</price>
</item>
</shiporder>
Database systems answer a recurrent requirement of system designers: to store and retrieve voluminous amounts of persistent data. In so doing they solve all of the difficult problem:
· Atomicity of updates. An update must be considered as a critical section. Failures may leave database in an inconsistent state with partial updates carried out. E.g. transfer of funds from one account to another should either complete or not happen at all.
· Concurrent access by multiple users.
· Manage security centrally.
Database systems offer solutions to all the above problems.
The figure below shows a table from a database. It is a collection of 5 instances (or records), each record defined by its attributes (or fields). A database is defined formally by its schema.

The database schema defines the logical structure of the database:
· What the fields of the table are. :
· How tables relate to one another. :
The schema is analogous to class/type information of programming languages.An instance (or record) of the database is part of the actual content of the database at a particular point in time. This is like a programming variable and its value.
Most database people characterize a database using the Entity-Relationship (ER) Model. The example below shows two entities, a customer and an account related by the depositor relationship. A customer has four attributes (customer-id, etc.), while an account has two attributes (account-number and balance).

A particular database is a collection of tables. Each table is a relation. Below we see the three tables that might exist for the ER diagram shown in the last section.

With an ER model of our database, we are ready to specify the database schema using the Data Definition Language (DDL). Here is how to define the account relation.
create
table account
(
account-number char(10),
balance integer)
The DDL compiler generates table definitions and stores in a data dictionary ready for users to create the instances.
Once table instances are created, a Data Manipulation Language (DML) accesses and manipulates the data. SQL is the most widely used query language. We will look at SQL in detail, but here are two simple examples:
select customer.customer-name
from customer
where customer.customer-id = ‘192-83-7465’
select account.balance
from depositor, account
where depositor.customer-id = ‘192-83-7465’ and
depositor.account-number =
account.account-number
Companies with large, critical data often have a separate database administrator with duties that include:
Table relations are designed with many requirements in mind:
· Ease of access
· Storage efficiency
· Simplifying maintenance
· “Naturalness”
When you study the theoretical issues about database systems, you will learn about normalization. Normalization takes existing table relations and transforms according to certain rules to satisfy various requirements. The whole idea is to create optimum schemas. For our purposes, just think about creating tables that store only information related to the intent of the relation, and no more.
For example, information about a banking enterprise could be broken up into parts, each represented by a relation.
The question is why have three tables? Couldn’t we put all the information in one table?
Storing all information about customers and accounts as a
single relation such as:
bank (account-number, branch-name,
balance, customer-name, customer-street, customer-city)
This might produce a table that looked like:
|
account-number |
branch-name |
balance |
customer-name |
customer-street |
customer-city |
|
A-101 |
Downtown |
500 |
Smith |
4 North St. |
Rye |
|
A-102 |
Downtown |
600 |
Smith |
4 North St. |
Rye |
|
A-103 |
Perryridge |
700 |
Hayes |
3 Main St. |
Harrison |
|
null |
null |
null |
White |
75 First St. |
Palo Alto |
The above table has two serious problems:
Normalization deals with how to design relational schemas to avoid those problems. Normalization centers on which attributes uniquely identifies an entity.
Let K be a subset of the attributes. K is a key of R if
values for K are sufficient to identify a unique tuple (row). Consider the
following customer table:

Note that there is no ID field here. Do you see any field that by itself could be a key. The customer-name field doesn’t work—last names are not unique. The {customer-name, customer-street} combination doesn’t work since the same street name could exist in different cities. The {customer-name, customer-street, customer-city} doesn’t work either. What if two members of the same household each have an account?
The easiest way to solve this problem is introduce the notion of a unique identifier for use as the key. So, the customer table below works. Look at the first two rows; two different customers who coincidently have the same name, street and city can be told apart.
|
customer-id |
customer-name |
customer-street |
customer-city |
|
c-001-001 |
Smith |
4 North St. |
Rye |
|
c-001-002 |
Smith |
4 North St. |
Rye |
|
c-001-003 |
Hayes |
3 Main St. |
Harrison |
|
c-001-004 |
White |
75 First St. |
Palo Alto |
SQL is the defacto standard for accessing data. Only a small subset of SQL will be introduced here—more than enough for you to do your homework. In the examples below the following schema will be used. The highlighted names are the keys within the relations. The arrows show which fields are in common. For example, the branch-name of the account and the branch-name of the loan are intended to be the same as the branch-name in the branch relation—that’s how you link entities together.

SQL is based on set and relational operations.A typical SQL query has the form:
select A1,
A2, ..., An
from r1,
r2, ..., rm
where P
The result of an SQL query is a relation (i.e. a table).
The select clause is used to list the attributes desired in the result of a query.
Example: Find the names of all branches in the loan relation.
select branch_name
from loan
An asterisk in the
select clause denotes “all attributes”. The following example returns the
whole loan table.
select
*
from loan
NOTE: SQL names are case insensitive, meaning you can use upper case or lower case.
SQL allows duplicates in relations as well as in query
results. To force the elimination of duplicates, insert the keyword distinct after select. For example, to
find the names of all branches in the
loan relations, and remove duplicates
select
distinct branch_name
from loan
The keyword all
specifies that duplicates not be removed.
select
all branch_name
from
loan
The select clause can contain arithmetic expressions involving the operation, +, –, *, and /, and operating on constants or attributes of tuples.
For example, the following query: would return a relation which is the same as the loan relations, except
that the attribute amount
is multiplied by 100.
select loan_number, branch_name, amount *
100
from loan
The where clause is a predicate involving attributes of the relations in the from clause. Logical connectives are and, or, and not.
For example, find all loan number of loans made at the Perryridge branch whose loan amounts are greater than $1200
select loan_number
from loan
where branch_name = ‘Perryridge’ and
amount >1200
SQL Includes a between comparison operator to simplify queries.
For example, find the loan number of those loans with loan amounts between $90,000 and $100,000.
select loan_number
from loan
where amount between 90000 and
100000
The from clause lists the relations to be scanned in the evaluation of the expression.
For example find the Cartesian product of borrower “cross” loan.
select
*
from borrower, loan
This will create a relation with all the attributes of borrower and loan and all the records in both.
For example, find the name, loan number and loan amount of
all customers having a loan at the Perryridge branch.
select customer_name,
borrower.loan_number, amount
from borrower, loan
where borrower.loan_number =
loan.loan_number and
branch_name = ‘Perryridge’
Find the customer names and their loan numbers for all
customers having a loan at some branch.
select customer_name,
T.loan_number, S.amount
from borrower as T, loan as
S
where T.loan_number = S.loan_number
The above is just syntactic sugar.
Find the names of all branches that have greater assets than some branch located in Brooklyn.
select distinct T.branch_name
from branch as T, branch as S
where T.assets > S.assets and
S.branch_city = ‘Brooklyn’
Here it is required.
SQL includes a string-matching operators. Patterns are described using two special characters:
Find the names of all customers whose street includes the
substring “Main”.
select customer_name
from customer
where customer_street
like ‘%Main%’
SQL supports a variety of string operations such as:
You can also sort the results of a retrieval. For example: list in alphabetic order the names of all customers having a loan in Perryridge branch
select
distinct customer_name
from
borrower,
loan
where borrower loan_number =
loan.loan_number
and branch_name = ‘Perryridge’
order by customer_name
We may specify desc for descending order or asc for ascending order, for each attribute; ascending order is the default.
For example, order by customer_name desc
The set operations union, intersect, and except
operate on relations. Each automatically eliminates duplicates; to retain
all duplicates use the corresponding multiset versions union all, intersect all
and except
all.
For example, find all customers who have a loan, an account, or both:
(select customer_name from
depositor)
union
(select customer_name from borrower)
Find all customers who have both a loan and
an account.
(select
customer_name from depositor)
intersect
(select customer_name from
borrower)
Find all customers who have an account but no loan.
(select
customer_name from depositor)
except
(select customer_name from
borrower)
These functions operate on the multiset of values of a column of a relation, and return a value:
Here’s an example. Find the average account balance at the Perryridge branch.
select avg (balance)
from account
where branch_name
= ‘Perryridge’
Find the number of tuples in the customer relation.
select count (*)
from customer
Find the number of depositors in the bank.
select count (distinct customer_name)
from depositor
Find the number of depositors for each branch.
select branch_name,
count (distinct customer_name)
from depositor,
account
where depositor.account-number
=
account.account-number
group by branch_name
Note: Attributes in
select clause outside of aggregate functions must appear in the group by list.
Find the names of all branches where the average account balance is more than $1,200.
select branch_name,
avg (balance)
from account
group by
branch_name
having avg (balance) > 1200
Note: predicates in
the having clause are applied after the formation of groups whereas predicates
in the where clause are applied before forming groups.
It is possible for attributes to have no value. null signifies an unknown
or non-existent value. The predicate is null
can be used to check for null values.
E.g. Find all loan number in the loan relation with null values for amount.
select loan_number
from loan
where amount is
null
The result of any arithmetic expression involving null is
null, e.g. 5 + null returns null. Aggregate functions simply
ignore nulls.
Any comparison with null returns unknown
E.g. 5 < null or null
<> null or null
= null all return unknown
Three-valued logic using the truth value unknown:
OR: (unknown
or true) = true,
(unknown or false)
= unknown
(unknown or
unknown) = unknown
AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
NOT: (not unknown) = unknown
“P is unknown” is true if P is unknown.
Result of where
clause predicate is treated as
false if it evaluates to unknown.
Null values don’t affect aggregates. For example suppose you want to total all loan amounts. The following statement ignores null amounts:
select
sum (amount)
from loan
The result is null if there is no non-null amount.
All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes.
There are more things that can go into retrievals
But, we won’t cover them here.
To delete all account records at the Perryridge branch
delete from account
where branch_name
= ‘Perryridge’
Of course these can get complicated.
To add a new tuple to account (according to schema):
insert into account
values (‘A-9732’,
‘Perryridge’,1200)
Or equivalently if you change the order:
insert
into account (branch_name, balance, account-number)
values (‘Perryridge’,
1200, ‘A-9732’)
Add a new tuple to account with balance set to null
insert
into account
values (‘A-777’,‘Perryridge’, null)
To increase all accounts with balances over $10,000 by 6%, the rest receive 5% increase, write two update statements:
update account
set balance = balance * 1.06
where balance
> 10000
update account
set balance =
balance * 1.05
where balance
<= 10000
The order is important. Why?
Allows the specification of not only a set of relations but also information about each relation, including:
An SQL relation is defined using the create table command:
create
table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
Where:
For example:
create
table branch (
branch_name char(15),
branch-city char(30),
assets integer)
There are three ways to express constraints in this command:
Example: Declare branch_name as the primary
key for branch,
ensure it is not null, and ensure that the values of assets are non-negative.
create table branch
(branch_name
char(15) not null,
branch-city char(30)
assets integer,
primary
key (branch_name),
check
(assets >= 0))
An SQL Driver contains application program interface (API) to:
Below are example from Perl, C, and Java.
$dbh=DBI->connect("dbi:mysql:cs351:nunki.usc.edu:9776",
"cs351","cs351")
|| die
$DBI::err;
$sth=$dbh->prepare('SELECT * from data');
$sth->execute;
print "Student\tFavorite Color\n";
print "-------\t--------------\n";
while (@row=$sth->fetchrow_array) {
print $row[0]."\t".$row[1]."\n";
}
$sth->finish;
$dbh->disconnect;
int ODBCexample()
{
RETCODE error;
HENV
env; /* environment */
HDBC
conn; /* database connection */
SQLAllocEnv(&env);
SQLAllocConnect(env, &conn);
SQLConnect(conn, "aura.bell-labs.com",
SQL_NTS, "avi", SQL_NTS, "avipasswd", SQL_NTS);
{ …. Do actual work … }
SQLDisconnect(conn);
SQLFreeConnect(conn);
SQLFreeEnv(env);
}
public static void
JDBCexample(String dbid, String userid, String passwd) {
try {
Class.forName ("oracle.jdbc.driver.OracleDriver");
Connection conn = DriverManager.getConnection( "jdbc:oracle@bell-labs.com:2000:bankdb",
userid, passwd);
Statement stmt =
conn.createStatement();
… Do Actual Work ….
stmt.close();
conn.close();
}
catch (SQLException sqle) {
System.out.println("SQLException : " + sqle);
}
}
Java has a full package called JDBC to work with
databases. All examples from Core Web Programming by Marty Hall and Larry
Brown. As with most Java packages there are many on-line resources:
In the picture below you can see that JDBC consists of two parts:

There are seven basic steps in using JDBC:
1. Load
the driver
try
{
Class.forName("oracle.jdbc.driver.OracleDriver");
Class.forName("com.mysql.jdbc.Driver");
} catch { ClassNotFoundException cnfe) {
System.out.println("Error loading
driver: " cnfe);
}
2. Define the Connection URL to the database
String host =
"dbhost.yourcompany.com";
String dbName = "someName";
int port = 1234;
String oracleURL =
"jdbc:oracle:thin:@" + host +
":" + port +
":" + dbName;
String mysqlURL = "jdbc:mysql://" +
host +
":" + port +
"/" + dbName;
3. Establish the Connection
String username = "jay_debesee";
String password = "secret";
Connection connection =
DriverManager.getConnection(mysqlURL,
username,
password);
Optionally, get information about the db system
DatabaseMetaData dbMetaData =
connection.getMetaData();
String productName =
dbMetaData.getDatabaseProductName();
System.out.println("Database: "
+ productName);
String productVersion =
dbMetaData.getDatabaseProductVersion();
System.out.println("Version: " +
productVersion);
4. Create a Statement
Statement statement =
connection.createStatement();
// discuss
PreparedStatements later
5. Execute a Query
String query = "SELECT col1, col2,
col3 FROM sometable";
ResultSet resultSet =
statement.executeQuery(query);
To modify the database, use executeUpdate, supplying a string that uses UPDATE, INSERT, or DELETE
Use statement.setQueryTimeout to specify a maximum delay to wait for results
6. Process the Result
while(resultSet.next()) {
System.out.println(resultSet.getString(1) +
" " +
resultSet.getString(2) + " " +
resultSet.getString(3));
}
Note that the first column has
index 1, not 0.ResultSet provides various getXxx methods
that take a column index or name and returns the data.
7. Close the Connection
connection.close();
import java.sql.*;
public class TestDriver {
public static void main(String[] Args) {
try {
Class.forName
("com.mysql.jdbc.Driver").newInstance();
}
catch (Exception E) {
System.err.println("Unable to load driver.");
E.printStackTrace();
}
try {
Connection C = DriverManager.getConnection(
"jdbc:mysql://almaak.usc.edu:3307/menagerie",
"root", "xyz");
//?user=root&password=xyz");
Statement s = C.createStatement();
String sql="select * from pet";
s.execute(sql);
ResultSet res=s.getResultSet();
if (res!=null) {
while(res.next()){//MySql start with 1
System.out.println("\n"+res.getString(1)
+
"\t"+res.getString(2));
}
}
c.close();
}
catch (SQLException E) {
System.out.println("SQLException: " +
E.getMessage());
System.out.println("SQLState: " + E.getSQLState());
System.out.println("VendorError: " + E.getErrorCode());
}
}
}
All ResultSet methods can throw a SQLException
float short long Time Object
You can derive a ResultSetMetaData object from a ResultSet in order to look up information about the result set. In particular
ResultSetMetaData answers the following questions:
Here are some useful metadata methods:
Here is a full example of how to use MetaData:
Connection connection =
DriverManager.getConnection(url, username,
password);
// Look up info about the database as a
whole.
DatabaseMetaData dbMetaData =
connection.getMetaData();
String productName =
dbMetaData.getDatabaseProductName();
System.out.println("Database: " +
productName);
String productVersion =
dbMetaData.getDatabaseProductVersion();
//Make a query
Statement statement =
connection.createStatement();
String query = "SELECT * FROM pet";
ResultSet resultSet =
statement.executeQuery(query);
// Look up information about a particular
table.
ResultSetMetaData resultsMetaData =
resultSet.getMetaData();
int columnCount =
resultsMetaData.getColumnCount();
// Column index starts at 1 (a la SQL) not 0
(a la Java).
for(int i=1; i<columnCount+1; i++) {
System.out.print(resultsMetaData.getColumnName(i) +
" ");
}
System.out.println();
//
Print results.
while(resultSet.next()) {
// Name of animal
System.out.print(" " + resultSet.getString(1));
// Name of owner
...
}
Here is the result of the above code on the pet database.
Database: MySQL
Product Version: 3.23.52-log
Pet Table from menagerie Database
|
name |
owner |
species |
sex |
birth |
death |
|
Puffball2 |
Diane |
hamster |
f |
2000-01-30 |
0000-00-00 |
|
Puffball3 |
Diane |
hamster |
m |
2004-03-23 |
2005-03-30 |
|
Freddy |
Dave |
terrier |
m |
1999-03-30 |
0000-00-00 |
|
Buster |
Jeannette |
cat |
m |
1992-06-30 |
2005-02-20 |
|
Annabelle |
Jeannette |
cat |
f |
1992-06-30 |
0000-00-00 |
Through the Statement object, SQL statements are sent to the database. There are three types of statement objects are available:
Here are some useful statement methods:
ResultSet
results =
statement.executeQuery("SELECT a,
b FROM table");
int
rows =
statement.executeUpdate("DELETE
FROM EMPLOYEES" +
"WHERE
STATUS=0");
If you are going to execute similar SQL statements multiple times, using “prepared” (parameterized) statements will be more efficient. First create it for compilation before actually being used. Then, each time you use it, you simply replace some of the marked parameters using the setXxx methods. PreparedStatement's execute methods have no parameters:
Here is an example:
Connection
connection =
DriverManager.getConnection(url, user,
password);
PreparedStatement
statement =
connection.prepareStatement("UPDATE
employees " +
"SET salary
= ? " +
"WHERE id =
?");
float[] newSalaries
= getSalaries();
int[] employeeIDs =
getIDs();
for(int i=0;
i<employeeIDs.length; i++) {
statement.setFloat(1, newSalaries[i]);
statement.setInt(2, employeeIDs[i]);
statement.executeUpdate();
}
Here are a few useful prepared statement methods:
Nearly every JDBC method can throw a SQLException in response to a data access error. If more than one error occurs, they are chained together. SQL exceptions contain:
Here is an SQL exception example:
try {
... // JDBC statement.
} catch (SQLException sqle)
{
while (sqle != null) {
System.out.println("Message: " + sqle.getMessage());
System.out.println("SQLState: " +
sqle.getSQLState());
System.out.println("Vendor Error:
" +
sqle.getErrorCode());
sqle.printStackTrace(System.out);
sqle = sqle.getNextException();
}
}
SQLWarnings are rare, but provide information about database access warnings. The warning are chained to object whose method produced the warning. The following objects can receive a warning:
Call getWarning to obtain the warning object, and getNextWarning (on the warning object) for any additional warnings. Warnings are cleared on the object each time the statement is executed.
Here is an example:
ResultSet results =
statement.executeQuery(someQuery);
SQLWarning warning =
statement.getWarnings();
while (warning != null) {
System.out.println("Message: " +
warning.getMessage());
System.out.println("SQLState: " +
warning.getSQLState());
System.out.println("Vendor Error: "
+
warning.getErrorCode());
warning = warning.getNextWarning();
}
while (results.next()) {
int value = results.getInt(1);
... // Call additional methods on result
set.
SQLWarning warning = results.getWarnings();
while (warning != null) {
System.out.println("Message: " +
warning.getMessage());
System.out.println("SQLState: " +
warning.getSQLState());
System.out.println("Vendor Error:
" +
warning.getErrorCode());
warning = warning.getNextWarning();
}
}
By default, after each SQL statement is executed the changes are automatically committed to the database. It is often the case that you want two or more SQL statements to be considered atomic, that is, they all must finish successfully or you don’t want any. A transaction encapsulates SQL statements in the same way as a critical section does for code.
In order to group two or more statements together into a
transaction, turn auto-commit off as follows:
connection.setAutoCommit(false)
Then make your SQL statements and, if successful, call connection.commit() to permanently record the changes. If an error occurs call connection.rollback(). Here is an example:
Connection connection =
DriverManager.getConnection(url, username,
passwd);
connection.setAutoCommit(false);
try {
statement.executeUpdate(...);
statement.executeUpdate(...);
...
} catch (Exception e) {
try {
connection.rollback();
} catch (SQLException sqle) {
// report problem
}
} finally {
try {
connection.commit();
connection.close();
} catch (SQLException sqle) { }
}
Here is a review of the transaction methods:
Performing JDBC queries and formatting output are common tasks, so the authors of the above code have created helper classes to perform this function: DatabaseUtilities and DBResults.
DatabaseUtilities has 3 static utilities:
The DBResults class stores completed results of a JDBC Query. It differs from a ResultSet in several ways:
· ResultSet doesn't necessarily have all the data; reconnection to database occurs as you ask for later rows.
· This class stores results as strings, in arrays.
· This class includes DatabaseMetaData (database product name and version) and ResultSetMetaData (the column names).
·
This class has a toHTMLTable(headingColor)
method that turns the results into a long string corresponding to an HTML
table.
·
See the class definitions for all the methods.
Here is a simple example:
DBResults results =
DatabaseUtilities.getQueryResults(driver,
url,
username,
password,
query,
true);
out.println(results.toHTMLTable("RED"));
We have covered a lot of material here. Here a few summary points about JDBC by the authors of Core Web Programming.
Pthreads are for threading in C++. Here is sample threading code followed by an explanation:.
#include
<iostream>
#include
<pthread.h>
void
*task1(void *X) //task to be executed by ThreadA
{
//...
cout << "Thread A complete"
<< endl;
}
void
*task2(void *X) //task to be executed by ThreadB
{
//...
cout << "Thread B complete"
<< endl;
}
int
main(int argc, char *argv[])
{
pthread_t ThreadA,ThreadB; // declare
threads
pthread_create(&ThreadA,NULL,task1,NULL);
pthread_create(&ThreadB,NULL,task2,NULL);
// additional processing
pthread_join(ThreadA,NULL); // wait for
threads
pthread_join(ThreadB,NULL);
return(0);
}
In this example, the main line of the example defines the set of instructions for the primary thread. The primary thread, in this case, is also the boss thread. The boss thread declares two threads, ThreadA and ThreadB. By using the pthread_create() function, these two threads are associated with the tasks they are to execute. The two tasks, task1 and task2, are defined. They simply send a message to the standard out but could be programmed to do anything. The pthread_create() function causes the threads to immediately execute their assigned tasks. The pthread_join() function works the same way as wait() for processes. The primary thread waits until both threads return. Figure 11 shows the layout of the example. It also shows what happens to the flow of controls as the pthread_create() and pthread_join() functions are called.
In Figure 11, the pthread_create() function causes a fork in the flow of control in the main line or
primary thread. Two additional flows of control, one for each thread, are
executing concurrently. The pthread_create() function returns
immediately after the threads are created. It is an asynchronous function. As
each thread executes its set of instructions, the primary thread executes its
instructions. The pthread_join() causes the primary thread to wait until each thread terminates
and rejoins the main control flow.

Figure 11 The
layout, output, and flow of control for the example .