Any single technology in Windows DNA can be applied to specific problems or features, but taken together, the tools that make up the DNA architecture are all about applications that live on, use, or are accessed by networks.
Formally speaking, we'll define network applications as software applications that require or substantially benefit from the presence of networked computers.
You could run a multi-user database (like SQL Server) or a directory service without a network, but what would be the point? Web-based applications effectively require a network.
Breaking applications into their functional pieces and deploying them across a network lets us make the best possible use of an organization's resources. Once you do this, however, you become reliant on the network for the full use of your applications. This implies requirements for reliability, scalability, and security, and soon you realize you need a well-planned architecture. In our case, that's Windows DNA. It's about adopting network applications as the future of general purpose computing, then developing an architecture that supports them.
The definition of network applications given above is rather vague. You probably know what is meant intuitively, but intuition doesn't go very far in programming. So let's look at the characteristic problems that network applications will need to deal with.
Network Application Characteristics
We've just said that network applications break their implementation into functional modules and rely on (or at least substantially benefit from) the presence of a network of computers. Some characteristics follow from that:
- Communications
- Concurrent usage
- State management
- Latency
- Rigorous encapsulation
Let's look at these in more detail.
Communications
The first point is essentially a given one - if my applications work through a network, they must have some means of communication. However, as we'll see in a little while, communications can become a deep topic. There are issues of protocols and data formats that arise in network applications. Simply running a cable between two computers and configuring them for the network is the easy part. Life for the application programmer can get very interesting.
While you may not actually have to worry about the detailed implementation of network communications, you must be concerned with the issues that arise when applications span multiple computers.
Concurrent Usage
You could deploy network applications in a single user manner. You might insist that every client be matched with an individual server, or you might have a multi-user server that forced clients to wait for service in series. This would simplify the programming task, but it would also negate many of the benefits of using networks in the first place. We want applications to dynamically access some bit of software on the network, obtain service, and go about the rest of their processing. Forcing them through the iron gates of single-use software would mean incurring all the overhead of distributed processing while also incurring the limitations of standalone software. You'd feel cheated, wouldn't you?
Even if you don't develop multi-user software, you rely on concurrent access to system services and network servers. I can write a web page with single-user client-side script, but I want the web server it accesses to be able to accommodate multiple users at the same time. Could you imagine a corporate database application denying a user service because one other user somewhere in the organization was already connected?
Some part of a network application, then, must handle the tough tasks of concurrent access. These include multithreading, concurrency, and integrity. Multithreading is what enables a single piece of software to have more than one task underway in a program at any given time. Concurrency is what keeps one task distinct from another. Most importantly, concurrency concerns itself with how to maintain the integrity of the data or process, when different users want to modify the same bit of information.
State Management
State management is closely related to concurrency. Technically, this is another facet of concurrency, but we'll consider it as a characteristic in its own right because this is a topic that almost all programmers will encounter when writing network applications.
If an application is using other applications or components on the network, it must keep track of where it is in the process - the state of the process. Single-user, standalone applications find it easy to maintain state. The value of the variables is the state of your data, while the line of code that is currently executing defines where you are in the overall process.
Multiuser, distributed applications have it harder. Suppose I have an e-commerce web site whose implementation involves sending a message to another application and receiving a reply. I have to maintain a set of data for each user. When I send a message, I have to record where that particular user is in the overall process, together with the current state of that user's data. When a reply comes in, I have to be able to determine which user is affected by the reply and retrieve the data I saved.
If you've worked with the Session and Application objects in Active Server Pages, you've programmed state management information. You've told the ASP component to keep track of something you'll need again later. The more widely distributed you make your application, the more state information needs to be coordinated. It's best to try to minimize state information on remote servers using a stateless server model, which we'll see a little more about later in the book.
Latency
How long does it take to communicate with other components on the network? The time attributed solely to the network is the network latency of the application.
This would seem to be too small to worry about at first glance. How fast are electrons in a wire? It turns out that for practical purposes, the speed of electrons in copper wire is slightly less than the speed of light. A good rule of thumb is 200 meters per microsecond.
Surely that's good enough, you might say. In a standalone application, though, the time to access a function might be a fraction of a millisecond. Now measure the path through your network - seldom a straight line on a LAN - and multiply by two. Add the time imposed by routers or switches, and you find that networks have latency that is significant compared to the time to execute instructions within a component.
Latency is especially important for Internet applications. The distance alone is significant. A round trip across the United States should take, in theory, 50 milliseconds. But that's a direct hop. When I send a packet from Philadelphia to a particular server in San Francisco, I find that it takes almost three times as long to get there and back. My packet is bouncing around my service provider, then heading south to the major interconnect MAE East in Virginia, then making its way across country. Each router or switch takes its toll along the way.
This is an extreme case, but even crossing the office is more expensive than moving between addresses in a single computer's memory. Programmers and architects need to consider how they can minimize the number of times their systems need to call on a remote server if they want to maintain acceptable performance. Latency changes the way we design applications.
Rigorous Encapsulation
Encapsulation is a technique in which you hide - or encapsulate - the details of some implementation from some software using that implementation. Object oriented programming is a classic example of encapsulation. An application using an object has no idea how the object maintains its data or implements its methods. Structured programming, the traditional function-by-function method of building an application, may also practice encapsulation by hiding the implementation details of a subroutine from the main program. A programming Application Programming Interface (API) is an encapsulation.
In a standalone application, encapsulation was a good idea. It helped programmers develop and maintain software effectively and efficiently. Network applications have no choice but to practice rigorous encapsulation - different programming teams may write the components of a network application. One team may not have any influence over another, or even know who wrote the component.
If I were to write a shipping application that relied on information from the Federal Express tracking application on the Web, for example, I would have no choice but to use their application using their HTTP-based API, as I have no other access to the application. Certainly, I cannot call Federal Express and ask them to make some internal modifications for me. Network applications live and die by clean interfaces, behind which implementation details are encapsulated.
We're now going to take a possibly familiar trip down memory lane, tracing the history of applications from monoliths to component-based distributed applications. You may have seen it before, but it's still necessary to discuss this because it's central to DNA's concept.
Continued...