SOAP is the foundation on which the plethora of WS-* specifications is built. Despite the hype and antihype it’s been subjected to, there’s amazingly little to this specification. You can take any XML document (so long as it doesn’t have a DOCTYPE or processing instructions), wrap it in two little XML elements, and you have a valid SOAP document. For best results, though, the document’s root element should be in a namespace.
Here’s an XML document:
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <hello-world xmns="http://example.com"/> </soap:Body> </soap:Envelope>
The only catch is that the SOAP
Envelope must have the same character encoding
as the document it encloses. That’s pretty much all there is to it.
Wrapping an XML document in two extra elements is certainly not an
unreasonable or onerous task, but it doesn’t exactly solve all the
world’s problems either.
Example 10-1. A SOAP envelope to be submitted to Google’s SOAP search service
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <soap:Body> <gs:doGoogleSearch xmlns:gs="urn:GoogleSearch"> <key>00000000000000000000000000000000</key> <q>REST book</q> <start>0</start> <maxResults>10</maxResults> <filter>true</filter> <restrict/> <safeSearch>false</safeSearch> <lr/> <ie>latin1</ie> <oe>latin1</oe> </gs:doGoogleSearch> </soap:Body> </soap:Envelope>
This document describes a Call to the Remote Procedure
of the query parameters are neatly tucked into named elements. This
example is fully functional,
though if you POST it to Google you’ll get back a fault document saying
key is not valid.
This style of encoding parameters to a remote function is sometimes called RPC/literal or Section 5 encoding. That’s the section in the SOAP 1.1 specification that shows how to use SOAP for RPC. But over time, fashions change. Later versions of the specification made support of this encoding optional, and so it’s now effectively deprecated. It was largely replaced by an encoding called document/literal, and then by wrapped document/literal. Wrapped document/literal looks largely the same as section 5 encoding, except that the parameters tend to be scoped to a namespace.
One final note about body elements: the parameters may be
annotated with data type information based on XML Schema Data Types.
This annotation goes into attributes, and generally reduces the
readability of the document. Instead of
<ie>latin1</ie> you might see
xsi:type="xsd:string">latin1</ie>. Multiply that by
the number of arguments in Example 10-1 and
you may start to see why many recoil in horror when they hear
In Chapter 1 I said that HTTP and SOAP
are just different ways of putting messages in envelopes. HTTP’s main
moving parts are the entity-body and the headers. With a SOAP element
Body, you might expect to also
Header element. You’d be
right. Anything that can go into the
Body element—any namespaced document which has
no DOCTYPE or processing instructions—can go into the
Header. But while you tend to only find a
single element inside the
Header can contain any number of
Header elements also tend
to be small.
Recalling the terminology used in HTTP: Documents in Envelopes” in Chapter 1, headers are like “stickers” on an envelope. SOAP headers tend to contain information about the data in the body, such as security and routing information. The same is true of HTTP headers.
SOAP defines two attributes for header entities:
mustUnderstand. If you know in advance that
your message is going to pass through intermediaries on the way to its
destination, you can identify (via a URI) the
actor that’s the target of any particular
is used to impose restrictions on those intermediaries (or on the final
destination). If the
understand a header addressed to it, and
mustUnderstand is true, it must reject the
message—even if it thinks it could handle the message otherwise. An
example of this would be a header associated with a two-phase commit
operation. If the destination doesn’t understand two-phase commit, you
don’t want the operation to proceed.
Beyond that, there isn’t much to SOAP. Requests and responses have
the same format, similar to HTTP. There’s a separate format for a SOAP
Fault, used to signify an error
condition. Right now the only thing that can go into a SOAP document is
an XML document. There have been a few attempts to define mechanisms for
attaching binary data to messages, but no clear winner has
Given this fairly simple protocol, what’s the basis for the hype and controversy? SOAP is mainly infamous for the technologies built on top of it, and I’ll cover those next. It does have one alleged benefit of its own: transport independence. The headers are inside the message, which means they’re independent of the protocol used to transport the message. You don’t have to send a SOAP envelope inside an HTTP envelope. You can send it over email, instant messaging, raw TCP, or any other protocol. In practice, this feature is rarely used. There’s been some limited public use of SMTP transports, and some use of JMS transports behind the corporate firewall, but the overwhelming majority of SOAP traffic is over HTTP.
SOAP is almost always sent over HTTP, but SOAP toolkits make little use of HTTP status codes, and tend to coerce all operations into POST methods. This is not technically disallowed by the REST architectural style, but it’s a degenerate sort of RESTful architecture that doesn’t get any of the benefits REST is supposed to provide. Most SOAP services support multiple operations on diverse data, all mediated through POST on a single URI. This isn’t resource-oriented: it’s RPC-style.
The single most important change you can make is to split your service into resources: identify every “thing” in your service with a separate URI. Pretty much every SOAP toolkit in existence provides access to this information, so use it! Put the object reference up front. Such usages may not feel idiomatic at first, but if you stop and think about it, this is what you’d expect to be doing if SOAP were really a Simple Object Access Protocol. It’s the difference between object-oriented programming in a function-oriented language like C:
and in an object-oriented language like C++:
When you move the scoping information outside the parentheses
(or, in this case, the
you’ll soon find yourself identifying large numbers of resources with
common functionality. You’ll want to refactor your logic to exploit
The next most important change has to do with the
object-oriented concept of polymorphism. You should try to make
objects of different types respond to method calls with the same name.
In the world of the Web, this means (at a minimum) supporting HTTP’s
GET method. Why is this
important? Think about a programming language’s standard library.
Pretty much every object-oriented language defines a standard class
hierarchy, and at its root you find an
Object class which defines a
toString method. The details are different
for every language, but the result is always the same: every object
has a method that provides a canonical representation of the object.
GET method provides a similar
function for HTTP resources.
Once you do this, you’ll inevitably notice that the
GET method is used more heavily than all
the other methods you have provided. Combined. And by a wide margin.
That’s where conditional GET and caching come in. Implement these
standard features of HTTP, make your representations cacheable, and
you make your application more scalable. That has direct and tangible
Once you’ve done these three simple things, you may find yourself wanting more. Chapter 8 is full of advice on these topics.