Chapter 1. Introducing C#

C#—pronounced “See Sharp”—is a programming language designed for Microsoft’s .NET platform. Since its first release in 2002, C# has found many roles. It is widely used on the server side of websites, and also on both the client and server in line-of-business Windows desktop applications. You can write smartphone user interfaces and Xbox 360 games in C#. More recently, Microsoft’s Silverlight platform has made C# an option for writing Rich Internet Applications that run in a web browser.

But what kind of language is C#? To understand a language well enough to use it effectively, it’s not enough to focus purely on the details and mechanisms, although we’ll be spending plenty of time on those in this book. It is equally important to understand the thinking behind the details. So in this chapter, we’ll look at what problems C# was built to solve. Then we’ll explore the style of the language, through aspects that distinguish it from other languages. And we’ll finish the chapter with a look at the latest step in the evolution of C#, its fourth version.

Why C#? Why .NET?

Programming languages exist to help developers be more productive. Many successful languages simplify or automate tedious tasks that previously had to be done by hand. Some offer new techniques that allow old problems to be tackled more effectively, or on a larger scale than before. How much difference C# can make to you will depend on your programming background, of course, so it’s worth considering what sorts of people the language designers had in mind when they created C#.

C# is aimed at developers working on the Windows platform, and its syntax is instantly familiar to users of C or C++, or other languages that draw from the same tradition, such as JavaScript and Java. Fundamental language elements such as statements, expressions, function declarations, and flow control are modeled as closely as possible on their equivalents in C family languages.

A familiar syntax is not enough of a reason to pick a language, of course, so C# offers productivity-enhancing features not found in some of its predecessors. Garbage collection frees developers from the tyranny of common memory management problems such as memory leaks and circular references. Verifiable type safety of compiled code rules out a wide range of bugs and potential security flaws. While C or C++ Windows developers may not be accustomed to those features, they will seem old hat to Java veterans, but Java has nothing to compete with the “LINQ” features C# offers for working with collections of information, whether in object models, XML documents, or databases. Integrating code from external components is remarkably painless, even those written in other languages. C# also incorporates support for functional programming, a powerful feature previously most commonly seen in academic languages.

Many of the most useful features available to C# developers come from the .NET Framework, which provides the runtime environment and libraries for C#, and all other .NET languages, such as VB.NET. C# was designed for .NET, and one of the main benefits of its close relationship with the .NET Framework is that working with framework features such as the class library feels very natural.

The .NET Framework Class Library

Working in C# means more than using just the language—the classes offered by the .NET Framework are an extremely important part of the C# developer’s everyday experience (and they account for a lot of this book’s content). Most of the library functionality falls into one of three categories: utility features written in .NET, wrappers around Windows functionality, and frameworks.

The first group comprises utility types such as dictionaries, lists, and other collection classes, as well as string manipulation facilities such as a regular expression engine. There are also features that operate on a slightly larger scale, such as the object models for representing XML documents.

Some library features are wrappers around underlying OS functionality. For example, there are classes for accessing the filesystem, and for using network features such as sockets. And there are classes for writing output to the console, which we can illustrate with the obligatory first example of any programming language book, shown in Example 1-1.

Example 1-1. The inevitable “Hello, world” example

class Program
{
    static void Main()
    {
        System.Console.WriteLine("Hello, world");
    }
}

We’ll examine all the pieces shown here in due course, but for now, note that even this simplest of examples depends on a class from the library—the System.Console class in this case—to do its job.

Finally, the class library offers whole frameworks to support building certain kinds of applications. For example, Windows Presentation Foundation (WPF) is a framework for building Windows desktop software; ASP.NET (which is not an acronym, despite appearances) is a framework for building web applications. Not all frameworks are about user interfaces—Windows Communication Foundation (WCF) is designed for building services accessed over the network by other computer systems, for instance.

These three categories are not strict, as quite a few classes fit into two. For example, the parts of the class library that provide access to the filesystem are not just thin wrappers around existing Win32 APIs. They add new object-oriented abstractions, providing significant functionality beyond the basic file I/O services, so these types fit into both the first and second categories. Likewise, frameworks usually need to integrate with underlying services to some extent—for example, although the Windows Forms UI framework has a distinctive API of its own, a lot of the underlying functionality is provided by Win32 components. So the three categories here are not strict. They just offer a useful idea of what sorts of things you can find in the class libraries.

Language Style

C# is not the only language that runs on the .NET Framework. Indeed, support for multiple languages has always been a key feature of .NET, reflected in the name of its runtime engine, the CLR or Common Language Runtime. As this name implies, .NET is not just for one language—numerous languages have access to the services of the .NET Framework class library. Why might you choose C# over the others?

We already mentioned one important reason: C# was designed specifically for .NET. If you are working with .NET technologies such as WPF or ASP.NET, you’ll be speaking their language if you work in C#. Compare this with C++, which supports .NET through extensions to the original language. The extensions are carefully thought out and work well, but code that uses .NET libraries just looks different from normal C++, so programs that bridge the worlds of .NET and standard C++ never feel completely coherent. And the dual personality often presents dilemmas—should you use standard C++ collection classes or the ones in the .NET class library, for example? In native .NET languages such as C#, such questions do not emerge.

But C# is not unique in this respect. Visual Studio 2010 ships with three languages designed for .NET: C#, VB.NET, and F#. (Although VB.NET follows on from its non-.NET Visual Basic predecessors, it was radically different in some important ways. It is a native .NET language with a VB-like syntax rather than VB 6 with .NET capabilities bolted on.) The choice between these languages comes down to what style of language you prefer.

F# is the odd one out here. It’s a functional programming language, heavily influenced by a language called ML. Back in 1991, when your authors were first-year students, our university’s computer science course chose ML for the first programming language lectures in part because it was so academic that none of the students would previously have come across anything like it. F# is still at the academic end of the spectrum despite having climbed far enough down the ivory tower to be a standard part of a mainstream development environment. It excels at complicated calculations and algorithms, and has some characteristics that can help with parallel execution. However, as with many functional languages, the cost of making some hard problems easier is that a lot of things that are easy in more traditional languages are remarkably hard in F#—functional languages are adept at complex problems, but can be clumsy with simple ones. It seems likely that F# will mostly be used in scientific or financial applications where the complexity of the computation to be performed dwarfs the complexity of the code that needs to act on the results of those calculations.

While F# feels distinctly other, VB.NET and C# have a lot of similarities. The most obvious factor in choosing between these is that VB.NET is easier to learn for someone familiar with Visual Basic syntax, while C# will be easier for someone familiar with a C-like language. However, there is a subtler difference in language philosophy that goes beyond the syntax.

Composability

A consistent theme in the design of the C# programming language is that its creators tend to prefer general-purpose features over specialized ones. The most obvious example of this is LINQ, the Language INtegrated Query feature added in C# 3.0. Superficially, this appears to add SQL-like query features to the language, providing a natural way to integrate database access into your code. Example 1-2 shows a simple query.

Example 1-2. Data access with LINQ

var californianAuthors = from author in pubs.authors
                         where author.state == "CA"
                         select new
                         {
                             author.au_fname,
                             author.au_lname
                         };
foreach (var author in californianAuthors)
{
    Console.WriteLine(author);
}

Despite appearances, C# doesn’t know anything about SQL or databases. To enable this syntax, C# 3.0 added a raft of language features which, in combination, allow code of this sort to be used not just for database access, but also for XML parsing, or working with object models. Moreover, many of the individual features can be used in other contexts, as we’ll see in later chapters. C# prefers small, composable, general-purpose features over monolithic, specialized ones.

A striking example of this philosophy is a feature that was demonstrated in prototype form in C#, but which eventually got left out: XML literals. This experimental syntax allowed inline XML, which compiled into code that built an object model representing that XML. The C# team’s decision to omit this feature illustrates a stylistic preference for generality over highly specialized features—while the LINQ syntax has many applications, XML literal syntax cannot be used for anything other than XML, and this degree of specialization would feel out of place in C#.[1]

Managed Code

The .NET Framework provides more than just a class library. It also provides services in subtler ways that are not accessed explicitly through library calls. For example, earlier we mentioned that C# can automate some aspects of memory management, a notorious source of bugs in C++ code. Abandoning heap-allocated objects once you’re done with them is a coding error in C++, but it’s the normal way to free them in .NET. This service is provided by the CLR—the .NET Framework’s runtime environment. Although the C# compiler works closely with the runtime to make this possible, providing the necessary information about how your code uses objects and data, it’s ultimately the runtime that does the work of garbage collection.

Depending on what sorts of languages you may have worked with before, the idea that the language depends heavily on the runtime might seem either completely natural or somewhat disconcerting. It’s certainly different from how C and C++ work—with those languages, the compiler’s output can be executed directly by the computer, and although those languages have some runtime services, it’s possible to write code that can run without them. But C# code cannot even execute without the help of the runtime. Code that depends entirely on the runtime is called managed code.

Managed compilers do not produce raw executable code. Instead, they produce an intermediate form of code called IL, the Intermediate Language.[2] The runtime decides exactly how to convert it into something executable. One practical upshot of managed code is that a compiled C# program can run on both 32-bit and 64-bit systems without modification, and can even run on different processor architectures—it’s often possible for code that runs on an ARM-based handheld device to run unmodified on Intel-based PCs, or on the PowerPC architecture found in the Xbox 360 game console.

As interesting as CPU independence may be, in practice the most useful aspect of managed code and IL is that the .NET runtime can provide useful services that are very hard for traditional compilation systems to implement well. In other words, the point is to make developers more productive. The memory management mentioned earlier is just one example. Others include a security model that takes the origin of code into account rather than merely the identity of the user who happens to be running the code; flexible mechanisms for loading shared components with robust support for servicing and versioning; runtime code optimization based on how the code is being used in practice rather than how the compiler guesses it might be used; and as already mentioned, the CLR’s ability to verify that code conforms to type safety rules before executing it, ruling out whole classes of security and stability bugs.

If you’re a Java developer, all of this will sound rather familiar—just substitute bytecode for IL and the story is very similar. Indeed, a popular but somewhat ignorant “joke” among the less thoughtful members of the Java community is to describe C# as a poor imitation of Java. When the first version of C# appeared, the differences were subtle, but the fact that Java went on to copy several features from C# illustrates that C# was always more than a mere clone. The languages have grown more obviously different with each new version, but one difference, present from the start, is particularly important for Windows developers: C# has always made it easy to get at the features of the underlying Windows platform.

Continuity and the Windows Ecosystem

Software development platforms do not succeed purely on their own merits—context matters. For example, widespread availability of third-party components and tools can make a platform significantly more compelling. Windows is perhaps the most striking example of this phenomenon. Any new programming system attempting to gain acceptance has a considerable advantage if it can plug into some existing ecosystem, and one of the biggest differences between C# and Java is that C# and the .NET Framework positively embrace the Windows platform, while Java goes out of its way to insulate developers from the underlying OS.

If you’re writing code to run on a specific operating system, it’s not especially helpful for a language to cut you off from the tools and components unique to your chosen platform. Rather than requiring developers to break with the past, .NET offers continuity by making it possible to work directly with components and services either built into or built for Windows. Most of the time, you won’t need to use this—the class library provides wrappers for a lot of the underlying platform’s functionality. However, if you need to use a third-party component or a feature of the operating system that doesn’t yet have a .NET wrapper, the ability to work with such unmanaged features directly from managed code is invaluable.

Note

While .NET offers features to ease integration with the underlying platform, there is still support for non-Windows systems. Microsoft’s Silverlight can run C# and VB.NET code on Mac OS X as well as Windows. There’s an open source project called Mono which enables .NET code to run on Linux, and the related Moonlight project is an open source version of Silverlight. So the presence of local platform integration features doesn’t stop C# from being useful on multiple platforms—if you want to target multiple operating systems, you would just choose not to use any platform-specific features.

So the biggest philosophical difference between C# and Java is that C# provides equal support for direct use of operating-system-specific features and for platform independence. Java makes the former disproportionately harder than the latter.

The latest version of C# contains features that enhance this capability further. Several of the new C# 4.0 features make it easier to interact with Office and other Windows applications that use COM automation—this was a weak spot in C# 3.0. The relative ease with which developers can reach outside the boundaries of managed code makes C# an attractive choice—it offers all the benefits of managed execution, but retains the ability to work with any code in the Windows environment, managed or not.

C# 4.0, .NET 4, and Visual Studio 2010

Since C# favors general-purpose language features designed to be composed with one another, it often doesn’t make sense to describe individual new features on their own. So rather than devoting sections or whole chapters to new features, we cover them in context, integrated appropriately with other, older language features. The section you’re reading right now is an exception, of course, and the main reason is that we expect people already familiar with C# 3.0 to browse through this book in bookstores looking for our coverage of the new features. If that’s you, welcome to the book! If you look in the Preface you’ll find a guide to what’s where in the book, including a section just for you, describing where to find material about C# 4.0 features.

That being said, a theme unites the new language features in version 4: they support dynamic programming, with a particular focus on making certain interoperability scenarios simpler. For example, consider the C# 3.0 code in Example 1-3 that uses part of the Office object model to read the Author property from a Word document.

Example 1-3. The horrors of Office interop before C# 4.0

static void Main(string[] args)
{
    var wordApp = new Microsoft.Office.Interop.Word.Application();

    object fileName = @"WordFile.docx";
    object missing = System.Reflection.Missing.Value;
    object readOnly = true;
    Microsoft.Office.Interop.Word._Document doc =
        wordApp.Documents.Open(ref fileName, ref missing, ref readOnly,
            ref missing, ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing);

    object docProperties = doc.BuiltInDocumentProperties;
    Type docPropType = docProperties.GetType();
    object authorProp = docPropType.InvokeMember("Item",
        BindingFlags.Default | BindingFlags.GetProperty,
        null, docProperties,
        new object[] { "Author" });
    Type propType = authorProp.GetType();
    string authorName = propType.InvokeMember("Value",
        BindingFlags.Default |BindingFlags.GetProperty,
        null, authorProp,
        new object[] { }).ToString();

    object saveChanges = false;
    doc.Close(ref saveChanges, ref missing, ref missing);


    Console.WriteLine(authorName);
}

That’s some pretty horrible code—it’s hard to see what the example does because the goal is lost in the details. The reason it is so unpleasant is that Office’s programming model is designed for dynamic languages that can fill in a lot of the details at runtime. C# 3.0 wasn’t able to do this, so developers were forced to do all the work by hand.

Example 1-4 shows how to do exactly the same job in C# 4.0. This is a lot easier to follow, because the code contains only the relevant details. It’s easy to see the sequence of operations—open the document, get its properties, retrieve the Author property’s value, and close the document. C# 4.0 is now able to fill in all the details for us, thanks to its new dynamic language features.

Example 1-4. Office interop with C# 4.0

static void Main(string[] args)
{
    var wordApp = new Microsoft.Office.Interop.Word.Application();

    Microsoft.Office.Interop.Word._Document doc =
        wordApp.Documents.Open("WordFile.docx", ReadOnly: true);
    dynamic docProperties = doc.BuiltInDocumentProperties;
    string authorName = docProperties["Author"].Value;
    doc.Close(SaveChanges: false);

    Console.WriteLine(authorName);
}

This example uses a couple of C# 4.0 features: it uses the new dynamic keyword for runtime binding to members. It also uses the support for optional arguments. The Open and Close methods take 16 and 3 arguments, respectively, and as you can see from Example 1-3, you need to provide all of them in C# 3.0. But Example 1-4 has only provided values for the arguments it wants to set to something other than the default.

Besides using these two new features, a project containing this code would usually be built using a third new interop feature called no-PIA. There’s nothing to see in the preceding example, because when you enable no-PIA in a C# project, you do not need to modify your code—no-PIA is essentially a deployment feature. In C# 3.0, you had to install special support libraries called primary interop assemblies (PIAs) on the target machine to be able to use COM APIs such as Office automation, but in C# 4.0 you no longer have to do this. You still need these PIAs on your development machine, but the C# compiler can extract the information your code requires, and copy it into your application. This saves you from deploying PIAs to the target machine, hence the name, “no-PIA”.

While these new language features are particularly well suited to COM automation interop scenarios, they can be used anywhere. (The “no-PIA” feature is narrower, but it’s really part of the .NET runtime rather than a C# language feature.)

Summary

In this chapter we provided a quick overview of the nature of the C# language, and we showed some of its strengths and how the latest version has evolved. There’s one last benefit you should be aware of before we get into the details in the next chapter, and that’s the sheer quantity of C# resources available on the Internet. When the .NET Framework first appeared, C# adoption took off much faster than the other .NET languages. Consequently, if you’re searching for examples of how to get things done, or solutions to problems, C# is an excellent choice because it’s so well represented in blogs, examples, tools, open source projects, and webcasts—Microsoft’s own documentation is pretty evenhanded between C# and VB.NET, but on the Web as a whole, you’re far better served if you’re a C# developer. So with that in mind, we’ll now look at the fundamental elements of C# programs.



[1] VB.NET supports XML literals. Since C# 2.0 shipped, the C# and VB.NET teams have operated a policy of keeping the feature sets of the two languages similar, so the fact that VB.NET picked up a feature that C# abandoned shows a clear difference in language philosophy.

[2] Depending on whether you read Microsoft’s documentation, or the ECMA CLI (Common Language Infrastructure) specifications that define the standardized parts of .NET and C#, IL’s proper name is either MSIL (Microsoft IL) or CIL (Common IL), respectively. The unofficial name, IL, seems more popular in practice.

Get Programming C# 4.0, 6th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.