|
||||||
|
Selected Reading
|
Talk given November 3, 1997 at Computer Literacy Bookstore in San Jose, CA Bjarne Stroustrup is the creator of the C++ language and a graduate of the University of Aarhus, Denmark and Cambridge University, England. He is currently the head of AT&T Labs’ Large-scale Programming Research Department and is an AT&T Bell Laboratories Fellow. Dr. Stroustrup is the author of Annotated C++ Reference Manual, The Design and Evolution of C++ and the newly revised classic C++ Programming Language, 3rd Edition.
Introduction Good evening. I would like to introduce Bjarne Stroustrup, the designer and original implementer of C++. He is the author of The C++ Programming Language, first, second, and third editions; The Annotated C++ Reference Manual, often called ARM; the Design and Evolution of C++, which is one of my favorite books of all time. He's a graduate of Cambridge University with a Ph.D. in Computer Science. He's the head of AT&T Labs Large-Scale Programming Research Department and an AT&T Bell Labs Fellow. His interests are distributed systems, operating systems, simulated distributed design and programming. I now give you Dr. Bjarne Stroustrup.
Slide
#1
Dr.
Stroustrup's Opening Comments Slide
#2: Overview Learning C++ Slide
#3: Why would
anyone care what programming language was used to implement a system? One question that you may like think a little about is why would anybody care which programming language you use? I mean, the user can't see what language you used. Even if he could, he shouldn't, in my opinion. I don't have to know in detail how the engine of my car works. If, when driving a car, I can recognize what engine it has, there's something wrong with that engine. It's intruding itself on my consciousness in a way it's not supposed to. I just want to drive a car, I don't want to know which programming language was used to program its fuel injectors. I don't care; I want it to work; I want it to be cheap; and I preferably want it delivered yesterday. System suppliers have a whole list of questions that they are interested in. The suppliers resemble users, to a certain extent, but they have slightly more detailed views. Programmers have a lot of opinions, as you all know, but why do their opinions matter? Why, as non-programmers, should we care? I think that, as programmers, we ought to have a good answer to that question, because if it doesn't matter, then the managers are right in just hiring the cheapest people they can find and telling them exactly what to do and how to do it. We ought to have a higher degree of professionalism. We can only do that if we can answer simple questions like this. Slide
#4: Style of
code matters for program structure. My answer is that the style of code matters because it helps us to express program structure. The internal structure of a program is crucial. It is unseen, by the user, but it is what determines how you can maintain things, how you can change things to fit new conditions, for different computers, for different user communities, and for different natural languages. If my program has to run in Finland, its users might object to having it spew English text. Or if I'd written it to produce Danish, which would come naturally to me, you might find it annoyingly hard to understand. All of these issues - the maintainability, the extensibility, the portability - depend on internal structure. Internal structure can be enhanced or obscured by a language. That's why our programming technique and our programming languages matter. My conjecture is that a programming language can support a clean internal structure by making it easier to express that structure. If we have to write in, say, hexadecimal machine code, we might have an idea, but that idea is unlikely to come unmodified into the code. As we raise the level of programming languages, and when we provide more facilities for expressing structure, there's a greater chance that the structure is actually in the code the way we thought about it in design. We can also automate frequent tasks. There are lots of boring,unpleasant things that you'd rather not do if you could help it. In theold days you had to load the registers yourself before you could add values. And you had to say whether you were adding integers or floating point numbers. After a day you could have written a thousand instructions and they worked. After an effort like that you could feel good - I actually love writing assembler. However, I think if you can write a couple of thousand lines of code in a day, then there's typically something wrong. You should have had a neat little tool that helped you to get it done in half an hour. You don't want to do simple mechanical tasks by hand just because you can do them easily. I think a programming language must allow you do more than its designers envision. That is, you can't build a language that has built-in all the neat things you would ever need. And this, of course, is what most commercial organizations try to do: to give you exactly what you want. No more, no less. I think it is impossible to give people all they need. No designer, no organization can know enough about the needs of system builders to put in everything. Therefore, extensibility becomes crucial. In this case, I expressed it as "programmers have to create their own concepts." These concepts are are expressed by things like classes, modules, templates - whatever they're called in various languages. So, this is one thing a language can do. The second is simply to make a concepts explicit so that it can be seen. I don't know if you're experience with code generators that allow you to work from a very high level but generate low-level code from which you can't get back to the high level again. In that case, you have to maintain some rather unpleasant machine-generated code in a low-level language. I don't like that myself. Most of the time, I'm not actually writing code. I'm trying to figure out what the code is doing, either my code or somebody else's code. I like a language to help by making the code more amenable to analysis. An optimizer is simply something that analyzes the code to figure out how to remove inefficiencies from it. You can have analyzers that create dependency graphs, you can have analyzers that find atypical structures, non-idiomatic expressions. You want as much help as you can get when building systems, and a programming language can help by making it possible to write programs that analyze our code. These are some of my first level, and most important, most fundamental issues. As usual, discussion of really fundamental issues are somewhat indistinguishable from arm waving. So, I'll go down one level and get to something that I know a bit about: how and why did C++ came about? Slide
#5 My comments about programming languages in general are both the starting point of C++ and my conclusion from years of work with C++. I had the fundamental notions before I started C++, and I learned much more over the years. Here's the original idea about C++. I wanted C's strength for a systems programming language, and I wanted Simula's facilities for program organization. Simula is the source of most of what we call object-oriented programming and object-oriented design these days. It's a very interesting language, and when you look back, its designers (Dahl and Nygaard) had a clear feeling, not just of language issues and programming issues, but also of the design issues. They saw the programming language in the context of the design techniques. I think that's something that is very, very often forgotten in the haste of getting the next release out. Unfortunately, having gotten a fair bit of experience with Simula,I found I simply couldn't afford to run my programs written in it. Then, I came to the conclusion that I did not want to have the choice between writing code that's elegant and code that is efficient. C++ is an attempt to ensure that I can do most things elegantly, and that most of the things that I can do elegantly, I can also afford to use. For things I'm doing, C++ is a pretty good approximation to that ideal. Slide #6: C with Classes - why C? but: improve
static type checking When C++ first came about, I called it "C with classes." C because C was the best systems programming language around: it was efficient, it was portable, it was very flexible. It was available, and it was known. C was not as well know as it is today, however. A lot of people asked: "C? Why didn't you use Pascal like everybody else?" My answer was, "Well, I don't like Pascal. It's a straitjacket. I'm not looking for a straitjacket, I'm looking for flexibility, efficiency, portability and so on." Some, then said: ""But, C's pretty awful"." My answer to that was: "No, no. C has some pretty awful parts. I don't like the sloppy type checking; I don't like its declarative syntax; I don't like the conversion rules for built-in types. However, these are second-order issues. I've never noticed that these things stop people from writing a good program. On the other hand, I have seen an overly rigid type system stopping people from writing a good program. In particular, Pascal has stopped me from writing good programs. So, these problems with C are second-order things which you can deal with." However, one of the first things I did to build C with Classes from C++ was to improve the static type system because I don't really want "sqrt(2)" to mean "segment violation." I would like to make sure that run-time errors happen as infrequently as possible. Slide #7: C with Classes - why Classes? The other part of "C with Classes" that became C++ was the classes; the Simula aspects of program organization; the notion that you write code by figuring out what your concepts are, and mapping them into classes in your program; taking the view that a class is a type, and static checking. I just mentioned that I found Pascal's static checking unpleasant and unhelpful. It took me some time to realize why stronger static checking in Simula didn't bother me at all. The reason is that the Simula type system is extensible and flexible. So, what this static type checking does is to check my rules for my (user-defined) types, rather than some language designer's rules for the built-in types he provided. And that makes all the difference. If you have an extensible system, you don't have a straitjacket, you have an enforcement of your own rules. This is a radical difference. Slide #8: C++ Design Rules Over the years, a
fair number of rules came about for the design of C++. You can't just
sit down and design a language. As soon as you have a little bit of success,
everybody comes and want the language to be stable except for two things
they absolutely want added to it. So, you can't not just design the language,
sort of from day to day. You have to slowly build up a set of rules by
which you live. Rules allows you to say "yes, your suggestions are very
nice ideas, but the rules I operate under are these, and those suggestions
don't quite fit."
I consider theory
a great constraint on solutions. But it's not a good guide to choosing
the problems you want to solve. So, you look for real problems or you
wait for real problems to come to you. Then, you use all the relevant
theory you can find to make sure that the solution you choose fits this
problem.
There has been a fair
amount of thinking about these great languages we'll need to solve these
great problems we get in a few years for these great machines we'll have
in a few years, and for these programmers that are so much smarter than
the "dumb folk who are programming now." I decided that I know the machines
I've got now, I know my friends that are writing programs, I know the
problems they're dealing with, and the systems they use. Maybe I can do
something to help - and I'll leave the far future to people who have working
crystal balls. So, in designing C++, I tried to deal with problems that
I knew, current problems, and I tried to make C++ useful now. I supported
my first non-research user six months after I started C with classes.
Another thing I decided
was, that I was designing a programming language, not an operating system,
not a filing system, not a user interface system, etc., etc. There are
a lot of questions you can't ask about C++, like "what's a binary file?"
It's not a language issue, it's a system issue. On the other hand, we
can take C++ and use it on a variety of systems. I found that aspect of
C++ particularly useful for writing code in the areas where you might
not have an operating system. I'm not sure that a fuel injector has been
programmed in C++, but I know there's a lot of gadgetry that you can carry
around in your hand that has.
Slide
#9: C++ is a Better C The net effect of
all of this is that C++ is a better C - meaning that it is a language
where you do the things you usually do in C, in better ways, without additional
overhead and without restrictions on what you can do. It supports data
abstraction - the notion of having concepts represented directly. It supports
object-oriented programming - from Simula in form of class hierarchies
and use of such hierarchies. And it supports generic programming - the
ability to parameterize types and functions with other types, which is
very useful.
Slide
#10
This is how to you
almost certainly would write a solution in C. I'm assuming you've all
seen C before.
Slide
#11
You have some variables,
like the double you read in, the mean, and the number of elements. You
read them in and check them each time and update the running mean. That's
very easy. Then you sort the array and output the median. Now, if you
see this in the second week of a first programming course, or the first
week of a course in a programming language you've never seen before, preventing
overflow is actually somewhat difficult. You have to figure out how memory
is managed, how to extend the input buffer or prevent it from overflowing.
The average novice does not get it right the first time. Or the second
time.
Explaining qsort to
somebody who is not a programmer - or hasn't encountered C before - is
a nightmare. I want to sort the buffer, right? Why do I have to say "n,"
"size of double," and "compare?" Is this machine so stupid it doesn't
know how to compare two doubles? The answer is: "yes it is that stupid."
You have to write the compare function yourself - and the compare function
almost as big as the code you see here. Next, it doesn't know the size
of the double? Well, you see, really sort() doesn't know it's sorting
doubles. Now you're really in trouble! The clever students know that you're
waffling. This is not good. Yes, it can be done, yes it's been done a
million times, at least. That's not the point. The point is that we are
getting into hot water very early on for no apparent good reason. And
so, I'm going to show how we can do it, using C++ as it's being defined
now by the draft standard. And, see how little we have to know to get
it done.
Slide
#12
This program is short,
it is safe, it is simple, and it is easy to explain. What I mean by safe
is if you produce sloppy coding, it comes back to you fast with a compile
time error, rather than waiting until run time.
One question people
have asked is "what about efficiency?" I mean, it costs more to make a
vector and then expand it gradually, than just putting values into a fixed-sized
buffer. It doesn't actually cost that much more than to use the malloc+realloc
solution, but it costs a bit more. Fortunately, this sort runs about five
times the speed of qsort. That's primarily because you don't have to call
the stupid compare function; you just use the less-than operator built
into the language.
That's basically it.
You have two examples. The first one, unfortunately, is the traditional
way in which we teach programmers in languages such as C or Pascal. The
second one is what we could do. One of the problems we had is we could
only teach really elegant, easy style like this in languages that didn't
quite scale. If you're a LISP or a Smalltalk programmers, you just yawn
and say "of course, we've been doing this for a couple of decades". Yeah,
but in Standard C++ this can be done within a framework that expands to
do all the things that C and C++ have been used for, and you don't pay
in run time, so you can actually afford to do things right. As I said
before, I don't want to force people to choose between elegance and efficiency.
Slide
#13 I think education
is key - not just training. Organizations like to send their programmers
on a one-week training course. This will allow them to write using C syntax
instead of using Pascal syntax, or Pascal syntax instead of C syntax.
It will not teach them any new techniques or teach them to think in different
ways. They'll come up with equivalent solutions in differ ent languages.
In other words, they haven't learned that much.
I think education,
not just training, is the key. And the way to deal with that is to focus
on concepts and techniques, teaching language features later. When people
come and say I'm drowning in language features, usually what they mean
is "I'm trying to use all these features, and I can't figure out what
they're supposed to be doing." Well, if you don't know what the features
are supposed to do, what are you doing with them? First you have a problem,
then you have the concepts for the solution, and then you look for the
tools to solve them.
Programming language
features have to be seen in the context of programming and design techniques.
Initial teaching should be based on higher-level data structures, algorithms,
the standard library. The understanding of lower-level facilities will
come later, at least for a lot of people. But that should come later after
the fundamental issues are in place. Yes, we can deal with pointers and
arrays and free store. But let's deal with them after we know how to call
a function, we know how to declare a variable, we know how to write a
loop, we know how to stay out of trouble, and until we need to get into
the lower-level facilities.
I'll take questions
now.
Question and Answer Q: Why does C++ not have a realloc? A: Well, it has a realloc, if you use malloc, which you shouldn't. But, basically, you don't need realloc in C++ because the standard data containers, such as vectors and lists, expand with the technique I just showed you. So instead of declaring a simple array, figuring out you've run out of space, and then try to realloc, you simply use a vector that keeps expanding as needed. This, of course, is implemented at a lower level using something that looks rather like realloc. But it is far less error-prone and usually more efficient. Most people don't seem to realize that realloc moves every element, sometimes. They get bitten by when they have pointer into the array, then they realloc the array, and then they get surprised the array moved. So, yes, you can use malloc/realloc in C++, but you shouldn't. And you needn't because this is a better facility - there are safer and more convenient ways of getting the same thing done. Q: In the language Common LISP there is a notion of closure in which one writes an ad hoc function, that has a lot of lexical context at the point of the function generation. And its proper use is to pass this function around... A: I know what a closure is, and there are dialects of C with nested functions that try to get to that idea. C++, standard C++, does not have closures or nested functions and won't get them, at least not for the next five or ten years. Part of the problem is how to define the context well enough, and part of the problem is that you can get too much context and get obscure code. That at least is the traditional answer in the C++ and C world. And in a lot of areas that is a reasonable answer.On the other hand, there's a lot of algorithms where it's nice to have the equivalent of closure. The simplest case is a "for each" in which you do something to all elements of a sequence. Another example is taking the sum of a set of elements; where you get the context to put the sum in and return it? Compare two sets. How do you specify what is the comparison criteria? In C++, you generally use an object that acts like a function when give to an algorithm such as for_each, accumulate, or compare. This is directly supported in the standard library, is through "function objects." Somebody calls them "functors." There's a variety of names for them, but basically, you define a class of which you can initialize objects. This initialization explicitly gives an object its initial state. That is, you don't pick up the context, where a context is everything that could affect you. You initialize an object with a specific set of elements (from the context) that defines what the object can refer to. Then, you go and apply the call operator on the object, and at the end of the algorithm you can call any of the functions that you have defined for that object to extract information from it. That way you can pass a fair bit of context through many iterations, for instance. So, there are no closures in C++, and no nested functions. However, there are function objects that in many areas serve the same purpose. Functional objects allow us to approximate some function or programming techniques. And actually do it fairly elegantly and very efficiently. Q: So now that machines are fast, how hard is it to run LISP? Also, Java has a garbage collector; is that going to be offered in a future version? A: Machines have become faster every year for the last about 40 years or so. My experience is that the only thing that grows faster than hardware cycles is human expectation. I think that throwing away a factor of 5, 10, 30, 50 of efficiency is acceptable in some areas. But there are many areas where it is not acceptable. Traditionally, I have been most interested in applications that are rather demanding on time or space, and for those the basic efficiency of a programming language matters. Essentially, I think there are three parts to your question. One is the machine efficiency, another is LISP, and the third is garbage collection. I think they are separate issues. I am sure that today there are many things that you could do in LISP that you couldn't get away with doing in LISP ten years ago. LISP happens not to be my favorite language for most of the things that I'm doing. But clearly the hardware improvements help all languages including LISP. I think there's some very nice aspects of LISP that have been quietly forgotten or maybe people have been ruling out LISP for other reasons. Now, garbage collection is a much more interesting issue. If you like garbage collection, it's usually because you are doing things where you can afford it. And I'm sure we can afford it in many more applications than people think. If I wanted a garbage collector language, I would, out of personal inclination, probably use C++. There are pretty good garbage collectors available for C++. They're pretty good, meaning at least as good as what you can get for any other language. And they work. I tried to get the standards committee to write into the text the fact that garbage collection could be used and document two or three positions that has to be explicit for using garbage collection in C++. The third of the C++ committee that really likes C, had conniptions. Therefore the fact that garbage collection is a valid implementation technique for C++ is still implicit in the standard. There is nothing in C++ that prohibits garbage collection. Various people, such as me, have been going around saying that you can actually do it for about 15 years. You can get a very nice free garbage collector and you can also get commercial ones if you need support and things like that. My guess is that we'll see significant experimentations of different garbage collectors with C++ in the next few years. It's already happening. Q: Now, if I can just add on to that question... perhaps the largest remaining weakness in C++ is the fact that it started out in C, and there are still various stages of C in the language that are unpleasant. Have you got any thoughts on when C++ is finished, designing a successor language that may correct some of those things? High on my list would be pointer arithmetic. A: In the early years of C++, I used to go around with a slide that had two lists: advantages and disadvantages. You found C prominent in both columns. There's no doubt that some of the main problems that people have had with C++ has to do with its C heritage. On the other hand, there is no doubt that some of the things that attract people the most - and are seen as most important by a lot of people - is the C heritage. I am opposed to anything that decreases the compatibility between C and C++. I hope that the C community will have a similar attitude toward C++. I have no plans of going to another language. I think that people that go and build general purpose programming languages must be nuts. It's a field that has essentially a zero success rate, and if you - against the odds - have any degree of success you get wrapped up in a lot of unpleasant work. Or, you get wrapped up in some commercial unpleasantness and hype that warps peoples' brains. My only excuse for getting into the language design business is that I didn't mean to. I had some specific problems that I looked for solutions for: that's how C++ came about. Should I do another language, it would be because I've gotten myself into a hole that didn't know any other way out of. I am not in such a hole now. I think for real problems the conversion rules between the various integer types ranks very high on my list of annoyances in C and C++. For sheer annoyance, the syntax also comes high. But the syntax you get used to after a week. The only people I really worry about are people who are proud that the can write things like the definition of a function returning a pointer functions without using a typedef or looking in a manual. When things like that become an issue of pride, there's something wrong. I think that the C's model of arrays and pointers is actually very fundamental, has a very good match to real hardware, so I don't particularly want to throw that away. So I would think very hard before doing something like that, but I don't have to think very hard because I'm probably not going to touch it. Q: You talked about extensibility as being important for C++, and for the evolution of C++... have you done any thinking about having the structures, the actual programmatic structures being extensible as well, such as in languages like Forth, where you can create your own if-thens? A: I thought about that a long time ago in several contexts. I wouldd very much have liked to integrate a variety of control structures. I was working on concurrency at the time and I was also thinking of having a more flexible expression syntax. Consequently, I looked at languages where you could define control structures. That was about fifteen years ago. I looked at the examples that the designers put forward as good examples. I found them hard to read. That was discouraging. I talked to people who had written in a language where they could define their own control structures and operators. Basically, they said: "don't do it." And so I set out to design a language that was extensible, but not mutable. Yes, it was a deliberate design decision. By the way, since I'm in a bookstore, I guess I should point out that I have yet to answer a question that is not answered in this book (The Design and Evolution of C++). I took the time a few years ago to actually sit down and think what is this, why is it here, how did it come about, and to write it down. Q: How would you compare C++ to Java? A: I will answer one question about Java. And I'll try to give a reasonably exhaustive answer, instead of getting into all kinds of lengthy discussions. Before there was Java there have been other languages that offered similar facilities - Smalltalk, Modula-3, Eiffel and such. So Java doesn't look as new to me as it does to some people. I dislike hype rather strongly. And I think Java is floating on hype. I dislike proprietary corporate languages rather strongly and Java is one of those. So that might color my answer a bit. I think Java as a programming language is a fairly uninteresting programming language, not up to, say, SmallTalk and not up to Modula-3. What it had different from those is a lot of corporate backing and some timing to do with the Internet. For a while, as far I understand, Java was a solution looking for a problem, and the Internet came just at the right time. From a language perspective, I often hear that Java is so very much like C++ - just with the errors fixed. I even heard that Java is the programming language that Bjarne would have designed if he didn't have the C compatibility concerns. This is not so. If you read anything I've been writing for the last 5 years or so, even as far back as 85, you see that Java doesn't have some of the things that are really useful in programming languages. The ability to efficiently provide new primitives. Try writing a new string class in Java, that you could use to replace the existing one. Having true local variables, having user-defined and built-in types work according to the same rules. Having standard type-safe containers. Java is still incomplete and growing. Java is only just starting down the road to encompass some facilities provided by C++ - just as other languages have done. No doubt we'll see a lot of extensions to Java. I don't think Java's claims of simplicity makes sense. It is simpler than older languages, but that's partly because Java is new and incomplete. It's partly because the complexity is in the libraries rather than in the language. I don't believe that the efficiency will get closer than a factor of 2.5 to 3 to C++; Java's object model will get in the way, even when using proper compilation. I don't believe in the portability of Java, because the portability goes out the window the second you start using system resources. That was a long answer and it wasn't particularly positive to Java. I guess I don't have to apologize too much for that because you've heard some of the things which Java evangelists have been writing and saying about C++. So it is not unreasonable that I should at least be given two or three minutes to give my side of the story, even if I cannot - and would not - take out television ads telling you not to use any programming language but mine. One of the design criteria for C++ was coexistence with other languages. This is another criteria that Java hasn't considered - or rather rejected. Well, I'll tell you a story. I had an agent from one of the big publishers phone me a couple of weeks ago, who said: "C++ book sales are dramatically up, not just your book, and I was just thinking, could it be refugees from Java?" I very sadly had to explain to him, no that could not be the case because there certainly weren't enough of them. Q: What dramatic productivity increases do you foresee for the next ten to twenty years? A: I don't forsee any dramatic productivity increases. What I see is a steady improvement. An individual or an organization can get dramatic productivity increases, depending, of course, on what you think dramatic is. In my world, we consider halving of the error rate, for the same amount of effort, is dramatic, because it halves the cost. And that we saw by introducing C++ of the 1990s vintage. I think it's recently easy to argue for doublings here and there. But it depends where you start from. I mean, if you're writing C code, use C++ as a better C. Use vectors instead of arrays. Use the standard algorithms instead of the C library functions. You'll probably double your productivity by some measure or other. Don't ask me for orders of magnitude of improvement. You get orders of magnitudes of improvements by getting a lot of little things right. Look at engineering. Cars did not become fuel efficient the way they are now from a dramatic change in engine technology that happened overnight. There were lots of little improvements. If you look for revolutions, you just go from one fad to the next. Q: What improvements do you see coming next? A: It depends where you're starting from. For the things I think about most, I still have too much code using pointers and C arrays. And I want to clean that up. I have too much code that splatters all over the global name space. I would like to experiment more with exceptions. That's a little bit trickier for introducing into old code, just as I never did find a really good way of building class hierarchies into an existing system. It is rare that I get to write a system completely from scratch. Q: This is an easy, light question. When you started creating the language, what kind of a machine did you develop it on? A: I started out on a PDP 11/70 with 128K of memory. I realized after a while that it was probably not necessary to make do with as little as that. I thought that in the future people would be able to afford 1 Meg and 1 MIPS. And so I built C++ and its first compiler to fit that. Fortunately, I overshot the target, so my first C++ compiler could run on an IBM AT. I was actually the first person to put C++ on a PC. In particularly, I put it on an AT. And then I did one of the few clever things ever I did, I threw the port away, so that I wouldn't have to maintain it in that environment and could get on with my real work of improving C++ and the ways of using it. Thank you. |
Home |
Search |
Subject Categories |
New Titles
Bestsellers |
Recommendations |
Training & Certification |
What's Hot
Shopping Basket |
Your Account |
Help Desk
Suggestions? | contact-us@clbooks.com Customer Service: 800-789-8590 or 408-752-9910 Copyright © 1996 - 1999 Computer Literacy, Inc. |