Beautiful JavaScript


/ theory / in / practice Beautiful JavaScript Leading Programmers Explain How They Think Anton Kovalyov / theory / in / practice Beautiful JavaScript JavaScript is arguably the most polarizing and misunderstood programming language in the world. Many have attempted to replace it as the language of the Web, but JavaScript has survived, evolved, and thrived. Why did a language created in such a hurry succeed where others failed? This guide gives you a rare glimpse into JavaScript from people intimately familiar with it. Chapters contributed by domain experts such as Jacob Thornton, Ariya Hidayat, and Sara Chipps reveal what they love about their favorite language—whether it’s turning the most feared features into useful tools, or how JavaScript can be used for self-expression. About the editor: Anton Kovalyov is a software engineer at Medium, creator of JSHint, and coauthor of Third-Party JavaScript (Manning). Contributors include: Jonathan Barronville Sara Chipps Angus Croll Marijn Haverbeke Ariya Hidayat Daryl Koopersmith Anton Kovalyov Rebecca Murphey Daniel Pupius Graeme Roberts Jenn Schiffer Jacob Thornton Ben Vinegar Rick Waldron Nicholas Zakas Programming/JavaScript “Reading this book is like sitting down with some of the masters of JavaScript for lunch and hearing them talk about what's on their mind at the moment. You'll leave with a new appreciation for the language, and with something you can use to make your next project better.” — Dave Camp, Director of Engineering, Firefox ISBN: 978-1-449-37075-6 US $39.99 CAN $45.99 Twitter: @oreillymedia facebook.com/oreilly oreilly.com Edited by Anton Kovalyov Beautiful JavaScript 978-1-449-37075-6 [LSI] Beautiful JavaScript edited by Anton Kovalyov Copyright © 2015 Anton Kovalyov. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://safaribooksonline.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or corporate@oreilly.com. Acquisitions Editor: Simon St. Laurent Editor: Allyson MacDonald Production Editor: Matthew Hacker Copyeditor: Rachel Head Proofreader: Rachel Monaghan Indexer: WordCo Indexing Services, Inc. Interior Designer: David Futato Cover Designer: Susan Thompson Illustrator: Rebecca Demarest August 2015: First Edition Revision History for the First Edition 2015-08-07: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781449370756 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Beautiful JavaScript, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. While the publisher and the authors have used good faith efforts to ensure that the informa- tion and instructions contained in this work are accurate, the publisher and the authors dis- claim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instruc- tions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. TABLE OF CONTENTS Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Beautiful Mixins. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Classical Inheritance 1 Prototypes 2 Mixins 3 The Basics 4 The Use Case 4 Classic Mixins 5 The extend Function 6 Functional Mixins 7 Adding Options 8 Adding Caching 9 Advice 10 Wrapup 11 2 eval and Domain-Specific Languages. . . . . . . . . . . . . . . . . . . 13 What About “eval Is Evil”? 13 History and Interface 14 Performance 15 Common Uses 16 A Template Compiler 16 Speed 18 Mixing Languages 19 Dependencies and Scopes 20 Debugging Generated Code 21 Binary Pattern Matches 21 Closing Thoughts 25 3 How to Draw a Bunny. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 What Is a Rabbit? 27 iii What Is a Bunny? 28 What Does This Have to Do with JavaScript? 29 With So Much Variation, Which Way Is Correct? 32 How Does This Affect the Classroom? 33 Is This Art? And Why Does That Matter? 34 What Does This Look Like? 36 What Did I Just Read? 38 4 Too Much Rope, or JavaScript for Teams. . . . . . . . . . . . . . . . 39 Know Your Audience 39 Stupid Good 40 Keep It Classy 41 Style Rules 43 Evolution of Code 44 Conclusion 44 5 Hacking JavaScript Constructors for Model Harmony. . . . . 47 Doppelgangers 48 Miniature Models of Factories 50 Constructor Identity Crisis 51 Making It Scale 52 Conclusion 54 6 One World, One Language. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 An Imperative, Dynamic Proposal 58 The Paradox of Choice 60 Globalcommunicationscript 60 7 Math Expression Parser and Evaluator. . . . . . . . . . . . . . . . . . 61 Lexical Analysis and Tokens 61 Syntax Parser and Syntax Tree 66 Tree Walker and Expression Evaluator 72 Final Words 76 8 Evolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Backbone 79 CONTENTSiv New Possibilities 79 9 Error Handling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Assume Your Code Will Fail 83 Throwing Errors 84 When to Throw Errors 86 Types of Errors 86 Custom Errors 88 Handling Errors 89 Global Error Handling in Browsers 91 Global Error Handling in Node.js 92 Summary 93 10 The Node.js Event Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Event-Driven Programming 95 Asynchronous, Nonblocking I/O 97 Concurrency 99 Adding Tasks to the Event Loop 99 11 JavaScript Is…. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 JavaScript Is Dynamic 101 JavaScript Can Be Static 102 JavaScript Is Functional 102 JavaScript Does Everything 103 12 Coding Beyond Logic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 0. The Basement 105 1. Quine’s Paradox 105 2. The Conjecture 110 3. Peer Review 112 13 JavaScript Is Cutieful. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 All This Loose Beauty 115 The Absurdity of Dalí 115 Dalí’s JavaScript 116 Is This Beauty Just Ugly? 116 An Unfortunate Necessity 116 The Beauty Is in the Madness 116 vCONTENTS Let’s Have a Wee Look at map 116 Hello, thisArg 117 Okay! So That’s a Bunch of Stuff I Already Knew About [].map—Now What? 117 calling All Cars 117 Number 117 Now I Know Everything 118 Wild 118 14 Functional JavaScript. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Functional Programming 119 Functional JavaScript 121 Objects 126 Now What? 127 15 Progress. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 CONTENTSvi Preface FUNCTIONS ARE FIRST-CLASS CITIZENS, SYNTAX RESEMBLES JAVA, INHERITANCE is prototypal, and (+"") equals zero. This is JavaScript, arguably the most polarizing and misunderstood programming language in the world. It was created in 10 days and had a lot of warts and rough edges. Since then, there have been many attempts to replace it as the language of the Web. And yet, the language and the ecosystem around it are thriving. JavaScript is the most popular language in the world—and the only true language of the web platform. What made JavaScript special? Why did a language that was created in such a hurry succeed where others failed? I believe the reasons why JavaScript (and the Web in general) survived lie in its omni- presence—it’s practically impossible to find a personal computer that doesn’t have some sort of JavaScript interpreter—and its ability to gain from disorder, to use its stressors for self-improvement. JavaScript, like no other language, brought all kinds of different people to the plat- form. Anyone with a text editor and a web browser could get started with JavaScript, and many did. Its expressiveness and limited standard library prompted those people to experiment with the language and push it to its limits. People were not only making websites and applications; they were writing libraries and creating programming lan- guages that could be compiled back into JavaScript. Those libraries competed with each other, and their approaches to solving problems often contradicted one another. The JavaScript ecosystem was a mess, but it was bursting with life. vii Many of those libraries and languages are now forgotten. Their best ideas, however— the ones that proved themselves and stood the test of time—were absorbed into the language. They made their way into JavaScript’s standard library and its syntax. They made the language better. Then there were languages and technologies that were designed to replace JavaScript. But instead of succeeding, they unwillingly became its necessary stressors. Every time a new language or system to replace JavaScript emerged, browser vendors would find a way to make it faster, more powerful, and more robust. Once again, good ideas were absorbed into newer versions of the language, and the bad ones were discarded. These competing technologies didn’t replace JavaScript; instead, they made it better. Today, JavaScript is unbelievably popular. Will it last? I don’t know. I cannot predict whether it will still be popular 5 or 10 years from now, but it doesn’t really matter. For me, JavaScript will always be a great example of a language that survived not despite its flaws but because of them, and a language that brought people of so many different backgrounds into this wonderful world of computer programming. About This Book This book was written by people who are intimately familiar with the language. Each and every person who contributed a chapter is an expert in his or her domain. The authors highlight different sides of JavaScript, some of which you can discover only by writing lots of code, experimenting and making mistakes. As you make your way through this book, you’ll get to see what JavaScript movers and shakers love about their favorite language. You’ll also learn a lot. I did. But do not mistake this book for a JavaScript tutorial, because it is much bigger than that. There are chapters that challenge the conventional wisdom and show how even the most feared features can be used as helpful tools. Some authors show that JavaScript can be a tool for self-expression and a form of art, while others share the wisdom of using JavaScript in codebases with hundreds of con- tributors. Some authors share personal stories, while others look into the future. There’s no common pattern that goes from one chapter to another—there’s even a purely satirical chapter. This is intentional. I tried to give the authors as much freedom as possible to see what they would come up with, and they came up with something incredible. They came up with a book that resembles JavaScript itself, where each chapter is a reflection of its author. PREFACEviii Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program ele- ments such as variable or function names, databases, data types, environment variables, statements, and keywords. TIP This element signifies a tip or suggestion. NOTE This element signifies a general note. Using Code Examples Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/oreillymedia/beautiful_javascript. This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require per- mission. We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Beautiful JavaScript, edited by Anton Kova- lyov (O’Reilly). Copyright 2015 Anton Kovalyov, 978-1-449-37075-6.” If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at permissions@oreilly.com. ixPREFACE Safari® Books Online Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business. Technology professionals, software developers, web designers, and business and crea- tive professionals use Safari Books Online as their primary resource for research, prob- lem solving, learning, and certification training. Safari Books Online offers a range of plans and pricing for enterprise, government, education, and individuals. Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kauf- mann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-998-9938 (in the United States or Canada) 707-829-0515 (international or local) 707-829-0104 (fax) We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/beautiful_javascript. To comment or ask technical questions about this book, send email to bookques- tions@oreilly.com. For more information about our books, courses, conferences, and news, see our web- site at http://www.oreilly.com. Find us on Facebook: http://facebook.com/oreilly Follow us on Twitter: http://twitter.com/oreillymedia Watch us on YouTube: http://www.youtube.com/oreillymedia PREFACEx CHAPTER ONE Beautiful Mixins Angus Croll Developers love to create overly complex solutions to things that aren’t really problems. —Thomas Fuchs In the beginning there was code, and the code was verbose, so we invented functions that the code might be reused. But after a while there were also too many functions, so we looked for a way to reuse those too. Developers often go to great lengths to apply “proper” reuse techniques to JavaScript. But sometimes when we try too hard to do the right thing, we miss the beautiful thing right in front of our eyes. Classical Inheritance Many developers schooled in Java, C++, Objective-C, and Smalltalk arrive at Java- Script with an almost religious belief in the necessity of the class hierarchy as an organizational tool. Yet humans are not good at classification. Working backward from an abstract superclass toward real types and behaviors is unnatural and restrictive—a superclass must be created before it can be extended, yet classes closer to the root are by nature more generic and abstract and are more easily defined after we have more knowledge of their concrete subclasses. Moreover, the need to tightly couple types a priori such that one type is always defined solely in terms of another tends to lead to an overly rigid, brittle, and often ludicrous model (“Is a button a rectangle or is it a control? Tell you what, let’s make Button inherit from Rectangle, and Rectangle can inherit from Control…no, wait a minute…”). If we don’t get it right early on, our sys- tem is forever burdened with a flawed set of relationships—and on those rare occa- sions that, by chance or genius, we do get it right, anything but a minimal tree struc- ture usually represents too complex a mental model for us to readily visualize. 1 Classical inheritance is appropriate for modeling existing, well-understood hierarchies—it’s okay to unequivocally declare that a FileStream is a type of Input Stream. But if the primary motivation is function reuse (and it usually is), classical hierarchies can quickly become gnarly labyrinths of meaningless subtypes, frustrating redundancies, and unmanageable logic. Prototypes It’s questionable whether the majority of behaviors can ever be mapped to objectively “right” classifications. And indeed, the classical inheritance lobby is countered by an equally fervent band of JavaScript loyalists who proclaim that JavaScript is a prototy- pal, not classical, language and is deeply unsuited to any approach that includes the word class. But what does “prototypal” mean, and how do prototypes differ from classes? In generic programming terms, a prototype is an object that supplies base behavior to a second object. The second object can then extend this base behavior to form its own specialization. This process, also known as differential inheritance, differs from classical inheritance in that it doesn’t require explicit typing (static or dynamic) or attempt to formally define one type in terms of another. While classical inheritance is planned reuse, true prototypal inheritance is opportunistic. In general, when working with prototypes, one typically chooses not to cate- gorize but to exploit alikeness. —Antero Taivalsaari, Nokia Research Center In JavaScript, every object references a prototype object from which it can inherit properties. JavaScript prototypes are great instruments for reuse: a single prototype instance can define properties for an infinite number of dependent instances. Proto- types may also inherit from other prototypes, thus forming prototype chains. So far, so good. But, with a view to emulating Java, JavaScript tied the prototype prop- erty to the constructor. As a consequence, more often than not, multilevel object inheritance is achieved by chaining constructor-prototype couplets. The standard implementation of a JavaScript prototype chain is too grisly to appear in a book about beautiful JavaScript, but suffice it to say, creating a new instance of a base prototype in order to define the initial properties of its inheritor is neither graceful nor intuitive. The alternative—manually copying properties between prototypes and then meddling with the constructor property to fake real prototypal inheritance—is even less becoming. Syntactic awkwardness aside, constructor-prototype chaining requires upfront plan- ning and results in structures that more closely resemble the traditional hierarchies of classical languages than a true prototypal relationship: constructors represent types CHAPTER ONE: BEAUTIFUL MIXINS2 (classes), each type is defined as a subtype of one (and only one) supertype, and all properties are inherited via this type chain. The ES6 class keyword merely formalizes the existing semantics. Leaving aside the gnarly and distinctly unbeautiful syntax char- acteristic in constructor-prototype chains, traditional JavaScript is clearly less prototy- pal than some would claim. In an attempt to support less rigid, more opportunistic prototypes, the ES5 specifica- tion introduced Object.create. This method allows a prototype to be assigned to an object directly and therefore liberates JavaScript prototypes from constructors (and thus categorization) so that, in theory, an object can acquire behavior from any other arbitrary object and be free from the constraints of typecasting: var circle = Object.create({ area: function() { return Math.PI * this.radius * this.radius; }, grow: function() { this.radius++; }, shrink: function() { this.radius--; } }); Object.create accepts an optional second argument representing the object to be extended. Sadly, instead of accepting the object itself (in the form of a literal, variable, or argument), the method expects a full-blown meta property definition: var circle = Object.create({ /*see above*/ }, { radius: { writable:true, configurable:true, value: 7 } }); Assuming no one actually uses these unwieldy beasts in real code, all that remains is to manually assign properties to the instance after it has been created. Even then, the Object.create syntax still only enables an object to inherit the properties of a single prototype. In real scenarios, we often want to acquire behavior from multiple proto- type objects: for example, a person can be an employee and a manager. Mixins Fortunately, JavaScript offers viable alternatives to inheritance chaining. In contrast to objects in more rigidly structured languages, JavaScript objects can invoke any func- tion property regardless of lineage. In other words, JavaScript functions don’t need to MIXINS 3 be inheritable to be visible—and with that simple observation, the entire justification for inheritance hierarchies collapses like a house of cards. The most basic approach to function reuse is manual delegation—any public function can be invoked directly via call or apply. It’s a powerful and easily overlooked feature. However, aside from the verbosity of serial call or apply directives, such delegation is so convenient that, paradoxically, it sometimes actually works against structural disci- pline in your code—the invocation process is sufficiently ad hoc that in theory there is no need for developers to organize their code at all. Mixins are a good compromise: by encouraging the organization of functionality along thematic lines they offer something of the descriptive prowess of the class hierarchy, yet they are light and flexible enough to avoid the premature organization traps (and head-spinning dizziness) associated with deeply chained, single-ancestry models. Bet- ter still, mixins require minimal syntax and play very well with unchained JavaScript prototypes. The Basics Traditionally, a mixin is a class that defines a set of functions that would otherwise be defined by a concrete entity (a person, a circle, an observer). However, mixin classes are considered abstract in that they will not themselves be instantiated—instead, their functions are copied (or borrowed) by concrete classes as a means of inheriting behav- ior without entering into a formal relationship with the behavior provider. Okay, but this is JavaScript, and we have no classes per se. This is actually a good thing because it means we can use objects (instances) instead, which offer clarity and flexi- bility: our mixin can be a regular object, a prototype, a function, whatever, and the mixin process becomes transparent and obvious. The Use Case I’m going to discuss a number of mixin techniques, but all the coding examples are directed toward one use case: creating circular, oval, or rectangular buttons (some- thing that would not be readily possible using conventional classical inheritance tech- niques). Here’s a schematic representation: square boxes represent mixin objects, and rounded boxes represent the actual buttons. CHAPTER ONE: BEAUTIFUL MIXINS4 Classic Mixins Scanning the first two pages returned from a Google search for “javascript mixin,” I noticed the majority of authors define the mixin object as a full-blown constructor type with its function set defined in the prototype. This could be seen as a natural pro- gression—early mixins were classes, and this is the closest thing JavaScript has to a class. Here’s a circle mixin modeled after that style: var Circle = function() {}; Circle.prototype = { area: function() { return Math.PI * this.radius * this.radius; }, grow: function() { this.radius++; }, shrink: function() { this.radius--; } }; In practice, however, such a heavyweight mixin is unnecessary. A simple object literal will suffice: var circleFns = { area: function() { return Math.PI * this.radius * this.radius; }, grow: function() { this.radius++; }, shrink: function() { this.radius--; } }; MIXINS 5 Here’s another mixin defining button behavior (for the sake of demonstration, I’ve substituted a simple log call for the working implementation of some function properties): var clickableFns = { hover: function() { console.log('hovering'); }, press: function() { console.log('button pressed'); }, release: function() { console.log('button released'); }, fire: function() { this.action.fire(); } }; The extend Function How does a mixin object get mixed into your object? By means of an extend function (sometimes known as augmentation). Usually extend simply copies (not clones) the mixin’s functions into the receiving object. A quick survey reveals some minor varia- tions in this implementation. For example, the Prototype.js framework omits a hasOwn Property check (suggesting the mixin is not expected to have enumerable properties in its prototype chain), while other versions assume you want to copy only the mixin’s prototype object. Here’s a version that is both safe and flexible: function extend(destination, source) { for (var key in source) { if (source.hasOwnProperty(key)) { destination[key] = source[key]; } } return destination; } Now let’s extend a base prototype with the two mixins we created earlier to make a RoundButton.prototype: var RoundButton = function(radius, label) { this.radius = radius; this.label = label; }; extend(RoundButton.prototype, circleFns); extend(RoundButton.prototype, clickableFns); var roundButton = new RoundButton(3, 'send'); CHAPTER ONE: BEAUTIFUL MIXINS6 roundButton.grow(); roundButton.fire(); Functional Mixins If the functions defined by mixins are intended solely for the use of other objects, why bother creating mixins as regular objects at all? Isn’t it more intuitive to think of mix- ins as processes instead of objects? Here are the circle and button mixins rewritten as functions. We use the context (this) to represent the mixin’s target object: var withCircle = function() { this.area = function() { return Math.PI * this.radius * this.radius; }; this.grow = function() { this.radius++; }; this.shrink = function() { this.radius--; }; }; var withClickable = function() { this.hover = function() { console.log('hovering'); }; this.press = function() { console.log('button pressed'); }; this.release = function() { console.log('button released'); }; this.fire = function() { this.action.fire(); }; } And here’s our RoundButton constructor. We’ll want to apply the mixins to RoundButton.prototype: var RoundButton = function(radius, label, action) { this.radius = radius; this.label = label; this.action = action; }; Now the target object can simply inject itself into the functional mixin by means of Function.prototype.call, cutting out the middleman (the extend function) entirely: MIXINS 7 withCircle.call(RoundButton.prototype); withClickable.call(RoundButton.prototype); var button1 = new RoundButton(4, 'yes!', function() {return 'you said yes!'}); button1.fire(); //'you said yes!' This approach feels right. Mixins as verbs instead of nouns; lightweight one-stop func- tion shops. There are other things to like here too. The programming style is natural and concise: this always refers to the receiver of the function set instead of an abstract object we don’t need and will never use; moreover, in contrast to the traditional approach, we don’t have to protect against inadvertent copying of inherited properties, and (for what it’s worth) functions are now cloned instead of copied. Adding Options This functional strategy also allows mixed in behaviors to be parameterized by means of an options argument. The following example creates a withOval mixin with a cus- tom grow and shrink factor: var withOval = function(options) { this.area = function() { return Math.PI * this.longRadius * this.shortRadius; }; this.ratio = function() { return this.longRadius/this.shortRadius; }; this.grow = function() { this.shortRadius += (options.growBy/this.ratio()); this.longRadius += options.growBy; }; this.shrink = function() { this.shortRadius -= (options.shrinkBy/this.ratio()); this.longRadius -= options.shrinkBy; }; } var OvalButton = function(longRadius, shortRadius, label, action) { this.longRadius = longRadius; this.shortRadius = shortRadius; this.label = label; this.action = action; }; withButton.call(OvalButton.prototype); withOval.call(OvalButton.prototype, {growBy: 2, shrinkBy: 2}); var button2 = new OvalButton(3, 2, 'send', function() {return 'message sent'}); button2.area(); //18.84955592153876 button2.grow(); button2.area(); //52.35987755982988 button2.fire(); //'message sent' CHAPTER ONE: BEAUTIFUL MIXINS8 Adding Caching You might be concerned that this approach creates additional performance overhead because we’re redefining the same functions on every call. Bear in mind, however, that when we’re applying functional mixins to prototypes, the work only needs to be done once: during the definition of the constructors. The work required for instance creation is unaffected by the mixin process, since all the behavior is preassigned to the shared prototype. This is how we support all function sharing on the twitter.com site, and it produces no noticeable latency. Moreover, it’s worth noting that performing a classical mixin requires property getting as well as setting, and in fact functional mix- ins appear to benchmark quicker in the Chrome browser than traditional ones (although this is obviously subject to considerable variance). That said, it is possible to optimize functional mixins further. By forming a closure around the mixins we can cache the results of the initial definition run, and the per- formance improvement is impressive. Functional mixins now easily outperform classic mixins in every browser. Here’s a version of the withRectangle mixin with added caching: var withRectangle = (function() { function area() { return this.length * this.width; } function grow() { this.length++, this.width++; } function shrink() { this.length--, this.width--; } return function() { this.area = area; this.grow = grow; this.shrink = shrink; return this; }; })(); var RectangularButton = function(length, width, label, action) { this.length = length; this.width = width; this.label = label; this.action = action; } withClickable.call(RectangularButton.prototype); withRectangle.call(RectangularButton.prototype); var button3 = new RectangularButton(4, 2, 'delete', function() {return 'deleted'}); MIXINS 9 button3.area(); //8 button3.grow(); button3.area(); //15 button3.fire(); //'deleted' Advice One danger with any kind of mixin technique is that a mixin function will accidentally overwrite a property of the target object that, coincidentally, has the same name. Twit- ter’s Flight framework, which makes use of functional mixins, guards against clobber- ing by temporarily locking existing properties (using the writable meta property) dur- ing the mixin process. Sometimes, however, instead of generating a collision error we might want the mixin to augment the corresponding method on the target object. advice redefines a function by adding custom code before, after, or around the original implementation. The Underscore framework implements a basic function wrapper that enables advice: button.press = function() { mylib.appendClass('pressed'); }; //after pressing button, reduce shadow (using underscore) button.pressWithShadow = _.wrap(button.press, function(fn) { fn(); button.reduceShadow(); } The Flight framework takes this a stage further: now the advice object is itself a func- tional mixin that can be mixed into target objects to enable advice for subsequent mixins. Let’s use this advice mixin to augment our rectangular button actions with shadow behavior. First we apply the advice mixin, followed by the two mixins we used earlier: withAdvice.call(RectangularButton.prototype); withClickable.call(RectangularButton.prototype); withRectangle.call(RectangularButton.prototype); And now the withShadow mixin that will take advantage of the advice mixin: var withShadow = function() { this.after('press', function() { console.log('shadow reduced'); }; this.after('release', function() { console.log('shadow reset'); }; }; withShadow.call(RectangularButton.prototype); CHAPTER ONE: BEAUTIFUL MIXINS10 1 See Charles Miller’s entire post at his blog, The Fishbowl. var button4 = new RectangularButton(5, 4); button4.press(); //'button pressed' 'shadow reduced' button4.release(); //'button released' 'shadow reset' The Flight framework sugarcoats this process. All flight components get withAdvice mixed in for free, and there’s also a defineComponent method that accepts multiple mix- ins at a time. So, if we were using Flight we could further simplify the process (in Flight, constructor properties such as rectangle dimensions are defined as attr proper- ties in the mixins): var RectangularButton = defineComponent(withClickable, withRectangle, withShadow); var button5 = new RectangularButton(3, 2); button5.press(); //'button pressed' 'shadow reduced' button5.release(); //'button released' 'shadow reset' With advice we can define functions on mixins without having to guess whether they’re also implemented on the target object, so the mixin can be defined in isolation (perhaps by another vendor). Conversely, advice allows us to augment third-party library functions without resorting to monkey patching. Wrapup When possible, cut with the grain. The grain tells you which direction the wood wants to be cut. If you cut against the grain, you’re just making more work for yourself, and making it more likely you’ll spoil the cut. —Charles Miller1 As programmers, we’re encouraged to believe that certain techniques are indispensa- ble. Ever since the early 1990s, object-oriented programming has been hot, and classi- cal inheritance has been its poster child. It’s not hard to see how a developer eager to master a new language would feel under considerable pressure to fit classical inheri- tance under the hood. But peer pressure is not an agent of beautiful code, and neither is serpentine logic. When you find yourself writing Circle.prototype.constructor = Circle, ask yourself if the pattern is serving you, or you’re serving the pattern. The best patterns tread lightly on your process and don’t interfere with your ability to use the full power of the language. By repeatedly defining an object solely in terms of another, classical inheritance estab- lishes a series of tight couplings that glue the hierarchy together in an orgy of mutual dependency. Mixins, in contrast, are extremely agile and make very few organizational WRAPUP 11 demands on your codebase—mixins can be created at will, whenever a cluster of com- mon, shareable behavior is identified, and all objects can access a mixin’s functionality regardless of their role within the overall model. Mixin relationships are entirely ad hoc: any combination of mixins can be applied to any object, and objects can have any number of mixins applied to them. Here, at last, is the opportunistic reuse that proto- typal inheritance promised us. CHAPTER ONE: BEAUTIFUL MIXINS12 CHAPTER TWO eval and Domain-Specific Languages Marijn Haverbeke eval is a language construct that takes a string and executes it as code. This means that in a language with an eval construct, the code that is being executed can come not just from input files, but also from the running code itself. There are several reasons why this is interesting and useful. In this chapter, I will explore the degree to which JavaScript’s eval can be used to create simple language- based abstractions. What About “eval Is Evil”? I know that some of my readers, at the mention of the word eval, are feeling the adre- naline shoot into their veins, and hearing the solemn voice of a certain bearded Java- Script evangelist boom in the back of their heads. “eval is evil!” this voice proclaims. I’ve never found absolute moral judgments very applicable in engineering. But if you do, and don’t want to reevaluate your faith, feel free to skip this chapter. Practically speaking, there are a number of problematic issues that come up when eval is used. Its semantics are confusing and error-prone, and its impact on performance is not always obvious. I’m going to approach it as a tool, and try to clarify and study these issues, in order to help you use the tool effectively. 13 History and Interface An interpreter (in the broad sense of the word) for a language is a program that takes text and executes it as code. When you have an interpreter available, exposing it as an eval construct, which does pretty much the same thing, is easy and obvious. The first language to do this was an early dialect of Lisp. More recent dynamic lan- guages—Perl, Python, PHP, Ruby, and of course JavaScript—followed suit. Most of these languages went through a similar process, where they initially introduced a straightforward, naive evaluation construct, and later tried to refine, extend, or disable it as a form of damage control. The subtlety in designing an interface for code execution lies in the environment in which the code is to be interpreted—the question of which variables it can see. In a primitive interpreter, which often represents variables in a way that makes it easy to inspect and manipulate them, it is no problem to give evaluated code full access to all the variables that are visible at the point where the eval construct is used. The initial design of a dynamic language is often intertwined with the first implementation of its interpreter, and this makes it tempting to go with the model where the evaluated code has access to the local environment. There are two reasons why this is problematic. Firstly, there’s rarely a reason to want to access local scope. You’ll occasionally see some confused JavaScript programmers do something like eval("obj." + propertyName) because they fail to realize that the lan- guage allows dynamic property access, or eval("var result = " + code) because they are ignorant of the fact that eval already returns the result of the evaluation, and the var result = part could be lifted out. When the code string comes from an external source, there’s also the risk of a variable in the string accidentally using a variable name that is also defined locally, causing a conflict between the two uses. The one case where access to a local scope is not completely wrongheaded is when evaluated code needs to have access to utility functions defined in the module that evaluates it. We’ll see a decent way to work around that later. The second reason that evaluating in the local scope is not a good idea is that it makes life quite a bit harder for the compiler. Knowing exactly what the code it’s compiling looks like enables a compiler to make a lot of decisions at compile time (rather than runtime), which makes the code it produces faster. Most importantly, if it knows a variable x refers to a specific x variable defined either globally or in one of its enclosing scopes, it can generate very simple code to access this x. An eval could introduce a new variable x, forcing the compiler to represent its environment in a more complex way and to output more expensive code for each variable access. And this last point is the reason for the very odd way in which JavaScript eval behaves—the distinction between local and global evaluation. CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES14 eval is, historically, a regular global variable that holds a function. That means you can do everything with it that you can do with other values—store it in another variable or in a data structure, pass it to a function, and so on. But because the people trying to optimize JavaScript execution did not want to represent all environments and variable accesses in the expensive, dynamic way I described previously, they introduced a sub- tle rule, probably initially as a hack, that was later standardized into ECMAScript. This rule is: the eval is only done in the local scope if we can see, during compilation, that a call to eval takes place—there has to be a function call to the actual global vari- able named eval in the code (and this global must still have its original value). If you call eval in any other, more indirect way, it will not have access to the local scope, and thus will be a global evaluation. For example, eval("foo") is local, while (0 || eval)("foo") is global, and so is var lave = eval; lave("foo"). Though this was conceived purely as an efficiency kludge, not as an attempt to provide a better interface, people have been intentionally making use of it, since global evalua- tion is often more useful and less error-prone than local evaluation. Another variant of global evaluation is the Function constructor. It takes strings for the argument names and function body as arguments, and returns a function in the con- text of the global scope (it does not close over variables in the scope where it was cre- ated). Note that the argument names can be passed either as separate arguments (new Function("a", "b", "return a + b")) or as a single comma-separated string (new Func tion("a, b", "return a + b")). For most purposes, this is the preferred way to evalu- ate code. Performance Evaluating code is expensive. Not only does the JavaScript compiler have to be invoked to compile the code, but modern JavaScript engines also tend to perform analysis on the loaded program in order to perform certain optimizations. Introducing new code can invalidate the results of such analysis, and cause recompilation of other parts of the program. Evaluation in local scope is extra worrying, for the reasons discussed before. I ran a number of benchmarks on modern JavaScript engines, and found that variable access that goes through a scope that can be accessed by a local eval form is significantly slower. This means that if you’re using the closure module pattern (an anonymous function as module scope), having a local evaluation anywhere in your module will incur a cost for all code in the module. The scope just needs to have such a call—it doesn’t even have to execute it—to incur this cost. PERFORMANCE 15 On the other hand, the speed of a function created by new Function or a global eval is not adversely affected by the fact that it was created dynamically. So, a desirable pattern is one where the evaluation happens once (at program startup), or outside of hot loops (we’re talking about few-millisecond delays here, not interface- freezing disasters). The functions generated by the evaluation can then be used as intensively as needed. Common Uses The most obvious use of eval is dynamically running code from an external source: for example, in a module-manager library that fetches code from somewhere and then uses a global eval to inject it into the environment, or an interactive repl (read-eval- print loop) that executes code that the user types. In the past, eval was the easiest way to parse strings of JSON data, whose representa- tion is a subset of JavaScript’s own syntax. In modern implementations we have JSON.parse for that, which has the significant advantage of not enabling code injection attacks when parsing untrusted data. Most JavaScript-based text templating systems use some form of eval to precompile templates. They parse the template text once, produce a program that instantiates the template, and use eval to have the JavaScript compiler compile that. In some cases this is simply an optimization, but in others the templates may contain JavaScript code, so some form of eval has to be involved. We’ll go over the compiler for a simple JavaScript-based templating language in the next section. A template is a kind of domain-specific language (DSL), a language designed to solve a specific problem (in this case, building up strings) by being specialized to express the elements of that problem more directly than plain JavaScript. Domain-specific lan- guages are a more interesting application of eval. We’ll cover another one, a compact and efficient notation for matching and extracting binary data, later on in this chapter. A Template Compiler Before you look at the code that follows, I should warn you. You opened a book called Beautiful JavaScript, and I’m about to confront you with some rather ugly code. That may seem disingenuous. Code that builds up strings of code tends to look bad. If we had string interpolation, a code-oriented templating system, or even a data structure that represented code, things might be slightly better. But as it is, we’ll be crudely concatenating lots of strings, many of them containing the same keywords and syntactic patterns as the code around them. This does not make for very elegant or readable code. CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES16 The function shown here accepts a template string as an argument and returns a func- tion that represents a compiled version of this template. It recognizes templating direc- tives written between hash signs. Here’s an example of a trivial template that it parses: #$in.title# ============== Items on today's list: #for item in $in.items# * #item.name##if item.note# (Note: #item.note#) #end# #end# A directive starting with for opens a loop (over an array). An if directive opens a con- ditional. Both are closed by an end directive. Anything else is interpreted as a value that should simply be inserted as text into the output. The variable $in is used to refer to the value passed into the template. For brevity, the code does no input checking whatsoever. Here’s the implementation of that function: function compile(template) { var code = "var _out = '';", uniq = 0; var parts = template.split("#"); for (var i = 0; i < parts.length; ++i) { var part = parts[i], m; if (i % 2) { // Odd elements are templating directives if (m = part.match(/^for (\S+) in (.*)/)) { var loopVar = m[1], arrayExpr = m[2]; var indexVar = "_i" + (++uniq), arrayVar = "_a" + uniq; code += "for (var " + indexVar + " = 0, " + arrayVar + " = " + arrayExpr + ";" + indexVar + "<" + arrayVar + ".length; ++" + indexVar + ") {" + "var " + loopVar + " = " + arrayVar + "[" + indexVar + "];"; } else if (m = part.match(/^if (.*)/)) { code += "if (" + m[1] + ") {"; } else if (part == "end") { code += "}"; } else { code += "_out += " + part + ";"; } } else if (part) { // Even elements are plain text code += "_out += " + JSON.stringify(part) + ";"; } } return new Function("$in", code + "return _out;"); } To locate the directives, the function simply splits the template on hash characters, and considers the even-numbered parts to be plain text and the odd-numbered elements (the parts that appear between hash characters) as templating directives. Regular expressions are used to recognize the if and for directives. A TEMPLATE COMPILER 17 The _out variable in the generated code is used to build up the output string. The underscore is an attempt to avoid name clashes, since we’ll be mixing generated code with code found in the template. To build a loop for a for directive, we need to introduce two additional variables into the generated code—one for the index and one to hold the array. We need a variable that holds the array to ensure that whatever expression is used to produce it is not evaluated repeatedly, since it might be expensive to compute or have side effects. In order to make sure that these variable names do not clash, even for nested loops, a counter (uniq) is added to the variable name (_i1, _i2, etc.). Finally, the Function constructor is used to create a function with our generated code as the body and a single argument, $in. If we feed the template compiler the example template, it will spit out a function like this (whitespace added): function($in) { var _out = ''; _out += $in.title; _out += "\n==============\n\nItems on today's list:\n"; for (var _i1 = 0, _a1 = $in.items; _i1 < _a1.length; ++_i1) { var item = _a1[_i1]; _out += "\n * "; _out += item.name; if (item.note) { _out += " (Note: "; _out += item.note; _out += ") " } } return _out; } We could make that code cleaner by adding some intelligence to the compiler (for example, it could combine subsequent += statements to simply use +), but you can see how it expresses the steps needed to instantiate the template. With a few extensions, such as the option to escape the inserted strings for your out- put format of choice (HTML, for example), and some error checking, this code can be built into a practical templating engine. Speed It is always possible to interpret a domain-specific language on demand. But just as compilers tend to run programs faster than interpreters, precompiling a template leads to faster instantiation than interpreting it from its source every time it is instantiated. CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES18 If we forget for a second that the templating language contains JavaScript code, it would be possible to do a form of compilation without new Function—we could parse the template, and build up a data structure that allows us to instantiate it quickly with little repeated work. But it’d take a lot of effort to come close to the speed of the pre- ceding approach that way. The JavaScript compiler is much more powerful (and has more direct access to the machine) than our puny compiler, so by first translating to JavaScript and then hand- ing off the rest of the work to its more advanced peer, we can get good results with very little work. This idea of building on top of a compiler for another language in order to run your own language or notation is widely applicable. The various compile-to-JavaScript lan- guages make use of it. But it also works well on a smaller scale, such as for writing a tiny compiler for a simple language to solve a very specific problem. Mixing Languages Let’s look a bit more at the fact that the templates in the toy templating language con- tain JavaScript code. They are, in a way, JavaScript programs with a syntactic exten- sion that optimizes them for text expansion. Whether this is a good idea is a question that can be answered in several ways. If you don’t trust the source of your templates, or you want to expand the templates in an environment that doesn’t run JavaScript, then it is definitely a bad idea. The authors of the templates can inject arbitrary code into your program, and expanding these templates in, for example, a Ruby program would be awkward. But we do get the full expressive power of a real programming language in our tem- plates. The alternative would be to define a simple expression language as part of the templating language, parse that, and either interpret it during expansion or convert it to the output language (JavaScript, in our case). This approach has its own problems, though. It’s more work, obviously. But it is also hard to find a balance between offer- ing enough features to allow people to do what they need to do without the language becoming huge and complex. We already know JavaScript, so if we wanted, in the example template, to render only items whose category property contains the string important, we could simply type #if /\bimportant\b/.test(item.category)#. If we had to express that in a sublanguage, we’d either be out of luck if the language didn’t have string search, or need to first spend 10 minutes digging through documentation to figure out how to express string search in the language. (Tangentially related is the argument that templating languages should be weak because they should contain presentation logic only. My take on that is that, firstly, MIXING LANGUAGES 19 presentation logic can get quite complicated, and secondly, taking away my hammer to ensure that I don’t use it on screws is a lousy way of enforcing good style.) A tricky issue that comes up when you’re mixing languages is “hygiene.” The gener- ated code and the code that appeared in the template both run in the same scope. Thus, there is a danger that the two sources of code will disagree on what a certain variable name refers to. The toy template compiler generates variables like _a3 to avoid accidentally clashing with variables from the included code. This mostly works, but is of course far from perfect (#for _a1 in [1, 2, 3]# causes a clash). You could use more obscure variable names (_$$_o_O_a3) to further reduce the chance of clashes, but it’ll never be elegant. Languages that use this kind of metaprogramming more intensively have mechanisms to cope with these kinds of problems. JavaScript doesn’t, but because its metaprogramming support is so minimal, that’s usually not a problem. Dependencies and Scopes Since the toy template compiler used new Function to evaluate its code, that code will only be able to see the global scope. What if the code that sits in the template needs access to, for example, a date format- ting function? Or what if the generated part of the code needs an HTML escaping function to escape the dynamic parts of the output? You could put them in the global scope, but if you’re using modern, disciplined scoping in the style of CommonJS (Node.js) or RequireJS modules, that would be unfortunate. The key to a workable solution to this problem is that, though we can’t control what the generated function itself closes over, we can wrap our result function in an addi- tional function, and thus inject stuff for it to close over. Here’s a crude utility that does this: function newFunctionWith(env, args, body) { var code = ""; for (var prop in env) code += "var " + prop + " = $$env." + prop + ";"; code += "return function(" + args + ") {" + body + "};"; return new Function("$$env", code)(env); } console.log(newFunctionWith({x: 10}, "y", "return x + y;")(20)); // → 30 Given an object mapping variables to values, an argument list string, and a function body string, this helper acts like new Function(args, body), except that it makes sure that all the properties in the env object are visible as closed-over variables to the body of the function. CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES20 It does this by generating a wrapping function that unpacks its argument into local variables, and then, immediately after evaluating this function, calling it. For simple values like integers, it could also have inserted the string form of the value directly into the wrapping function (var x = 10;). However, that doesn’t work for complex values, so we need to pass the environment object to the evaluated code, allowing it to extract the actual values from that object. Using this utility, the templating system could do something like allowing templates to declare their dependencies and require-ing those in, making the code close over them. Debugging Generated Code Debugging generated code is rarely a pleasant experience. When you write a compiler like the one we just looked at, and try it out, you will most likely be greeted by some kind of syntax error. Details differ between JavaScript engines, but if this error has ori- gin information at all, it’ll often point to the line that did the evaluation, not to the generated code. So what now? Unfortunately, there’s no good answer that I know of. One approach is to make your compiler function log the code before it evaluates it, autoformat it, put it in a file, and try to load it. Then, the error will at least point to the actual place where the code is broken. If it’s not a syntax error but a logic error, this might not be necessary—you might just be able to insert console.log or debugger statements into your generated code. Where it gets really bad is when, as in the templating system I discussed, code from the input is mixed into the generated code. Debugging a compiler once is one thing. Getting strange, contextless exceptions whenever you make a typo in your template can ruin your whole day. For production-strength systems, you probably want serious syntax checking of your templates. There are a variety of good JavaScript parsers (written in JavaScript) available nowadays, and they can be used to properly parse the expressions or statements you expect in your template, at compile time. This also helps to determine their extent in a reliable way (a directive like #if $in.type == "#" # would not parse in the code shown earlier, because it doesn’t understand that the sec- ond hash sign is quoted), and would make it possible to emit a meaningful error (including the template name and line offset) when nonsense is encountered. Binary Pattern Matches The second example I want to show you largely follows the same pattern as the first: we compile a domain-specific language down to JavaScript, in order to gain both speed and expressivity. DEBUGGING GENERATED CODE 21 There is a feature in the Erlang programming language that allows you to pattern- match against binary data by specifying a sequence of fields and, for each field, a vari- able name or constant. Variables will be bound to the content of the field, and con- stants will be compared to the content of the field in order to determine whether the pattern matches. This provides a very convenient way of checking and extracting data from binary blobs. Let’s say we want something like this in JavaScript. Ideally, it’d look like this: function gifSize(bytes) { binswitch (bytes) { case <<"GIF89a" width:uint16 height:uint16>>: return {width: width, height: height}; default: throw new Error("not a GIF file"); } } where binswitch is like switch, except that it matches a series of fields in the given chunk of binary data (a typed array, presumably). This pattern would mean “first the bytes corresponding to the string "GIF89a", then a two-byte unsigned integer, which is bound to width, and finally another unsigned integer bound to height.” Patterns that bind variables like that are found in many modern programming languages, and are a very pleasant feature. If you’re willing to do heavyweight full-file preprocessing, you could write your own JavaScript dialect in which this code is valid. But in this chapter, we’re looking for lightweight tricks, not alternative languages. We need to find some kind of operator that gets us close enough to this goal, but can be expressed in the existing syntax of the language. Here’s what I came up with: var pngHead = binMatch("'\x89PNG\\r\\n\x1a\\n':str8 _:uint4 'IHDR':str4 " + "width:uint4 height:uint4 depth:uint1"); function pngSize(bytes) { var match; if (match = pngHead(bytes, 0)) return {width: match.width, height: match.height}; else throw new Error("Not a PNG file."); } Patterns are precompiled from strings to functions, much like in the template example. The pattern string contains any number of binding:type pairs, where type is a word like str or uint followed by a byte size, and binding can be _ (an underscore) to ignore a field, a literal (in which case the pattern matches only when the value is equal to the literal), or a field name in which to store the value. CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES22 The very ugly string at the start of the pattern contains the first eight bytes of the PNG header. The double backslashes are needed because the content of the string is inter- preted as a string literal (again) in the generated code, so it may not contain raw new- lines. After the file-identifying string, a four-byte field is found, which we ignore. Next, the string 'IHDR' announces the start of the image header, which starts with width, height, and color depth fields. A function produced by binMatch takes a Uint8Array and an offset integer, and returns null for failed matches and an object containing the matched values when the match succeeds. The return object will have an additional field, end, which indicates the byte offset of the end of the match. Here is the core of the match compiler. It is pleasantly small: function binMatch(spec) { var totalSize = 0, code = "", match; while (match = /^([^:]+):(\w+)(\d+)\s*/.exec(spec)) { spec = spec.slice(match[0].length); var pattern = match[1], type = match[2], size = Number(match[3]); totalSize += size; if (pattern == "_") { code += "pos += " + size + ";"; } else if (/^[\w$]+$/.test(pattern)) { code += "out." + pattern + " = " + binMatch.read[type](size) + ";"; } else { code += "if (" + binMatch.read[type](size) + " !== " + pattern + ") return null;"; } } code = "if (input.length - pos < " + totalSize + ") return null;" + "var out = {end: pos + " + totalSize + "};" + code + "return out;"; return new Function("input, pos", code); } It does a (crude, non-error-checking) parse of the input string using a regular expres- sion that matches a single pattern:type element. For wildcard (_) patterns, it simply generates code to move the offset (pos) forward. For other patterns, it uses a helper from binMatch.read (which we’ll look at momentarily) to generate an expression that builds up a JavaScript value from the bytes at the current position. For literals, it gen- erates an if that returns null when the value found doesn’t match the literal. Finally, an extra conditional is generated at the start of the function, which verifies that there are enough bytes in the array to match the pattern, and code that initializes and returns the output object is added. These are the type-parsing functions needed for the example: BINARY PATTERN MATCHES 23 binMatch.read = { uint: function(size) { for (var exprs = [], i = 1; i <= size; ++i) exprs.push("input[pos++] * " + Math.pow(256, size - i)); return exprs.join(" + "); }, str: function(size) { for (var exprs = [], i = 0; i < size; ++i) exprs.push("input[pos++]"); return "String.fromCharCode(" + exprs.join(", ") + ")"; } }; Given a size, they return a string that contains the expression that will advance the pos variable and produce a value of the specified type. Note that uint is big-endian (net- work byte order). Obvious extensions would be to write a little-endian type (uintL), which we’d need when parsing our earlier GIF example, and of course signed types (int, intL). Further optimizations are possible. For example, we could pick literal strings and inte- gers apart into bytes at compile time, and compare those bytes one by one instead of building up the composite value and comparing that. Or, we could first check all liter- als in a pattern and only then extract the output fields, so that the match does as little work as possible if it fails. This is a nice property of static metaprogramming—the static part of the input (in this case, the pattern string) gives us a rather high-level view of the desired dynamic behavior, and we can pick a compilation strategy based on that information. If you were to interpret such a pattern at runtime, there would be less room for such decisions. If you want to test this code out, here’s a tiny HTML page that, using the code shown previously, allows you to pick a PNG file and will console.log its size: CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES24 The binary pattern compiler, by putting pieces of code (literals) from the input string directly in the generated code (without sanity-checking them), could, in slightly con- trived situations such as building up the pattern string from user input, be used to inject code into a system. Always take a moment to consider this angle when you use eval-like constructs. For some tools, like the template compiler, giving the sublanguage the ability to run arbitrary code is part of the design. For others, like this one, it isn’t, and it is a good idea to make sure they can’t be used for that purpose. We could fix this by checking whether the syntax of the literals actually conforms to JavaScript lit- erals, or by defining and parsing our own string and number syntax (which could also get rid of the double backslash problem) and not inserting any raw, unparsed code from the template at all. Closing Thoughts There is a major convenience gap between my fantasy syntax for pattern matching and the reality of what I came up with. Instead of elegantly expressing our pattern inline, we have to build it up beforehand, in order to ensure that it is built only once— reparsing and recompiling it every time it gets run would, in a situation where the matching happens multiple times, be embarrassingly wasteful. Instead of simply bind- ing the variables in the pattern to local variables, we have to store them in an object. In this case, I think that if you are doing actual binary parsing, the abstraction is help- ful enough to live with the not-quite-ideal interface. But the case is representative of a wall that you hit when trying to push eval-based abstractions beyond a certain point. There’s a pattern that works well—compiling a domain-specific language down to a piece of code. Some languages can be expressed as JSON-like composite data, rather than plain strings (for example, a decision tree modeled as nested objects). The awkward part lies in the interaction between the domain-specific language and the code around it. They can’t be mixed, due to the requirement that the compilation happens only once, whereas the code that makes use of the domain-specific function- ality will typically run many times. Small snippets of code with little external dependencies can be made part of the domain language. In some cases, you might even decide to include closures in your source data structure, in order to be able to access the local environment—yet even those won’t be able to close over the incoming data for a specific invocation of the func- tionality, but only over data that has the same lifetime as the compiled artifact. For this reason, many domain-specific languages are better expressed using interpreta- tion rather than compilation. jQuery is a good example of a successful interpreted domain language in JavaScript—it hacks method chaining in a way that allows for CLOSING THOUGHTS 25 succinct DOM operations. This abstraction would be completely unpractical (though probably faster) when executed as a compiled language. The pattern where you should consider reaching for a compiled domain-specific lan- guage is: • You’re writing chunks of repetitive, low-density code. • Performance is important. • The code chunks can conveniently be isolated in functions. • You can think of a shorter, more elegant notation. CHAPTER TWO: EVAL AND DOMAIN-SPECIFIC LANGUAGES26 CHAPTER THREE How to Draw a Bunny Jacob Thornton This chapter is not about rendering rabbits with JavaScript. This chapter is about language and the difference between what it means to draw a “rabbit” and what it means to draw a “bunny.” This chapter is not a tutorial. It’s an exegesis. This chapter is at play. What Is a Rabbit? So she was considering, in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid), whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies, when suddenly a White Rabbit with pink eyes ran close by her. —Lewis Carroll, Down the Rabbit Hole A “rabbit” is an animal you might find in a field, forest, or pet shop. It is a gregarious plant-eater with a short tail and floppy ears. It is an actual rabbit existing in reality. A “rabbit” cannot talk to itself. A “rabbit” does not run late. From this point forward, when we speak of rabbits, we speak of these ordinary, everyday rabbits. For the purposes of this chapter, to “draw a rabbit” is to apply various drawing techni- ques in such a way as to render an image of a rabbit indistinguishable from the actual rabbit itself. It is to approach a level of realism on par with that of a photograph. A rabbit drawing is strictly referential. It strives to be a copy. Drawing a rabbit is mechanical and spec-based. There is a correct way to draw a rabbit and an incorrect way to draw a rabbit. 27 When you draw a rabbit, you are always drawing a very particular rabbit. Deviations from the rabbit model should be regarded as errors. The more your rabbit rendering stays on model, the better. What Is a Bunny? After a time she heard a little pattering of feet in the distance, and she hastily dried her eyes to see what was coming. It was the White Rabbit returning, splendidly dressed, with a pair of white kid gloves in one hand and a large fan in the other: he came trotting along in a great hurry, muttering to himself as he came, “Oh! the Duchess, the Duchess! Oh! won’t she be savage if I’ve kept her waiting!” —Lewis Carroll, Down the Rabbit Hole A “bunny” is not just a young, cute rabbit. A bunny is a splendidly dressed abstraction. A playful resemblance that prioritizes an identity other than the rabbit. It is a symbol. There are several examples from pop culture of bunnies: Bugs Bunny, the Energizer Bunny, etc. These icons are always characters first and rabbits second (or third). Here, the rabbit identity is hijacked and subjugated to serve a new ruling identity. To “draw a bunny” is to play within the loose constraints of an already existing iden- tity (the rabbit) to create something entirely new. The connotation of the word “bunny” itself invokes a lack of seriousness which serves to disarm and undermine the rigid structure of the rabbit, promoting both creative exploration and expression. CHAPTER THREE: HOW TO DRAW A BUNNY28 Consider the bunny heads of Ray Johnson (pictured above), a correspondence artist from New York. In January 1964, Ray Johnson signed a letter to his friend William (Bill) S. Wilson with a small picture of a bunny head next to his name. This image rapidly proliferated, primarily becoming Johnson’s signature and “self por- trait” as personifications of how he felt on a given day. Johnson also used the bunny head to represent other “characters” who populate his works, as well as the subject of one of his “How to draw” series. —Frances F.L. Beatty, Ph.D. The Ray Johnson Estate When you draw bunnies, their proximity to a real image of a rabbit isn’t called into question. For Johnson, the bunnies ceased to be rabbits, instead becoming a vehicle for alternative expression; a means to creativity; and an exercise in play, imagination, inventiveness, and originality. What Does This Have to Do with JavaScript? JavaScript is an expressive language. Expressions are what lie beyond the literal compiled logic of a program. They are what we as humans read and interpret. The expressiveness of JavaScript is a vehicle through which software developers speak. It is a way for developers to infuse their code with semantic value: different styles, dialects, and character. And this potential for linguistic play inherent in JavaScript is precisely where we begin to see “bunnies.” To draw a rabbit in JavaScript is to copy patterns out of books and slides, to mimic specific styles from blogs, and more generally to reproduce already established forms and expressions. Alternatively, to draw a bunny here is to undertake an exercise in experimentation. It is to unearth alternative forms from within the language and then combine these forms in functional yet inventive ways. In drawing JavaScript bunnies, you’re playing. It’s fun. It challenges and evolves both your individual and the community’s understanding of the language. It opens up new potential solutions to old problems, and exposes flaws in old assumptions. It estab- lishes a personal relationship between you and the code you produce. It makes writing JavaScript a craft. An art. It makes reading software personal and purposeful. It estab- lishes an audience for your program other than just the compiler. Intent becomes clearer. Code becomes more consistent. And you grow as a developer. With this in mind, consider the following conditional statement, which checks to see if a property exists; if the property doesn’t exist, it calls a method to set it. Traditionally, this logic might have looked something like this: WHAT DOES THIS HAVE TO DO WITH JAVASCRIPT? 29 if (!this.username) { this.setUsername(); } As an expression, this logic reads: if not a username, then set a username. However, using the logical OR operator you could express this same statement in a more mini- malist way: this.username || this.setUsername() The expression: a username exists, or set a username. These two code blocks are functionally equivalent, yet their expressions are different. They read differently. Where the former has a sort of exactness and formality, the lat- ter is pithy and short. Exploring these variations in expression is precisely what draw- ing bunnies is all about. And what’s more, by using expressions in conjunction with other like expressions a developer can begin to architect an overarching voice or tone in a program. Let’s consider a second reduced example. Imagine looking inside an array for a user- name. If the username is not present, you want to add the username to the array. The logic for this might be expressed as follows: if (users.indexOf(this.username) === -1) { users.push(this.username) } This code reads: if the username has an index in the users array that is equal to -1, then push the username into the users array. An alternative way to express this statement might be to make use of the bitwise NOT operator. The bitwise NOT operator inverts the bits of its operand, turning a -1 into a 0 (or falsy). The preceding logic might then be rewritten simply as: ~users.indexOf(this.username) || users.push(this.username) The expression: the username is in the array, or add it. As you begin to build up these expressions into programs, a certain rhythm and time signature emerges. And as you improve as an engineer, you can begin to orchestrate different phrasings and melodies into your software as well. This establishes a consis- tent rhythm at the project level, which will make it much easier to flow from one piece of a program to another. The following is a simple function that, given x, y, w, h, and placement arguments, returns an offset object with a top and left value. It is written in a decidedly slow man- ner, with a very deliberate, heavy rhythm (switch > case… case… case… case… return): CHAPTER THREE: HOW TO DRAW A BUNNY30 function getOffset (x, y, w, h, placement) { var offset switch (placement) { case 'bottom': offset = { top: y + h, left: x + w/2 } break case 'top': offset = { top: y, left: x + w/2 } break case 'left': offset = { top: y + h/2, left: x } break case 'right': offset = { top: y + h/2, left: x + w } break } return offset } Notice the difference between this function and the following function, not in terms of computing performance (where the difference is inconsequential), but rather in pure cognitive pacing. The next function returns the same result, but with a quicker, more succinct rhythm (return > this/that, this/that, this/that): function getOffset (x, y, w, h, placement) { return placement == 'bottom' ? { top: y + h, left: x + w/2 } : placement == 'top' ? { top: y, left: x + w/2 } : placement == 'left' ? { top: y + h/2, left: x } : { top: y + h/2, left: x + w } } A third function might even exaggerate the pacing further, focusing in on the return object itself—clearly calling out expected properties “top” and “left”—but with a more complex rhythm, forking the conditions at the object’s properties: function getOffset (x, y, w, h, placement) { return { top : placement == 'bottom' ? y + h : placement == 'top' ? y : y + h/2, left : placement == 'right' ? x + w : WHAT DOES THIS HAVE TO DO WITH JAVASCRIPT? 31 placement == 'left' ? x : x + w/2 } } As you’ve begun to see, expressions guide our reading of software. In JavaScript, the potential for this sort of variation both enables and is enabled by experimentation and play—which therefore should be championed and not discouraged. With So Much Variation, Which Way Is Correct? Imagine sitting several adults down in a room and providing them with an actual image of a rabbit and adequate drawing supplies. Imagine asking them each to draw a rabbit. Depending on the group’s exposure to various drawing techniques, you’d likely receive a variety of renderings, ranging from rather crude to rather capable. Variety here becomes a metric for the lack of experience in drawing amongst the group. Which is to say, if everyone were perfect at illustration they would each have rendered a photorealistic image, indistinguishable from the image of the rabbit; there wouldn’t have been any variety at all. This is because to draw a rabbit is to exercise one’s ability to duplicate. It is an exercise in experience and mimicry. There is a right answer, and thus, there isn’t room for creativity. But what if you had asked the same group to draw a bunny? Arguably the request is at once less threatening, less rigid, and less scientific. To draw a bunny is to draw a rabbit-like thing. It is exceedingly difficult to be critical of a bunny drawing because at most it’s only ever a resemblance. Following this, you could expect the variety in the group’s images to be even more exaggerated. To draw a bunny is to celebrate and to lean on variety. Here, however, vari- ety no longer takes a negative form. Instead, it is symptomatic of the potential for creative expression implicit in the act of drawing without bounds. It is a positive metric for inventiveness and imagination. To draw a bunny is to engage with variety. It serves to challenge the image of the rab- bit by introducing new means of achieving likeness. Consider immediately invoked function expressions (IIFEs). By convention, an IIFE takes one of the two following forms: (function (){})() (function (){}()) CHAPTER THREE: HOW TO DRAW A BUNNY32 But drawing bunnies is not about convention. Rather, it’s an exercise in upsetting con- vention. And yet at the same time it’s about positive variation—one manifestation of an expression not being absolutely superior to another. With this in mind, here are a few other ways you may write an IIFE: !function (){}() ~function (){}() +function (){}() -function (){}() new function (){} 1,function (){}() 1&&function (){}() var i=function (){}() Each manifestation has its own unique qualities and advantages—some with fewer bytes, some safer for concatenation, each valid and each executable. How Does This Affect the Classroom? Because school is limited by grades, it spends much of its time propagandizing the drawing of rabbits. HOW DOES THIS AFFECT THE CLASSROOM? 33 If you’ve taken a drawing class, you’ve almost certainly drawn a block of wood. You’ve spent hours shading a piece of fruit. You’ve studied proportions. You’ve been lectured on perspective. You’ve been given tools to break things down to a grid. And, after a few months of intense studying, your apple does begin to look a bit more like the apple sitting in front of you. To be sure, this isn’t a bad thing. In fact, quite the opposite. These practices give you foundational knowledge on top of which you can build more complex structures. Fur- thermore, you can turn the tools in on themselves and exploit them in very interesting ways. And perhaps best of all, they introduce conventions and a new language through which you can engage with your peers. The problem emerges when students think of these tools in absolute ways. This is the right way to do X; this is the only way to do Y. As you might imagine, this absolutism breeds arrogance, narcissism, and an environment rooted in peer opposition. Is This Art? And Why Does That Matter? It’s true to say that when you paint anything, you are also painting not only the subject, but you are painting yourself as well as the object that you are trying to record. Because painting is a dual performance. Because, for instance, if you look at a Rembrandt painting, I feel like I know very much more about Rembrandt than I do about the sitter. —Francis Bacon, interview with David Sylvester Briefly consider two libraries I’ve contributed to this past year: Ratchet and Bootstrap. Functionally, the content of both libraries is as it should be. What’s interesting are the undertones—or rather, the potential for the same sort of undertones you would expect to find in painting, music, or creative writing. Which is to say, the differences in style between these two projects aren’t just arbitrary preferences. They’re very definite, derived expressions, representative of a certain mood over time. CHAPTER THREE: HOW TO DRAW A BUNNY34 Bootstrap reads very fun, not serious—nearly every line is a joke. It’s trying to provoke you. Taking shortcuts. Demanding that you reread it. Reread it again. It’s very pop. Very optimistic. Forward. Playful. The code for Ratchet is very different. It’s very conservative. It’s not meant to draw attention to itself. It’s very explicit. Assertive, necessary. It’s easy to approach. It’s a vanilla milkshake. Insofar as art has been characterized in terms of mimesis, expression, communication of emotion, and other such values, it follows that software, when written expressively, is also an artistic gesture. What’s more, this realization reinforces our insistence on the importance of drawing bunnies inasmuch as the exercise stretches one’s creative and expressive capacities, enabling the formation of opinions and development of style, while also helping to strengthen communication, exploration, and imaginative facul- ties in the programmer. Along these lines, my good friend Angus Croll has been exploring further creative manifestations of code with his great articles on literary figures writing JavaScript. In his articles, he writes several functions to return a Fibonacci series of a given length, each program in the style of a different literary figure: Hemingway, Breton, Shake- speare, Poe. The results are comedic, but the point is consistent: The joy of JavaScript is rooted in its lack of rigidity and the infinite possibili- ties that this allows for. Natural languages hold the same promise. The best authors and the best JavaScript developers are those who obsess about lan- guage, who explore and experiment with language every day and in doing so develop their own style, their own idioms, and their own expression. —Angus Croll, If Hemingway wrote JavaScript Beautiful JavaScript is an art. Reading through it should feel uniform; it should allow you to flow from expression to expression. It’s not just about executing logic; it’s about establishing pace and reflecting a little bit of yourself. It’s about taking pride in what you create. IS THIS ART? AND WHY DOES THAT MATTER? 35 What Does This Look Like? In 1945, Picasso released a suite of 11 lithographs entitled “Bull.” In this series he deconstructs the image of the bull, from realist rendering to hyperreduced line draw- ing, progressively subtracting from and reimagining its form with each plate. What’s of particular interest here is the progression. Beginning with the realistic brush drawing, Picasso bulks the form up, increasing its expression of power before dissect- ing it with lines of force, following the contours of its muscles and skeleton, ultimately reducing and simplifying the image into a line. This study is considered the ultimate master class in abstraction, and what’s more, it’s a classic example of Picasso drawing bunnies. This same exercise in abstraction can be applied to JavaScript. I had the privilege of working with Alex Maccaw during my time at Twitter. There, we had a number of conversations about interview philosophies and code challenges. During one of our discussions he mentioned that he had always asked the same intro- ductory interview question during phone screens—and since then, I have adopted it as my first question as well. CHAPTER THREE: HOW TO DRAW A BUNNY36 The question goes, given the following condition, define explode: if ('alex'.explode() === 'a l e x') interview.nextQuestion() else interview.terminate() There are a number of ways to answer this question. Let’s begin with the most verbose: String.prototype.explode = function () { var i var result = '' for (i = 0; i < this.length; i++) { result = result + this[i] if (i < result.length - 1) { result = result + ' ' } } return result } This block is swollen and distended, yet deliberate. There’s nothing clever. It’s by the book. And it’s easily the most common response to the question. Simply put, we declare variables i and result, iterating over the string’s value, pushing its characters to result and conditionally adding a space between each character until eventually we return. Fine. But now let’s try something a bit cleverer: String.prototype.explode = function (f,a,t) { for (f = a = '', t = this.length; a++ < t;) { f += this[a-1] a < t && (f += ' ') } return f //ollow @fat } If you write code like this, people will hate you. Without question. It’s playful. It looks to trick you. To trick the language. It assaults the reader. It’s concerned with everything, except its own logic. It’s vain. But it’s beautiful (to me). In this block, we’re scoping the variables to the function by including them as pseu- doarguments (which spell my Twitter handle). The for loop saves some characters by setting both f and a to new string, and the a is then coerced in the next expression to 1 by the ++ increment operator_, just in time to be used in the equality comparison. On the next line the program subtracts 1 from a before indexing the string to make up for starting the loop at 1 (rather than 0). It then conditionally adds a space to the end, before completing the loop and returning the result. WHAT DOES THIS LOOK LIKE? 37 The next iteration of the solution is by far the simplest, leaning heavily on the langua- ge’s tool belt. Perhaps surprisingly, this response is actually very uncommon to receive in real interviews: String.prototype.explode = function () { return this.split('').join(' ') } This solution is about getting to the next question. It’s clever, but not overly so. It’s blunt. It’s mature. If the previous solution was crass, this one is urbane. And finally, the absolute simplest: String.prototype.explode = function (/*smart a$$*/) { return 'a l e x' } Which I’ve never gotten. What Did I Just Read? If drawing rabbits in JavaScript means copying patterns out of books or mimicking specific styles from blogs, drawing bunnies is about experimentation and creative expression. To draw a bunny is to pervert the conventions of the language. To draw your breath or to get it all out as fast as possible. It’s an exercise in discovering and pushing the bounds of your understanding of the language. It’s about reinforcing and challenging JavaScript as a craft. In drawing JavaScript bunnies, you’re always at play. And you’re getting better. CHAPTER THREE: HOW TO DRAW A BUNNY38 CHAPTER FOUR Too Much Rope, or JavaScript for Teams Daniel Pupius Beauty is power and elegance, right action, form fitting function, intelligence, and reasonability. —Kim Stanley Robinson, Red Mars JavaScript is a flexible language. In fact, this entire book is a testament to its expres- siveness and dynamism. Within these pages you’ll hear stories of how to bend the lan- guage to your will, descriptions of how to use it to experiment and play, and sugges- tions for seemingly contradictory ways to write it. My job is to tell a more cautionary tale. I’m here to ask the question: what does it mean to write JavaScript in a team? How do you maintain sanity with 5, 10, 100 people committing to the same codebase? How do you make sure new team members can orient themselves quickly? How do you keep things DRY without forcing broken abstractions? Know Your Audience In 2005 I joined the Gmail team in sunny Mountain View, California. The team was building what many considered at the time to be the pinnacle of web applications. They were awesomely smart and talented, but across Google, JavaScript wasn’t consid- ered a “real programming language”—you engineered backends, you didn’t engineer web UIs—and this mentality affected how they thought about the code. Furthermore, even though the language was 10 years old, JavaScript engines were still limited: they were designed for basic form validation, not building applications. Gmail was starting to hit performance bottlenecks. To get around these limitations much of 39 the application was implemented as global functions, anything requiring a dot lookup was avoided, sparse arrays were used in place of templates, and string concatenation was a no-no. The team was writing first and foremost for the JavaScript engine, not for themselves or others. This led to a codebase that was hard to follow, inconsistent, and sprawling. Instead of optimizing by hand, we transitioned to a world where code was written for humans and the machine did the optimizations. This wasn’t a new language, mind you—it was important that the raw code be valid JavaScript, for ease of understand- ing, testing, and interoperability. Using the precursor to the Closure Compiler, we developed optimization passes that would collapse namespaces, optimize strings, inline functions, and remove dead code. This is work much better suited to a machine, and it allowed the raw code to be more readable and more maintainable. TIP Lesson 1: Code for one another, and use tools to perform mechanical optimizations. Stupid Good As the old adage goes, debugging is harder than writing code, so if you write the clev- erest code you can, you’ll never be clever enough to debug it. It can be fun to come up with obscure and arcane ways of solving problems, especially since JavaScript gives you so much flexibility. But save it for personal projects and JavaScript puzzlers. When working in a team you need to write code that everyone is going to understand. Some parts of the codebase may go unseen for months, until a day comes when you need to debug a production issue. Or perhaps you have a new hire with little Java- Script experience. In these types of situation, keeping code simple and easy to under- stand will be better for everyone. You don’t want to spend time decoding some bizarro, magical incantation at two in the morning while debugging production issues. Consider the following: var el = document.querySelector('.profile'); el.classList[['add','remove'][+el.classList.contains('on')]]('on'); And an alternative way of expressing the same behavior: var el = document.querySelector('.profile'); if (el.classList.contains('on')) el.classList.remove('on'); else el.classList.add('on'); CHAPTER FOUR: TOO MUCH ROPE, OR JAVASCRIPT FOR TEAMS40 Saying that the second snippet is better than the first may seem in conflict with the concept that “succinctness = power.” But I believe there is a disconnect that stems from the common synonyms for succinct: compact, brief. I prefer terse as a synonym: using few words, devoid of superfluity, smoothly elegant The first snippet is more compact than the second snippet, but it is denser and actually includes more symbols. When reading the first snippet you have to know how coer- cion rules apply when using a numeric operator on a Boolean, you have to know that methods can be invoked using subscript notation, and you have to notice that square brackets are used for both defining an array literal and method lookup. The second snippet, while longer, actually has less syntax for the reader to process. Furthermore, it reads like English: “If the element’s class list contains ‘on’, then remove ‘on’ from the class list; otherwise, add ‘on’ to the class list.” All that said, an even better solution would be to abstract this functionality and have the very simple, readable, and succinct: toggleCssClass(document.querySelector('.profile'), 'on'); TIP Lesson 2: Keep it simple; compactness != succinctness. Keep It Classy When I’m talking with “proper programmers,” they often complain about how terrible JavaScript is. I usually respond that JavaScript is misunderstood, and that one of the main issues is that it gives you too much rope—so inevitably you end up hanging yourself. There were certainly questionable design decisions in the language, and it is true that the early engines were quite terrible, but many of the problems that occur as Java- Script codebases scale can be solved with pretty standard computer science best practi- ces. A lot of it comes down to code organization and encapsulation. Unfortunately, until we finally get ES6 we have no standard module system, no stan- dard packaging mechanisms, and a prototypal inheritance model that confuses a lot of people and begets a million different class libraries. While JavaScript’s prototypal inheritance allows instance-based inheritance, I gener- ally suggest when working in a team that you simulate classical inheritance as much as possible, while still utilizing the prototype chain. Let’s consider an example: KEEP IT CLASSY 41 var log = console.log.bind(console); var bob = { money: 100, toString: function() { return '$' + this.money } }; var billy = Object.create(bob); log('bob:' + bob, 'billy:' + billy); // bob:$100 billy:$100 bob.money = 150; log('bob:' + bob, 'billy:' + billy); // bob:$150 billy:$150 billy.money = 50; log('bob:' + bob, 'billy:' + billy); // bob:$150 billy:$50 delete billy.money; log('bob:' + bob, 'billy:' + billy); // bob:$150 billy:$150 In this example, billy inherits from bob. What that means in practice is that billy.pro totype = bob, and nonmatching property lookups on billy will delegate to bob. In other words, to begin with billy’s $100 is bob’s $100; billy isn’t a copy of bob. Then, when billy gets his own money, it essentially overrides the property that was being inherited from bob. Deleting billy’s money doesn’t set it to undefined; instead, bob’s money becomes billy’s again. This can be rather confusing to newcomers. In fact, developers can go a long time without ever knowing precisely how prototypes work. So, if you use a model that sim- ulates classical inheritance, it increases the chances that people on your team will get on board quickly and allows them to be productive without necessarily needing to understand the details of the language. Both the Closure library’s goog.inherits and Node.js’s util.inherits make it easy to write class-like structures while still relying on the prototype for wiring: function Bank(initialMoney) { EventEmitter.call(this); this.money = money; } util.inherits(Bank, EventEmitter); Bank.prototype.withdraw = function (amount) { if (amount <= this.money) { this.money -= amount; this.emit('balance_changed', this.money); // inherited return true; } else { return false; } } This looks very similar to inheritance in other languages. Bank inherits from EventEmit ter; the superclass’s constructor is called in the context of the new instance; util.inher CHAPTER FOUR: TOO MUCH ROPE, OR JAVASCRIPT FOR TEAMS42 its wires up the prototype chain just like we saw with bob and billy earlier; and then the property lookup for emit falls to the EventEmitter “class.” A suggested exercise for the reader is to create instances of a class without using the new keyword. TIP Lesson 3: Just because you can doesn’t mean you should. TIP Lesson 4: Utilize familiar paradigms and patterns. Style Rules The need for consistent style as codebases and teams grow is nothing unique to Java- Script. However, where many languages are opinionated about coding style, JavaScript is lenient and forgiving. This means it’s all the more important to define a set of rules the team should stick to. Good style is subjective and can be difficult to define, but there are many cases where certain style choices are quantifiably better than others. In the cases where there isn’t a quantifiable difference, there is still value in making an arbitrary choice one way or the other. TIP Style guides provide a common vocabulary so people can concentrate on what you’re saying instead of how you’re saying it. A good style guide should set out rules for code layout, indentation, whitespace, capi- talization, naming, and comments. It is also good to create usage guides that explain best practices and provide guidance on how to use common APIs. Importantly, these guides should explain why a rule exists; over time you will want to reevaluate the rules and should avoid them becoming cargo cults. Style guides should be enforced by a linter and if possible coupled with a formatter to remove the mechanical steps of adhering to the guide. You don’t want to waste cycles correcting style nits in code reviews. The ultimate goal is to have all code look like it was written by the same person. TIP Lesson 5: Consistency is king. STYLE RULES 43 Evolution of Code When I was first working on Google Closure there was no simple utility for making XMLHttpRequests; everything was rolled up in large, application-specific request utilities. So, in my naiveté XhrLite was born. XhrLite became popular—no one wants to use a “heavy” implementation—but its users kept finding features that were missing. Over time small patches were submitted, and XhrLite accumulated support for form encoded data, JSON decoding, XSSI han- dling, headers, and more—even fixes for obscure bugs in FF3.5 web workers. Needless to say, the irony of “XhrLite” becoming a distinctly heavy behemoth was not lost, and eventually it was renamed “XhrIo.” The API, however, remained bloated and cumbersome. TIP Small changes—reasonable in isolation—evolve into a system that no one would ever design if given a blank canvas. Evolutionary complexity is almost a force of nature in software development, but it has always seemed more pronounced with JavaScript. One of the strengths that hel- ped spur JavaScript’s popularity is that you can get up and running quickly. Whether you’re creating a simple web app or a Node.js server, a minimal dev environment and a few lines of code yields something functional. This is great when you’re learning, or prototyping, but can lead to fragile foundations for a growing team. You start out with some simple HTML and CSS. Perhaps you add some event handlers using jQuery. You add some XHRs, maybe you even start to use pushState. Before long you have an actual single-page application, something you never intended at first. Per- formance starts to suffer, there are weird race conditions, your code is littered with setTimeouts, there are hard-to-track-down memory leaks…you start wondering if a traditional web page would be better. You have the duck-billed platypus of applications. TIP Lesson 6: Lay good foundations. Be mindful of evolutionary complexity. Conclusion JavaScript’s beauty is in its pervasiveness, its flexibility, and its accessibility. But beauty is also contextual. What started as a “scripting language” is now used by hundred- plus-person teams and forms the building blocks of billion-dollar products. In such sit- CHAPTER FOUR: TOO MUCH ROPE, OR JAVASCRIPT FOR TEAMS44 uations you can’t write code in the same way you would hacking up a one-person website. So… 1. Code for one another, and use tools to perform mechanical optimizations. 2. Keep it simple; compactness != succinctness. 3. Just because you can doesn’t mean you should. 4. Utilize familiar paradigms and patterns. 5. Consistency is king. 6. Lay good foundations. Be mindful of evolutionary complexity. CONCLUSION 45 CHAPTER FIVE Hacking JavaScript Constructors for Model Harmony Ben Vinegar JavaScript MVC—or MVW (Model, View, “Whatever”)—frameworks come in many flavors, shapes, and sizes. But by virtue of their namesake, they all provide developers with a fundamental component: models, which “model” the data associated with the application. In client-side web apps, they typically represent a database-backed object. Last year at Disqus, we rewrote our embedded client-side application in Backbone, a minimal MVC framework. Backbone is often criticized for having an unsophisticated “view” layer, but one thing it does particularly well is managing models. Defining a new model in Backbone looks like this: var User = Backbone.Model.extend({ defaults: { username: '', firstName: '', lastName: '' }, idAttribute: 'username', fullName: function () { return this.get('firstName') + this.get('lastName'); } }); Here’s some sample code that initializes a new model, and demonstrates how that model instance might be used in an application: 47 var user = new User({ username: 'john_doe', firstName: 'John', lastName: 'Doe' }); user.fullName(); // John Doe user.set('firstName', 'Bill'); user.save(); // PUTs changes to server endpoint These are simple examples, but client-side models can be very powerful, and they are typically—ahem—the backbone of any nontrivial MVC app. Additionally, Backbone provides what are called “collection” classes, which help devel- opers easily manipulate common sets of model instances. You can think of them as superpowered arrays, loaded with helpful utility functions: var UserCollection = Backbone.Collection.extend({ model: User, url: '/users' }); var users = new UserCollection(); users.fetch(); // Fetches user records via HTTP var johndoe = users.get('john_doe'); // Find by primary idAttribute Not all MVC frameworks implement a Collection class exactly like Backbone does. For example, Ember.js defines a CollectionView class, which similarly maintains a set of common models, but tied to a DOM representation. API differences aside, it’s clear that developers commonly manipulate and render sets of objects, and frameworks provide different facilities for doing so. Doppelgangers When you’re working with large or even medium-sized client applications, it’s com- mon to have multiple model instances representing the same database-backed object. This usually happens when you have multiple views of some data, such that a model appears in two or more views. Consider this example, which introduces two new collections of users: Followers, for users that are following a given user (say, on a social network), and Following, for users whom a given user happens to be following. A user who is both a follower and being followed will appear in both collections, in which case we will have duplicate instances of the same database-backed model: CHAPTER FIVE: HACKING JAVASCRIPT CONSTRUCTORS FOR MODEL HARMONY48 var FollowingCollection = UserCollection.extend({ url: '/following' }); var FollowersCollection = UserCollection.extend({ url: '/followers' }); var following = new FollowingCollection(); var followers = new FollowersCollection(); following.fetch(); followers.fetch(); var user1 = following.get('johndoe'); var user2 = followers.get('johndoe'); user1 === user2; // false Having multiple instances of the same model has two major downsides. First, you are using additional memory to represent the same object. Depending on the complexity of the model and the sizes of the attributes it holds, it’s not unreasonable for a single instance to consume kilobytes of memory. If instances are duplicated doz- ens or hundreds of times—a very possible scenario for long-lived single-page applica- tions—they can quickly become a memory sink. Secondly, if you or the user modifies the state of one of these models on the client, other instances of that model will fall out of sync. This can happen through a number of means, like if the user changes the state of the object via the UI, or an update cre- ated by another user is sent to the client via a real-time service: user1.set('firstName', 'Johnny'); user2.get('firstName'); // still John In this simple example, where the same user appears in only two different collections, it might seem trivial to update both instances manually with the new property. But it’s easy to imagine how in a complex application the same user object might exist across dozens of different collections—not just follower/following lists, but also notifications, feed items, logs, and so on. It would be terrific if, instead of having to track down every instance of a given model, we could have each instance update itself intelligently. Or better yet, if we never had duplicated instances to begin with. DOPPELGANGERS 49 Miniature Models of Factories A common solution for handling duplicate instances is to use a factory function when you create a new model instance. If the factory detects that a model instance already exists, it will just return the existing instance instead: var userCache = {}; function UserFactory(attrs, options) { var username = attrs.username; return userCache[username] ? userCache[username] : new User(attrs, options); } var user1 = UserFactory({ username: 'johndoe' }); var user2 = UserFactory({ username: 'johndoe '}); user1 === user2; // true In order to use this pattern effectively, you must always use this factory function when creating new instances. This is a simple enough chore when managing your own code. But difficulty arises when you try to enforce this pattern in codebases you aren’t responsible for, like third-party libraries and plugins. Consider, for example, the Collection.prototype._prepareModel function from Back- bone’s source code. Backbone uses this function to “prepare” and ultimately create a new model instance to add to a collection. It is invoked by a variety of means, such as when you’re populating a collection with models returned from an HTTP resource: // Prepare a hash of attributes (or other model) to be added to this // collection. Backbone.Collection.prototype._prepareModel = function(attrs, options) { if (attrs instanceof Model) { if (!attrs.collection) attrs.collection = this; return attrs; } options || (options = {}); options.collection = this; var model = new this.model(attrs, options); if (!model._validate(attrs, options)) { this.trigger('invalid', this, attrs, options); return false; } return model; }; Of particular importance is this line: var model = new this.model(attrs, options); CHAPTER FIVE: HACKING JAVASCRIPT CONSTRUCTORS FOR MODEL HARMONY50 This is what actually creates a new instance of the model associated with this collection. this.model is a reference to the constructor of the model class the collection wraps. It’s specified when you define a new collection class, like we did earlier: var UserCollection = Backbone.Collection.extend({ model: User, url: '/users' }); What’s pretty cool is that instead of passing the User class to the collection definition, we can pass the UserFactory class (our factory function that returns unique model instances): var UserCollection = Backbone.Collection.extend({ model: UserFactory, url: '/users' }); UserFactory will then be assigned to this.model, and will be invoked by the new opera- tor when the collection creates a new instance: var model = new this.model(attrs, options); // this.model is UserFactory But wait a minute. Now we’re invoking UserFactory via the new operator. We weren’t doing that earlier; we were calling the function directly. Does this even work? It turns out it does. Constructor Identity Crisis What exactly happens when you use the new operator on a function? A few things: 1. It creates a new object. 2. It sets that object’s prototype property to be the prototype property of the con- structor function. 3. It invokes the constructor function, with this assigned to the newly created object. 4. It returns the object, unless the constructor function returns a nonprimitive value. In that case, the nonprimitive value is returned instead. That last one is the neat part. If your constructor function returns a nonprimitive value, that becomes the result of the new operation. Since UserFactory returns a nonprimitive, that means that these two operations return the same value: CONSTRUCTOR IDENTITY CRISIS 51 var user1 = UserFactory({ username: 'johndoe' }); var user2 = new UserFactory({ username: 'johndoe '}); user1 === user2; // true This property of the new operator is pretty handy. It means that you can essentially dis- card the object created by new, and return what you want—in our case, a unique user model instance. Making It Scale In the examples so far, UserFactory has been a single-purpose factory function; it only guarantees uniqueness of User instances. While that’s super handy, there are probably other models for which we’ll want to guarantee uniqueness. So, it would be nice to have a general-purpose wrapper that can work for any model class. In a moment we’ll look at a function called UniqueFactory. It’s actually a constructor function that is invoked with the new operator, and takes as input a normal Backbone model class. It returns a wrapped constructor function that generates unique instances of that class. For example, it can actually generate a UserFactory class: var UserFactory = new UniqueFactory(User); var user1 = UserFactory({ username: 'johndoe' }); var user2 = new UserFactory({ username: 'johndoe '}); user1 === user2; // true The UniqueFactory implementation is shown here: /** * UniqueFactory takes a class as input, and returns a wrapped version of * that class that guarantees uniqueness of any generated model instances. * * Example: * var UniqueUser = new UniqueFactory(User); */ function UniqueFactory (Model) { var self = this; // The underlying Backbone Model class this.Model = Model; // Tracked instances of this model class this.instances = {}; // Constructor to return that will be used for creating new instances var WrappedConstructor = function (attrs, options) { CHAPTER FIVE: HACKING JAVASCRIPT CONSTRUCTORS FOR MODEL HARMONY52 return self.getInstance(attrs, options); }; // For compatibility with Backbone collections, our wrapped // model prototype should point to the *actual* Model prototype WrappedConstructor.prototype = this.Model.prototype; return WrappedConstructor; } UniqueFactory.prototype.getInstance = function (attrs, options) { options = options || {}; var id = attrs && attrs[this.Model.prototype.idAttribute]; // If there's no ID, this model isn't being tracked, and // cannot be tracked; return a new instance if (!id) return new this.Model(attrs, options); // Attempt to restore a cached instance var instance = this.instances[id]; if (!instance) { // If we haven't seen this instance before, start caching it instance = this.createInstance(id, attrs, options); } else { // Otherwise update the attributes of the cached instance instance.set(attrs); } return instance; }; UniqueFactory.prototype.createInstance = function (id, attrs, options) { var instance = new this.Model(attrs, options); this.instances[id] = instance; return instance; }; Let’s take a closer look at the UniqueFactory constructor, because it’s doing some tricky stuff. First recall that UniqueFactory is intended to be invoked with the new operator, which creates a new object and assigns it to this (which is immediately aliased to self). The constructor creates a new function, WrappedConstructor, whose signature matches that of a Backbone.Model constructor function. But instead of invoking the actual construc- tor, it calls the getInstance prototype method of the UniqueFactory instance we just created: var WrappedConstructor = function (attrs, options) { return self.getInstance(attrs, options); }; MAKING IT SCALE 53 Then, on the last line of this function, UniqueFactory returns WrappedConstructor. Once again, we’ve decided to ignore the object created by the new operator, and instead return an entirely different object—a function, even. This means that when we invoke UniqueFactory, the return value is actually our wrap- ped constructor: var UserFactory = new UniqueFactory(User); // WrappedConstructor However, this time we actually used the object created by the new operator. We just didn’t return it. And it still exists: in the closure created by the WrappedConstructor function (self). Phew. Did you catch all that? This is kind of a funny implementation. It’s not necessarily ideal, but I presented it to you to demonstrate how the new operator can be abused in an interesting—if some- what confusing—way. Namely, a constructor function can both make use of the object created by new and return an entirely new value, at the same time. Beware of Memory Leaks In the example factory implementations here, I’ve glossed over an important detail: they maintain an ever-growing global cache of model instances. Since instances are never removed from the cache even when they’re no longer needed, they continue occupying memory forever (or at least, until the page refreshes). For example, suppose a unique model instance is destroyed via Model.proto type.destroy: (function () { var user = UserFactory({ username: 'johndoe' }); user.destroy(); // sends HTTP DELETE to API server })(); Despite the user variable not existing outside the functional scope in which it is declared, and despite the johndoe record being destroyed on the server, the instance lives on inside our UserFactory instance cache. This is particularly bad in long-lived single-page applications. A proper implementation would “track” instance creation and dismissal, and remove the instance from the cache when it is no longer required to be there. Conclusion In this chapter, we’ve identified the “uniqueness” problem that affects applications where the same database-backed object appears in multiple collections. We explored a CHAPTER FIVE: HACKING JAVASCRIPT CONSTRUCTORS FOR MODEL HARMONY54 powerful solution for this problem: functions that wrap a class constructor, and guar- antee the uniqueness of any returned objects. Lastly, we introduced a utility, UniqueFac tory, that generates model classes that similarly guarantee uniqueness. What we’ve covered isn’t necessarily unique to JavaScript. Factory methods that return unique instances are tried-and-true patterns that can be—and certainly have been—implemented in any number of languages. But one clever trick that JavaScript has up its sleeve is the new operator. Specifically, the function on which new is called can ignore the newly created object (this) and return what it pleases. This little quirk is deceptively powerful, because it allows you to emulate object creation when object creation is expected—for instance, when you’re working with external libraries like Backbone. In my experience, JavaScript has never been accused of being a particularly flexible language. It still bears the marks of being designed in 10 days. But for all its warts, occasionally I discover new things about it that particularly please me. This small prop- erty of the new operator is one of them. Hopefully, having read this chapter, you’ll feel similarly. CONCLUSION 55 CHAPTER SIX One World, One Language Jenn Schiffer There sure are a lot of languages. —Jenn Schiffer It was September 2003 when I began my undergraduate studies in computer science. Having chosen a liberal arts school, I was required to select a number of general edu- cation course requirements that lived outside the realm of my major. One of those requirements was two foreign language courses. When I inquired about using Java to fulfill that sequence, my request was immediately shut down. “You have to pick a real foreign language, like Spanish or French,” my undergraduate advisor told me. Perhaps I should have asked about JavaScript. To be multilingual, or a polyglot, has always been presented as superior to being able to speak one’s native language only. I have never understood why people believe this. Living under one roof, having one job for an extended amount of time, and being in a long-term monogamous relationship: these are seen as qualities of a stable life. Being an expert in a single subject, as opposed to knowing a little bit about a lot, is champ- ioned. So should be the case with programming. JavaScript is a single, stable language that is powerful enough to build the World Wide Web, make robots move, and convince publishers to print entire books about it. If we were required to pick a single “best” programming language, JavaScript seems like a no-brainer. It is understandably controversial to say that a specific language is better than the rest and that it should, therefore, become the official language of programming. Who am I to decide which language every other programmer should learn and build with? In my favor, one of the greatest aspects of web development in the 21st century is the expression of opinions so strong they are worthy of becoming web standards. 57 An Imperative, Dynamic Proposal Imagine you are an academic advisor at a liberal arts college and are tasked with defin- ing the choices given to students for their foreign language requirements. A language called “JavaScript” comes up in a proposal, and you need to study it and determine if it is a viable option. Naturally, you just so happen to be a fluent JavaScript expert, yet you are not sure it would be more useful than, say, Java. Java is notoriously simple to learn at the college freshman level, regardless of the stu- dent’s experience: /** * Hello World in Java */ class Example { public static void main(String[] args) { System.out.println("Hello World."); } } To run Java, though, the client must also be running the Java virtual machine (J.V.M.). It would be silly to ask students to carry multiple machines around to all of their classes, so a language that does not require a JVM would be a better option. You might be thinking, “Maybe this is a weird joke I just don’t get?” Perhaps the author, yours truly, is trying to make a joke, and you feel like there are much better ones she could make. But this is no joke: JavaScript does not require a Java virtual machine. Neither does Haskell: -- Hello World in Haskell main = putStrLn "Hello World." The problem with Haskell is that, unlike JavaScript, it requires installation of a com- piler. It is also a functional programming language that, like Latin, is considered “dead” and referenced only in historical texts. Yes, it is useful to learn Haskell in order to understand the context of programming today, but not for making useful products. It would be irresponsible to require students to learn something that would not help them build client-side web applications. Ruby happens to be quite useful in building web applications: # Hello World in Ruby puts "Hello World." One of the features of Ruby is flexibility in the form of having dozens of different ver- sions, the most popular of which is called Rails. Rails itself has many versions—dia- lects, if you will—which causes communication breakdowns between apps. Multiple versions works for operating system releases, but not for web development. JavaScript CHAPTER SIX: ONE WORLD, ONE LANGUAGE58 versions do not matter to the user or developer because it is not server-side, and removing that headache makes it a better option for teaching. Cascading Style Sheets (C.S.S.) is also not server-side and does not require a compiler or virtual machine: /* Hello World in C.S.S. */ #example { content:'Hello World.'} But much like hardware does not work without software, C.S.S. does not work without other languages. In the previous example, the browser looks for an element on the page with the ID “example.” If the developer did not use another language to create that element, the C.S.S. cannot do anything. The professor teaching the foreign language course would have to teach another language in addition to C.S.S., and that is asking a lot of the staff. JavaScript does not need other languages to work. It just works. How about HyperText Markup Language (H.T.M.L.)? It works on its own and does not need a compiler installed:

Actually, H.T.M.L. does need another language to work, and it is JavaScript. Sure, in the past, H.T.M.L. used to be all you needed to create a web page. In the current state of the Semantic Web, though, the use of frontend JavaScript frameworks like Ember.js is required to bind text to a document. JavaScript does not need a JavaScript framework to run, because it is JavaScript already: // Hello World in JavaScript alert('Hello World'); AN IMPERATIVE, DYNAMIC PROPOSAL 59 And there you have it. Simple, pure, vanilla, untouched, beautiful JavaScript. Short, effective, and simple to teach. You can rightfully count JavaScript among the options for teaching foreign languages to your college’s student body. The Paradox of Choice As hard as it is to choose the options of foreign language courses a student can take, it is even harder for the student to decide among those options. One of the hardest prob- lems in computer science is choosing the right tool to use, and the same certainly goes for communication. It is an impossible question to ask: “German or JavaScript?” Why can a student not learn both? This may seem like an NP-complete problem. You cannot teach JavaScript in German, because JavaScript syntax is in American English: Benachrichtigung('Hello World'); Although semantically, factually, and tactfully correct, the preceding code is syntacti- cally incorrect: >> ReferenceError: Benachrichtigung is not defined It turns out, though, that you can teach German in JavaScript: alert('Hallo Welt'); If one can learn a language within JavaScript, then it is clear that JavaScript can be the only foreign language course offered that will not prevent students from learning how to communicate in foreign countries. Globalcommunicationscript College is the basis of learning for all web developers, as is evident with the current education revolution within the software industry. As more programming jobs are cre- ated, educators grow more responsible for fostering the growth of new developers. To make this job easy, it only makes perfect sense to choose a language that everyone can communicate and learn with. As we discovered in our foreign language course narra- tive, that language is JavaScript. Simple, pure, vanilla, untouched, beautiful JavaScript. CHAPTER SIX: ONE WORLD, ONE LANGUAGE60 CHAPTER SEVEN Math Expression Parser and Evaluator Ariya Hidayat Domain-specific languages (DSLs) are encountered in many aspects of a software engi- neer’s life: configuration file formats, data transfer protocols, model schemas, applica- tion extensions, interface definition languages, and many others. Because of the nature of such languages, the language expression needs to be straightforward and easy to understand. In this chapter, we will explore the use of JavaScript to implement a simple language that can be used to evaluate a mathematical expression. In a way, it is very similar to a classic handheld programming calculator. Besides the typical math syntax, our Java- Script code should handle operator precedence and understand predefined functions. Given a math expression as a string, this is the series of processing applied to that string: • The string is split into a stream of tokens. • The tokens are used to construct the syntax tree. • The syntax tree is traversed to evaluate the expression. Each step will be described in the following sections. Lexical Analysis and Tokens The first important thing to do to a string representing a math expression is lexical analysis—that is, splitting the string into a stream of tokens. Quite expectedly, a 61 function that does this is often called a tokenizer. Alternatively, it is also known as a lexer or a scanner. We first need to define the types of the tokens. Since we’ll be dealing with simple math expressions, all we really need are number, identifier, and operator. Before we can identify a portion of a string as one of these tokens, we need some helper func- tions (they are self-explained): function isWhiteSpace(ch) { return (ch === 'u0009') || (ch === ' ') || (ch === 'u00A0'); } function isLetter(ch) { return (ch >= 'a' && ch < = 'z') || (ch >= 'A' && ch < = 'Z'); } function isDecimalDigit(ch) { return (ch >= '0') && (ch < = '9'); } Another very useful auxiliary function is the following createToken, used mostly to avoid repetitive code in the later stages. It basically creates an object for the given token type and value: function createToken(type, value) { return { type: type, value: value }; } As we iterate through the characters in the math expression, we will need a way to advance to the next character and another method to have a peek at the next charac- ter without advancing our position: function getNextChar() { var ch = 'x00', idx = index; if (idx < length) { ch = expression.charAt(idx); index += 1; } return ch; } function peekNextChar() { var idx = index; return ((idx < length) ? expression.charAt(idx) : 'x00'); } CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR62 In our expression language, spaces do not matter: 40 + 2 is treated the same as 40+2. Thus, we need a function that ignores whitespace and continues to move forward until there is no whitespace anymore: function skipSpaces() { var ch; while (index < length) { ch = peekNextChar(); if (!isWhiteSpace(ch)) { break; } getNextChar(); } } Suppose we want to support standard arithmetic operations, brackets, and simple assignment. The operators we need to support are +, -, *, /, =, (, and ). A method to scan such an operator can be constructed as follows. Note that rather than checking the character against all possible choices, we just use a simple trick utilizing the String.indexOf method. By convention, if this scanOperator function is called but no operator is detected, it returns undefined: function scanOperator() { var ch = peekNextChar(); if ('+-*/()='.indexOf(ch) >= 0) { return createToken('Operator', getNextChar()); } return undefined; } Deciding whether a series of characters is an identifier or not is slightly more complex. Let us assume we allow the first character to be a letter or an underscore. The second, third, and subsequent characters can each be another letter or a decimal digit. We dis- allow a decimal digit to start an identifier to avoid confusion with a number. Let’s begin with two simple helper functions that do these checks: function isIdentifierStart(ch) { return (ch === '_') || isLetter(ch); } function isIdentifierPart(ch) { return isIdentifierStart(ch) || isDecimalDigit(ch); } The identifier check can now be written as a simple loop like this: function scanIdentifier() { var ch, id; ch = peekNextChar(); LEXICAL ANALYSIS AND TOKENS 63 if (!isIdentifierStart(ch)) { return undefined; } id = getNextChar(); while (true) { ch = peekNextChar(); if (!isIdentifierPart(ch)) { break; } id += getNextChar(); } return createToken('Identifier', id); } Since we want to process math expressions, it would be absurd not to be able to recog- nize numbers. We want to support simple integers such as 42, floating points like 3.14159, and also numbers written in scientific notation like 6.62606957e-34. A skele- ton for such a function is: function scanNumber() { // return a token representing a number // or undefined if no number is recognized } And here is the breakdown of the function implementation. First and foremost, we need to detect the presence of a number. It’s rather easy—we just check whether the next character is a decimal digit or a decimal point (because .1 is a valid number): ch = peekNextChar(); if (!isDecimalDigit(ch) && (ch !== '.')) { return undefined; } And if that is the case, we need to process each following character as long as it is a decimal digit: number = ''; if (ch !== '.') { number = getNextChar(); while (true) { ch = peekNextChar(); if (!isDecimalDigit(ch)) { break; } number += getNextChar(); } } CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR64 Since we want to support floating-point numbers, potentially we will see a decimal point coming (for example, for 3.14159, up to now only we’re processing the 3). If that is the case, we need to loop again and process all the digits after the decimal point: if (ch === '.') { number += getNextChar(); while (true) { ch = peekNextChar(); if (!isDecimalDigit(ch)) { break; } number += getNextChar(); } } Supporting scientific notation with exponents means we may see an “e” after those digits. For example, if we are supposed to scan 6.62606957e-34, the previous code will get us only up to 6.62606957. We need to process the “e,” and more digits after the exponent sign. Note that there can be a plus or a minus sign as well: if (ch === 'e' || ch === 'E') { number += getNextChar(); ch = peekNextChar(); if (ch === '+' || ch === '-' || isDecimalDigit(ch)) { number += getNextChar(); while (true) { ch = peekNextChar(); if (!isDecimalDigit(ch)) { break; } number += getNextChar(); } } else { throw new SyntaxError('Unexpected character after exponent sign'); } } The exception is needed because we want to tackle invalid numbers such as 4e.2 (there cannot be a decimal point after the exponent sign) or even just 4e (there must be some digits after the exponent sign). If we want to consume a math expression and produce a list of tokens represented by the expression, we need a function that recognizes and gets the next token. This is easy, since we have three individual functions that can handle a number, an operator, or an identifier: function next() { var token; skipSpaces(); if (index >= length) { LEXICAL ANALYSIS AND TOKENS 65 return undefined; } token = scanNumber(); if (typeof token !== 'undefined') { return token; } token = scanOperator(); if (typeof token !== 'undefined') { return token; } token = scanIdentifier(); if (typeof token !== 'undefined') { return token; } throw new SyntaxError('Unknown token from character ' + peekNextChar()); } Syntax Parser and Syntax Tree The stream of tokens produced by the lexer does not give us enough information to compute the math expression. Before we can evaluate the expression, an abstract syn- tax tree (AST) corresponding to the expression needs to be constructed. This proce- dure is commonly known as syntactic analysis, and it is usually carried out by a syntax parser. Consider the following expression: x = -6 * 7 The associated syntax tree for this expression is depicted in the following diagram. A popular technique to construct the syntax tree is recursive-descent parsing. In such a parsing strategy, we go top down and match the possible parse tree from the highest CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR66 level. For this particular problem, the simplified grammar of the math expression we want to handle is written as the following (in Backus-Naur Form): Expression ::= Assignment Assignment ::= Identifier '=' Assignment | Additive Additive ::= Multiplicative | Additive '+' Multiplicative | Additive '-' Multiplicative Multiplicative ::= Unary | Multiplicative '*' Unary | Multiplicative '/' Unary Unary ::= Primary | '-' Unary Primary ::= Identifier | Number | '(' Assignment ')' | FunctionCall FunctionCall ::= Identifier '(' ')' | Identifier '(' ArgumentList ')' ArgumentList := Expression | Expression ',' ArgumentList The following code walkthrough illustrates the process of matching the expression from the topmost level (Expression). The lexer itself comes from the implementation of the lexical analyzer shown earlier. The main entry point for the parsing looks like this: function parse(expression) { var expr; lexer.reset(expression); expr = parseExpression(); return { 'Expression': expr }; } From this, we go to the main parseExpression function, which is surprisingly simple. This is because our syntax implies only a variable assignment as an expression. For other languages with more elaborate control flow (branching, loops, etc.) or some form of DSL, assignment may not be the only form of expression: function parseExpression() { return parseAssignment(); } For the subsequent parseFoo variants, we need a function that can match an operator. If the incoming operator is the same as the expected one, then it returns true: function matchOp(token, op) { return (typeof token !== 'undefined') && token.type === T.Operator && SYNTAX PARSER AND SYNTAX TREE 67 token.value === op; } An example form of assignment is x = 42. However, we also want to tackle cases where the expression is as plain as 42, or a nested assignment such as x = y = 42. See if you can understand how the following implementation of parseAssignment handles all the three cases (hint: recursion is a possibility): function parseAssignment() { var token, expr; expr = parseAdditive(); if (typeof expr !== 'undefined' && expr.Identifier) { token = lexer.peek(); if (matchOp(token, '=')) { lexer.next(); return { 'Assignment': { name: expr, value: parseAssignment() } }; } return expr; } return expr; } The function parseAdditive processes both addition and subtraction—that is, it creates a binary operator node. There will be two child nodes, the left and right ones. They represent the two subexpressions, further handled by parseMultiplicative, to be added or subtracted: function parseAdditive() { var expr, token; expr = parseMultiplicative(); token = lexer.peek(); while (matchOp(token, '+') || matchOp(token, '-')) { token = lexer.next(); expr = { 'Binary': { operator: token.value, left: expr, right: parseMultiplicative() } } token = lexer.peek(); }; CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR68 return expr; } The same logic follows for parseMultiplicative. It handles both multiplication and division: function parseMultiplicative() { var expr, token; expr = parseUnary(); token = lexer.peek(); while (matchOp(token, '*') || matchOp(token, '/')) { token = lexer.next(); expr = { 'Binary': { operator: token.value, left: expr, right: parseUnary() } }; token = lexer.peek(); } return expr; } Before we check the details of parseUnary, you may wonder why parseAdditive is called first, and then parseMultiplicative. This is done in order to satisfy the operator precedence requirement. Consider the expression 2 + 4 * 10, which actually evaluates to 42 (multiply 4 by 10, then add 2) rather than 60 (add 2 to 4, then multiply by 10). This is possible only if the topmost node in the syntax tree is the binary operator +, which has two child nodes: the left one is just the number 2, and the right one is actually another binary operator, *. The latter holds two numbers as the corresponding child nodes, 4 and 10. To handle a negation, like -42, we use the concept of unary operation. In the syntax tree, this is represented by a unary operator node and it has only one child node (hence the name). While negation is one form of unary operation, we also need to take into account the unary positive operator, as in +42. Thanks to the function’s recur- sive nature, expressions like ----42 or even -+-+42 can be handled without any prob- lem as well. The code to handle the unary operation is as simple as the following: function parseUnary() { var token, expr; token = lexer.peek(); if (matchOp(token, '-') || matchOp(token, '+')) { token = lexer.next(); expr = parseUnary(); return { 'Unary': { SYNTAX PARSER AND SYNTAX TREE 69 operator: token.value, expression: expr } }; } return parsePrimary(); } Now here comes one of the most important functions of all: parsePrimary. First of all, let’s consider the four possible forms of primary node: • An identifier (basically referring to a variable in this context)--for example, x • A number—for example, 3.14159 • A function call—for example, sin(0) • Another expression enclosed in brackets—for example, (4 + 5) Fortunately, deciding whether the incoming tokens will form one of these possibilities is rather easy, as we just need to examine the token type. There is only ambiguity between an identifier and a function call, which can be solved if we peek at the next token (i.e., whether it is an open bracket or not). Without further ado, here is the code: function parsePrimary() { var token, expr; token = lexer.peek(); if (token.type === T.Identifier) { token = lexer.next(); if (matchOp(lexer.peek(), '(')) { return parseFunctionCall(token.value); } else { return { 'Identifier': token.value }; } } if (token.type === T.Number) { token = lexer.next(); return { 'Number': token.value }; } if (matchOp(token, '(')) { lexer.next(); expr = parseAssignment(); CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR70 token = lexer.next(); if (!matchOp(token, ')')) { throw new SyntaxError('Expecting )'); } return { 'Expression': expr }; } throw new SyntaxError('Parse error, can not process token ' + token.value); } Now the remaining part is parseFunctionCall. If we see an example of a function call like sin(0), it basically consists of a function name, open bracket, function argument, and close bracket. It is important to realize that there can be more than one argument (foo(1, 2, 3)) or no argument at all (random()), depending on the function itself. For simplicity, we split out the handling of the function argument to parseArgumentList. Here are both functions for your pleasure: function parseArgumentList() { var token, expr, args = []; while (true) { expr = parseExpression(); if (typeof expr === 'undefined') { break; } args.push(expr); token = lexer.peek(); if (!matchOp(token, ',')) { break; } lexer.next(); } return args; } function parseFunctionCall(name) { var token, args = []; token = lexer.next(); if (!matchOp(token, '(')) { throw new SyntaxError('Expecting ( in a function call "' + name + '"'); } token = lexer.peek(); if (!matchOp(token, ')')) { args = parseArgumentList(); } token = lexer.next(); SYNTAX PARSER AND SYNTAX TREE 71 if (!matchOp(token, ')')) { throw new SyntaxError('Expecting ) in a function call "' + name + '"'); } return { 'FunctionCall' : { 'name': name, 'args': args } }; } Voilà! That’s all our parser code. When combined properly into a functional object, it is just about 200 lines of code, supporting various math operations with proper prece- dences, brackets, variables, and function calls. Tree Walker and Expression Evaluator Once a syntax tree is obtained, evaluating the expression associated with it is surpris- ingly easy. It is simply a matter of walking the tree, from the topmost syntax node through all children, and executing a specific instruction related to the type of each syntax node. For example, a binary operator node means that we need to add (or sub- tract, or multiply, or divide) the two values obtained from each child node. Looking at the previous example: x = -6 * 7 the generated syntax tree as a JavaScript object is: { "Expression": { "Assignment": { "name": { "Identifier": "x" }, "value": { "Binary": { "operator": "*", "left": { "Unary": { "operator": "-", "expression": { "Number": "6" } } }, "right": { "Number": "7" } } } CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR72 } } } The code to interpret this JSON-formatted tree is quite straightforward. Let’s start from the leaf, such as a number (we assume from here on that node points to the current node we need to evaluate): if (node.hasOwnProperty('Number')) { return parseFloat(node.Number); } For a unary operation node, we need to evaluate the child node first and then apply the unary operation, either + or -: if (node.hasOwnProperty('Unary')) { node = node.Unary; expr = exec(node.expression); switch (node.operator) { case '+': return expr; case '-': return -expr; default: throw new SyntaxError('Unknown operator ' + node.operator); } } A binary node is handled similarly—we just need to process both child nodes for the left and right side of the operator: if (node.hasOwnProperty('Binary')) { node = node.Binary; left = exec(node.left); right = exec(node.right); switch (node.operator) { case '+': return left + right; case '-': return left - right; case '*': return left * right; case '/': return left / right; default: throw new SyntaxError('Unknown operator ' + node.operator); } } Before we continue to tackle variable assignment, let’s take a step back and consider the concept of evaluation context. For this purpose, we define the context as an object that holds the variables, constants, and function definitions. When we evaluate an TREE WALKER AND EXPRESSION EVALUATOR 73 expression, we also need to pass a context so that the evaluator knows where to fetch the value of a variable, store a value to a variable, and invoke a certain function. Keeping the context as a different object promotes the separation of logic: the inter- preter knows nothing about the context, and the context does not really care how the interpreter works. In our evaluator, the simplest possible context is: context = { Constants: {}, Functions: {}, Variables: {} } A slightly more useful context (that can be used as a default) is: context = { Constants: { pi: 3.1415926535897932384, phi: 1.6180339887498948482 }, Functions: { abs: Math.abs, acos: Math.acos, asin: Math.asin, atan: Math.atan, ceil: Math.ceil, cos: Math.cos, exp: Math.exp, floor: Math.floor, ln: Math.ln, random: Math.random, sin: Math.sin, sqrt: Math.sqrt, tan: Math.tan }, Variables: {} } We still do not have any variables (because the context is freshly created), but there are two common constants ready to use. The difference between a constant and a vari- able in this example is very simple and obvious: you cannot change a constant or cre- ate a new one, but you can do both with a variable. With the context and its variables and constants ready, now we can handle identifier lookup (e.g., in an expression like x + 2): CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR74 if (node.hasOwnProperty('Identifier')) { if (context.Constants.hasOwnProperty(node.Identifier)) { return context.Constants[node.Identifier]; } if (context.Variables.hasOwnProperty(node.Identifier)) { return context.Variables[node.Identifier]; } throw new SyntaxError('Unknown identifier'); } Assignment (like x = 3) works the other way around, though we have to ensure that we process only variable assignment and not constant override: if (node.hasOwnProperty('Assignment')) { right = exec(node.Assignment.value); context.Variables[node.Assignment.name.Identifier] = right; return right; } Finally, the remaining function node is handled as follows. Basically, the function arguments (if any) are prepared in an array and then passed to the actual function. Note that in our default context, we simply wire a bunch of functions to the methods of the built-in Math object: if (node.hasOwnProperty('FunctionCall')) { expr = node.FunctionCall; if (context.Functions.hasOwnProperty(expr.name)) { args = []; for (i = 0; i < expr.args.length; i += 1) { args.push(exec(expr.args[i])); } return context.Functions[expr.name].apply(null, args); } throw new SyntaxError('Unknown function ' + expr.name); } What if we want to have a custom function, maybe because it is not supported by the Math object? It can’t be easier: all we have to do is define the function for the context. As an example, let’s implement sum, which adds all the numbers passed in the argu- ment. Since we’re dealing with a function that may have a variable number of argu- ments, we use a special arguments object instead of named parameters: context.Functions.sum = function () { var i, total = 0; for (i = 0; i < arguments.length; i += 1) { total += arguments[i]; } return total; } TREE WALKER AND EXPRESSION EVALUATOR 75 Final Words The simple example presented here can be easily extended or modified for a wide range of domain-specific languages. For a simpler language, the lexer can be imple- mented as a collection of regular expressions. Alternatively, a simple state machine is often suitable in many cases. On the other hand, a language with a complex grammar may require a deeper recursive-descent parsing. In some cases, it is more convenient to handle some of the recursive aspect by using a stack-based shift and reduce. Some languages are known to have peculiar cases that complicate both the lexer and the parser. For example, doing lexical analysis on JavaScript code is notoriously diffi- cult because the symbol / is ambiguous: it can signify either a division operator or the beginning of a regular expression. In addition to that, the famous automatic semicolon insertion feature requires various parts of the parser to take that into account wher- ever it is mandated by the language specification. It is instructive to learn how various parsers handle these types of edge cases. Happy parsing! CHAPTER SEVEN: MATH EXPRESSION PARSER AND EVALUATOR76 CHAPTER EIGHT Evolution Rebecca Murphey In March 2009, Paul Irish published a blog post, “Markup-based Unobtrusive Compre- hensive DOM-ready Execution,” describing a solution to a pesky problem familiar to every newcomer to the world of client-side JavaScript at the time: executing only the code that was required for a given page. Back in 2009, it was common for client-side JavaScript developers to just put all of their code—for all of their pages—inside one giant $(document).ready() callback; some were a bit cleverer, and tested for the presence of an element with a certain ID in order to determine the page they were on. A newcomer to such code struggled might- ily to mentally parse hundreds of lines where function declarations, anonymous func- tions, and long chains of jQuery methods intermingled. The method proposed in this blog post was simple: put a class on the element, and then use a simple helper function to look up a corresponding initialization method in an application object: UTIL = { loadEvents : function () { var bodyId = document.body.id; $.each(document.body.className.split(/\s+/), function (i, className) { UTIL.fire(className); UTIL.fire(className,bodyId); }); }, fire : function (func, funcname, args) { var namespace = APP; // indicate your obj literal namespace here funcname = (funcname === undefined) ? 'init' : funcname; 77 if ( func !== '' && namespace[func] && typeof namespace[func][funcname] == 'function' ) { namespace[func][funcname](args); } } }; $(document).ready(UTIL.loadEvents); The code, written by a self-taught and largely unknown frontend developer with a degree in technical communications, was mediocre. The idea, though, was transforma- tive, especially for a community with lots of similarly self-taught developers: if we could organize our code somehow, maybe writing ever-bigger JavaScript applications didn’t have to be such a messy affair. A few months after Paul wrote his post, I published “Using Objects to Organize Your Code” and gave a talk on the same topic at the 2009 jQuery Conference. My post sug- gested having one object per “feature” (a piece of functionality on a page), and encap- sulating all of the functionality of that feature in methods on that object. For example, a list of email messages might be one feature; a list of mailboxes might be another. I know for a fact that I had only the most cursory understanding of .call() and .apply() at the time, and though $.proxy didn’t exist yet, I’m not sure I’d have fully understood it if it did. John Resig had posted his micro-templating snippet a year before, and I’d read JavaScript: The Good Parts, yet the post contained no consideration of client-side templating or being able to create instances of these feature objects. “If I tried to think of the simplest JavaScript thing I could write a post about,” my friend Alex Sexton said to me recently (in the nicest way possible, because he is Alex), “I’d still never come up with something as simple as that.” And yet this too seemed to have a transformative effect in the still largely self-taught JavaScript community of the time. Not only could we break our code down per page; we could also break it down per component, and those components could be clearly rep- resented by distinct pieces of code. We could even…not to get crazy here, but…we could put those pieces of code in sepa- rate files, using a global object as a namespace, right? Granted, loading all of those sep- arate files as