PHP6 AND MYSQL5 For DYNAMIC WEB SITES


VISUAL QUICKPRO GUIDE PHP 6 AND MYSQL 5 FOR DYNAMIC WEB SITES Larry Ullman Peachpit Press Visual QuickPro Guide PHP 6 and MySQL 5 for Dynamic Web Sites Larry Ullman Peachpit Press 1249 Eighth Street Berkeley, CA 94710 510/524-2178 510/524-2221 (fax) Find us on the Web at: www.peachpit.com To report errors, please send a note to: errata@peachpit.com Peachpit Press is a division of Pearson Education. Copyright © 2008 by Larry Ullman Editor: Rebecca Gulick Copy Editor: Bob Campbell Production Coordinator: Becky Winter Compositors: Myrna Vladic, Jerry Ballew, and Rick Gordon Indexer: Rebecca Plunkett Cover Production: Louisa Adair Technical Reviewer: Arpad Ray Notice of rights All rights reserved. No part of this book may be reproduced or transmitted in any form by any means, elec- tronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the pub- lisher. For information on getting permission for reprints and excerpts, contact permissions@peachpit.com. Notice of liability The information in this book is distributed on an “As Is” basis, without warranty. While every precaution has been taken in the preparation of the book, neither the author nor Peachpit Press shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indi- rectly by the instructions contained in this book or by the computer software and hardware products described in it. Trademarks MySQL is a registered trademark of MySQL AB in the United States and in other countries. Macintosh and Mac OS X are registered trademarks of Apple Computer, Inc. Microsoft and Windows are registered trade- marks of Microsoft Corporation. Other product names used in this book may be trademarks of their own respective owners. Images of Web sites in this book are copyrighted by the original holders and are used with their kind permission. This book is not officially endorsed by nor affiliated with any of the above com- panies, including MySQL AB. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Peachpit was aware of a trademark claim, the designations appear as requested by the owner of the trademark. All other product names and services identified throughout this book are used in editorial fashion only and for the benefit of such companies with no intention of infringement of the trademark. No such use, or the use of any trade name, is intended to convey endorsement or other affiliation with this book. ISBN-13: 978-0-321-52599-4 ISBN-10: 0-321-52599-X 987654321 Printed and bound in the United States of America Dedication Dedicated to the fine faculty at my alma mater, Northeast Missouri State University. In particular, I would like to thank: Dr. Monica Barron, Dr. Dennis Leavens, Dr. Ed Tyler, and Dr. Cole Woodcox, whom I also have the pleasure of calling my friend. I would not be who I am as a writer, as a student, as a teacher, or as a person if it were not for the magnanimous, affecting, and brilliant instruction I received from these educators. Special Thanks to: My heartfelt thanks to everyone at Peachpit Press, as always. My gratitude to editor extraordinaire Rebecca Gulick, who makes my job so much easier. And thanks to Bob Campbell for his hard work, helpful suggestions, and impressive attention to detail. Thanks also to Rebecca Plunkett for indexing and Becky Winter, Myrna Vladic, Jerry Ballew, and Rick Gordon for laying out the book, and thanks to Arpad Ray for his technical review. Kudos to the good people working on PHP, MySQL, Apache, phpMyAdmin, and XAMPP, among other great projects. And a hearty “cheers” to the denizens of the various news- groups, mailing lists, support forums, etc., who offer assistance and advice to those in need. Thanks, as always, to the readers, whose sup- port gives my job relevance. An extra helping of thanks to those who provided the transla- tions in Chapter 15, “Example—Message Board,” and who offered up recommendations as to what they’d like to see in this edition. Thanks to Nicole and Christina for enter- taining and taking care of the kids so that I could get some work done. Finally, I would not be able to get through a single book if it weren’t for the love and support of my wife, Jessica. And a special shout out to Zoe and Sam, who give me rea- sons to, and not to, write books! Introduction: ix What Are Dynamic Web Sites? . . . . . . . . . . . . . . . . x What You’ll Need . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi About This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Companion Web Site . . . . . . . . . . . . . . . . . . . . . . . . xix Chapter 1: Introduction to PHP 1 Basic Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Sending Data to the Web Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Writing Comments . . . . . . . . . . . . . . . . . . . . . . . . . . 10 What Are Variables? . . . . . . . . . . . . . . . . . . . . . . . . . 14 Introducing Strings . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Concatenating Strings . . . . . . . . . . . . . . . . . . . . . . . 21 Introducing Numbers . . . . . . . . . . . . . . . . . . . . . . . . 23 Introducing Constants . . . . . . . . . . . . . . . . . . . . . . . 27 Single vs. Double Quotation Marks . . . . . . . . . . . . 30 Chapter 2: Programming with PHP 33 Creating an HTML Form . . . . . . . . . . . . . . . . . . . . . 34 Handling an HTML Form . . . . . . . . . . . . . . . . . . . . 38 Conditionals and Operators . . . . . . . . . . . . . . . . . . 42 Validating Form Data . . . . . . . . . . . . . . . . . . . . . . . . 46 Introducing Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . 52 For and While Loops . . . . . . . . . . . . . . . . . . . . . . . . 70 Chapter 3: Creating Dynamic Web Sites 73 Including Multiple Files . . . . . . . . . . . . . . . . . . . . . . 74 Handling HTML Forms, Revisited . . . . . . . . . . . . . 84 Making Sticky Forms . . . . . . . . . . . . . . . . . . . . . . . . 89 Creating Your Own Functions . . . . . . . . . . . . . . . . 92 Chapter 4: Introduction to MySQL 107 Naming Database Elements . . . . . . . . . . . . . . . . . 108 Choosing Your Column Types . . . . . . . . . . . . . . . 110 Choosing Other Column Properties . . . . . . . . . . 114 Accessing MySQL . . . . . . . . . . . . . . . . . . . . . . . . . . 116 v Table of Contents Table of Contents Chapter 5: Introduction to SQL 123 Creating Databases and Tables . . . . . . . . . . . . . . . 124 Inserting Records . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Selecting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Using Conditionals . . . . . . . . . . . . . . . . . . . . . . . . . 133 Using LIKE and NOT LIKE . . . . . . . . . . . . . . . . . . 136 Sorting Query Results . . . . . . . . . . . . . . . . . . . . . . . 138 Limiting Query Results . . . . . . . . . . . . . . . . . . . . . 140 Updating Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Deleting Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Using Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 Chapter 6: Advanced SQL and MySQL 157 Database Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 Performing Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 Grouping Selected Results . . . . . . . . . . . . . . . . . . . 178 Creating Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 Using Different Table Types . . . . . . . . . . . . . . . . . 185 Performing FULLTEXT Searches . . . . . . . . . . . . 188 Performing Transactions . . . . . . . . . . . . . . . . . . . . 194 Chapter 7: Error Handling and Debugging 199 Error Types and Basic Debugging . . . . . . . . . . . . 200 Displaying PHP Errors . . . . . . . . . . . . . . . . . . . . . . 206 Adjusting Error Reporting in PHP . . . . . . . . . . . . 208 Creating Custom Error Handlers . . . . . . . . . . . . . 211 PHP Debugging Techniques . . . . . . . . . . . . . . . . . 216 SQL and MySQL Debugging Techniques . . . . . . 220 Chapter 8: Using PHP with MySQL 223 Modifying the Template . . . . . . . . . . . . . . . . . . . . . 224 Connecting to MySQL . . . . . . . . . . . . . . . . . . . . . . 226 Executing Simple Queries . . . . . . . . . . . . . . . . . . . 230 Retrieving Query Results . . . . . . . . . . . . . . . . . . . . 239 Ensuring Secure SQL . . . . . . . . . . . . . . . . . . . . . . . 243 Counting Returned Records . . . . . . . . . . . . . . . . . 249 Updating Records with PHP . . . . . . . . . . . . . . . . . 251 Chapter 9: Common Programming Techniques 259 Sending Values to a Script . . . . . . . . . . . . . . . . . . . 260 Using Hidden Form Inputs . . . . . . . . . . . . . . . . . . 264 Editing Existing Records . . . . . . . . . . . . . . . . . . . . 270 Paginating Query Results . . . . . . . . . . . . . . . . . . . . 277 Making Sortable Displays . . . . . . . . . . . . . . . . . . . 285 vi Table of Contents Table of Contents Chapter 10: Web Application Development 291 Sending Email . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 Date and Time Functions . . . . . . . . . . . . . . . . . . . 298 Handling File Uploads . . . . . . . . . . . . . . . . . . . . . . 302 PHP and JavaScript . . . . . . . . . . . . . . . . . . . . . . . . . 315 Understanding HTTP Headers . . . . . . . . . . . . . . . 322 Chapter 11: Cookies and Sessions 327 Making a Login Page . . . . . . . . . . . . . . . . . . . . . . . 328 Making the Login Functions . . . . . . . . . . . . . . . . 331 Using Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Using Sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 Improving Session Security . . . . . . . . . . . . . . . . . . 358 Chapter 12: Security Methods 361 Preventing Spam . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 Validating Data by Type . . . . . . . . . . . . . . . . . . . . . 369 Preventing XSS Attacks . . . . . . . . . . . . . . . . . . . . . 374 Preventing SQL Injection Attacks . . . . . . . . . . . . 377 Database Encryption . . . . . . . . . . . . . . . . . . . . . . . 383 Chapter 13: Perl-Compatible Regular Expressions 389 Creating a Test Script . . . . . . . . . . . . . . . . . . . . . . . 390 Defining Simple Patterns . . . . . . . . . . . . . . . . . . . . 394 Using Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Using Character Classes . . . . . . . . . . . . . . . . . . . . . 400 Finding All Matches . . . . . . . . . . . . . . . . . . . . . . . . 403 Using Modifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407 Matching and Replacing Patterns . . . . . . . . . . . . 409 Chapter 14: Making Universal Sites 413 Character Sets and Encoding . . . . . . . . . . . . . . . . 414 Creating Multilingual Web Pages . . . . . . . . . . . . . 416 Unicode in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420 Collation in PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . 424 Transliteration in PHP . . . . . . . . . . . . . . . . . . . . . . 427 Languages and MySQL . . . . . . . . . . . . . . . . . . . . . 430 Time Zones and MySQL . . . . . . . . . . . . . . . . . . . . 434 Working with Locales . . . . . . . . . . . . . . . . . . . . . . . 437 vii Table of Contents Table of Contents Chapter 15: Example—Message Board 441 Making the Database . . . . . . . . . . . . . . . . . . . . . . . 442 Writing the Templates . . . . . . . . . . . . . . . . . . . . . . 451 Creating the Index Page . . . . . . . . . . . . . . . . . . . . . 460 Creating the Forum Page . . . . . . . . . . . . . . . . . . . . 461 Creating the Thread Page . . . . . . . . . . . . . . . . . . . 466 Posting Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Chapter 16: Example—User Registration 483 Creating the Templates . . . . . . . . . . . . . . . . . . . . . 484 Writing the Configuration Scripts . . . . . . . . . . . . 490 Creating the Home Page . . . . . . . . . . . . . . . . . . . . 498 Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 Activating an Account . . . . . . . . . . . . . . . . . . . . . . 509 Logging In and Logging Out . . . . . . . . . . . . . . . . . 513 Password Management . . . . . . . . . . . . . . . . . . . . . 519 Chapter 17: Example—E-Commerce 529 Creating the Database . . . . . . . . . . . . . . . . . . . . . . 530 The Administrative Side . . . . . . . . . . . . . . . . . . . . 536 Creating the Public Template . . . . . . . . . . . . . . . . 553 The Product Catalog . . . . . . . . . . . . . . . . . . . . . . . . 557 The Shopping Cart . . . . . . . . . . . . . . . . . . . . . . . . . 569 Recording the Orders . . . . . . . . . . . . . . . . . . . . . . . 579 Appendix A: Installation 587 Installation on Windows . . . . . . . . . . . . . . . . . . . . 588 Installation on Mac OS X . . . . . . . . . . . . . . . . . . . 591 MySQL Permissions . . . . . . . . . . . . . . . . . . . . . . . . 594 Testing Your Installation . . . . . . . . . . . . . . . . . . . . 598 Configuring PHP . . . . . . . . . . . . . . . . . . . . . . . . . . . 601 Index 603 viii Table of Contents Table of Contents Today’s Web users expect exciting pages that are updated frequently and provide a customized experience. For them, Web sites are more like communities, to which they’ll return time and again. At the same time, Web site administrators want sites that are easier to update and maintain, understanding that’s the only real way to keep up with visitors’ expectations. For these reasons and more, PHP and MySQL have become the de facto standards for creating dynamic, database-driven Web sites. This book represents the culmination of my many years of Web development experi- ence coupled with the value of having written several previous books on the technologies discussed herein. The focus of this book is on covering the most important knowledge in the most efficient manner. It will teach you how to begin developing dynamic Web sites and give you plenty of example code to get you started. All you need to provide is an eagerness to learn. Well, that and a computer. ix Introduction i Introduction What Are Dynamic Web Sites? Dynamic Web sites are flexible and potent creatures, more accurately described as applications than merely sites. Dynamic Web sites ◆ Respond to different parameters (for example, the time of day or the version of the visitor’s Web browser) ◆ Have a “memory,” allowing for user regis- tration and login, e-commerce, and simi- lar processes ◆ Almost always have HTML forms, so that people can perform searches, provide feedback, and so forth ◆ Often have interfaces where administra- tors can manage the site’s content ◆ Are easier to maintain, upgrade, and build upon than statically made sites There are many technologies available for creating dynamic Web sites. The most com- mon are ASP.NET (Active Server Pages, a Microsoft construct), JSP (Java Server Pages), ColdFusion, Ruby on Rails, and PHP. Dynamic Web sites don’t always rely on a database, but more and more of them do, particularly as excellent database applications like MySQL are available at little to no cost. x Introduction What Are Dynamic Web Sites? Figure i.1 The home page for PHP. What is PHP? PHP originally stood for “Personal Home Page” as it was created in 1994 by Rasmus Lerdorf to track the visitors to his online résumé. As its usefulness and capabilities grew (and as it started being used in more professional situations), it came to mean “PHP: Hypertext Preprocessor.” According to the official PHP Web site, found at www.php.net (Figure i.1), PHP is a “widely-used general-purpose scripting lan- guage that is especially suited for Web devel- opment and can be embedded into HTML.” It’s a long but descriptive definition, whose meaning I’ll explain. Starting at the end of that statement, to say that PHP can be embedded into HTML means that you can take a standard HTML page, drop in some PHP wherever you need it, and end up with a dynamic result. This attribute makes PHP very approachable for anyone that’s done even a little bit of HTML work. Also, PHP is a scripting language, as opposed to a programming language: PHP was designed to write Web scripts, not stand- alone applications (although, with some extra effort, you can now create applications in PHP). PHP scripts run only after an event occurs—for example, when a user submits a form or goes to a URL. I should add to this definition that PHP is a server-side, cross-platform technology, both descriptions being important. Server-side refers to the fact that everything PHP does occurs on the server. A Web server applica- tion, like Apache or Microsoft’s IIS (Internet Information Services), is required and all PHP scripts must be accessed through a URL (http://-something). Its cross-platform nature means that PHP runs on most oper- ating systems, including Windows, Unix (and its many variants), and Macintosh. More important, the PHP scripts written on one server will normally work on another with little or no modification. At the time the book was written, PHP was at version 5.2.4, with version 4.4.7 still being maintained. Support for version 4 is being dropped, though, and it’s recommended that everyone use at least version 5 of PHP. This edition of this book actually focuses on ver- sion 6 of PHP, to be released in late 2007 or in 2008. If you’re still using version 4, you really should upgrade. If that’s not in your plans, then please grab the second edition of this book instead. If you’re using PHP 5, either the second or this edition of the book will work for you. In this edition, I will make it clear which features and functions are PHP 6–specific. xi Introduction What Are Dynamic Web Sites? What’s new in PHP 6 Because of the planned extinction of PHP 4, many users and Web hosting companies will likely make a quick transition from PHP 4 to PHP 5 to PHP 6. To discuss what’s new in PHP 6, I’ll start with the even bigger differ- ences between PHP 4 and 5. PHP 5, like PHP 4 before it, is a major new development of this popular programming language. The most critical changes in PHP 5 involve object-oriented programming (OOP).Those changes don’t really impact this book, as OOP isn’t covered (I do so in my book PHP 5 Advanced: Visual QuickPro Guide). With respect to this book, the biggest change in PHP 5 is the addition of the Improved MySQL Extension, which is used to communicate with MySQL. The Improved MySQL Extension offers many benefits over the older MySQL extension and will be used exclusively. The big change in PHP 6 is support for Unicode, which is to say that PHP can now handle characters in every language in the world. This is huge, and it’s also one of the reasons it’s taken a while to release PHP 6. What this means in terms of programming is covered in Chapter 14, “Making Universal Sites.” The information in that chapter is also used in Chapter 15, “Example—Message Board.” Beyond Unicode support, PHP 6 cleans up a lot of garbage that was left in PHP 5 even though the recommendation was not to use such things. The two biggest removals are the “Magic Quotes” and “register globals” features. Why use PHP? Put simply, when it comes to developing dynamic Web sites, PHP is better, faster, and easier to learn than the alternatives. What you get with PHP is excellent performance, a tight integration with nearly every database available, stability, portability, and a nearly limitless feature set due to its extendibility. All of this comes at no cost (PHP is open source) and with a very manageable learning curve. PHP is one of the best marriages I’ve ever seen between the ease with which beginning programmers can start using it and the ability for more advanced program- mers to do everything they require. Finally, the proof is in the pudding: PHP has seen an exponential growth in use since its inception, overtaking ASP as the most pop- ular scripting language being used today. It’s the most requested module for Apache (the most-used Web server), and by the time this book hits the shelves, PHP will be on nearly 25 million domains. Of course, you might assume that I, as the author of a book on PHP (several, actually), have a biased opinion. Although not nearly to the same extent as PHP, I’ve also devel- oped sites using Java Server Pages (JSP), Ruby on Rails (RoR), and ASP.NET. Each has its pluses and minuses, but PHP is the tech- nology I always return to. You might hear that it doesn’t perform or scale as well as other technologies, but Yahoo! handles over 3.5 billion hits per day using PHP (yes, billion). You might also wonder how secure PHP is. But security isn’t in the language; it’s in how that language is used. Rest assured that a complete and up-to-date discussion of all the relevant security concerns is provided by this book! xii Introduction What Are Dynamic Web Sites? How PHP works As previously stated, PHP is a server-side language. This means that the code you write in PHP sits on a host computer called a server. The server sends Web pages to the request- ing visitors (you, the client, with your Web browser). When a visitor goes to a Web site written in PHP, the server reads the PHP code and then processes it according to its scripted direc- tions. In the example shown in Figure i.2, the PHP code tells the server to send the appropriate data—HTML code—to the Web browser, which treats the received code as it would a standard HTML page. This differs from a static HTML site where, when a request is made, the server merely sends the HTML data to the Web browser and there is no server-side interpretation occurring (Figure i.3). Because no server- side action is required, you can run HTML pages in your Web browser without using a server at all. To the end user and their Web browser there is no perceptible difference between what home.html and home.php may look like, but how that page’s content was created will be significantly different. xiii Introduction What Are Dynamic Web Sites? URL Request HTML Client Server PHP HTML Script Request Figure i.2 How PHP fits into the client/server model when a user requests a Web page. URL Request HTML Client Server Figure i.3 The client/server process when a request for a static HTML page is made. What is MySQL? MySQL (www.mysql.com, Figure i.4) is the world’s most popular open-source database. In fact, today MySQL is a viable competitor to the pricey goliaths such as Oracle and Microsoft’s SQL Server. Like PHP, MySQL offers excellent performance, portability, and reliability, with a moderate learning curve and little to no cost. MySQL is a database management system (DBMS) for relational databases (therefore, MySQL is an RDBMS). A database, in the simplest terms, is a collection of interrelated data, be it text, numbers, or binary files, that are stored and kept organized by the DBMS. There are many types of databases, from the simple flat-file to relational and object-oriented. A relational database uses multiple tables to store information in its most discernable parts. While relational databases may involve more thought in the design and program- ming stages, they offer an improvement to reliability and data integrity that more than makes up for the extra effort required. Further, relational databases are more search- able and allow for concurrent users. By incorporating a database into a Web appli- cation, some of the data generated by PHP can be retrieved from MySQL (Figure i.5). This further moves the site’s content from a static (hard-coded) basis to a flexible one, flexibility being the key to a dynamic Web site. MySQL is an open-source application, like PHP, meaning that it is free to use or even modify (the source code itself is download- able). There are occasions in which you should pay for a MySQL license, especially if you are making money from the sales or incorporation of the MySQL product. Check MySQL’s licensing policy for more informa- tion on this. xiv Introduction What Are Dynamic Web Sites? Figure i.4 The home page for the MySQL database application. otherwise. MySQL is used by NASA and the United States Census Bureau, among many others. At the time of this writing, MySQL is on ver- sion 5.0.45, with versions 5.1 and 6.0 in devel- opment. The version of MySQL you have affects what features you can use, so it’s important that you know what you’re work- ing with. For this book, MySQL 5.0.45 was used, although you should be able to do everything in this book as long as you’re using a version of MySQL greater than 4.1. (My book MySQL: Visual QuickStart Guide goes into the more advanced and newer features of MySQL 5 that aren’t used in this book.) xv Introduction What Are Dynamic Web Sites? Pronunciation Guide Trivial as it may be, I should clarify up front that MySQL is technically pronounced “My Ess Que Ell,” just as SQL should be said “Ess Que Ell.” This is a question many people have when first working with these technologies. While not a critical issue, it’s always best to pronounce acronyms correctly. The MySQL software consists of several pieces, including the MySQL server (mysqld, which runs and manages the databases), the MySQL client (mysql, which gives you an interface to the server), and numerous utili- ties for maintenance and other purposes. PHP has always had good support for MySQL, and that is even more true in the most recent versions of the language. MySQL has been known to handle databases as large as 60,000 tables with more than five billion rows. MySQL can work with tables as large as eight million terabytes on some operating systems, generally a healthy 4 GB URL Request HTML Client Server PHP MySQL HTML Script Request Query Data Figure i.5 How most of the dynamic Web applications in this book will work, using both PHP and MySQL. What You’ll Need To follow the examples in this book, you’ll need the following tools: ◆ A Web server application (for example, Apache, Abyss, or IIS) ◆ PHP ◆ MySQL ◆ A Web browser (Microsoft’s Internet Explorer, Mozilla’s Firefox, Apple’s Safari, etc.) ◆ A text editor, PHP-capable WYSIWYG application (Adobe’s Dreamweaver quali- fies), or IDE (integrated development environment) ◆ An FTP application, if using a remote server One of the great things about developing dynamic Web sites with PHP and MySQL is that all of the requirements can be met at no cost whatsoever, regardless of your operating system! Apache, PHP, and MySQL are each free; most Web browsers can be had without cost; and many good text editors are avail- able for nothing. The appendix discusses the installation process on the Windows and Mac OS X operating systems. If you have a computer, you are only a couple of downloads away from being able to create dynamic Web sites (in that case, your computer would represent both the client and the server in Figures i.2 and i.5). Conversely, you could purchase Web hosting for only dollars per month that will provide you with a PHP- and MySQL-enabled environment already online. xvi Introduction What You'll Need About This Book This book teaches how to develop dynamic Web sites with PHP and MySQL, covering the knowledge that most developers might require. In keeping with the format of the Visual QuickPro series, the information is discussed using a step-by-step approach with corresponding images. The focus has been kept on real-world, practical examples, avoiding “here’s something you could do but never would” scenarios. As a practicing Web developer myself, I wrote about the informa- tion that I use and avoided those topics immaterial to the task at hand. As a practic- ing writer, I made certain to include topics and techniques that I know readers are ask- ing about. The structure of the book is linear, and the intention is that you’ll read it in order. It begins with three chapters covering the fun- damentals of PHP (by the second chapter, you will have already developed your first dynamic Web page). After that, there are three chapters on SQL (Structured Query Language, which is used to interact with all databases) and MySQL. They teach the basics of SQL, database design, and the MySQL application in particular. Then there’s one chapter on debugging and error manage- ment, information everyone needs. This is followed by a chapter introducing how to use PHP and MySQL together, a remarkably easy thing to do. The following five chapters teach more application techniques to round out your knowledge. Security, in particular, is repeat- edly addressed in those pages. Chapter 14, “Making Universal Sites,” is entirely new to this edition of the book, showing you how to broaden the reach of your sites. Finally, I’ve included three example chapters, in which the heart of different Web applications are developed, with instructions. Is this book for you? This book was written for a wide range of people within the beginner-to-intermediate range. The book makes use of XHTML for future compatibility, so solid experience with XHTML, or its forebear HTML, is a must. Although this book covers many things, it does not formally teach HTML or Web page design. Some CSS is sprinkled about these pages but also not taught. Second, this book expects that you have one of the following: ◆ The drive and ability to learn without much hand holding, or… ◆ Familiarity with another programming language (even solid JavaScript skills would qualify), or… ◆ A cursory knowledge of PHP Make no mistake: This book covers PHP and MySQL from A to Z, teaching everything you’ll need to know to develop real-world Web sites, but particularly the early chapters cover PHP at a quick pace. For this reason I recommend either some programming expe- rience or a curious and independent spirit when it comes to learning new things. If you find that the material goes too quickly, you should probably start off with the latest edi- tion of my book PHP for the World Wide Web: Visual QuickStart Guide, which goes at a more tempered pace. No database experience is required, since SQL and MySQL are discussed starting at a more basic level. xvii Introduction About This Book What’s new in this edition The first two editions of this book have been very popular, and I’ve received a lot of posi- tive feedback on them (thanks!). In writing this new edition, I wanted to do more than just update the material for the latest ver- sions of PHP and MySQL, although that is an overriding consideration throughout the book. Other new features you’ll find are: ◆ New examples demonstrating techniques frequently requested by readers ◆ Some additional advanced MySQL and SQL examples ◆ A dedicated chapter on thwarting com- mon Web site abuses and attacks ◆ A brand-new chapter on working with multiple languages and time zones ◆ A brand-new example chapter on creat- ing a message board (or forum) ◆ Expanded and updated installation and configuration instructions ◆ Removal of outdated content (e.g., things used in older versions of PHP or not applicable to PHP 6) For those of you that also own the first and/or second edition (thanks, thanks, thanks!), I believe that these new features will also make this edition a required fixture on your desk or bookshelf. How this book compares to my other books This is my fourth PHP and/or MySQL title, after (in order) ◆ PHP for the World Wide Web: Visual QuickStart Guide ◆ PHP 5 Advanced for the World Wide Web: Visual QuickPro Guide ◆ MySQL: Visual QuickStart Guide I hope this résumé implies a certain level of qualification to write this book, but how do you, as a reader standing in a bookstore, decide which title is for you? Of course, you are more than welcome to splurge and buy the whole set, earning my eternal gratitude, but… The PHP for the World Wide Web: Visual QuickStart Guide book is very much a begin- ner’s guide to PHP. This title overlaps it some, mostly in the first three chapters, but uses new examples so as not to be redun- dant. For novices, this book acts as a follow- up to that one. The advanced book is really a sequel to this one, as it assumes a fair amount of knowledge and builds upon many things taught here. The MySQL book focus- es almost exclusively on MySQL (there are but two chapters that use PHP). With that in mind, read the section “Is this book for you?” and see if the requirements apply. If you have no programming experi- ence at all and would prefer to be taught PHP more gingerly, my first book would be better. If you are already very comfortable with PHP and want to learn more of its advanced capabilities, pick up the second. If you are most interested in MySQL and are not concerned with learning much about PHP, check out the third. That being said, if you want to learn every- thing you need to know to begin developing dynamic Web sites with PHP and MySQL today, then this is the book for you! It refer- ences the most current versions of both technologies, uses techniques not previously discussed in other books, and contains its own unique examples. And whatever book you do choose, make sure you’re getting the most recent edition or, barring that, the edition that best matches the versions of the technologies you’ll be using. xviii Introduction About This Book Companion Web Site I have developed a companion Web site specifically for this book, which you may reach at www.DMCinsights.com/phpmysql3/ (Figure i.6). There you will find every script from this book, a text file containing lengthy SQL commands, and a list of errata that occurred during publication. (If you have problem with a command or script, and you are following the book exactly, check the errata to ensure there is not a printing error before driving yourself absolutely mad.) At this Web site you will also find useful Web links, a highly popular forum where readers can ask and answer each other’s questions (I answer many of them myself), and more! Questions, comments, or suggestions? If you have any questions on PHP or MySQL, you can turn to one of the many Web sites, mailing lists, newsgroups, and FAQ reposito- ries already in existence. A quick search online will turn up virtually unlimited resources. For that matter, if you need an immediate answer, those sources or a quick Web search will most assuredly serve your needs (in all likelihood, someone else has already seen and solved your exact problem). You can also direct your questions, comments, and suggestions to me. You’ll get the fastest reply using the book’s corresponding forum (I always answer those questions first). If you’d rather email me, my contact informa- tion is available on the Web site. I do try to answer every email I receive, although I can- not guarantee a quick reply. xix Introduction Companion Web Site Figure i.6 The companion Web site for this book. This page intentionally left blank To use an old chestnut, every journey starts with one small step, and the first step in developing dynamic Web applications with PHP and MySQL is to learn the fundamen- tals of the scripting language itself. Although this book focuses on using MySQL and PHP in combination, you’ll do a vast majority of your legwork using PHP alone. In this and the following chapter, you’ll learn its basics, from syntax to variables, operators, and language constructs (conditionals, loops, and whatnot). At the same time you are picking up these fundamentals, you’ll also begin developing usable code that you’ll integrate into larger applications later in the book. This introductory chapter will cruise through most of the basics of the PHP language. You’ll learn the syntax for coding PHP, how to send data to the Web browser, and how to use two kinds of variables (strings and numbers) plus constants. Some of the examples may seem inconsequential, but they’ll demonstrate ideas you’ll have to master in order to write more advanced scripts further down the line. 1 Introduction to PHP 1 Introduction to PHP Basic Syntax As stated in the book’s introduction, PHP is an HTML-embedded scripting language. This means that you can intermingle PHP and HTML code within the same file. So to begin programming with PHP, start with a simple Web page. Script 1.1 gives an example of a no-frills, no-content XHTML Transitional document, which will be used as the foundation for every Web page in the book (this book does not formally discuss [X]HTML; see a resource dedicated to the topic for more information). To add PHP code to a page, place it within PHP tags: Anything placed within these tags will be treated by the Web server as PHP (meaning the PHP interpreter will process the code). Any text outside of the PHP tags is immedi- ately sent to the Web browser as regular HTML. Along with placing PHP code within PHP tags, your PHP files must have a proper extension. The extension tells the server to treat the script in a special way, namely, as a PHP page. Most Web servers will use .html or .htm for standard HTML pages, and nor- mally, .php is preferred for your PHP files. To make a basic PHP script: 1. Create a new document in your text editor or Integrated Development Environment (Script 1.2). It generally does not matter what appli- cation you use, be it Dreamweaver (a fancy IDE), BBEdit (a great and popular Macintosh plain-text editor), or vi (a plain- text Unix editor, lacking a graphical interface). Still, some text editors and 2 Chapter 1 Basic Syntax 1 2 3 4 5 Page Title 6 7 8 9 Script 1.1 A basic XHTML 1.0 Transitional Web page. IDEs make typing and debugging HTML and PHP easier (conversely, Notepad on Windows does some things that makes coding harder). If you don’t already have an application you’re attached to, search the Web or use the book’s corresponding forum (www.DMCInsights.com/phorum/) to find one. 2. Start a basic HTML document. Basic PHP Page

This is standard HTML.

Although this is the syntax being used throughout the book, you can change the HTML to match whichever standard you intend to use (e.g., HTML 4.0 Strict). Again, see a dedicated (X)HTML resource if you’re unfamiliar with this HTML code (see the first tip). 3 Introduction to PHP Basic Syntax 1 2 3 4 5 Basic PHP Page 6 7 8

This is standard HTML.

9 11 12 continues on next page Script 1.2 This first PHP script doesn’t do anything, per se, but does demonstrate how a PHP script is written. It’ll also be used as a test, prior to getting into elaborate PHP code. 3. Before the closing body tag, insert your PHP tags. These are the formal PHP tags, also known as XML-style tags. Although PHP supports other tag types (see the second tip), I recommend that you use the for- mal type, and I will do so throughout this book. 4. Save the file as first.php. Remember that if you don’t save the file using an appropriate PHP extension, the script will not execute properly. 5. Place the file in the proper directory of your Web server. If you are running PHP on your own computer (presumably after following the installation directions in Appendix A, “Installation”), you just need to move, copy, or save the file to a specific folder on your computer. Check the documen- tation for your particular Web server to identify the correct directory, if you don’t already know what it is. If you are running PHP on a hosted server (i.e., on a remote computer), you’ll need to use an FTP application to upload the file to the proper directory. Your hosting company will provide you with access and the other necessary information. 6. Run first.php in your Web browser (Figure 1.1). Because PHP scripts need to be parsed by the server, you absolutely must access them via the URL. You cannot simply open them in your Web browser as you would a file in other applications. If you are running PHP on your own computer, you’ll need to go to something like http://localhost/first.php, http://127.0.0.1/first.php, or 4 Chapter 1 Basic Syntax Figure 1.1 While it seems like any other (simple) HTML page, this is in fact a PHP script and the basis for the rest of the examples in the book. http://localhost/~/first.php (on Mac OS X, using your actual user- name for ). If you are using a Web host, you’ll need to use http:// your-domain-name/first.php (e. g., http://www.example.com/first.php). 7. If you don’t see results like those in Figure 1.1, start debugging. Part of learning any programming lan- guage is mastering debugging. It’s a sometimes-painful but absolutely neces- sary process. With this first example, if you don’t see a simple, but perfectly valid, Web page, follow these steps: 1. Confirm that you have a working PHP installation (see Appendix A for testing instructions). 2. Make sure that you are running the script through a URL. The address in the Web browser must begin with http://. If it starts with file://, that’s the problem (Figure 1.2). 3. If you get a file not found (or simi- lar) error, you’ve likely put the file in the wrong directory or mistyped the file’s name (either when saving it or in your Web browser). If you’ve gone through all this and are still having problems, turn to the book’s corresponding forum (www.DMCInsights. com/phorum/list.php?20). 5 Introduction to PHP Basic Syntax Figure 1.2 If you see the actual PHP code (in this case, the tags) in the Web browser, this means that the PHP Web server is not running the code for one reason or another. ✔ Tips ■ To find more information about HTML and XHTML, check out Elizabeth Castro’s excellent book HTML, XHTML, and CSS, Sixth Edition: Visual QuickStart Guide, (Peachpit Press, 2006) or search the Web. ■ There are actually three different pairs of PHP tags. Besides the formal (), there are the short tags (), and the script style (). This last style is rarely used, and the formal style is recommended. ■ Because I am running PHP on my own computer, you will sometimes see URLs like http://127.0.0.1:8000/first.php in this book’s figures. The important thing is that I’m running these scripts via http://; don’t let the rest of the URL confuse you. ■ You can embed multiple sections of PHP code within a single HTML document (i.e., you can go in and out of the two languages). You’ll see examples of this throughout the book. Sending Data to the Web Browser To create dynamic Web sites with PHP, you must know how to send data to the Web browser. PHP has a number of built-in func- tions for this purpose, the most common being echo() and print(). I personally tend to favor echo(): echo ‘Hello, world!’; echo “What’s new?”; You could use print() instead, if you prefer: print “Hello, world!”; print “What’s new?”; As you can see from these examples, you can use either single or double quotation marks (but there is a distinction between the two types of quotation marks, which will be made clear by the chapter’s end). The first quotation mark after the function name indicates the start of the message to be printed. The next matching quotation mark (i.e., the next quotation mark of the same kind as the opening mark) indicates the end of the message to be printed. Along with learning how to send data to the Web browser, you should also notice that in PHP all statements (a line of executed code, in layman’s terms) must end with a semi- colon. Also, PHP is case-insensitive when it comes to function names, so ECHO(), echo(), eCHo(), and so forth will all work. The all-lowercase version is easiest to type, of course. 6 Chapter 1 Sending Data to the Web Browser Needing an Escape As you might discover, one of the compli- cations with sending data to the Web involves printing single and double quo- tation marks. Either of the following will cause errors: echo “She said, “How are you?””; echo ‘I’m just ducky.’; There are two solutions to this problem. First, use single quotation marks when printing a double quotation mark and vice versa: echo ‘She said, “How are you?”’; echo “I’m just ducky.”; Or, you can escape the problematic char- acter by preceding it with a backslash: echo “She said, \”How are you?\””; print ‘I\’m just ducky.’; As escaped quotation mark will merely be printed like any other character. Understanding how to use the backslash to escape a character is an important concept, and one that will be covered in more depth at the end of the chapter. Script 1.3 Using print( ) or echo( ), PHP can send data to the Web browser (see Figure 1.3). To send data to the Web browser: 1. Open first.php (refer to Script 1.2) in your text editor or IDE. 2. Between the PHP tags (lines 9 and 10), add a simple message (Script 1.3). echo ‘This was generated using ➝ PHP!’; It truly doesn’t matter what message you type here, which function you use (echo() or print()), or which quotation marks, for that matter—just be careful if you are printing a single or double quotation mark as part of your message (see the sidebar “Needing an Escape”). 3. If you want, change the page title to bet- ter describe this page (line 5). Using Echo() This change only affects the browser window’s title bar. 4. Save the file as second.php, place it in your Web directory, and test it in your Web browser (Figure 1.3). 5. If necessary, debug the script. If you see a parse error instead of your message (see Figure 1.4), check that you have both opened and closed your quota- tion marks and escaped any problematic characters (see the sidebar). Also be cer- tain to conclude each statement with a semicolon. 7 Introduction to PHP Sending Data to the Web Browser Figure 1.3 The results still aren’t glamorous, but this page was in part dynamically generated by PHP. 1 2 3 4 5 Using Echo() 6 7 8

This is standard HTML.

9 12 13 Figure 1.4 This may be the first of many parse errors you see as a PHP programmer (this one is caused by an un-escaped quotation mark). continues on next page If you see an entirely blank page, this is probably for one of two reasons: ▲ There is a problem with your HTML. Test this by viewing the source of your page and looking for HTML problems there (Figure 1.5). ▲ An error occurred, but display_errors is turned off in your PHP configura- tion, so nothing is shown. In this case, see the section in Appendix A on how to configure PHP so that you can turn display_errors back on. ✔ Tips ■ Technically, echo() and print() are lan- guage constructs, not functions. That being said, don’t be flummoxed as I con- tinue to call them “functions” for con- venience. Also, I include the parentheses when referring to functions—say echo(), not just echo—to help distinguish them from variables and other parts of PHP. This is just my own little convention. ■ You can, and often will, use echo() and print() to send HTML code to the Web browser, like so (Figure 1.6): echo ‘

Hello, world!

’; ■ Echo() and print() can both be used to print text over multiple lines: echo ‘This sentence is printed over two lines.’; What happens in this case is that the return (created by pressing Enter or Return) becomes part of the printed message, which isn’t terminated until the closing single quotation mark. The net result will be the “printing” of the return in the HTML source code (Figure 1.7). This will not have an effect on the generated page (Figure 1.8). For more on this, see the sidebar “Understanding White Space.” 8 Chapter 1 Sending Data to the Web Browser Figure 1.5 One possible cause of a blank PHP page is a simple HTML error, like the closing title tag here (it’s missing the slash). Figure 1.6 PHP can send HTML code (like the formatting here) as well as simple text (see Figure 1.3) to the Web browser. 9 Introduction to PHP Sending Data to the Web Browser Figure 1.7 Printing text and HTML over multiple PHP lines will generate HTML source code that also extends over multiple lines. Note that extraneous white spacing in the HTML source will not affect the look of a page (see Figure 1.8) but can make the source easier to review. Figure 1.8 The return in the HTML source (Figure 1.7) has no effect on the rendered result. The only way to alter the spacing of a displayed Web page is to use HTML tags (like
and

). Understanding White Space With PHP you send data (like HTML tags and text) to the Web browser, which will, in turn, render that data as the Web page the end user sees. Thus, what you are doing with PHP is creating the HTML source of a Web page. With this in mind, there are three areas of notable white space (extra spaces, tabs, and blank lines): in your PHP scripts, in your HTML source, and in the rendered Web page. PHP is generally white space insensitive, meaning that you can space out your code however you want to make your scripts more legible. HTML is also gener- ally white space insensitive. Specifically, the only white space in HTML that affects the rendered page is a single space (multiple spaces still get rendered as one). If your HTML source has text on multiple lines, that doesn’t mean it’ll appear on multiple lines in the rendered page (see Figures 1.7 and 1.8). To alter the spacing in a rendered Web page, use the HTML tags
(line break,
in older HTML standards) and

(paragraph). To alter the spacing of the HTML source created with PHP, you can ◆ Use echo() or print() over the course of several lines. or ◆ Print the newline character (\n) with- in double quotation marks. Writing Comments Creating executable PHP code is only a part of the programming process (admittedly, it’s the most important part). A secondary but still crucial aspect to any programming endeavor involves documenting your code. In HTML you can add comments using special tags: HTML comments are viewable in the source (Figure 1.9) but do not appear in the ren- dered page. PHP comments are different in that they aren’t sent to the Web browser at all, mean- ing they won’t be viewable to the end user, even when looking at the HTML source. 10 Chapter 1 Writing Comments Figure 1.9 HTML comments appear in the browser’s source code but not in the rendered Web page. PHP supports three comment types. The first uses the pound or number symbol (#): # This is a comment. The second uses two slashes: // This is also a comment. Both of these cause PHP to ignore every- thing that follows until the end of the line (when you press Return or Enter). Thus, these two comments are for single lines only. They are also often used to place a comment on the same line as some PHP code: print ‘Hello!’; // Say hello. A third style allows comments to run over multiple lines: /* This is a longer comment that spans two lines. */ Script 1.4 These basic comments demonstrate the three syntaxes you can use in PHP. To comment your scripts: 1. Begin a new PHP document in your text editor or IDE, starting with the initial HTML (Script 1.4). Comments 2. Add the initial PHP tag and write your first comments. This is a line of text. ➝
This is another line of ➝ text.

’; 11 Introduction to PHP Writing Comments 1 2 3 4 5 Comments 6 7 8 This is a line of text.
This is another line of text.

’; 15 16 /* 17 echo ‘This line will not be executed.’; 18 */ 19 20 echo “

Now I’m done.

”; // End of PHP code. 21 22 ?> 23 24 continues on next page Figure 1.10 The PHP comments in Script 1.4 don’t appear in the Web page or the HTML source (Figure 1.11). It doesn’t matter what you do here, just so the Web browser has something to display. For the sake of variety, I’ll have the echo() statement print some HTML tags, including a line break (
) to add some spacing to the generated HTML page. 4. Use the multiline comments to comment out a second echo() statement. /* echo ‘This line will not be ➝ executed.’; */ By surrounding any block of PHP code with /* and */, you can render that code inert without having to delete it from your script. By later removing the com- ment tags, you can reactivate that sec- tion of PHP code. 5. Add a final comment after a third echo() statement. echo “

Now I’m done.

”; // End ➝ of PHP code. This last (superfluous) comment shows how to place one at the end of a line, a common practice. Note that I used dou- ble quotation marks to surround the message, as single quotation marks would conflict with the apostrophe (see the “Needing an Escape” sidebar, earlier in the chapter). 6. Close the PHP section and complete the HTML page. ?> 7. Save the file as comments.php, place it in your Web directory, and test it in your Web browser (Figure 1.10). 12 Chapter 1 Writing Comments ■ It’s nearly impossible to over-comment your scripts. Always err on the side of writing too many comments as you code. That being said, in the interest of saving space, the scripts in this book will not be as well documented as I would suggest they should be. ■ It’s also important that as you change a script you keep the comments up-to- date and accurate. There’s nothing more confusing than a comment that says one thing when the code really does some- thing else. 13 Introduction to PHP Writing Comments Figure 1.11 The PHP comments from Script 1.4 are nowhere to be seen in the client’s browser. 8. If you’re the curious type, check the source code in your Web browser to confirm that the PHP comments do not appear there (Figure 1.11). ✔ Tips ■ You shouldn’t nest (place one inside another) multiline comments (/* */). Doing so will cause problems. ■ Any of the PHP comments can be used at the end of a line (say, after a function call): echo ‘Howdy’; /* Say ‘Howdy’ */ Although this is allowed, it’s far less common. What Are Variables? Variables are containers used to temporarily store values. These values can be numbers, text, or much more complex data. PHP has eight types of variables. These include four scalar (single-valued) types—Boolean (TRUE or FALSE), integer, floating point (decimals), and strings (characters); two nonscalar (mul- tivalued)—arrays and objects; plus resources (which you’ll see when interacting with databases) and NULL (which is a special type that has no value). Regardless of what type you are creating, all variables in PHP follow certain syntactical rules: ◆ A variable’s name—also called its identifier—must start with a dollar sign ($), for example, $name. ◆ The variable’s name can contain a combi- nation of strings, numbers, and the underscore, for example, $my_report1. ◆ The first character after the dollar sign must be either a letter or an underscore (it cannot be a number). ◆ Variable names in PHP are case-sensitive. This is a very important rule. It means that $name and $Name are entirely differ- ent variables. To begin working with variables, let’s make use of several predefined variables whose values are automatically established when a PHP script is run. Before getting into this script, there are two more things you should know. First, variables can be assigned values using the equals sign (=), also called the assignment operator. Second, variables can be printed without quotation marks: print $some_var; 14 Chapter 1 What Are Variables? Script 1.5 This script prints three of PHP’s many predefined variables. Or variables can be printed within double quotation marks: print “Hello, $name”; You cannot print variables within single quotation marks: print ‘Hello, $name’; // Won’t work! To use variables: 1. Begin a new PHP document in your text editor or IDE, starting with the initial HTML (Script 1.5). Predefined Variables</ ➝ title> </head> <body> 2. Add your opening PHP tag and your first comment. <?php # Script 1.5 - predefined.php From here on out, my scripts will no longer comment on the creator, creation date, and so forth, although you should continue to document your scripts thor- oughly. I will, however, make a comment listing the script number and filename for ease of cross-referencing (both in 15 Introduction to PHP What Are Variables? 1 <!DOCTYPE html PUBLIC “-//W3C//DTD XHTML 1.0 Transitional//EN” “http://www.w3.org/ TR/xhtml1/DTD/xhtml1-transitional.dtd”> 2 <html xmlns=”http://www.w3.org/1999/xhtml” xml:lang=”en” lang=”en”> 3 <head> 4 <meta http-equiv=”content-type” content= ”text/html; charset=iso-8859-1” /> 5 <title>Predefined Variables 6 7 8 You are running the file:
$file.

\n”; 17 18 // Print the user’s information: 19 echo “

You are viewing this page using:
$user

\n”; 20 21 // Print the server’s information: 22 echo “

This server is running:
$server.

\n”; 23 24 ?> 25 26 continues on next page the book and when you download them from the book’s supporting Web site, www.DMCInsights.com/phpmysql3/). 3. Create a shorthand version of the first variable to be used in this script. $file = $_SERVER[‘SCRIPT_FILENAME’]; This script will use three variables, each of which comes from the larger and pre- defined $_SERVER variable. $_SERVER refers to a mass of server-related infor- mation. The first variable the script uses is $_SERVER[‘SCRIPT_FILENAME’].This variable stores the full path and name of the script being run (for example, C:\Program Files\Apache\htdocs\ predefined.php). The value stored in $_SERVER[‘SCRIPT_ FILENAME’] will be assigned to the new variable $file. Creating new variables with shorter names and then assigning them values from $_SERVER will make it easier to refer to the variables when printing them. (It also gets around some other issues you’ll learn about in due time.) 4. Create a shorthand version of the other two variables. $user = $_SERVER[‘HTTP_USER_AGENT’]; $server = $_SERVER[‘SERVER_ ➝ SOFTWARE’]; $_SERVER[‘HTTP_USER_AGENT’] represents the Web browser and operating system of the user accessing the script. This value is assigned to $user. $_SERVER[‘SERVER_SOFTWARE’] represents the Web application on the server that’s 16 Chapter 1 What Are Variables? running PHP (e.g., Apache, Abyss, Xitami, IIS). This is the program that must be installed (see Appendix A) in order to run PHP scripts on that computer. 5. Print out the name of the script being run. echo “

You are running the file: ➝
$file.

\n”; The first variable to be printed is $file. Notice that this variable must be printed out within double quotation marks and that I also make use of the PHP newline (\n), which will add a line break in the generated HTML source. Some basic HTML tags—paragraph and bold— are added to give the generated page some flair. 6. Print out the information of the user accessing the script. echo “

You are viewing this page ➝ using:
$user

\n”; This line prints the second variable, $user. To repeat what’s said in the fourth step, $user correlates to $_SERVER[‘HTTP_ USER_AGENT’] and refers to the operating system, browser type, and browser ver- sion being used to access the Web page. 7. Print out the server information. echo “

This server is running:
$server.

\n”; 8. Complete the HTML and PHP code. ?> 9. Save your file as predefined.php, place it in your Web directory, and test it in your Web browser (Figure 1.12). ✔ Tips ■ If you have problems with this, or any other script, turn to the book’s corre- sponding Web forum (www.DMCInsights. com/phorum/) for assistance. ■ If possible, run this script using a differ- ent Web browser and/or on another server (Figure 1.13). ■ The most important consideration when creating variables is to use a consistent naming scheme. In this book you’ll see that I use all-lowercase letters for my variable names, with underscores separating words ($first_name). Some programmers prefer to use capitalization instead: $FirstName. ■ PHP is very casual in how it treats vari- ables, meaning that you don’t need to initialize them (set an immediate value) or declare them (set a specific type), and you can convert a variable among the many types without problem. 17 Introduction to PHP What Are Variables? Figure 1.12 The predefined.php script reports back to the viewer information about the script, the Web browser being used to view it, and the server itself. Figure 1.13 This is the book’s first truly dynamic script, in that the Web page changes depending upon the server running it and the Web browser viewing it (compare with Figure 1.12). Introducing Strings The first variable type to delve into is strings. A string is merely a quoted chunk of charac- ters: letters, numbers, spaces, punctuation, and so forth. These are all strings: ◆ ‘Tobias’ ◆ “In watermelon sugar” ◆ ‘100’ ◆ ‘August 2, 2006’ To make a string variable, assign a string value to a valid variable name: $first_name = ‘Tobias’; $today = ‘August 2, 2006’; When creating strings, you can use either single or double quotation marks to encap- sulate the characters, just as you would when printing text. Likewise, you must use the same type of quotation mark for the beginning and the end of the string. If that same mark appears within the string, it must be escaped: $var = “Define \”platitude\”, please.”; To print out the value of a string, use either echo() or print(): echo $first_name; To print the value of string within a context, use double quotation marks: echo “Hello, $first_name”; You’ve already worked with strings once— when using the predefined variables in the preceding section. In this next example, you’ll create and use new strings. 18 Chapter 1 Introducing Strings Script 1.6 String variables are created and their values sent to the Web browser in this introductory script. To use strings: 1. Begin a new PHP document in your text editor or IDE, starting with the initial HTML and including the opening PHP tag (Script 1.6). Strings The book $book ➝ was written by $first_name ➝ $last_name.

”; 19 Introduction to PHP Introducing Strings 1 2 3 4 5 Strings 6 7 8 The book $book was written by $first_name $last_name.

”; 17 18 ?> 19 20 continues on next page All this script does is print a statement of authorship based upon three estab- lished variables. A little HTML format- ting (the emphasis on the book’s title) is thrown in to make it more attractive. Remember to use double quotation marks here for the variable values to be printed out appropriately (more on the importance of double quotation marks at the chapter’s end). 4. Complete the HTML and PHP code. ?> 5. Save the file as strings.php, place it in your Web directory, and test it in your Web browser (Figure 1.14). 6. If desired, change the values of the three variables, save the file, and run the script again (Figure 1.15). ✔ Tips ■ If you assign another value to an existing variable (say $book), the new value will overwrite the old one. For example: $book = ‘High Fidelity’; $book = ‘The Corrections’; /* $book now has a value of ‘The Corrections’. */ ■ PHP has no set limits on how big a string can be. It’s theoretically possible that you’ll be limited by the resources of the server, but it’s doubtful that you’ll ever encounter such a problem. 20 Chapter 1 Introducing Strings Figure 1.14 The resulting Web page is based upon printing out the values of three variables. Figure 1.15 The output of the script is changed by altering the variables in it. Concatenating Strings Concatenation is like addition for strings, whereby characters are added to the end of the string. It’s performed using the concatenation operator, which is the period (.): $city= ‘Seattle’; $state = ‘Washington’; $address = $city . $state; The $address variable now has the value SeattleWashington, which almost achieves the desired result (Seattle, Washington). To improve upon this, you could write $address = $city . ‘, ‘ . $state; so that a comma and a space are added to the mix. Concatenation works with strings or num- bers. Either of these statements will produce the same result (Seattle, Washington 98101): $address = $city . ‘, ‘ . $state . ‘ 98101’; $address = $city . ‘, ‘ . $state . ‘ ‘ . 98101; Let’s modify strings.php to use this new operator. To use concatenation: 1. Open strings.php (refer to Script 1.6) in your text editor or IDE. 2. After you’ve established the $first_name and $last_name variables (lines 11 and 12), add this line (Script 1.7): $author = $first_name . ‘ ‘ . $last_name; 21 Introduction to PHP Concatenating Strings continues on next page Script 1.7 Concatenation gives you the ability to easily manipulate strings, like creating an author’s name from the combination of their first and last names. 1 2 3 4 5 Concatenation 6 7 8 The book $book was written by $author.

”; 19 20 ?> 21 22 As a demonstration of concatenation, a new variable—$author—will be created as the concatenation of two existing strings and a space in between. 3. Change the echo() statement to use this new variable. echo “

The book $book was ➝ written by $author.

”; Since the two variables have been turned into one, the echo() statement should be altered accordingly. 4. If desired, change the HTML page title and the values of the first name, last name, and book variables. 5. Save the file as concat.php, place it in your Web directory, and test it in your Web browser (Figure 1.16). ✔ Tips ■ PHP has a slew of useful string-specific functions, which you’ll see over the course of this book. For example, to cal- culate how long a string is (how many characters it contains), use strlen(): $num = strlen(‘some string’); ■ You can have PHP convert the case of strings with: strtolower(), which makes it entirely lowercase; strtoupper(), which makes it entirely uppercase; ucfirst(), which capitalizes the first character; and ucwords(), which capitalizes the first character of every word. 22 Chapter 1 Concatenating Strings Figure 1.16 In this revised script, the end result of concatenation is not apparent to the user (compare with Figures 1.14 and 1.15). ■ If you are merely concatenating one value to another, you can use the con- catenation assignment operator (.=). The following are equivalent: $title = $title . $subtitle; $title .= $subtitle; ■ The initial example in this section could be rewritten using either $address = “$city, $state”; or $address = $city; $address .= ‘, ‘; $address .= $state; Introducing Numbers In introducing variables, I was explicit in stating that PHP has both integer and float- ing-point (decimal) number types. In my experience, though, these two types can be classified under the generic title numbers without losing any valuable distinction (for the most part). Valid number-type variables in PHP can be anything like ◆ 8 ◆ 3.14 ◆ 10980843985 ◆ -4.2398508 ◆ 4.4e2 Notice that these values are never quoted— in which case they’d be strings with numeric values—nor do they include commas to indicate thousands. Also, a number is assumed to be positive unless it is preceded by the minus sign (-). Along with the standard arithmetic opera- tors you can use on numbers (Table 1.1), there are dozens of functions. Two common ones are round() and number_format(). 23 Introduction to PHP Introducing Numbers Operator Meaning + Addition - Subtraction * Multiplication / Division % Modulus ++ Increment -- Decrement Arithmetic Operators Table 1.1 The standard mathematical operators. The former rounds a decimal to the nearest integer: $n = 3.14; $n = round ($n); // 3 It can also round to a specified number of decimal places: $n = 3.142857; $n = round ($n, 3); // 3.143 The number_format() function turns a num- ber into the more commonly written version, grouped into thousands using commas: $n = 20943; $n = number_format ($n); // 20,943 This function can also set a specified num- ber of decimal points: $n = 20943; $n = number_format ($n, 2); // 20,943.00 To practice with numbers, let’s write a mock- up script that performs the calculations one might use in an e-commerce shopping cart. To use numbers: 1. Begin a new PHP document in your text editor or IDE (Script 1.8). Numbers 2 3 4 5 Numbers 6 7 8 You are purchasing ’ . $quantity . ‘ widget(s) at a cost of $’ . $price . ‘ each. With tax, the total comes to $’ . $total . ‘.

’; 24 25 ?> 26 27 Script 1.8 The numbers.php script demonstrates basic mathematical calculations, like those used in an e-commerce application. 25 Introduction to PHP Introducing Numbers Figure 1.17 The numbers PHP page (Script 1.8) performs calculations based upon set values. Figure 1.18 To change the generated Web page, alter any or all of the three variables (compare with Figure 1.17). The second line then adds the amount of tax to the total (calculated by multi- plying the tax rate by the total). 4. Format the total. $total = number_format ($total, 2); The number_format() function will group the total into thousands and round it to two decimal places. This will make the display more appropriate to the end user. 5. Print the results. echo ‘

You are purchasing ’ . ➝$quantity . ‘ widget(s) at a cost ➝of $’ . $price . ‘ each. With ➝ tax, the total comes to $’ . ➝ $total . ‘.

’; The last step in the script is to print out the results. To use a combination of HTML, printed dollar signs, and variables, the echo() statement uses both single- quoted text and concatenated variables. You could also put this all within a double-quoted string (as in previous examples), but when PHP encounters, for example, at a cost of $$price in the echo() statement, the double dollar sign would cause problems. You’ll see an alternative solution in the last exam- ple of this chapter. 6. Complete the PHP code and the HTML page. ?> 7. Save the file as numbers.php, place it in your Web directory, and test it in your Web browser (Figure 1.17). 8. If desired, change the initial three vari- ables and rerun the script (Figure 1.18). continues on next page ✔ Tips ■ PHP supports a maximum integer of around two billion on most platforms. With numbers larger than that, PHP will automatically use a floating-point type. ■ When dealing with arithmetic, the issue of precedence arises (the order in which complex calculations are made). While the PHP manual and other sources tend to list out the hierarchy of precedence, I find programming to be safer and more legible when I group clauses in parenthe- ses to force the execution order (see line 17 of Script 1.8). ■ Computers are notoriously poor at deal- ing with decimals. For example, the num- ber 2.0 may actually be stored as 1.99999. Most of the time this won’t be a problem, but in cases where mathematical preci- sion is paramount, rely on integers, not decimals. The PHP manual has informa- tion on this subject, as well as alternative functions for improving computational accuracy. ■ Many of the mathematical operators also have a corresponding assignment opera- tor, letting you create a shorthand for assigning values. This line, $total = $total + ($total * $taxrate); could be rewritten as $total += ($total * $taxrate); ■ If you set a $price value without using two decimals (e.g., 119.9 or 34), you would want to apply number_format() to $price before printing it. 26 Chapter 1 Introducing Numbers Introducing Constants Constants, like variables, are used to tem- porarily store a value, but otherwise, con- stants and variables differ in many ways. For starters, to create a constant, you use the define() function instead of the assignment operator (=): define (‘NAME’, ‘value’); Notice that, as a rule of thumb, constants are named using all capitals, although this is not required. Most importantly, constants do not use the initial dollar sign as variables do (because constants are not variables). A constant can only be assigned a scalar value, like a string or a number. And unlike variables, a constant’s value cannot be changed. To access a constant’s value, like when you want to print it, you cannot put the con- stant within quotation marks: echo “Hello, USERNAME”; // Won’t work! With that code, PHP would literally print Hello, USERNAME and not the value of the USERNAME constant (because there’s no indi- cation that USERNAME is anything other than literal text). Instead, either print the con- stant by itself: echo ‘Hello, ‘; echo USERNAME; or use the concatenation operator: echo ‘Hello, ‘ . USERNAME; PHP runs with several predefined constants, much like the predefined variables used earlier in the chapter. These include PHP_VERSION (the version of PHP running) and PHP_OS (the operating system of the server). 27 Introduction to PHP Introducing Constants To use constants: 1. Begin a new PHP document in your text editor or IDE (Script 1.9). Constants Today is ‘ . TODAY . ‘.
This server is running version ➝ ’ . PHP_VERSION . ‘ of PHP ➝ on the ’ . PHP_OS . ‘ ➝ operating system.

’; Since constants cannot be printed within quotation marks, use the concatenation operator to create the echo() statement. 28 Chapter 1 Introducing Constants Script 1.9 Constants are another temporary storage tool you can use in PHP, distinct from variables. 1 2 3 4 5 Constants 6 7 8 Today is ‘ . TODAY . ‘.
This server is running version ’ . PHP_ VERSION . ‘ of PHP on the ’ . PHP_ OS . ‘ operating system.

’; 15 16 ?> 17 18 29 Introduction to PHP Introducing Constants 4. Complete the PHP code and the HTML page. ?> 5. Save the file as constants.php, place it in your Web directory, and test it in your Web browser (Figure 1.19). ✔ Tips ■ If possible, run this script on another PHP-enabled server (Figure 1.20). ■ In Chapter 11, “Cookies and Sessions,” you’ll learn about another constant, SID (which stands for session ID). Figure 1.19 By making use of PHP’s constants, you can learn more about your PHP setup. Figure 1.20 Running the same script (refer to Script 1.9) on different servers garners different results. Single vs. Double Quotation Marks In PHP it’s important to understand how single quotation marks differ from double quotation marks. With echo() and print(), or when assigning values to strings, you can use either, as in the examples uses so far. But there is a key difference between the two types of quotation marks and when you should use which. I’ve introduced this differ- ence already, but it’s an important enough concept to merit more discussion. In PHP, values enclosed within single quota- tion marks will be treated literally, whereas those within double quotation marks will be interpreted. In other words, placing variables and special characters (Table 1.2) within double quotes will result in their represented values printed, not their literal values. For example, assume that you have $var = ‘test’; The code echo “var is equal to $var”; will print out var is equal to test, whereas the code echo ‘var is equal to $var’; will print out var is equal to $var. Using an escaped dollar sign, the code echo “\$var is equal to $var”; will print out $var is equal to test, whereas the code echo ‘\$var is equal to $var’; will print out \$var is equal to $var. As these examples should illustrate, double quotation marks will replace a variable’s name ($var) with its value (test) and a special character’s code (\$) with its repre- sented value ($). Single quotes will always display exactly what you type, except for the escaped single quote (\’) and the escaped backslash (\\), which are printed as a single quotation mark and a single backslash, respectively. As another example of how the two quotation marks differ, let’s modify the numbers.php script as an experiment. 30 Chapter 1 Single vs. Double Quotation Marks Code Meaning \” Double quotation mark \’ Single quotation mark \\ Backslash \n Newline \r Carriage return \t Tab \$ Dollar sign Escape Sequences Table 1.2 These characters have special meanings when used within double quotation marks. To use single and double quotation marks: 1. Open numbers.php (refer to Script 1.8) in your text editor or IDE. 2. Delete the existing echo() statement (Script 1.10). 3. Print a caption and then rewrite the original echo() statement using double quotation marks. echo ‘

Using double quotation ➝ marks:

’; echo “

You are purchasing $ ➝ quantity widget(s) at a cost ➝ of \$$price each. With tax, ➝ the total comes to \$$total.

\n”; In the original script, the results were printed using single quotation marks and concatenation. The same result can be achieved using double quotation marks. When using double quotation marks, the variables can be placed within the string. There is one catch, though: trying to print a dollar amount as $12.34 (where 12.34 comes from a variable) would sug- gest that you would code $$var. That will not work; instead, escape the initial dol- lar sign, resulting in \$$var, as you see 31 Introduction to PHP Single vs. Double Quotation Marks continues on next page 1 2 3 4 5 Quotation Marks 6 7 8 Using double quotation marks:’; 24 echo “

You are purchasing $quantity widget(s) at a cost of \$$price each. With tax, the total comes to \ $$total.

\n”; 25 26 // Print the results using single quotation marks: 27 echo ‘

Using single quotation marks:

’; Script 1.10 This, the final script in the chapter, demonstrates the differences between using single and double quotation marks. (script continues) 28 echo ‘

You are purchasing $quantity widget(s) at a cost of \$$price each. With tax, the total comes to \$$total.

\n’; 29 30 ?> 31 32 Script 1.10 continued twice in this code. The first dollar sign will be printed, and the second becomes the start of the variable name. 4. Repeat the echo() statements, this time using single quotation marks. echo ‘

Using single quotation marks:

’; echo ‘

You are purchasing $ ➝ quantity widget(s) at a cost ➝ of \$$price each. With tax, ➝ the total comes to \$$total ➝ .

\n’; This echo() statement is used to high- light the difference between using single or double quotation marks. It will not work as desired, and the resulting page will show you exactly what does happen instead. 5. If you want, change the page’s title. 6. Save the file as quotes.php, place it in your Web directory, and test it in your Web browser (Figure 1.21). 7. View the source of the Web page to see how using the newline character (\n) within each quotation mark type also differs. You should see that when you place the newline character within double quota- tion marks it creates a newline in the HTML source. When placed within single quotation marks, the literal characters \ and n are printed instead. 32 Chapter 1 Single vs. Double Quotation Marks Figure 1.21 These results demonstrate when and how you’d use one type of quotation mark as opposed to the other. If you’re still unclear as to the difference between the types, use double quotation marks and you’re less likely to have problems. ✔ Tips ■ Because PHP will attempt to find vari- ables within double quotation marks, using single quotation marks is theoreti- cally faster. If you need to print the value of a variable, though, you must use dou- ble quotation marks. ■ As valid HTML often includes a lot of double-quoted attributes, it’s often easi- est to use single quotation marks when printing HTML with PHP: echo ‘’; If you were to print out this HTML using double quotation marks, you would have to escape all of the double quotation marks in the string: echo “
”; Now that you have the fundamentals of the PHP scripting language down, it’s time to build on those basics and start truly programming. In this chapter you’ll begin creating more elaborate scripts while still learning some of the standard constructs, functions, and syntax of the language. You’ll begin by creating an HTML form, then learning how you can use PHP to handle the submitted values. From there, the chapter covers conditionals and the remaining operators (Chapter 1, “Introduction to PHP,” presented the assignment, concatenation, and mathematical operators), arrays (another variable type), and one last language construct, loops. 33 Programming with PHP 2 Programming with PHP Creating an HTML Form Handling an HTML form with PHP is perhaps the most important process in any dynamic Web site. Two steps are involved: first you create the HTML form itself, and then you create the corresponding PHP script that will receive and process the form data. It would be outside the realm of this book to go into HTML forms in any detail, but I will lead you through one quick example so that it may be used throughout the chapter. If you’re unfamiliar with the basics of an HTML form, including the various types of elements, see an HTML resource for more information. An HTML form is created using the form tags and various elements for taking input. The form tags look like In terms of PHP, the most important attribute of your form tag is action, which dictates to which page the form data will be sent. The second attribute—method—has its own issues (see the “Choosing a Method” side- bar), but post is the value you’ll use most frequently. The different inputs—be they text boxes, radio buttons, select menus, check boxes, etc.—are placed within the opening and closing form tags. As you’ll see in the next section, what kinds of inputs your form has makes little difference to the PHP script handling it. You should, however, pay atten- tion to the names you give your form inputs, as they’ll be of critical importance when it comes to your PHP code. 34 Chapter 2 Creating an HTML Form Choosing a Method The method attribute of a form dictates how the data is sent to the handling page. The two options—get and post—refer to the HTTP (Hypertext Transfer Protocol) method to be used. The get method sends the submitted data to the receiving page as a series of name-value pairs appended to the URL. For example, http://www.example.com/script. php? ➝ name=Homer&gender=M&age=35 The benefit of using the get method is that the resulting page can be book- marked in the user’s Web browser (since it’s a URL). For that matter, you can also click Back in your Web browser to return to a get page, or reload it without prob- lems (none of which is true for post). But there is a limit in how much data can be transmitted via get, and this method is less secure (since the data is visible). Generally speaking, get is used for requesting information, like a particular record from a database or the results of a search (searches almost always use get). The post method is used when an action is required, as when a database record will be updated or an email should be sent. For these reasons I will primarily use post throughout this book, with noted exceptions. To create an HTML form: 1. Begin a new HTML document in your text editor (Script 2.1). Simple HTML Form There’s nothing significantly new here. The document still uses the same basic syntax for an HTML page as in the previous chapter. An HTML comment indicates the file’s name and number. 2. Add the initial form tag. Since the action attribute dictates to which script the form data will go, you should give it an appropriate name (han- dle_form to correspond with this script: form.html) and the .php extension (since a PHP page will handle this form’s data). 3. Begin the HTML form.
Enter your ➝ information in the form ➝ below: 35 Programming with PHP Creating an HTML Form 1 2 3 4 5 Simple HTML Form 6 7 8 9 10 11 12
Enter your information in the form below: 13 14

Name:

15 16

Email Address:

17 18

Gender: Male Female

19 20

Age: 21

Email Address:

These are just simple text inputs, allow- ing the user to enter their name and email address (Figure 2.1). In case you are wondering, the extra space and slash at the end of each input’s tag are required for valid XHTML. With stan- dard HTML, these tags would conclude, for instance, with maxlength=”40”> or maxlength=”60”> instead. 5. Add a pair of radio buttons.

Gender: Male Female

The radio buttons (Figure 2.2) both have the same name, meaning that only one of the two can be selected. They have different values, though. 6. Add a pull-down menu.

Age:

26 27

Comments:

28 29
30 31
32 33 34 35 36 Script 2.1 continued Figure 2.1 Two text inputs. Figure 2.2 If multiple radio buttons have the same name, only one can be chosen by the user. \n”; } 93 Creating Dynamic Web Sites Creating Your Own Functions 30 echo “\n”; 31 } 32 echo ‘’; 33 34 } // End of the function definition. 35 36 // Create the form tags: 37 echo ‘

Select a Date:

38
’; 39 40 // Call the function. 41 make_calendar_pulldowns(); 42 43 echo ‘’; 44 45 include (‘includes/footer.html’); 46 ?> Script 3.7 continued continues on next page echo ‘’; echo ‘’; echo ‘’; This code is exactly as it was in the origi- nal script, only it’s now placed within a function definition. 4. Close the function definition. } // End of the function definition. It’s helpful to place a comment at the end of a function definition so that you know where a definition starts and stops. 5. Create the form and call the function. echo ‘

Select a Date:

’; make_calendar_pulldowns(); echo ‘’; This code will create a header tag, plus the tags for the form. The call to the make_calendar_pulldowns() function will have the end result of creating the code for the three pull-down menus. 94 Chapter 3 Creating Your Own Functions 6. Complete the PHP script by including the HTML footer. include (‘includes/footer.html’); ?> 7. Save the file as dateform.php, place it in your Web directory (in the same folder as index.php), and test it in your Web browser (Figure 3.13). ✔ Tips ■ If you ever see a call to undefined function function_name error, this means that you are calling a function that hasn’t been defined. This can happen if you misspell the function’s name (either when defin- ing or calling it) or if you fail to include the file where the function is defined. ■ Because a user-defined function takes up some memory, you should be prudent about when to use one. As a general rule, functions are best used for chunks of code that may be executed in several places in a script or Web site. Figure 3.13 These pull-down menus are generated by a user-defined function. Creating a function that takes arguments Just like PHP’s built-in functions, those you write can take arguments (also called parameters). For example, the isset() function takes as an argument the name of a variable to be tested. The strlen() func- tion takes as an argument the string whose character length will be determined. A function can take any number of argu- ments, but the order in which you list them is critical. To allow for arguments, add vari- ables to a function’s definition: function print_hello ($first, $last) { // Function code. } The variable names you use for your argu- ments are irrelevant to the rest of the script (more on this in the “Variable Scope” sidebar toward the end of this chapter), but try to use valid, meaningful names. Once the function is defined, you can then call it as you would any other function in PHP, sending literal values or variables to it: print_hello (‘Jimmy’, ‘Stewart’); $surname = ‘Stewart’; print_hello (‘Jimmy’, $surname); As with any function in PHP, failure to send the right number of arguments results in an error (Figure 3.14). To demonstrate this concept, let’s rewrite the calculator process as a function. 95 Creating Dynamic Web Sites Creating Your Own Functions Figure 3.14 Failure to send a function the proper number (and sometimes type) of arguments creates an error. To define functions that take arguments: 1. Open calculator.php (Script 3.6) in your text editor or IDE. 2. After including the header file, define the calculate_total() function (Script 3.8). function calculate_total ($qty, ➝ $cost, $tax) { $total = ($qty * $cost); $taxrate = ($tax / 100); $total += ($total * $taxrate); echo ‘

The total cost of ➝ purchasing ‘ . $qty . ‘ widget(s) ➝ at $’ . number_format ($cost, 2) ➝ . ‘ each, including a tax rate of ➝ ‘ . $tax . ‘%, is $’ . number_ ➝ format ($total, 2) . ‘.

’; } This function performs the same calcula- tions as it did before and then prints out the result. It takes three arguments: the quantity being ordered, the price, and the tax rate. Notice that the variables used as arguments are not $_POST[‘quantity’], $_POST[‘price’], and $_POST[‘tax’]. The function’s argument variables are particular to this function and have their own names. Notice as well that the cal- culations, and the printed result, use these function-specific variables, not those in $_POST (which will actually be sent to this function when it’s called). 3. Change the contents of the validation conditional (where the calculations were previously made) to read echo ‘

Total Cost

’; calculate_total ($_POST[‘quantity’], ➝ $_POST[‘price’], $_POST[‘tax’]); 96 Chapter 3 Creating Your Own Functions 1 The total cost of purchasing ‘ . $qty . ‘ widget(s) at $’ . number_ format ($cost, 2) . ‘ each, including a tax rate of ‘ . $tax . ‘%, is $’ . number_format ($total, 2) . ‘.

’; 16 17 } // End of function. 18 19 // Check for form submission: 20 if (isset($_POST[‘submitted’])) { 21 22 // Minimal form validation: 23 if ( is_numeric($_POST[‘quantity’]) && is_numeric($_POST[‘price’]) && is_numeric($_POST[‘tax’]) ) { 24 25 // Print the heading: 26 echo ‘

Total Cost

’; Script 3.8 The calculator.php script now uses a function to perform its calculations. Unlike the make_calendar_pulldowns() user-defined function, this one takes arguments. (script continues on next page) Again, this is just a minor rewrite of the way the script worked before. Assuming that all of the submitted values are numeric, a heading is printed (this is not done within the function) and the func- tion is called (which will calculate and print the total). When calling the function, three argu- ments are passed to it, each of which is a $_POST variable. The value of $_POST[‘quantity’] will be assigned to the function’s $qty variable; the value of $_POST[‘price’] will be assigned to the function’s $cost variable; and the value of $_POST[‘tax’] will be assigned to the function’s $tax variable. 4. Save the file as calculator.php, place it in your Web directory, and test it in your Web browser (Figure 3.15). 97 Creating Dynamic Web Sites Creating Your Own Functions Script 3.8 continued 27 28 // Call the function: 29 calculate_total ($_POST[‘quantity’], $_POST[‘price’], $_POST[‘tax’]); 30 31 } else { // Invalid submitted values. 32 echo ‘

Error!

33

Please enter a valid quantity, price, and tax.

’; 34 } 35 36 } // End of main isset() IF. 37 38 // Leave the PHP section and create the HTML form: 39 ?> 40

Widget Cost Calculator

41
42

Quantity: ” />

43

Price: ” />

44

Tax (%): ” />

45

46 47 48 Figure 3.15 Although a user-defined function is used to perform the calculations (see Script 3.8), the end result is no different to the user (see Figure 3.11). Setting default argument values Another variant on defining your own func- tions is to preset an argument’s value. To do so, assign the argument a value in the func- tion’s definition: function greet ($name, $msg = ‘Hello’) { echo “$msg, $name!”; } The end result of setting a default argument value is that that particular argument becomes optional when calling the function. If a value is passed to it, the passed value is used; otherwise, the default value is used. You can set default values for as many of the arguments as you want, as long as those arguments come last in the function defini- tion. In other words, the required arguments should always be listed first. With the example function just defined, any of these will work: greet ($surname, $message); greet (‘Zoe’); greet (‘Sam’, ‘Good evening’); However, just greet() will not work. Also, there’s no way to pass $greeting a value without passing one to $name as well (argu- ment values must be passed in order, and you can’t skip a required argument). To set default argument values: 1. Open calculator.php (refer to Script 3.8) in your text editor or IDE. 2. Change the function definition line (line 9) so that only the quantity and cost are required (Script 3.9). function calculate_total ($qty, ➝ $cost, $tax = 5) { 98 Chapter 3 Creating Your Own Functions 1 The total cost of purchasing ‘ . $qty . ‘ widget(s) at $’ . number_ format ($cost, 2) . ‘ each, including a tax rate of ‘ . $tax . ‘%, is $’ . number_format ($total, 2) . ‘.

’; 17 18 } // End of function. 19 20 // Check for form submission: 21 if (isset($_POST[‘submitted’])) { 22 23 // Minimal form validation: 24 if ( is_numeric($_POST[‘quantity’]) && is_numeric($_POST[‘price’]) ) { 25 26 // Print the heading: Script 3.9 The calculate_total() function now assumes a set tax rate unless one is specified when the function is called. (script continues on next page) continues on page 100 99 Creating Dynamic Web Sites Creating Your Own Functions Script 3.9 continued 27 echo ‘

Total Cost

’; 28 29 // Call the function, with or without tax: 30 if (is_numeric($_POST[‘tax’])) { 31 calculate_total ($_POST[‘quantity’], $_POST[‘price’], $_POST[‘tax’]); 32 } else { 33 calculate_total ($_POST[‘quantity’], $_POST[‘price’]); 34 } 35 36 } else { // Invalid submitted values. 37 echo ‘

Error!

38

Please enter a valid quantity and price.

’; 39 } 40 41 } // End of main isset() IF. 42 43 // Leave the PHP section and create the HTML form: 44 ?> 45

Widget Cost Calculator

46
47

Quantity: ” />

48

Price: ” />

49

Tax (%): ” /> (optional)

50

51 52 53 The value of the $tax variable is now hard-coded in the function definition, making it optional. 3. Change the form validation to read if (is_numeric($_POST[‘quantity’]) && is_numeric($_POST[‘price’])) { Because the tax value will be optional, only the other two variables are required and need to be validated. 4. Change the function call line to if (is_numeric($_POST[‘tax’])) { calculate_total ($_POST ➝ [‘quantity’], $_POST ➝ [‘price’], $_POST[‘tax’]); } else { calculate_total ($_POST ➝ [‘quantity’], $_POST[‘price’]); } If the tax value has also been submitted (and is numeric), then the function will be called as before, providing the user-submitted tax rate. Otherwise, the function is called providing just the two arguments, in which case the default value will be used for the tax rate. 5. Change the error message to only report on the quantity and price. echo ‘

Error!

Please enter a valid ➝ quantity and price.

’; Since the tax will now be optional, the error message is changed accordingly. 100 Chapter 3 Creating Your Own Functions Figure 3.16 If no tax value is entered, the default value of 5% will be used in the calculation. 6. If you want, mark the tax value in the form as optional.

Tax (%): ” /> (optional)

A parenthetical is added to the tax input, indicating to the user that this value is optional. 7. Save the file, place it in your Web direc- tory, and test it in your Web browser (Figures 3.16 and 3.17). ✔ Tips ■ To pass a function no value for an argu- ment, use either an empty string (‘’), NULL, or FALSE. ■ In the PHP manual, square brackets ([]) are used to indicate a function’s optional parameters (Figure 3.18). Returning values from a function The final attribute of a user-defined func- tion to discuss is that of returning values. Some, but not all, functions do this. For example, print() will return either a 1 or a 0 indicating its success, whereas echo() will not. As another example, the strlen() function returns a number correlating to the number of characters in a string. To have a function return a value, use the return statement. function find_sign ($month, $day) { // Function code. return $sign; } A function can return a value (say a string or a number) or a variable whose value has been created by the function. When calling a function that returns a value, you can assign the function result to a variable: $my_sign = find_sign (‘October’, 23); or use it as an argument when calling anoth- er function: print find_sign (‘October’, 23); Let’s update the calculate_total() function one last time so that it returns the calculat- ed total instead of printing it. 101 Creating Dynamic Web Sites Creating Your Own Functions Figure 3.17 If the user enters a tax value, it will be used instead of the default value. Figure 3.18 The PHP manual’s description of the number_format() function shows that only the first argument is required. continues on next page To have a function return a value: 1. Open calculator.php (refer to Script 3.9) in your text editor or IDE. 2. Remove the echo() statement from the function definition and replace it with a return statement (Script 3.10) return number_format($total, 2); This version of the function will not print the results. Instead it will return just the calculated total, formatted to two decimal places. 3. Change the function call lines to if (is_numeric($_POST[‘tax’])) { $sum = calculate_total ($_POST ➝ [‘quantity’], $_POST[‘price’], ➝ $_POST[‘tax’]); } else { $sum = calculate_total ($_POST ➝ [‘quantity’], $_POST[‘price’]); } Since the function now returns instead of prints the calculation results, the invo- cation of the function needs to be assigned to a variable so that the total can be printed later in the script. 102 Chapter 3 Creating Your Own Functions 1 Total Cost’; 27 28 // Call the function, with or without tax: 29 if (is_numeric($_POST[‘tax’])) { Script 3.10 The calculate_total() function now performs the calculations and returns the calculated result. (script continues on next page) continues on page 104 103 Creating Dynamic Web Sites Creating Your Own Functions Script 3.10 continued 30 $sum = calculate_total ($_POST[‘quantity’], $_POST[‘price’], $_POST[‘tax’]); 31 } else { 32 $sum = calculate_total ($_POST[‘quantity’], $_POST[‘price’]); 33 } 34 35 // Print the results: 36 echo ‘

The total cost of purchasing ‘ . $_POST[‘quantity’] . ‘ widget(s) at $’ . number_ format ($_POST[‘price’], 2) . ‘ each, with tax, is $’ . $sum . ‘.

’; 37 38 } else { // Invalid submitted values. 39 echo ‘

Error!

40

Please enter a valid quantity and price.

’; 41 } 42 43 } // End of main isset() IF. 44 45 // Leave the PHP section and create the HTML form: 46 ?> 47

Widget Cost Calculator

48
49

Quantity: ” />

50

Price: ” />

51

Tax (%): ” /> (optional)

52

53 54 55 4. Add a new echo() statement that prints the results. echo ‘

The total cost of ➝ purchasing ‘ . $_POST[‘quantity’] ➝ . ‘ widget(s) at $’ . number_ ➝ format ($_POST[‘price’], 2) . ‘ ➝ each, with tax, is $’ . $sum . ➝ ‘.

’; Since the function just returns a value, a new echo() statement must be added to the main code. This statement uses the quantity and price from the form (both found in $_POST) and the total returned by the function (assigned to $sum). It does not, however, report on the tax rate used (see the final tip). 5. Save the file, place it in your Web direc- tory, and test it in your Web browser (Figure 3.19). ✔ Tips ■ Although this last example may seem more complex (with the function per- forming a calculation and the main code printing the results), it actually demon- strates better programming style. Ideally, functions should perform universal, obvi- ous tasks (like a calculation) and be independent of page-specific factors like HTML formatting. ■ The return statement terminates the code execution at that point, so any code within a function after an executed return will never run. 104 Chapter 3 Creating Your Own Functions Figure 3.19 The calculator’s user-defined function now returns, instead of prints, the results, but this change has little impact on what the user sees. ■ A function can have multiple return statements (e.g., in a switch statement or conditional) but only one, at most, will ever be invoked. For example, functions commonly do something like this: function some_function () { if (/* condition */) { return TRUE; } else { return FALSE; } } ■ To have a function return multiple val- ues, use the array() function to return an array. By changing the return line in Script 3.10 to return array ($total, $tax); the function could return both the total of the calculation and the tax rate used (which could be the default value or a user-supplied one). ■ When calling a function that returns an array, use the list() function to assign the array elements to individual vari- ables: list ($sum, $taxrate) = calculate_ ➝ total ($_POST[‘quantity’], ➝ $_POST[‘price’], $_POST[‘tax’]); 105 Creating Dynamic Web Sites Creating Your Own Functions 106 Chapter 3 Creating Your Own Functions Variable Scope Every variable in PHP has a scope to it, which is to say a realm in which the variable (and therefore its value) can be accessed. For starters, variables have the scope of the page in which they reside. So if you define $var, the rest of the page can access $var, but other pages generally cannot (unless you use special variables). Since included files act as if they were part of the original (including) script, variables defined before an include() line are available to the included file (as you’ve already seen with $page_title and header.html). Further, variables defined within the included file are available to the parent (including) script after the include() line. User-defined functions have their own scope: variables defined within a function are not available outside of it, and variables defined outside of a function are not available within it. For this reason, a variable inside of a function can have the same name as one outside of it but still be an entirely different variable with a different value. This is a confusing concept for many beginning programmers. To alter the variable scope within a function, you can use the global statement. function function_name() { global $var; } $var = 20; function_name(); // Function call. In this example, $var inside of the function is now the same as $var outside of it. This means that the function $var already has a value of 20, and if that value changes inside of the function, the external $var’s value will also change. Another option for circumventing variable scope is to make use of the superglobals: $_GET, $_POST, $_REQUEST, etc. These variables are automatically accessible within your functions (hence, they are superglobal). You can also add elements to the $GLOBALS array to make them available within a function. All of that being said, it’s almost always best not to use global variables within a function. Functions should be designed so that they receive every value they need as arguments and return whatever value (or values) need to be returned. Relying upon global variables within a function makes them more context-dependent, and consequently less useful. Because this book discusses how to integrate several technologies (primarily PHP, SQL, and MySQL), a solid understanding of each individually is important before you begin writing PHP scripts that use SQL to interact with MySQL. This chapter is a departure from its predecessors in that it temporarily leaves PHP behind to delve into MySQL. MySQL is the world’s most popular open-source database application (according to MySQL’s Web site, www.mysql.com) and is commonly used with PHP. The MySQL soft- ware comes with the database server (which stores the actual data), different client applications (for interacting with the database server), and several utilities. In this chapter you’ll see how to define a simple table using MySQL’s allowed data types and other properties. Then you’ll learn how to interact with the MySQL server using two different client applications. All of this information will be the foundation for the SQL taught in the next two chapters. This chapter assumes you have access to a running MySQL server. If you are working on your own computer, see Appendix A, “Installation,” for instructions on installing MySQL, starting MySQL, and creating MySQL users (all of which must already be done in order to finish this chapter). If you are using a hosted server, your Web host should provide you with the database access. 107 Introduction to MySQL 4 Introduction to MySQL Naming Database Elements Before you start working with databases, you have to identify your needs. The purpose of the application (or Web site, in this case) dictates how the database should be designed. With that in mind, the examples in this chap- ter and the next will use a database that stores some user registration information. When creating databases and tables, you should come up with names (formally called identifiers) that are clear, meaningful, and easy to type. Also, identifiers ◆ Should only contain letters, numbers, and the underscore (no spaces) ◆ Should not be the same as an existing key- word (like an SQL term or a function name) ◆ Should be treated as case-sensitive ◆ Cannot be longer than 64 characters (approximately) ◆ Must be unique within its realm This last rule means that a table cannot have two columns with the same name and a data- base cannot have two tables with the same name. You can, however, use the same column name in two different tables in the same database (in fact, you often will do this). As for the first three rules, I use the word should, as these are good policies more than exact requirements. Exceptions can be made to these rules, but the syntax for doing so can be complicated. Abiding by these sugges- tions is a reasonable limitation and will help avoid complications. 108 Chapter 4 Naming Database Elements Column Name Example user_id 834 first_name Larry last_name David email ld@example.com pass emily07 registration_date 2007-12-31 19:21:03 users Table To name a database’s elements: 1. Determine the database’s name. This is the easiest and, arguably, least important step. Just make sure that the database name is unique for that MySQL server. If you’re using a hosted server, your Web host will likely provide a database name that may or may not include your account or domain name. For this first example, the database will be called sitename, as the information and techniques could apply to any generic site. 2. Determine the table names. The table names just need to be unique within this database, which shouldn’t be a problem. For this example, which stores user registration information, the only table will be called users. 3. Determine the column names for each table. The users table will have columns to store a user ID, a first name, a last name, an email address, a password, and the registration date. Table 4.1 shows these columns, with sample data, using proper identifiers. As MySQL has a function called password, I’ve changed the name of that column to just pass. This isn’t strictly necessary but is really a good idea. ✔ Tips ■ Chapter 6, “Advanced SQL and MySQL,” discusses database design in more detail, using a more complex example. ■ To be precise, the length limit for the names of databases, tables, and columns is actually 64 bytes, not characters. While most characters in many languages require one byte apiece, it’s possible to use a multi- byte character in an identifier. But 64 bytes is still a lot of space, so this probably won’t be an issue for you. ■ Whether or not an identifier in MySQL is case-sensitive actually depends upon many things. On Windows and normally on Mac OS X, database and table names are generally case-insensitive. On Unix and some Mac OS X setups, they are case- sensitive. Column names are always case-insensitive. It’s really best, in my opinion, to always use all lowercase letters and work as if case-sensitivity applied. 109 Introduction to MySQL Naming Database Elements Table 4.1 The users table will have these six columns, to store records like the sample data here. 110 Chapter 4 Choosing Your Column Types Type Size Description CHAR[Length] Length bytes A fixed-length field from 0 to 255 characters long VARCHAR[Length] String length + 1 or 2 bytes A variable-length field from 0 to 65,535 characters long TINYTEXT String length + 1 bytes A string with a maximum length of 255 characters TEXT String length + 2 bytes A string with a maximum length of 65,535 characters MEDIUMTEXT String length + 3 bytes A string with a maximum length of 16,777,215 characters LONGTEXT String length + 4 bytes A string with a maximum length of 4,294,967,295 characters TINYINT[Length] 1 byte Range of –128 to 127 or 0 to 255 unsigned SMALLINT[Length] 2 bytes Range of –32,768 to 32,767 or 0 to 65,535 unsigned MEDIUMINT[Length] 3 bytes Range of –8,388,608 to 8,388,607 or 0 to 16,777,215 unsigned INT[Length] 4 bytes Range of –2,147,483,648 to 2,147,483,647 or 0 to 4,294,967,295 unsigned BIGINT[Length] 8 bytes Range of –9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 or 0 to 18,446,744,073,709,551,615 unsigned FLOAT[Length, Decimals] 4 bytes A small number with a floating decimal point DOUBLE[Length, Decimals] 8 bytes A large number with a floating decimal point DECIMAL[Length, Decimals] Length + 1 or 2 bytes A DOUBLE stored as a string, allowing for a fixed decimal point DATE 3 bytes In the format of YYYY-MM-DD DATETIME 8 bytes In the format of YYYY-MM-DD HH:MM:SS TIMESTAMP 4 bytes In the format of YYYYMMDDHHMMSS; acceptable range ends in the year 2037 TIME 3 bytes In the format of HH:MM:SS ENUM 1 or 2 bytes Short for enumeration, which means that each column can have one of several possible values SET 1, 2, 3, 4, or 8 bytes Like ENUM except that each column can have more than one of several possible values MySQL Data Types Within each of these, there are a number of variants—some of which are MySQL- specific—you can use. Choosing your column types correctly not only dictates what infor- mation can be stored and how but also affects the database’s overall performance. Table 4.2 lists most of the available types for MySQL, how much space they take up, and brief descriptions of each type. Choosing Your Column Types Once you have identified all of the tables and columns that the database will need, you should determine each column’s data type. When creating a table, MySQL requires that you explicitly state what sort of infor- mation each column will contain. There are three primary types, which is true for almost every database application: ◆ Text (aka strings) ◆ Numbers ◆ Dates and times Table 4.2 The common MySQL data types you can use for defining columns. Note: some of these limits may change in different versions of MySQL, and the character set may also impact the size of the text types. Many of the types can take an optional Length attribute, limiting their size. (The square brackets, [], indicate an optional parameter to be put in parentheses.) For performance purposes, you should place some restrictions on how much data can be stored in any col- umn. But understand that attempting to insert a string five characters long into a CHAR(2) column will result in truncation of the final three characters (only the first two characters would be stored; the rest would be lost forever). This is true for any field in which the size is set (CHAR, VARCHAR, INT, etc.). So your length should always correspond to the maximum possible value (as a number) or longest possible string (as text) that might be stored. The various date types have all sorts of unique behaviors, which are documented in the MySQL manual. You’ll use the DATE and TIME fields primarily without modification, so you need not worry too much about their intricacies. There are also two special types—ENUM and SET—that allow you to define a series of acceptable values for that column. An ENUM column can have only one value of a possible several thousand, while SET allows for sever- al of up to 64 possible values. These are available in MySQL but aren’t present in every database application. 111 Introduction to MySQL Choosing Your Column Types To select the column types: 1. Identify whether a column should be a text, number, or date/time type (Table 4.3). This is normally an easy and obvious step, but you want to be as specific as possible. For example, the date 2006-08-02 (MySQL format) could be stored as a string— August 2, 2006. But if you use the proper date format, you’ll have a more useful database (and, as you’ll see, there are functions that can turn 2006-08-02 into August 2, 2006). 2. Choose the most appropriate subtype for each column (Table 4.4). For this example, the user_id is set as a MEDIUMINT, allowing for up to nearly 17 million values (as an unsigned, or non- negative, number). The registration_date will be a DATETIME. It can store both the date and the specific time a user regis- tered. When deciding among the date types, consider whether or not you’ll want to access just the date, the time, or possibly both. If unsure, err on the side of storing too much information. The other fields will be mostly VARCHAR, since their lengths will differ from record to record. The only exception is the pass- word column, which will be a fixed-length CHAR (you’ll see why when inserting records in the next chapter). See the side- bar “CHAR vs. VARCHAR” for more informa- tion on these two types. 112 Chapter 4 Choosing Your Column Types Column Name Type user_id number first_name text last_name text email text pass text registration_date date/time users Table Column Name Type user_id MEDIUMINT first_name VARCHAR last_name VARCHAR email VARCHAR pass CHAR registration_date DATETIME users Table Column Name Type user_id MEDIUMINT first_name VARCHAR(20) last_name VARCHAR(40) email VARCHAR(60) pass CHAR(40) registration_date DATETIME users Table Table 4.3 The users table with assigned generic data types. Table 4.4 The users table with more specific data types. Table 4.5 The users table with set length attributes. 3. Set the maximum length for text columns (Table 4.5). The size of any field should be restricted to the smallest possible value, based upon the largest possible input. For example, if a column is storing a state abbreviation, it would be defined as a CHAR(2). Other times you might have to guess some- what: I can’t think of any first names longer than about 10 characters, but just to be safe I’ll allow for up to 20. ✔ Tips ■ The length attribute for numeric types does not affect the range of values that can be stored in the column. Columns defined as TINYINT(1) or TINYINT(20) can store the exact same values. Instead, for integers, the length dictates the dis- play width; for decimals, the length is the total number of digits that can be stored. ■ Many of the data types have synony- mous names: INT and INTEGER, DEC and DECIMAL, etc. ■ The TIMESTAMP field type is automatically set as the current date and time when an INSERT or UPDATE occurs, even if no value is specified for that particular field. If a table has multiple TIMESTAMP columns, only the first one will be updated when an INSERT or UPDATE is performed. ■ MySQL also has several variants on the text types that allow for storing binary data. These types are BINARY, VARBINARY, TINYBLOB, MEDIUMBLOB, and LONGBLOB. Such types are used for storing files or encrypted data. 113 Introduction to MySQL Choosing Your Column Types CHAR vs. VARCHAR Both of these types store strings and can be set with a maximum length. One pri- mary difference between the two is that anything stored as a CHAR will always be stored as a string the length of the column (using spaces to pad it; these spaces will be removed when you retrieve the stored value from the database). Conversely, strings stored in a VARCHAR column will require only as much space as the string itself. So the word cat in a VARCHAR(10) column requires four bytes of space (the length of the string plus 1), but in a CHAR(10) column, that same word requires 10 bytes of space. So, generally speaking, VARCHAR columns tend to take up less disk space than CHAR columns. However, databases are normally faster when working with fixed-size columns, which is an argument in favor of CHAR. And that same three-letter word—cat— in a CHAR(3) only uses 3 bytes but in a VARCHAR(10) requires 4. So how do you decide which to use? If a string field will always be of a set length (e.g., a state abbreviation), use CHAR; other- wise, use VARCHAR. You may notice, though, that in some cases MySQL defines a column as the one type (like CHAR) even though you created it as the other (VARCHAR). This is perfectly normal and is MySQL’s way of improving performance. Choosing Other Column Properties Besides deciding what data types and sizes you should use for your columns, you should consider a handful of other properties. First, every column, regardless of type, can be defined as NOT NULL. The NULL value, in data- bases and programming, is equivalent to saying that the field has no value. Ideally, in a properly designed database, every column of every row in every table should have a value, but that isn’t always the case. To force a field to have a value, add the NOT NULL description to its column type. For example, a required dollar amount can be described as cost DECIMAL(5,2) NOT NULL When creating a table, you can also specify a default value for any column, regardless of type. In cases where a majority of the records will have the same value for a column, pre- setting a default will save you from having to specify a value when inserting new rows (unless that row’s value for that column is different from the norm). gender ENUM('M', 'F') default 'F' With the gender column, if no value is specified when adding a record, the default will be used. If a column does not have a default value and one is not specified for a new record, that field will be given a NULL value. However, if no value is specified and the column is defined as NOT NULL, an error will occur. The number types can be marked as UNSIGNED, which limits the stored data to positive numbers and zero. This also effectively dou- bles the range of positive numbers that can be stored (because no negative numbers will be kept, see Table 4.2). You can also flag the number types as ZEROFILL, which means that any extra room will be padded with zeros (ZEROFILLs are also automatically UNSIGNED). Finally, when designing a database, you’ll need to consider creating indexes, adding keys, and using the AUTO_INCREMENT property. Chapter 6 discusses these concepts in greater detail, but in the meantime, check out the sidebar “Indexes, Keys, and AUTO_INCREMENT” to learn how they affect the users table. To finish defining your columns: 1. Identify your primary key. The primary key is quixotically both arbi- trary and critically important. Almost always a number value, the primary key is a unique way to refer to a particular record. For example, your phone number has no inherent value but is unique to you (your home or mobile phone). In the users table, the user_id will be the primary key: an arbitrary number used to refer to a row of data. Again, Chapter 6 will go into the concept of primary keys in more detail. 2. Identify which columns cannot have a NULL value. In this example, every field is required (cannot be NULL). If you stored peoples’ addresses, by contrast, you might have address_line1 and address_line2, with the latter one being optional (it could have a NULL value). In general, tables that have a lot of NULL values suggest a poor design (more on this in…you guessed it…Chapter 6). 3. Make any numeric type UNSIGNED if it won’t ever store negative numbers. The user_id, which will be a number, should be UNSIGNED so that it’s always positive. Other examples of UNSIGNED numbers would be the price of items in an e-commerce example, a telephone extension for a business, or a zip code. 114 Chapter 4 Choosing Other Column Properties 4. Establish the default value for any column. None of the columns here logically implies a default value. 5. Confirm the final column definitions (Table 4.6). Before creating the tables, you should revisit the type and range of data you’ll store to make sure that your database effectively accounts for everything. ✔ Tip ■ Text columns can also have defined char- acter sets and collations. This will mean more once you start working with multi- ple languages (see Chapter 14, “Making Universal Sites”). 115 Introduction to MySQL Choosing Other Column Properties Indexes, Keys, and AUTO_INCREMENT Two concepts closely related to database design are indexes and keys. An index in a database is a way of requesting that the database keep an eye on the values of a specific column or combination of columns (loosely stated). The end result of this is improved performance when retrieving records but marginally hindered performance when inserting records or updating them. A key in a database table is integral to the normalization process used for designing more complicated databases (see Chapter 6). There are two types of keys: primary and foreign. Each table should have one primary key, and the primary key in one table is often linked as a foreign key in another. A table’s primary key is an artificial way to refer to a record and should abide by three rules: 1. It must always have a value. 2. That value must never change. 3. That value must be unique for each record in the table. In the users table, the user_id will be designated as a PRIMARY KEY, which is both a descrip- tion of the column and a directive to MySQL to index it. Since the user_id is a number (which primary keys almost always will be), also add the AUTO_INCREMENT description to the column, which tells MySQL to use the next-highest number as the user_id value for each added record. You’ll see what this means in practice when you begin inserting records. Column Name Type user_id MEDIUMINT UNSIGNED NOT NULL first_name VARCHAR(20) NOT NULL last_name VARCHAR(40) NOT NULL email VARCHAR(60) NOT NULL pass CHAR(40) NOT NULL registration_date DATETIME NOT NULL users Table Table 4.6 The final description of the users table. The user_id will also be defined as an auto-incremented primary key. Accessing MySQL In order to create tables, add records, and request information from a database, some sort of client is necessary to communicate with the MySQL server. Later in the book, PHP scripts will act in this role, but being able to use another interface is necessary. Although there are oodles of client applica- tions available, I’ll focus on two: the mysql client (or mysql monitor, as it is also called) and the Web-based phpMyAdmin. A third option, the MySQL Query Browser, is not dis- cussed in this book but can be found at the MySQL Web site (www.mysql.com), should you not be satisfied with these two choices. Using the mysql Client The mysql client is normally installed with the rest of the MySQL software. Although the mysql client does not have a pretty graphical interface, it’s a reliable, standard tool that’s easy to use and behaves consis- tently on many different operating systems. The mysql client is accessed from a command- line interface, be it the Terminal application in Linux or Mac OS X (Figure 4.1), or a DOS prompt in Windows (Figure 4.2). If you’re not comfortable with command-line interactions, you might find this interface to be challeng- ing, but it becomes easy to use in no time. 116 Chapter 4 Accessing MySQL Figure 4.1 A Terminal window in Mac OS X. Figure 4.2 A Windows DOS prompt or console (although the default is for white text on a black background). To start an application from the command line, type its name and press Return or Enter: mysql When invoking this application, you can add arguments to affect how it runs. The most common arguments are the username, pass- word, and hostname (computer name or URL) you want to connect using. You establish these arguments like so: mysql -u username -p -h hostname The -p option will cause the client to prompt you for the password. You can also specify the password on this line if you pre- fer—by typing it directly after the -p prompt—but it will be visible, which is inse- cure. The -h hostname argument is optional, and you can leave it off unless you cannot connect to the MySQL server without it. Within the mysql client, every statement (SQL command) needs to be terminated by a semicolon. These semicolons are an indi- cation to the client that the query is com- plete and should be run. The semicolons are not part of the SQL itself (this is a common point of confusion). What this also means is that you can continue the same SQL state- ment over several lines within the mysql client, which makes it easier to read and to edit, should that be necessary. As a quick demonstration of accessing and using the mysql client, these next steps will show you how to start the mysql client, select a database to use, and quit the client. Before following these steps, ◆ The MySQL server must be running. ◆ You must have a username and password with proper access. Both of these ideas are explained in Appendix A. As a side note, in the following steps and throughout the rest of the book, I will con- tinue to provide images using the mysql client on both Windows and Mac OS X. While the appearance differs, the steps and results will be identical. So in short, don’t be concerned about why one image shows the DOS prompt and the next a Terminal. To use the mysql client: 1. Access your system from a command- line interface. On Unix systems and Mac OS X, this is just a matter of bringing up the Terminal or a similar application. If you are using Windows and followed the instructions in Appendix A, you can choose Start > Programs > MySQL > MySQL Server X.X > MySQL Command Line Client (Figure 4.3). Then you can skip to Step 3. If you don’t have a MySQL Command Line Client option available, you’ll need to choose Run from the Start menu, type cmd in the window, and press Enter to bring up a DOS prompt (then follow the instructions in the next step). 117 Introduction to MySQL Accessing MySQL Figure 4.3 The MySQL Windows installer creates a link in your Start menu so that you can easily get into the mysql client. continues on next page 2. Invoke the mysql client, using the appro- priate command (Figure 4.4). /path/to/mysql/bin/mysql -u username -p The /path/to/mysql part of this step will be largely dictated by the operating sys- tem you are running and where MySQL was installed. This might therefore be ▲ /usr/local/mysql/bin/mysql - u ➝ username -p (on Mac OS X and Unix) or ▲ C:\mysql\bin\mysql -u username -p (on Windows) The basic premise is that you are run- ning the mysql client, connecting as username, and requesting to be prompt- ed for the password. Not to overstate the point, but the username and password values that you use must already be established in MySQL as a valid user (see Appendix A). 3. Enter the password at the prompt and press Return/Enter. The password you use here should be for the user you specified in the preceding step. If you used the MySQL Command Line Client link on Windows (Figure 4.3), the user is root, so you should use that password (probably established during installation and configuration, see Appendix A). If you used the proper username/pass- word combination (i.e., someone with valid access), you should be greeted as shown in Figure 4.5. If access is denied, you’re probably not using the correct val- ues (see Appendix A for instructions on creating users). 118 Chapter 4 Accessing MySQL Figure 4.4 Access the mysql client by entering the full path to the utility, along with the proper arguments. Figure 4.5 If you are successfully able to log in, you’ll see a welcome message like this. 4. Select the database you want to use (Figure 4.6). USE test; The USE command selects the database to be used for every subsequent com- mand. The test database is one that MySQL installs by default. Assuming it exists on your server, all users should be able to access it. 5. Quit out of mysql (Figure 4.7). quit You can also use the command exit to leave the client. This step—unlike most other commands you enter in the mysql client—does not require a semicolon at the end. If you used the MySQL Command Line Client, this will also close the DOS prompt window. ✔ Tips ■ If you know in advance which database you will want to use, you can simplify matters by starting mysql with /path/to/mysql/bin/mysql -u username ➝ -p databasename ■ To see what else you can do with the mysql client, type /path/to/mysql/bin/mysql --help ■ The mysql client on most systems allows you to use the up and down arrows to scroll through previously entered com- mands. If you make a mistake in typing a query, you can scroll up to find it, and then correct the error. ■ If you are in a long statement and make a mistake, cancel the current operation by typing c and pressing Return or Enter. If mysql thinks a closing single or double quotation mark is missing (as indicated by the '> and "> prompts), you’ll need to enter the appropriate quotation mark first. 119 Introduction to MySQL Accessing MySQL Figure 4.6 After getting into the mysql client, run a USE command to choose the database with which you want to work. Figure 4.7 Type either exit or quit to terminate your session and leave the mysql client. Using phpMyAdmin phpMyAdmin (www.phpmyadmin.net) is one of the best and most popular applications written in PHP. Its sole purpose is to provide an interface to a MySQL server. It’s some- what easier and more natural to use than the mysql client but requires a PHP installa- tion and must be accessed through a Web browser. If you’re running MySQL on your own computer, you might find that using the mysql client makes more sense, as installing and configuring phpMyAdmin constitutes unnecessary extra work (although all-in-one PHP and MySQL installers may do this for you). If using a hosted server, your Web host is virtually guaranteed to provide phpMyAdmin as the primary way to work with MySQL and the mysql client may not be an option. Using phpMyAdmin isn’t hard, but the next steps run through the basics so that you’ll know what to do in the following chapters. 120 Chapter 4 Accessing MySQL Figure 4.8 The first phpMyAdmin page (when connected as a MySQL user that can access multiple databases). To use phpMyAdmin: 1. Access phpMyAdmin through your Web browser (Figure 4.8). The URL you use will depend upon your situation. If running on your own com- puter, this might be http://localhost/ phpMyAdmin/. If running on a hosted site, your Web host will provide you with the proper URL. In all likelihood, phpMyAdmin would be available through the site’s con- trol panel (should one exist). Note that phpMyAdmin will only work if it’s been properly configured to connect to MySQL with a valid username/password/ hostname combination. If you see a mes- sage like the one in Figure 4.9, you’re probably not using the correct values (see Appendix A for instructions on creating users). databases, or every database. On a hosted site where you have just one database, that database will probably already be selected for you (Figure 4.11). On your own com- puter, with phpMyAdmin connecting as the MySQL root user, you would see a pull- down menu (Figure 4.10) or a simple list of available databases (Figure 4.8). 121 Introduction to MySQL Accessing MySQL Figure 4.9 Every client application requires a proper username/password/ hostname combination in order to interact with the MySQL server. Figure 4.10 Use the list of databases on the left side of the window to choose with which database you want to work. This is the equivalent of running a USE databasename query within the mysql client (see Figure 4.6). Figure 4.11 If phpMyAdmin only has access to one database, it’ll likely already be selected when you load the page. 2. If possible and necessary, use the menu on the left to select a database to use (Figure 4.10). What options you have here will vary depending upon what MySQL user phpMyAdmin is connecting as. That user might have access to one database, several continues on next page 3. Use the SQL tab (Figure 4.12) or the SQL query window (Figure 4.13) to enter SQL commands. The next two chapters, and the occasion- al one later in the book, will provide SQL commands that must be run to create, populate, or alter tables. These might look like INSERT INTO tablename (col1, col2) ➝ VALUES (x, y) These commands can be run using the mysql client, phpMyAdmin, or any other interface. To run them within phpMyAdmin, just type them into one of the SQL prompts and click Go. ✔ Tips ■ There’s a lot more that can be done with phpMyAdmin, but full coverage would require a chapter in its own right (and a long chapter at that). The information presented here will be enough for you to follow any of the examples in the book, should you not want to use the mysql client. ■ phpMyAdmin can be configured to use a special database that will record your query history, allow you to bookmark queries, and more. ■ One of the best reasons to use phpMyAdmin is to transfer a database from one comput- er to another. Use the Export tab in phpMyAdmin connected to the source computer to create a file of data. Then, on the destination computer, use the Import tab in phpMyAdmin (connected to that MySQL server) to complete the transfer. 122 Chapter 4 Accessing MySQL Figure 4.12 The SQL tab, in the main part of the window, can be used to run any SQL command. Figure 4.13 The SQL window can also be used to run commands. It pops up after clicking the SQL icon at the top of the left side of the browser (see the second icon from the left in Figure 4.10). The preceding chapter provides a quick introduction to MySQL. The focus there is on two topics: using MySQL’s rules and data types to define a database, and how to interact with the MySQL server. This chapter moves on to the lingua franca of databases: SQL. SQL, short for Structured Query Language, is a group of special words used exclusively for interacting with databases. Every major database uses SQL, and MySQL is no exception. There are multiple versions of SQL and MySQL has its own variations on the SQL standards, but SQL is still surprisingly easy to learn and use. In fact, the hardest thing to do in SQL is use it to its full potential! In this chapter you’ll learn all the SQL you need to know to create tables, populate them, and run other basic queries. The examples will all use the users table discussed in the preceding chapter. Also, as with that other chapter, this chapter assumes you have access to a running MySQL server and know how to use a client application to interact with it. 123 Introduction to SQL 5 Introduction to SQL Creating Databases and Tables The first logical use of SQL will be to create a database. The syntax for creating a new database is simply CREATE DATABASE databasename That’s all there is to it (as I said, SQL is easy to learn)! The CREATE term is also used for making tables: CREATE TABLE tablename ( column1name description, column2name description …) As you can see from this syntax, after nam- ing the table, you define each column within parentheses. Each column-description pair should be separated from the next by a comma. Should you choose to create indexes at this time, you can add those at the end of the creation statement, but you can add indexes at a later time as well. (Indexes are more for- mally discussed in Chapter 6, “Advanced SQL and MySQL,” but Chapter 4, “Introduction to MySQL,” introduced the topic.) In case you were wondering, SQL is case- insensitive. However, I strongly recommend making it a habit to capitalize the SQL key- words as in the preceding example syntax and the following steps. Doing so helps to contrast the SQL terms from the database, table, and column names. To create databases and tables: 1. Access MySQL using whichever client you prefer. Chapter 4 shows how to use two of the most common interfaces—the mysql client and phpMyAdmin—to communi- cate with a MySQL server. Using the steps in the last chapter, you should now con- nect to MySQL. Throughout the rest of this chapter, most of the SQL examples will be entered using the mysql client, but they will work just the same in phpMyAdmin or any other client tool. 2. Create and select the new database (Figure 5.1). CREATE DATABASE sitename; USE sitename; This first line creates the database (assum- ing that you are connected to MySQL as a user with permission to create new data- bases). The second line tells MySQL that you want to work within this database from here on out. Remember that within the mysql client, you must terminate every SQL command with a semicolon, although these semicolons aren’t techni- cally part of SQL itself. If executing mul- tiple queries at once within phpMyAdmin, they should also be separated by semi- colons (Figure 5.2). If running only a single query within phpMyAdmin, no semicolons are necessary. If you are using a hosting company’s MySQL, they will probably create the database for you. In that case, just con- nect to MySQL and select the database. 124 Chapter 5 Creating Databases and Tables 3. Create the users table (Figure 5.3). CREATE TABLE users ( user_id MEDIUMINT UNSIGNED NOT NULL AUTO_INCREMENT, first_name VARCHAR(20) NOT NULL, last_name VARCHAR(40) NOT NULL, email VARCHAR(60) NOT NULL, pass CHAR(40) NOT NULL, registration_date DATETIME NOT NULL, PRIMARY KEY (user_id) ); The design for the users table is developed in Chapter 4. There, the names, types, and attributes of each column in the table are determined based upon a num- ber of criteria (see that chapter for more information). Here, that information is placed within the CREATE table syntax to actually make the table in the database. Because the mysql client will not run a query until it encounters a semicolon, you can enter statements over multiple lines as in Figure 5.3 (by pressing Return or Enter at the end of each line). This often makes a query easier to read and debug. In phpMyAdmin, you can also run queries over multiple lines, although they will not be run until you click Go. 125 Introduction to SQL Creating Databases and Tables Figure 5.1 A new database, called sitename, is created in MySQL. It is then selected for future queries. Figure 5.2 The same commands for creating and selecting a database can be run within phpMyAdmin’s SQL window. Figure 5.3 This CREATE SQL command will make the users table. continues on next page 4. Confirm the existence of the table (Figure 5.4). SHOW TABLES; SHOW COLUMNS FROM users; The SHOW command reveals the tables in a database or the column names and types in a table. Also, you might notice in Figure 5.4 that the default value for user_id is NULL, even though this column was defined as NOT NULL. This is actually correct and has to do with user_id being an automatically incremented primary key. MySQL will often make minor changes to a column’s definition for better performance or other reasons. In phpMyAdmin, a database’s tables are listed on the left side of the browser window, under the database’s name (Figure 5.5). Click a table’s name to view its columns (Figure 5.6). ✔ Tips ■ The rest of this chapter assumes that you are using the mysql client or comparable tool and have already selected the site- name database with USE. ■ The order you list the columns when cre- ating a table has no functional impact, but there are stylistic suggestions for how to order them. I normally list the primary-key column first, followed by any foreign-key columns (more on this subject in the next chapter), followed by the rest of the columns, concluding with any date columns. ■ When creating a table, you have the option of specifying its type. MySQL sup- ports many table types, each with its own strengths and weaknesses. If you do not specify a table type, MySQL will automatically create the table using the default type for that MySQL installation. Chapter 6 discusses this in more detail. ■ When creating tables and text columns, you have the option to specify its colla- tion and character set. Both come into play when using multiple languages or languages not native to the MySQL server. Chapter 14, “Making Universal Sites,” covers these subjects. ■ DESCRIBE tablename is the same state- ment as SHOW COLUMNS FROM tablename. 126 Chapter 5 Creating Databases and Tables Figure 5.4 Confirm the existence of, and columns in, a table using the SHOW command. Figure 5.5 phpMyAdmin shows that the sitename database contains one table, named users. Figure 5.6 phpMyAdmin shows a table’s definition on this screen (accessed by clicking the table’s name in the left-hand column). Inserting Records After a database and its table(s) have been created, you can start populating them using the INSERT command. There are two ways that an INSERT query can be written. With the first method, you name the columns to be populated: INSERT INTO tablename (column1, column2 ➝ …) VALUES (value1, value2 …) INSERT INTO tablename (column4, column8) ➝ VALUES (valueX, valueY) Using this structure, you can add rows of records, populating only the columns that matter. The result will be that any columns not given a value will be treated as NULL (or given a default value, if one was defined). Note that if a column cannot have a NULL value (it was defined as NOT NULL) and does not have a default value, not specifying a value will cause an error. The second format for inserting records is not to specify any columns at all but to include values for every one: INSERT INTO tablename VALUES (value1, ➝ NULL, value2, value3, …) If you use this second method, you must specify a value, even if it’s NULL, for every col- umn. If there are six columns in the table, you must list six values. Failure to match the number of values to the number of columns will cause an error. For this and other rea- sons, the first format of inserting records is generally preferable. 127 Introduction to SQL Inserting Records Quotes in Queries In every SQL command: ◆ Numeric values shouldn’t be quoted. ◆ String values (for CHAR, VARCHAR, and TEXT column types) must always be quoted. ◆ Date and time values must always be quoted. ◆ Functions cannot be quoted. ◆ The word NULL must not be quoted. Unnecessarily quoting a numeric value normally won’t cause problems (although you still shouldn’t do it), but misusing quotation marks in the other situations will almost always mess things up. Also, it does not matter if you use single or double quotation marks, so long as you consis- tently pair them (an opening mark with a matching closing one). And, as with PHP, if you need to use a quotation mark in a value, either use the other quotation mark type to encapsulate it or escape the mark by preceding it with a backslash: INSERT INTO tablename (last_name) ➝ VALUES ('O\'Toole') continues on next page MySQL also allows you to insert multiple rows at one time, separating each record by a comma. INSERT INTO tablename (column1, column4) ➝ VALUES (valueA, valueB), (valueC, valueD), (valueE, valueF) While you can do this with MySQL, it is not acceptable within the SQL standard and is therefore not supported by all database applications. Note that in all of these examples, placeholders are used for the actual table names, column names, and values. Furthermore, the exam- ples forgo quotation marks. In real queries, you must abide by certain rules to avoid errors (see the “Quotes in Queries” sidebar). To insert data into a table: 1. Insert one row of data into the users table, naming the columns to be populated (Figure 5.7). INSERT INTO users (first_name, last_name, email, pass, ➝ registration_date) VALUES ('Larry', 'Ullman', ➝ 'email@example.com', ➝ SHA1('mypass'), NOW()); Again, this syntax (where the specific columns are named) is more foolproof but not always the most convenient. For the first name, last name, and email columns, simple strings are used for the values (and strings must always be quoted). For the password and registration date columns, two functions are being used to generate the values (see the sidebar “Two MySQL Functions”). The SHA1() function will encrypt the password (mypass in this example). The NOW() function will set the registration_date as this moment. When using any function in an SQL state- ment, do not place it within quotation marks. You also must not have any spaces between the function’s name and the fol- lowing parenthesis (so NOW() not NOW ()). 2. Insert one row of data into the users table, without naming the columns (Figure 5.8). INSERT INTO users VALUES (NULL, 'Zoe', 'Isabella', ➝ 'email2@example.com', ➝ SHA1('mojito'), NOW()); In this second syntactical example, every column must be provided with a value. The user_id column is given a NULL value, which will cause MySQL to use the next logical number, per its AUTO_INCREMENT description. In other words, the first record will be assigned a user_id of 1, the second, 2, and so on. 128 Chapter 5 Inserting Records Figure 5.7 This query inserts a single record into the users table. The 1 row affected message indicates the success of the insertion. Figure 5.8 Another record is inserted into the table, this time by providing a value for every column in the table. 3. Insert several values into the users table (Figure 5.9). INSERT INTO users (first_name, ➝ last_name, email, pass, ➝ registration_date) VALUES ('John', 'Lennon', ➝ 'john@beatles.com', ➝ SHA1('Happin3ss'), NOW()), ('Paul', 'McCartney', ➝ 'paul@beatles.com', ➝ SHA1('letITbe'), NOW()), ('George', 'Harrison', ➝ 'george@beatles.com', ➝ SHA1('something'), NOW()), ('Ringo', 'Starr', ➝ 'ringo@beatles.com', ➝ SHA1('thisboy'), NOW()); Since MySQL allows you to insert multi- ple values at once, you can take advantage of this and fill up the table with records. 129 Introduction to SQL Inserting Records continues on next page Figure 5.9 This one query—which MySQL allows but other databases will not—inserts several records into the table at once. Two MySQL Functions Although functions are discussed in more detail later in this chapter, two need to be intro- duced at this time: SHA1() and NOW(). The SHA1() function is one way to encrypt data. This function creates an encrypted string that is always exactly 40 characters long (which is why the users table’s pass column is defined as CHAR(40)). SHA1() is a one-way encryption technique, meaning that it cannot be reversed. It’s useful for storing sensitive data that need not be viewed in an unencrypted form again, but it’s obviously not a good choice for sensitive data that should be protected but later seen (like credit card numbers). SHA1() is available as of MySQL 5.0.2; if you are using an earlier version, you can use the MD5() function instead. This function does the same task, using a dif- ferent algorithm, and returns a 32-character long string (if using MD5(), your pass column could be defined as a CHAR(32) instead). The NOW() function is handy for date, time, and timestamp columns, since it will insert the cur- rent date and time (on the server) for that field. 4. Continue Steps 1 and 2 until you’ve thor- oughly populated the users table. Throughout the rest of this chapter I will be performing queries based upon the records I entered into my database. Should your database not have the same specific records as mine, change the par- ticulars accordingly. The fundamental thinking behind the following queries should still apply regardless of the data, since the sitename database has a set column and table structure. ✔ Tips ■ On the downloads page of the book’s supporting Web site (www.DMCInsights. com/phpmysql3/), you can download all of the SQL commands for the book. Using some of these commands, you can popu- late your users table exactly as I have. ■ The term INTO in INSERT statements is optional in current versions of MySQL. ■ phpMyAdmin’s INSERT tab allows you to insert records using an HTML form (Figure 5.10). 130 Chapter 5 Inserting Records Figure 5.10 phpMyAdmin’s INSERT form shows a table’s columns and provides text boxes for entering values. The pull-down menu lists functions that can be used, like SHA1() for the password or NOW() for the registration date. Selecting Data Now that the database has some records in it, you can retrieve the stored information with the most used of all SQL terms, SELECT. A SELECT query returns rows of records using the syntax SELECT which_columns FROM which_table The simplest SELECT query is SELECT * FROM tablename The asterisk means that you want to view every column. The alternative would be to specify the columns to be returned, with each separated from the next by a comma: SELECT column1, column3 FROM tablename There are a few benefits to being explicit about which columns are selected. The first is performance: There’s no reason to fetch columns you will not be using. The second is order: You can return columns in an order other than their layout in the table. Third— and you’ll see this later in the chapter—naming the columns allows you to manipulate the values in those columns using functions. To select data from a table: 1. Retrieve all the data from the users table (Figure 5.11). SELECT * FROM users; This very basic SQL command will retrieve every column of every row stored within that table. 131 Introduction to SQL Selecting Data continues on next page Figure 5.11 The SELECT * FROM tablename query returns every column for every record stored in the table. 2. Retrieve just the first and last names from users (Figure 5.12). SELECT first_name, last_name FROM users; Instead of showing the data from every column in the users table, you can use the SELECT statement to limit the results to only the fields you need. ✔ Tips ■ In phpMyAdmin, the Browse tab runs a simple SELECT query. ■ You can actually use SELECT without naming tables or columns. For example, SELECT NOW(); (Figure 5.13). ■ The order in which you list columns in your SELECT statement dictates the order in which the values are presented (com- pare Figure 5.12 with Figure 5.14). ■ With SELECT queries, you can even retrieve the same column multiple times, a fea- ture that enables you to manipulate the column’s data in many different ways. 132 Chapter 5 Selecting Data Figure 5.12 Only two of the columns for every record in the table are returned by this query. Figure 5.13 Many queries can be run without specifying a database or table. This query selects the result of calling the NOW() function, which returns the current date and time (according to MySQL). Figure 5.14 If a SELECT query specifies the columns to be returned, they’ll be displayed in that order. Using Conditionals The SELECT query as used thus far will always retrieve every record from a table. But often you’ll want to limit what rows are returned, based upon certain criteria. This can be accomplished by adding conditionals to SELECT queries. These conditionals use the SQL term WHERE and are written much as you’d write a conditional in PHP. SELECT which_columns FROM which_table ➝ WHERE condition(s) Table 5.1 lists the most common operators you would use within a conditional. For example, a simple equality check: SELECT name FROM people WHERE birth_date = '2008-01-26' The operators can be used together, along with parentheses, to create more complex expressions: SELECT * FROM items WHERE (price BETWEEN 10.00 AND 20.00) AND (quantity > 0) SELECT * FROM cities WHERE (zip_code = 90210) OR (zip_code = 90211) To demonstrate using conditionals, let’s run some more SELECT queries on the sitename database. The examples that follow will be just a few of the nearly limitless possibilities. Over the course of this chapter and the entire book you will see how conditionals are used in all types of queries. 133 Introduction to SQL Using Conditionals Operator Meaning = Equals < Less than > Greater than <= Less than or equal to >= Greater than or equal to != (also <>) Not equal to IS NOT NULL Has a value IS NULL Does not have a value BETWEEN Within a range NOT BETWEEN Outside of a range IN Found within a list of values OR (also ||) Where one of two conditionals is true AND (also &&) Where both conditionals are true NOT (also !) Where the condition is not true MySQL Operators Table 5.1 These MySQL operators are frequently (but not exclusively) used with WHERE expressions. To use conditionals: 1. Select all of the users whose last name is Simpson (Figure 5.15). SELECT * FROM users WHERE last_name = 'Simpson'; This simple query returns every column of every row whose last_name value is Simpson. 2. Select just the first names of users whose last name is Simpson (Figure 5.16). SELECT first_name FROM users WHERE last_name = 'Simpson'; Here only one column (first_name) is being returned for each row. Although it may seem strange, you do not have to select a column on which you are performing 134 Chapter 5 Using Conditionals Figure 5.15 All of the Simpsons who have registered. Figure 5.16 Just the first names of all of the Simpsons who have registered. Figure 5.17 No records are returned by this query because the email column cannot have a NULL value. So this query did work; it just had no matching records. a WHERE. The reason for this is that the columns listed after SELECT dictate only what columns to return and the columns listed in a WHERE dictate which rows to return. 3. Select every column from every record in the users table that does not have an email address (Figure 5.17). SELECT * FROM users WHERE email IS NULL; The IS NULL conditional is the same as saying does not have a value. Keep in mind that an empty string is different than NULL and therefore would not match this condition. Such a case would, how- ever, match SELECT * FROM users WHERE email=''; 4. Select the user ID, first name, and last name of all records in which the pass- word is mypass (Figure 5.18). SELECT user_id, first_name, last_name FROM users WHERE pass = SHA1('mypass'); Since the stored passwords were encrypted with the SHA1() function, you can match it by using that same encryption function in a conditional. SHA1() is case-sensitive, so this query will work only if the pass- words (stored vs. queried) match exactly. 5. Select the user names whose user ID is less than 10 or greater than 20 (Figure 5.19). SELECT first_name, last_name FROM users WHERE (user_id < 10) OR (user_id > 20); This same query could also be written as SELECT first_name, last_name FROM users WHERE user_id NOT BETWEEN 10 and 20; or even SELECT first_name, last_name FROM users WHERE user_id NOT IN (10, 11, 12, 13, 14, 15, 16, 17, 18, ➝ 19, 20); ✔ Tip ■ You can perform mathematical calculations within your queries using the mathematic addition (+), subtraction (-), multiplica- tion (*), and division (/) characters. 135 Introduction to SQL Using Conditionals Figure 5.18 Conditionals can make use of functions, like SHA1() here. Figure 5.19 This query uses two conditions and the OR operator. Using LIKE and NOT LIKE Using numbers, dates, and NULLs in condi- tionals is a straightforward process, but strings can be trickier. You can check for string equality with a query such as SELECT * FROM users WHERE last_name = 'Simpson' However, comparing strings in a more liberal manner requires extra operators and charac- ters. If, for example, you wanted to match a person’s last name that could be Smith or Smiths or Smithson, you would need a more flexible conditional. This is where the LIKE and NOT LIKE terms come in. These are used—primarily with strings—in conjunc- tion with two wildcard characters: the underscore (_), which matches a single char- acter, and the percentage sign (%), which matches zero or more characters. In the last- name example, the query would be SELECT * FROM users WHERE last_name LIKE 'Smith%' This query will return all rows whose last_name value begins with Smith. Because it’s a case-insensitive search by default, it would also apply to names that begin with smith. To use LIKE: 1. Select all of the records in which the last name starts with Bank (Figure 5.20). SELECT * FROM users WHERE last_name LIKE 'Bank%'; 136 Chapter 5 Using LIKE and NOT LIKE Figure 5.20 The LIKE SQL term adds flexibility to your conditionals. This query matches any record where the last name value begins with Bank. 2. Select the name for every record whose email address is not of the form some- thing@authors.com (Figure 5.21). SELECT first_name, last_name FROM users WHERE email NOT LIKE '%@authors.com'; To rule out the presence of values in a string, use NOT LIKE with the wildcard. ✔ Tips ■ Queries with a LIKE conditional are gen- erally slower because they can’t take advantage of indexes, so use this format only if you absolutely have to. ■ The wildcard characters can be used at the front and/or back of a string in your queries. SELECT * FROM users WHERE user_name LIKE '_smith%' ■ Although LIKE and NOT LIKE are normally used with strings, they can also be applied to numeric columns. ■ To use either the literal underscore or the percentage sign in a LIKE or NOT LIKE query, you will need to escape it (by pre- ceding the character with a backslash) so that it is not confused with a wildcard. ■ The underscore can be used in combina- tion with itself; as an example, LIKE '_ _' would find any two-letter combination. ■ In the next chapter you’ll learn about FULLTEXT searches, which can be better than LIKE searches. 137 Introduction to SQL Using LIKE and NOT LIKE Figure 5.21 A NOT LIKE conditional returns records based upon what a value does not contain. Sorting Query Results By default, a SELECT query’s results will be returned in a meaningless order. To sort them in a meaningful way, use an ORDER BY clause. SELECT * FROM tablename ORDER BY column SELECT * FROM orders ORDER BY total The default order when using ORDER BY is ascending (abbreviated ASC), meaning that numbers increase from small to large, dates go from older to most recent, and text is sorted alphabetically. You can reverse this by speci- fying a descending order (abbreviated DESC). SELECT * FROM tablename ORDER BY column DESC You can even order the returned values by multiple columns: SELECT * FROM tablename ORDER BY column1, column2 You can, and frequently will, use ORDER BY with WHERE or other clauses. When doing so, place the ORDER BY after the conditions: SELECT * FROM tablename WHERE conditions ORDER BY column To sort data: 1. Select all of the users in alphabetical order by last name (Figure 5.22). SELECT first_name, last_name FROM users ORDER BY last_name; If you compare these results with those in Figure 5.12, you’ll see the benefits of using ORDER BY. 138 Chapter 5 Sorting Query Results Figure 5.22 The records in alphabetical order by last name. Figure 5.23 The records in alphabetical order, first by last name, and then by first name within that. 2. Display all of the users in alphabetical order by last name and then first name (Figure 5.23). SELECT first_name, last_name FROM users ORDER BY last_name ASC, first_name ASC; In this query, the effect would be that every row is returned, first ordered by the last_name, and then by first_name with- in the last_names. The effect is most evi- dent among the Simpsons. 3. Show all of the non-Simpson users by date registered (Figure 5.24). SELECT * FROM users WHERE last_name != 'Simpson' ORDER BY registration_date DESC; You can use an ORDER BY on any column type, including numbers and dates. The clause can also be used in a query with a conditional, placing the ORDER BY after the WHERE. ✔ Tips ■ Because MySQL works naturally with any number of languages, the ORDER BY will be based upon the collation being used (see Chapter 14). ■ If the column that you choose to sort on contains NULL values, those will appear first, both in ascending and descending order. 139 Introduction to SQL Sorting Query Results Figure 5.24 All of the users not named Simpson, displayed by date registered, with the most recent listed first. Limiting Query Results Another SQL clause that can be added to most queries is LIMIT. In a SELECT query, WHERE dictates which records to return, and ORDER BY decides how those records are sorted, but LIMIT states how many records to return. It is used like so: SELECT * FROM tablename LIMIT x In such queries, only the initial x records from the query result will be returned. To return only three matching records, use: SELECT * FROM tablename LIMIT 3 Using this format SELECT * FROM tablename LIMIT x, y you can have y records returned, starting at x. To have records 11 through 20 returned, you would write SELECT * FROM tablename LIMIT 10, 10 Like arrays in PHP, result sets begin at 0 when it comes to LIMITs, so 10 is the 11th record. You can use LIMIT with WHERE and/or ORDER BY clauses, always placing LIMIT last. SELECT which_columns FROM tablename WHERE conditions ORDER BY column LIMIT x To limit the amount of data returned: 1. Select the last five registered users (Figure 5.25). SELECT first_name, last_name FROM users ORDER BY registration_date DESC LIMIT 5; To return the latest of anything, sort the data by date, in descending order. Then, to see just the most recent five, add LIMIT 5 to the query. 140 Chapter 5 Limiting Query Results Figure 5.25 Using the LIMIT clause, a query can return a specific number of records. 2. Select the second person to register (Figure 5.26). SELECT first_name, last_name FROM users ORDER BY registration_date ASC LIMIT 1, 1; This may look strange, but it’s just a good application of the information learned so far. First, order all of the records by registration_date ascending, so the first people to register would be returned first. Then, limit the returned results to start at 1 (which is the second row) and to return just one record. ✔ Tips ■ The LIMIT x, y clause is most frequently used when paginating query results (showing them in blocks over multiple pages). You’ll see this in Chapter 9, “Common Programming Techniques.” ■ A LIMIT clause does not improve the exe- cution speed of a query, since MySQL still has to assemble the entire result and then truncate the list. But a LIMIT clause will minimize the amount of data to han- dle when it comes to the mysql client or your PHP scripts. ■ The LIMIT term is not part of the SQL standard and is therefore (sadly) not available on all databases. ■ The LIMIT clause can be used with most types of queries, not just SELECTs. 141 Introduction to SQL Limiting Query Results Figure 5.26 Thanks to the LIMIT clause, a query can even return records from the middle of a group, using the LIMIT x, y format. Updating Data Once tables contain some data, you have the potential need to edit those existing records. This might be necessary if information was entered incorrectly or if the data changes (such as a last name or email address). The syntax for updating records is UPDATE tablename SET column=value You can alter multiple columns at a single time, separating each from the next by a comma. UPDATE tablename SET column1=valueA, column5=valueB… You will almost always want to use a WHERE clause to specify what rows should be updat- ed; otherwise, the change would be applied to every record. UPDATE tablename SET column2=value WHERE column5=value Updates, along with deletions, are one of the most important reasons to use a primary key. This value—which should never change— can be a reference point in WHERE clauses, even if every other field needs to be altered. To update a record: 1. Find the primary key for the record to be updated (Figure 5.27). SELECT user_id FROM users WHERE first_name = 'Michael' AND last_name='Chabon'; In this example, I’ll change the email for this author’s record. To do so, I must first find that record’s primary key, which this query accomplishes. 2. Update the record (Figure 5.28). UPDATE users SET email='mike@authors.com' WHERE user_id = 18; To change the email address, I use an UPDATE query, using the primary key (user_id) to specify to which record the update should apply. MySQL will report upon the success of the query and how many rows were affected. 142 Chapter 5 Updating Data Figure 5.27 Before updating a record, determine which primary key to use in the UPDATE’s WHERE clause. Figure 5.28 This query altered the value of one column in just one row. 3. Confirm that the change was made (Figure 5.29). SELECT * FROM users WHERE user_id=18; Although MySQL already indicated the update was successful (see Figure 5.28), it can’t hurt to select the record again to confirm that the proper changes occurred. ✔ Tips ■ Be extra certain to use a WHERE conditional whenever you use UPDATE unless you want the changes to affect every row. ■ If you run an update query that doesn’t actually change any values (like UPDATE users SET first_name='mike' WHERE first_name='mike'), you won’t see any errors but no rows will be affected. More recent versions of MySQL would show that X rows matched the query but that 0 rows were changed. ■ To protect yourself against accidentally updating too many rows, apply a LIMIT clause to your UPDATEs: UPDATE users SET email='mike@authors.com' WHERE user_id = 18 LIMIT 1 ■ You should never perform an UPDATE on a primary-key column, because this value should never change. Altering the value of a primary key could have serious repercussions. ■ To update a record in phpMyAdmin, you can run an UPDATE query using the SQL window or tab. Alternatively, run a SELECT query to find the record you want to update, and then click the pencil next to the record (Figure 5.30). This will bring up a form similar to Figure 5.10, where you can edit the record’s current values. 143 Introduction to SQL Updating Data Figure 5.29 As a final step, you can confirm the update by selecting the record again. Figure 5.30 A partial view of browsing records in phpMyAdmin. Click the pencil to edit a record; click the X to delete it. Deleting Data Along with updating existing records, another step you might need to take is to entirely remove a record from the database. To do this, you use the DELETE command. DELETE FROM tablename That command as written will delete every record in a table, making it empty again. Once you have deleted a record, there is no way of retrieving it. In most cases you’ll want to delete individ- ual rows, not all of them. To do so, use a WHERE clause DELETE FROM tablename WHERE condition To delete a record: 1. Find the primary key for the record to be deleted (Figure 5.31). SELECT user_id FROM users WHERE first_name='Peter' AND last_name='Tork'; Just as in the UPDATE example, I first need to determine which primary key to use for the delete. 144 Chapter 5 Deleting Data Figure 5.31 The user_id will be used to refer to this record in a DELETE query. Figure 5.32 To preview the effect of a DELETE query, first run a syntactically similar SELECT query. 2. Preview what will happen when the delete is made (Figure 5.32). SELECT * FROM users WHERE user_id = 8; A really good trick for safeguarding against errant deletions is to first run the query using SELECT * instead of DELETE. The results of this query will represent which row(s) will be affected by the deletion. 3. Delete the record (Figure 5.33). DELETE FROM users WHERE user_id = 8 LIMIT 1; As with the update, MySQL will report on the successful execution of the query and how many rows were affected. At this point, there is no way of reinstating the deleted records unless you backed up the database beforehand. Even though the SELECT query (Step 2 and Figure 5.32) only returned the one row, just to be extra careful, a LIMIT 1 clause is added to the DELETE query. 4. Confirm that the change was made (Figure 5.34). SELECT user_id, first_name, last_name FROM users ORDER BY user_id ASC; You could also confirm the change by running the query in Step 1. ✔ Tips ■ The preferred way to empty a table is to use TRUNCATE: TRUNCATE TABLE tablename ■ To delete all of the data in a table, as well as the table itself, use DROP TABLE: DROP TABLE tablename ■ To delete an entire database, including every table therein and all of its data, use DROP DATABASE databasename 145 Introduction to SQL Deleting Data Figure 5.33 Deleting one record from the table. Figure 5.34 The record whose user_id was 8 is no longer part of this table. Using Functions To wrap up this chapter, you’ll learn about a number of functions that you can use in your MySQL queries. You have already seen two—NOW() and SHA1()—but those are just the tip of the iceberg. Most of the functions you’ll see here are used with SELECT queries to format and alter the returned data, but you may use MySQL functions other types of queries as well. To apply a function to a column’s values, the query would look like SELECT FUNCTION(column) FROM tablename To apply a function to one column’s values while also selecting some other columns, you can write a query like either of these: ◆ SELECT *, FUNCTION(column) FROM ➝ tablename ◆ SELECT column1, FUNCTION(column2), ➝ column3 FROM tablename Before getting to the actual functions, make note of a couple more things. First, functions are often applied to stored data (i.e., columns) but can also be applied to literal values. Either of these applications of the UPPER() function (which capitalizes a string) is valid: SELECT UPPER(first_name) FROM users SELECT UPPER('this string') Second, while the function names them- selves are case-insensitive, I will continue to write them in an all-capitalized format, to help distinguish them from table and col- umn names (as I also capitalize SQL terms). Third, an important rule with functions is that you cannot have spaces between the function name and the opening parenthesis in MySQL, although spaces within the parentheses are acceptable. And finally, when using functions to format returned data, you’ll often want to make uses of aliases, a concept discussed in the sidebar. 146 Chapter 5 Using Functions Aliases An alias is merely a symbolic renaming of a thing in a query. Normally applied to tables, columns, or function calls, aliases provide a shortcut for referring to something. Aliases are created using the term AS: SELECT registration_date AS reg FROM users Aliases are case-sensitive strings composed of numbers, letters, and the underscore but are normally kept to a very short length. As you’ll see in the following examples, aliases are often reflected in the headings of the returned results. For the preceding sample, the query results returned will contain one column of data, named reg. If you’ve defined an alias on a table or a column, the entire query must consis- tently use that same alias rather than the original name. For example, SELECT first_name AS name FROM users ➝ WHERE name='Sam' This differs from standard SQL, which doesn’t support the use of aliases in WHERE conditionals. Text functions The first group of functions to demonstrate are those meant for manipulating text. The most common of the functions in this cate- gory are listed in Table 5.2. CONCAT(), perhaps the most useful of the text functions, deserves special attention. The CONCAT() function accomplishes con- catenation, for which PHP uses the period (see Chapter 1, “Introduction to PHP”). The syntax for concatenation requires you to place, within parentheses, the various values you want assembled, in order and separated by commas: SELECT CONCAT(t1, t2) FROM tablename While you can—and normally will—apply CONCAT() to columns, you can also incorpo- rate strings, entered within quotation marks. For example, to format a person’s name as FirstLast, you would use SELECT CONCAT(first_name, ' ', last_name) FROM users Because concatenation normally returns val- ues in a new format, it’s an excellent time to use an alias (see the sidebar): SELECT CONCAT(first_name, ' ', last_name) AS Name FROM users 147 Introduction to SQL Using Functions Function Usage Returns CONCAT() CONCAT(t1, t2, ...) A new string of the form t1t2. CONCAT_WS() CONCAT(S, t1, t2, ...) A new string of the form t1St2S… LENGTH() LENGTH(t) The number of characters in t. LEFT() LEFT(t, y) The leftmost y characters from t. RIGHT() RIGHT(t, x) The rightmost x characters from t. TRIM() TRIM(t) t with excess spaces from the beginning and end removed. UPPER() UPPER(t) t capitalized. LOWER() LOWER(t) t in all-lowercase format. SUBSTRING() SUBSTRING(t, x, y) y characters from t beginning with x (indexed from 0). Text Functions Table 5.2 Some of MySQL’s functions for working with text. As with most functions, these can be applied to either columns or literal values (both represented by t, t1, t2, etc). To format text: 1. Concatenate the names without using an alias (Figure 5.35). SELECT CONCAT(last_name, ', ', ➝ first_name) FROM users; This query will demonstrate two things. First, the users’ last names, a comma and a space, plus their first names are con- catenated together to make one string (in the format of Last, First). Second, as the figure shows, if you don’t use an alias, the returned data’s column heading will be the function call. In the mysql client or phpMyAdmin, this is just unsightly; when using PHP to connect to MySQL, this will likely be a problem. 2. Concatenate the names while using an alias (Figure 5.36). SELECT CONCAT(last_name, ', ', ➝ first_name) AS Name FROM users ORDER BY Name; To use an alias, just add AS aliasname after the item to be renamed. The alias will be the new title for the returned data. To make the query a little more interesting, the same alias is also used in the ORDER BY clause. 148 Chapter 5 Using Functions Figure 5.35 This simple concatenation returns every registered user’s full name. Notice how the column heading is the use of the CONCAT() function. Figure 5.36 By using an alias, the returned data is under the column heading of Name (compare with Figure 5.35). 3. Find the longest last name (Figure 5.37). SELECT LENGTH(last_name) AS L, last_name FROM users ORDER BY L DESC LIMIT 1; To determine which registered user’s last name is the longest (has the most char- acters in it), use the LENGTH() function. To find the name, select both the last name value and the calculated length, which is given an alias of L. To then find the longest name, order all of the results by L, in descending order, but only return the first record. ✔ Tips ■ A query like that in Step 3 (also Figure 5.37) may be useful for helping to fine-tune your column lengths once your database has some records in it. ■ MySQL has two functions for performing regular expression searches on text: REGEXP() and NOT REGEXP(). Chapter 13, “Perl-Compatible Regular Expressions,” introduces regular expressions using PHP. ■ CONCAT() has a corollary function called CONCAT_WS(), which stands for with sepa- rator. The syntax is CONCAT_WS(separator, t1, t2, …). The separator will be inserted between each of the listed columns or values. For example, to format a person’s full name as FirstMiddle Last, you would write SELECT CONCAT_WS(' ', first, middle, ➝ last) AS Name FROM tablename CONCAT_WS() has an added advantage over CONCAT() in that it will ignore columns with NULL values. So that query might return Joe Banks from one record but Jane Sojourner Adams from another. 149 Introduction to SQL Using Functions Figure 5.37 By using the LENGTH() function, an alias, an ORDER BY clause, and a LIMIT clause, this query returns the length and value of the longest stored name. Numeric functions Besides the standard math operators that MySQL uses (for addition, subtraction, mul- tiplication, and division), there are a couple dozen functions for formatting and per- forming calculations on numeric values. Table 5.3 lists the most common of these, some of which will be demonstrated shortly. I want to specifically highlight three of these functions: FORMAT(), ROUND(), and RAND(). The first—which is not technically number- specific—turns any number into a more conventionally formatted layout. For example, if you stored the cost of a car as 20198.20, FORMAT(car_cost, 2) would turn that num- ber into the more common 20,198.20. ROUND() will take one value, presumably from a column, and round that to a specified number of decimal places. If no decimal places are indicated, it will round the num- ber to the nearest integer. If more decimal places are indicated than exist in the original number, the remaining spaces are padded with zeros (to the right of the decimal point). The RAND() function, as you might infer, is used for returning random numbers (Figure 5.38). SELECT RAND() A further benefit to the RAND() function is that it can be used with your queries to return the results in a random order. SELECT * FROM tablename ORDER BY RAND() 150 Chapter 5 Using Functions Figure 5.38 The RAND() function returns a random number between 0 and 1.0. Function Usage Returns ABS() ABS(n) The absolute value of n. CEILING() CEILING(n) The next-highest integer based upon the value of n. FLOOR() FLOOR(n) The integer value of n. FORMAT() FORMAT(n1, n2) n1 formatted as a number with n2 decimal places and commas inserted every three spaces. MOD() MOD(n1, n2) The remainder of dividing n1 by n2. POW() POW(n1, n2) n1 to the n2 power. RAND() RAND() A random number between 0 and 1.0. ROUND() ROUND(n1, n2) n1 rounded to n2 decimal places. SQRT() SQRT(n) The square root of n. Numeric Functions Table 5.3 Some of MySQL’s functions for working with numbers. As with most functions, these can be applied to either columns or literal values (both represented by n, n1, n2, etc.). To use numeric functions: 1. Display a number, formatting the amount as dollars (Figure 5.39). SELECT CONCAT('$', FORMAT(5639.6, 2)) AS cost; Using the FORMAT() function, as just described, with CONCAT(), you can turn any number into a currency format as you might display it in a Web page. 2. Retrieve a random email address from the table (Figure 5.40). SELECT email FROM users ORDER BY RAND() LIMIT 1; What happens with this query is: All of the email addresses are selected; the order they are in is shuffled (ORDER BY RAND()); and then the first one is returned. Running this same query multiple times will pro- duce different random results. Notice that you do not specify a column to which RAND() is applied. ✔ Tips ■ Along with the mathematical functions listed here, there are several trigonometric, exponential, and other types of numeric functions available. ■ The MOD() function is the same as using the percent sign: SELECT MOD(9,2) SELECT 9%2 It returns the remainder of a division (1 in these examples). 151 Introduction to SQL Using Functions Figure 5.39 Using an arbitrary example, this query shows how the FORMAT() function works. Figure 5.40 Subsequent executions of the same query return different random results. Date and time functions The date and time column types in MySQL are particularly flexible and useful. But because many database users are not familiar with all of the available date and time func- tions, these options are frequently underused. Whether you want to make calculations based upon a date or return only the month name from a value, MySQL has a function for that purpose. Table 5.4 lists most of these; see the MySQL manual for a complete list. MySQL supports two data types that store both a date and a time (DATETIME and TIMESTAMP), one type that stores just the date (DATE), one that stores just the time (TIME), and one that stores just a year (YEAR). Besides allowing for different types of values, each data type also has its own unique behaviors (again, I’d recommend reading the MySQL manual’s pages on this for all of the details). But MySQL is very flexible as to which func- tions you can use with which type. You can apply a date function to any value that con- tains a date (i.e., DATETIME, TIMESTAMP, and DATE), or you can apply an hour function to any value that contains the time (i.e., DATETIME, TIMESTAMP, and TIME). MySQL will use the part of the value that it needs and ignore the rest. What you cannot do, however, is apply a date function to a TIME value or a time function to a DATE or YEAR value. 152 Chapter 5 Using Functions Function Usage Returns HOUR() HOUR(dt) The hour value of dt. MINUTE() MINUTE(dt) The minute value of dt. SECOND() SECOND(dt) The second value of dt. DAYNAME() DAYNAME(dt) The name of the day for dt. DAYOFMONTH() DAYOFMONTH(dt) The numerical day value of dt. MONTHNAME() MONTHNAME(dt) The name of the month of dt. MONTH() MONTH(dt) The numerical month value of dt. YEAR() YEAR(column) The year value of dt. CURDATE() CURDATE() The current date. CURTIME() CURTIME() The current time. NOW() NOW() The current date and time. UNIX_TIMESTAMP() UNIX_TIMESTAMP(dt) The number of seconds since the epoch until the current moment or until the date specified. Date and Time Functions Table 5.4 Some of MySQL’s functions for working with dates and times. As with most functions, these can be applied to either columns or literal values (both represented by dt, short for datetime). To use date and time functions: 1. Display the date that the last user regis- tered (Figure 5.41). SELECT DATE(registration_date) AS Date FROM users ORDER BY registration_date DESC LIMIT 1; The DATE() function returns the date part of a value. To see the date that the last person registered, an ORDER BY clause lists the users starting with the most recently registered and this result is lim- ited to just one record. 2. Display the day of the week that the first user registered (Figure 5.42). SELECT DAYNAME(registration_date) AS Weekday FROM users ORDER BY registration_date ASC LIMIT 1; This is similar to the query in Step 1 but the results are returned in ascending order and the DAYNAME() function is applied to the registration_date column. This function returns Sunday, Monday, Tuesday, etc., for a given date. 3. Show the current date and time, accord- ing to MySQL (Figure 5.43). SELECT CURDATE(), CURTIME(); To show what date and time MySQL cur- rently thinks it is, you can select the CURDATE() and CURTIME() functions, which return these values. This is another example of a query that can be run with- out referring to a particular table name. 153 Introduction to SQL Using Functions Figure 5.42 This query returns the name of the day that a given date represents. Figure 5.41 The date functions can be used to extract information from stored values. Figure 5.43 This query, not run on any particular table, returns the current date and time on the MySQL server. continues on next page 4. Show the last day of the current month (Figure 5.44). SELECT LAST_DAY(CURDATE()), MONTHNAME(CURDATE()); As the last query showed, CURDATE() returns the current date on the server. This value can be used as an argument to the LAST_DAY() function, which returns the last date in the month for a given date. The MONTHNAME() function returns the name of the current month. ✔ Tips ■ The date and time returned by MySQL’s date and time functions correspond to those on the server, not on the client accessing the database. ■ Not mentioned in this section or in Table 5.4 are ADDDATE(), SUBDATE(), ADDTIME(), and SUBTIME(). Each can be used to per- form arithmetic on date and time values. These can be very useful (for example, to find everyone registered within the past week) but their syntax is cumbersome. As always, see the MySQL manual for more information. ■ As of MySQL 5.0.2, the server will also prevent invalid dates (e.g., February 31, 2009) from being inserted into a date or date/time column. 154 Chapter 5 Using Functions Figure 5.44 Among the many things MySQL can do with date and time types is determine the last date in a month or the name value of a given date. Formatting the date and time There are two additional date and time functions that you might find yourself using more than all of the others combined: DATE_FORMAT() and TIME_FORMAT(). There is some overlap between the two and when you would use one or the other. DATE_FORMAT() can be used to format both the date and time if a value contains both (e.g., YYYY-MM-DD HH:MM:SS). Comparatively, TIME_FORMAT() can format only the time value and must be used if only the time value is being stored (e.g., HH:MM:SS). The syntax is SELECT DATE_FORMAT(datetime, formatting) The formatting relies upon combinations of key codes and the percent sign to indicate what values you want returned. Table 5.5 lists the available date- and time-formatting parameters. You can use these in any combi- nation, along with literal characters, such as punctuation, to return a date and time in a more presentable form. Assuming that a column called the_date has the date and time of 1996-04-20 11:07:45 stored in it, common formatting tasks and results would be ◆ Time (11:07:45 AM) TIME_FORMAT(the_date, '%r') ◆ Time without seconds (11:07 AM) TIME_FORMAT(the_date, '%l:%i %p') ◆ Date (April 20th, 1996) DATE_FORMAT(the_date, '%M %D, %Y') 155 Introduction to SQL Using Functions Term Usage Example %e Day of the month 1-31 %d Day of the month, 01-31 two digit %D Day with suffix 1st-31st %W Weekday name Sunday-Saturday %a Abbreviated Sun-Sat weekday name %c Month number 1-12 %m Month number, 01-12 two digit %M Month name January-December %b Month name, Jan-Dec abbreviated %Y Year 2002 %y Year 02 %l Hour 1-12 (lowercase L) %h Hour, two digit 01-12 %k Hour, 24-hour clock 0-23 %H Hour, 24-hour clock, 00-23 two digit %i Minutes 00-59 %S Seconds 00-59 %r Time 8:17:02 PM %T Time, 24-hour clock 20:17:02 %p AM or PM AM or PM *_FORMAT() Parameters Table 5.5 Use these parameters with the DATE_FORMAT() and TIME_FORMAT() functions. To format the date and time: 1. Return the current date and time as Month DD, YYYY - HH:MM (Figure 5.45). SELECT DATE_FORMAT(NOW(),'%M %e, %Y ➝ - %l:%i'); Using the NOW() function, which returns the current date and time, you can prac- tice formatting to see what results are returned. 2. Display the current time, using 24-hour notation (Figure 5.46). SELECT TIME_FORMAT(CURTIME(),'%T'); 3. Select the email address and date regis- tered, ordered by date registered, format- ting the date as Weekday (abbreviated) Month (abbreviated) Day Year, for the last five registered users (Figure 5.47). SELECT email, ➝ DATE_FORMAT(registration_date, ➝ '%a %b %e %Y') AS Date FROM users ORDER BY registration_date DESC LIMIT 5; This is just one more example of how you can use these formatting functions to alter the output of an SQL query. ✔ Tips ■ In your Web applications, you should almost always use MySQL functions to for- mat any dates coming from the database. ■ The only way to access the date or time on the client (the user’s machine) is to use JavaScript. It cannot be done with PHP or MySQL. 156 Chapter 5 Using Functions Figure 5.46 The current time, in a 24-hour format. Figure 5.47 The DATE_FORMAT() function is used to pre-format the registration date when selecting records from the users table. Figure 5.45 The current date and time, formatted. This chapter picks up where its predecessor left off, discussing more advanced SQL and MySQL topics. While the basics of both technologies will certainly get you by, it’s these more complex ideas that make sophisticated applications possible. The chapter begins by discussing database design in greater detail, using a message board as the example. More elaborate databases like a forum require SQL queries called joins, so that subject will follow. From there, the chapter introduces a category of functions that are specifically used when grouping query results. After that, the subjects turn to advanced MySQL concepts: indexes, changing the structure of existing tables, and table types. The chapter concludes with two more MySQL features: performing full text searches and transactions. 157 Advanced SQL and MySQL 6 Advanced SQL and MySQL Database Design Whenever you are working with a relational database management system such as MySQL, the first step in creating and using a data- base is to establish the database’s structure (also called the database schema). Database design, aka data modeling, is crucial for suc- cessful long-term management of informa- tion. Using a process called normalization, you carefully eliminate redundancies and other problems that will undermine the integrity of your database. The techniques you will learn over the next few pages will help to ensure the viability, use- fulness, and reliability of your databases. The specific example to be discussed—a forum where users can post messages—will be more explicitly used in Chapter 15, “Example— Message Board,” but the principles of nor- malization apply to any database you might create. (The sitename example as created in the past two chapters was properly normal- ized, even though that was never discussed.) Normalization Normalization was developed by an IBM researcher named E.F. Codd in the early 1970s (he also invented the relational database). A relational database is merely a collection of data, organized in a particular manner, and Dr. Codd created a series of rules called normal forms that help define that organiza- tion. In this chapter I will discuss the first three of the normal forms, which are suffi- cient for most database designs. Before you begin normalizing your database, you must define the role of the application being developed. Whether it means that you thoroughly discuss the subject with a client or figure it out for yourself, understanding how the information will be accessed dictates the modeling. Thus, this process will require paper and pen rather than the MySQL soft- ware itself (although database design is applicable to any relational database, not just MySQL). In this example I want to create a message board where users can post messages and other users can reply. I imagine that users will need to register, then log in with a user- name/password combination, in order to post messages. I also expect that there could be multiple forums for different subjects. I have listed a sample row of data in Table 6.1. The database itself will be called forum. ✔ Tips ■ One of the best ways to determine what information should be stored in a data- base is to think about what questions will be asked of the database and what data would be included in the answers. ■ Normalization can be hard to learn if you fixate on the little things. Each of the normal forms is defined in a very cryptic way; even when put into layman’s terms, they can still be confounding. My best advice is to focus on the big picture as you follow along. Once you’ve gone through normalization and see the end result, the overall process should be clear enough. 158 Chapter 6 Item Example username troutster password mypass actual name Larry Ullman user email email@example.com forum MySQL message subject Question about normalization. message body I have a question about… message date February 2, 2008 12:20 AM Sample Forum Data Database Design Table 6.1 Representative data for the kind of information to be stored in the database. Item Example message ID 325 username troutster password mypass actual name Larry Ullman user email email@example.com forum MySQL message subject Question about normalization. message body I have a question about… message date February 2, 2008 12:20 AM Sample Forum Data Keys As briefly mentioned in Chapter 4, “Introduction to MySQL,” keys are integral to normalized databases. There are two types of keys: primary and foreign. A pri- mary key is a unique identifier that has to abide by certain rules. They must ◆ Always have a value (they cannot be NULL) ◆ Have a value that remains the same (never changes) ◆ Have a unique value for each record in atable The best real-world example of a primary key is the U.S. Social Security number: each indi- vidual has a unique Social Security number, and that number never changes. Just as the Social Security number is an artificial con- struct used to identify people, you’ll frequently find creating an arbitrary primary key for each table to be the best design practice. The second type of key is a foreign key. Foreign keys are the representation in Table B of the primary key from Table A. If you have a cine- ma database with a movies table and a direc- tors table, the primary key from directors would be linked as a foreign key in movies. You’ll see better how this works as the nor- malization process continues. The forum database is just a simple table as it stands (Table 6.1), but before beginning the normalization process, identify at least one primary key (the foreign keys will come in later steps). To assign a primary key: 1. Look for any fields that meet the three tests for a primary key. In this example (Table 6.1), no column really fits all of the criteria for a primary key. The username and email address will be unique for each forum user but will not be unique for each record in the data- base (because the same user could post multiple messages). The same subject could be used multiple times as well. The message body will likely be unique for each message but could change (if edited), violating one of the rules of primary keys. 2. If no logical primary key exists, invent one (Table 6.2). Frequently, you will need to create a pri- mary key because no good solution pres- ents itself. In this example, a message ID is manufactured. When you create a pri- mary key that has no other meaning or purpose, it’s called a surrogate primary key. ✔ Tips ■ As a rule of thumb, I name my primary keys using at least part of the table’s name (e.g., message) and the word id. Some database developers like to add the abbreviation pk to the name as well. ■ MySQL allows for only one primary key per table, although you can base a primary key on multiple columns (this means the combination of those columns must be unique and never change). ■ Ideally, your primary key should always be an integer, which results in better MySQL performance. 159 Advanced SQL and MySQL Database Design Table 6.2 A primary key is added to the table as an easy way to reference the records. Relationships Database relationships refer to how the data in one table relates to the data in another. There are three types of relationships between any two tables: one-to-one, one-to-many, or many-to-many. (Two tables in a database may also be unrelated.) A relationship is one-to-one if one and only one item in Table A applies to one and only one item in Table B. For example, each U.S. citizen has only one Social Security num- ber, and each Social Security number applies to only one U.S. citizen; no citizen can have two Social Security numbers, and no Social Security number can refer to two citizens. A relationship is one-to-many if one item in Table A can apply to multiple items in Table B. The terms female and male will apply to many people, but each person can be only one or the other (in theory). A one-to-many relationship is the most common one between tables in normalized databases. Finally, a relationship is many-to-many if multiple items in Table A can apply to mul- tiple items in Table B. A record album can contain songs by multiple artists, and artists can make multiple albums. You should try to avoid many-to-many relationships in your design because they lead to data redundancy and integrity problems. Instead of having many-to-many relationships, properly designed databases use intermediary tables that break down one many-to-many relation- ship into two one-to-many relationships. Relationships and keys work together in that a key in one table will normally relate to a key in another, as mentioned earlier. ✔ Tips ■ Database modeling uses certain conven- tions to represent the structure of the database, which I’ll follow through a series of images in this chapter. The sym- bols for the three types of relationships are shown in Figure 6.1. ■ The process of database design results in an ERD (entity-relationship diagram) or ERM (entity-relationship model). This graphical representation of a database uses boxes for tables, ovals for columns, and the symbols from Figure 6.1 to repre- sent the relationships. ■ There are many programs available to help create a database schema, including MySQL Workbench (www.mysql.com), which is in alpha release at the time of this writing. ■ The term “relational” in RDBMS actually stems from the tables, which are techni- cally called relations. 160 Chapter 6 Database Design Figure 6.1 These symbols, or variations on them, are commonly used to represent relationships in database modeling schemes. First Normal Form As already stated, normalizing a database is the process of adjusting the database’s struc- ture according to several rules, called forms. Your database should adhere to each rule exactly, and the forms must be followed in order. Every table in a database must have the fol- lowing two qualities in order to be in First Normal Form (1NF): ◆ Each column must contain only one value (this is sometimes described as being atomic or indivisible). ◆ No table can have repeating groups of related data. A table containing one field for a person’s entire address (street, city, state, zip code, country) would not be 1NF compliant, because it has multiple values in one column, violating the first property above. As for the second, a movies table that had columns such as actor1, actor2, actor3, and so on would fail to be 1NF compliant because of the repeat- ing columns all listing the exact same kind of information. I’ll begin the normalization process by check- ing the existing structure (Table 6.2) for 1NF compliance. Any columns that are not atomic will be broken into multiple columns. If a table has repeating similar columns, then those will be turned into their own, separate table. To make a database 1NF compliant: 1. Identify any field that contains multiple pieces of information. Looking at Table 6.2, one field is not 1NF compliant: actual name. The example record contained both the first name and the last name in this one column. The message date field contains a day, a month, and a year, plus a time, but sub- dividing past that level of specificity is really not warranted. And, as the end of the last chapter shows, MySQL can handle dates and times quite nicely using the DATETIME type. Other examples of problems would be if a table used just one column for multiple phone numbers (mobile, home, work), or stored a person’s multiple interests (cook- ing, dancing, skiing, etc.) in a single column. 2. Break up any fields found in Step 1 into distinct fields (Table 6.3). To fix this problem, I’ll create separate first name and last name fields, each of which contains only one value. 161 Advanced SQL and MySQL Database Design continues on next page Item Example message ID 325 username troutster password mypass first name Larry last name Ullman user email email@example.com forum MySQL message subject Question about normalization. message body I have a question about… message date February 2, 2008 12:20 AM Forum Database, Atomic Table 6.3 The actual name column has been broken in two to store data more atomically. 3. Turn any repeating column groups into their own table. The forum database doesn’t have this problem currently, so to demonstrate what would be a violation, consider Table 6.4. The repeating columns (the multiple actor fields) introduce two problems. First of all, there’s no getting around the fact that each movie will be limited to a certain number of actors when stored this way. Even if you add columns actor 1 through actor 100, there will still be that limit (of a hundred). Second, any record that doesn’t have the maximum number of actors will have NULL values in those extra columns. You should generally avoid columns with NULL values in your database schema. As another concern, the actor and director columns are not atomic. To fix the problems in the movies table, a second table would be created (Table 6.5). This table uses one row for each actor in a movie, which solves the problems men- tioned in the last paragraph. The actor names are also broken up to be atomic. Notice as well that a primary-key column should be added to the new table. The notion that each table has a primary key is implicit in the First Normal Form. 4. Double-check that all new columns and tables created in Steps 2 and 3 pass the 1NF test. ✔ Tips ■ The simplest way to think about 1NF is that this rule analyzes a table horizontally. You inspect all of the columns within a single row to guarantee specificity and avoid repetition of similar data. ■ Various resources will describe the nor- mal forms in somewhat different ways, likely with much more technical jargon. What is most important is the spirit—and end result—of the normalization process, not the technical wording of the rules. 162 Chapter 6 Database Design Column Value movie ID 976 movie title Casablanca year released 1943 director Michael Curtiz actor 1 Humphrey Bogart actor 2 Ingrid Bergman actor 3 Peter Lorre Movies Table Table 6.4 This movies table violates the 1NF rule for two reasons. First, it has repeating columns of similar data (actor 1 etc.). Second, the actor and director columns are not atomic. ID Movie Actor First Name Actor Last Name 1 Casablanca Humphrey Bogart 2 Casablanca Ingrid Bergman 3 Casablanca Peter Lorre 4 The Maltese Falcon Humphrey Bogart 5 The Maltese Falcon Peter Lorre Movies-Actors Table Table 6.5 To make the movies table (Table 6.4) 1NF compliant, the association of actors with a movie would be made in this table. Second Normal Form For a database to be in Second Normal Form (2NF), the database must first already be in 1NF (you must normalize in order). Then, every column in the table that is not a key (i.e., a foreign key) must be dependent upon the primary key. You can normally identify a column that violates this rule when it has non-key values that are the same in multiple rows. Such values should be stored in their own table and related back to the original table through a key. Going back to the cinema example, a movies table (Table 6.4) would have the director Martin Scorsese listed twenty-plus times. This violates the 2NF rule as the column(s) that store the directors’ names would not be keys and would not be dependent upon the primary key (the movie ID). The fix is to cre- ate a separate directors table that stores the directors’ information and assigns each director a primary key. To tie the director back to the movies, the director’s primary key would also be a foreign key in the movies table. Looking at Table 6.5 (for actors in movies), both the movie name and the actor names are also in violation of the 2NF rule (they aren’t keys and they aren’t dependent on the table’s primary key). In the end, the cinema database in this minimal form requires four tables (Figure 6.2). Each director’s name, movie name, and actor’s name will be stored only once, and any non-key column in a table is dependent upon that table’s primary key. In fact, normalization could be summa- rized as the process of creating more and more tables until potential redundancies have been eliminated. 163 Advanced SQL and MySQL Database Design Figure 6.2 To make the cinema database 2NF compliant (given the information being represented), four tables are necessary. The directors are represented in the movies table through the director ID key; the movies are represented in the movies-actors table through the movie ID key; and the actors are represented in the movies-actors table through the actor ID key. To make a database 2NF compliant: 1. Identify any non-key columns that aren’t dependent upon the table’s primary key. Looking at Table 6.3, the username, first name, last name, email, and forum values are all non-keys (message ID is the only key column currently), and none are dependent upon the message ID. Conversely, the message subject, body, and date are also non-keys, but these do depend upon the message ID. 2. Create new tables accordingly (Figure 6.3). The most logical modification for the forum database is to make three tables: users, forums, and messages. In a visual representation of the database, create a box for each table, with the table name as a header and all of its columns (also called its attributes) underneath. 3. Assign or create new primary keys (Figure 6.4). Using the techniques described earlier in the chapter, ensure that each new table has a primary key. Here I’ve added a user ID field to the users table and a forum ID field to forums. These are both surrogate primary keys. Because the username field in the users table and the name field in the forums table must be unique for each record and must always have a value, you could have them act as the primary keys for their tables. However, this would mean that these values could never change (per the rules of primary keys) and the data- base will be a little slower, using text- based keys instead of numeric ones. 164 Chapter 6 Database Design Figure 6.3 To make the forum database 2NF compliant, three tables are necessary. Figure 6.4 Each table needs its own primary key. ■ A properly normalized database should never have duplicate rows in the same table (two or more rows in which the values in every non–primary key column match). ■ To simplify how you conceive of the nor- malization process, remember that 1NF is a matter of inspecting a table horizon- tally, and 2NF is a vertical analysis (hunt- ing for repeating values over multiple rows). 4. Create the requisite foreign keys and indicate the relationships (Figure 6.5). The final step in achieving 2NF compli- ance is to incorporate foreign keys to link associated tables. Remember that a primary key in one table will most likely be a foreign key in another. With this example, the user ID from the users table links to the user ID column in the messages table. Therefore, users has a one-to-many relationship with mes- sages (because each user can post multi- ple messages but each message can only be posted by one user). Also, the two forum ID columns are linked, creating a one-to-many relationship between messages and forums (each mes- sage can only be in one forum but each forum can have multiple messages). There is no relationship between the users and forums tables. 165 Advanced SQL and MySQL Database Design Figure 6.5 To relate the three tables, two foreign keys are added to the messages table, each key representing one of the other two tables. ✔ Tips ■ Another way to test for 2NF is to look at the relationships between tables. The ideal is to create one-to-many situations. Tables that have a many-to-many rela- tionship may need to be restructured. ■ Looking back at Figure 6.2, the movies- actors table is an intermediary table, which turns the many-to-many relationship between movies and actors into two one- to-many relationships. You can often tell a table is acting as an intermediary when all of its columns are keys. In fact, in this table, no ID column would be required, as the primary key could be the combina- tion of the movie ID and the actor ID. Third Normal Form A database is in Third Normal Form (3NF) if it is in 2NF and every non-key column is mutually independent. If you followed the normalization process properly to this point, you may not have 3NF issues. You would know that you have a 3NF violation if chang- ing the value in one column would require changing the value in another. In the forum example (see Figure 6.5), there aren’t any 3NF problems, but I’ll explain a hypothetical sit- uation where this rule would come into play. Take, as a common example, a single table that stores the information for a business’ clients: first name, last name, phone number, street address, city, state, zip code, and so on. Such a table would not be 3NF compliant because many of the columns would be interdependent: the street would actually be dependent upon the city; the city would be dependent upon the state; and the zip code would be an issue, too. These values are sub- servient to each other, not to the person whose record it is. To normalize this data- base, you would have to create one table for the states, another for the cities (with a for- eign key linking to the states table), and another for the zip codes. All of these would then be linked back to the clients table. If you feel that all that may be overkill, you are correct. To be frank, this higher level of nor- malization is often unnecessary. The point is that you should strive to normalize your data- bases but that sometimes you’ll make conces- sions to keep things simple (see the sidebar “Overruling Normalization”). The needs of your application and the particulars of your database will help dictate just how far into the normalization process you should go. As I said, the forum example is fine as is, but I’ll outline the 3NF steps just the same, showing how to fix the clients example just mentioned. To make a database 3NF compliant: 1. Identify any fields in any tables that are interdependent. As I just stated, what you look for are columns that depend more upon each other (like city and state) than they do on the record as a whole. In the forum database, this isn’t an issue. Just looking at the messages table, each subject will be specific to a message ID, each body will be specific to that message ID, and so forth. 2. Create new tables accordingly. If you found any problematic columns in Step 1, like city and state in a clients example, you would create separate cities and states tables. 3. Assign or create new primary keys. Every table must have a primary key, so add city ID and state ID to the new tables. 4. Create the requisite foreign keys that link any of the relationships (Figure 6.6). Finally, add a state ID to the cities table and a city ID to the clients table. This effectively links each client’s record to the city and state in which they live. 166 Chapter 6 Database Design Figure 6.6 Going with a minimal version of a hypothetical clients database, two new tables are created for storing the city and state values. ✔ Tips ■ As a general rule, I would probably not normalize the clients example to this extent. If I left the city and state fields in the Clients table, the worst thing that would happen is that a city would change its name and this fact would need to be updated for all of the users living in that city. But this—cities changing their names—is not a common occurrence. ■ Despite there being these set rules for how to normalize a database, two different people could normalize the same exam- ple in slightly different ways. Database design does allow for personal preference and interpretations. The important thing is that a database has no clear and obvi- ous NF violations. Any of those will likely lead to problems down the road. 167 Advanced SQL and MySQL Database Design Overruling Normalization As much as ensuring that a database is in 3NF will help guarantee reliability and viability, you won’t fully normalize every database with which you work. Before undermining the proper methods, though, understand that doing so may have dev- astating long-term consequences. The two primary reasons to overrule nor- malization are convenience and perform- ance. Fewer tables are easier to manipulate and comprehend than more. Further, because of their more intricate nature, nor- malized databases will most likely be slower for updating, retrieving data from, and modifying. Normalization, in short, is a trade-off between data integrity/scalability and simplicity/speed. On the other hand, there are ways to improve your database’s performance but few to remedy corrupted data that can result from poor design. Practice and experience will teach you how best to model your database, but do try to err on the side of abiding by the normal forms, particularly as you are still mastering the concept. Creating the database There are three final steps in designing the database: 1. Double-checking that all the requisite information is being stored. 2. Identifying the column types. 3. Naming all database elements. Table 6.6 shows the final database design. One column has been added to those shown in Figure 6.5. Because one message might be a reply to another, some method of indicating that relationship is required. The solution is to add a parent_id column to messages. If a message is a reply, its parent_id value will be the message_id of the original message (so message_id is acting as a foreign key in this same table). If a message has a parent_id of 0, then it’s a new thread, not a reply. If you make any changes to the tables, you must run through the normal forms one more time to ensure that the database is still normalized. In terms of choosing the column types and naming the tables and columns, this is cov- ered in Chapter 4. Once the schema is fully developed, it can be created in MySQL, using the commands shown in Chapter 5, “Introduction to SQL.” To create the database: 1. Access MySQL using whatever client you prefer. Like the preceding chapter, this one will also use the mysql client for all of its exam- ples. You are welcome to use phpMyAdmin or other tools as the interface to MySQL. 2. Create the forum database (Figure 6.7). CREATE DATABASE forum; USE forum; Depending upon your setup, you may not be allowed to create your own databases. If not, just use the provided database and add the following tables to it. 168 Chapter 6 Database Design Figure 6.7 The first steps are to create and select the database. Column Name Table Column Type forum_id forums TINYINT name forums VARCHAR(60) message_id messages INT forum_id messages TINYINT parent_id messages INT user_id messages MEDIUMINT subject messages VARCHAR(100) body messages LONGTEXT date_entered messages TIMESTAMP user_id users MEDIUMINT username users VARCHAR(30) pass users CHAR(40) first_name users VARCHAR(20) last_name users VARCHAR(40) email users VARCHAR(80) The forum Database with Types Table 6.6 The final plan for the forum database. Note that every integer column is UNSIGNED, the three primary key columns are also designated as AUTO_INCREMENT, and every column is set as NOT NULL. 3. Create the forums table (Figure 6.8). CREATE TABLE forums ( forum_id TINYINT UNSIGNED NOT NULL ➝ AUTO_INCREMENT, name VARCHAR(60) NOT NULL, PRIMARY KEY (forum_id) ); It does not matter in what order you cre- ate your tables, but I’ll make the forums table first. Remember that you can enter your SQL queries over multiple lines for convenience. This table only contains two columns (which will happen frequently in a nor- malized database). Because I don’t expect there to be a lot of forums, the primary key is a really small type (TINYINT). If you wanted to add descriptions of each forum, a VARCHAR(255) column could be added to this table. 4. Create the messages table (Figure 6.9). CREATE TABLE messages ( message_id INT UNSIGNED NOT NULL AUTO_INCREMENT, forum_id TINYINT UNSIGNED NOT NULL, parent_id INT UNSIGNED NOT NULL, user_id MEDIUMINT UNSIGNED NOT NULL, subject VARCHAR(100) NOT NULL, body LONGTEXT NOT NULL, date_entered TIMESTAMP NOT NULL, PRIMARY KEY (message_id) ); The primary key for this table has to be big, as it could have lots and lots of records. The three foreign key columns— forum_id, parent_id, and user_id—will all be the same size and type as their pri- mary key counterparts. The subject is limited to 100 characters and the body of each message can be a lot of text. The date_entered field is a TIMESTAMP type. It will store both the date and the time that a record is added, and be automatically updated to the current date and time when the record is inserted (this is how TIMESTAMP behaves). 169 Advanced SQL and MySQL Database Design Figure 6.8 Creating the first table. continues on next page Figure 6.9 Creating the second table. 5. Create the users table (Figure 6.10). CREATE TABLE users ( user_id MEDIUMINT UNSIGNED NOT NULL ›AUTO_INCREMENT, username VARCHAR(30) NOT NULL, pass CHAR(40) NOT NULL, first_name VARCHAR(20) NOT NULL, last_name VARCHAR(40) NOT NULL, email VARCHAR(80) NOT NULL, PRIMARY KEY (user_id) ); Most of the columns here mimic those in the sitename database’s users table, created in the preceding two chapters. The pass column is defined as CHAR(40), because the SHA1() function will be used and it always returns a string 40 characters long (see Chapter 5). 6. If desired, confirm the database’s struc- ture (Figure 6.11). SHOW TABLES; SHOW COLUMNS FROM forums; SHOW COLUMNS FROM messages; SHOW COLUMNS FROM users; This step is optional because MySQL reports on the success of each query as it is entered. But it’s always nice to remind yourself of a database’s structure. ✔ Tip ■ When you have a primary key–foreign key link (like forum_id in forums to forum_id in messages), both columns should be of the same type (in this case, TINYINT UNSIGNED NOT NULL). 170 Chapter 6 Database Design Figure 6.10 The database’s third and final table. Figure 6.11 Check the structure of any database or table using SHOW. Populating the database In Chapter 15, a Web-based interface to the message board will be written in PHP. That interface will be the standard way to popu- late the database (i.e., register users and post messages). But there’s still a lot to learn to get to that point, so the database has to be pop- ulated using a MySQL client application. You can follow these steps or download the SQL commands from the book’s corresponding Web site (www.DMCInsights.com/phpmysql3/, click Downloads). To populate the database: 1. Add some new records to the forums table (Figure 6.12). INSERT INTO forums (name) VALUES ('MySQL'), ('PHP'), ('Sports'), ('HTML'), ('CSS'), ('Kindling'); Since the messages table relies on values retrieved from both the forums and users tables, those two need to be populated first. With this INSERT command, only the name column must be provided a value (the table’s forum_id column will be given an automatically incremented integer by MySQL). 2. Add some records to the users table (Figure 6.13). INSERT INTO users (username, pass, first_name, last_name, email) VALUES ('troutster', SHA1('mypass'), 'Larry', 'Ullman', 'lu@example.com'), ('funny man', SHA1('monkey'), 'David', 'Brent', 'db@example.com'), ('Gareth', SHA1('asstmgr'), 'Gareth', 'Keenan', 'gk@example.com'); If you have any questions on the INSERT syntax or use of the SHA1() function here, see Chapter 5. 171 Advanced SQL and MySQL Database Design Figure 6.12 Adding records to the forums table. Figure 6.13 Adding records to the users table. continues on next page 3. Add new records to the messages table (Figure 6.14). SELECT * FROM forums; SELECT user_id, username FROM users; INSERT INTO messages (forum_id, ➝ parent_id, user_id, subject, body) ➝ VALUES (1, 0, 1, 'Question about ➝ normalization.', 'I\'m confused ➝ about normalization. For the second ➝ normal form (2NF), I read...'), (1, 0, 2, 'Database Design', 'I\'m ➝ creating a new database and am ➝ having problems with the structure. ➝ How many tables should I have?...'), (1, 2, 1, 'Database Design', 'The ➝ number of tables your database ➝ includes...'), (1, 3, 2, 'Database Design', 'Okay, ➝ thanks!'), (2, 0, 3, 'PHP Errors', 'I\'m using ➝ the scripts from Chapter 3 and I ➝ can\'t get the first calculator ➝ example to work. When I submit the ➝ form...'); Because two of the fields in the messages table (forum_id and user_id) relate to values in other tables, you need to know those values before inserting new records into this table. For example, when the troutocity user creates a new message in the MySQL forum, it will have a forum_id of 1 and a user_id of 1. This is further complicated by the parent_id column, which should store the message_id to which the new message is a reply. The second message added to the database will have a message_id of 2, so replies to that message need a parent_id of 2. With your PHP scripts—once you’ve cre- ated an interface for this database, this process will be much easier, but it’s important to comprehend the theory in SQL terms first. You should also notice here that you don’t need to enter a value for the date_entered field. MySQL will automatically insert the current date and time for this TIMESTAMP column. 4. Repeat Steps 1 through 3 to populate the database. The rest of the examples in this chapter will use the populated database. You’ll probably want to download the SQL commands from the book’s corresponding Web site, although you can populate the tables with your own examples and then just change the queries in the rest of the chapter accordingly. 172 Chapter 6 Database Design Figure 6.14 Normalized databases will often require you to know values from one table in order to enter records into another. Populating the messages table requires knowing foreign key values from users and forums. Performing Joins Because relational databases are more com- plexly structured, they sometimes require special query statements to retrieve the information you need most. For example, if you wanted to know what messages are in the kindling forum, you would need to first find the forum_id for kindling, and then use that number to retrieve all the records from the messages table that have that forum_id. This one simple (and, in a forum, often nec- essary) task would require two separate queries. By using a join, you can accomplish all of that in one fell swoop. A join is an SQL query that uses two or more tables, and produces a virtual table of results. The two main types of joins are inner and outer (there are subtypes within both). An inner join returns all of the records from the named tables wherever a match is made. For example, to find every message in the kindling forum, the inner join would be writ- ten as (Figure 6.15) SELECT * FROM messages INNER JOIN forums ON messages.forum_id = forums.forum_id WHERE forums.name = 'kindling' This join is selecting every column from both tables under two conditions. First, the forums.name column must have a value of kindling (this will return the forum_id of 6). Second, the forum_id value in the forums table must match the forum_id value in the messages table. Because of the equality com- parison being made across both tables (messages.forum_id = forums.forum_id), this is known as an equijoin. Inner joins can also be written without for- mally using the term INNER JOIN: SELECT * FROM messages, forums WHERE messages.forum_id = forums.forum_id AND forums.name = 'kindling' When selecting from multiple tables, you must use the dot syntax (table.column) if the tables named in the query have columns with the same name. This is normally the case when dealing with relational databases because a primary key from one table will have the same name as a foreign key in another. If you are not explicit when refer- encing your columns, you’ll get an error (Figure 6.16). 173 Advanced SQL and MySQL Performing Joins continues on next page Figure 6.15 This join returns every column from both tables where the forum_id values represent the kindling forum (6). Figure 6.16 Generically referring to a column name present in multiple tables will cause an ambiguity error. In this query, referring to just name instead of forums.name would be fine, but it’s still best to be precise. An outer join differs from an inner join in that an outer join could return records not matched by a conditional. There are three outer join subtypes: left, right, and full. An example of a left join is SELECT * FROM forums LEFT JOIN messages ➝ ON forums.forum_id = messages.forum_id The most important consideration with left joins is which table gets named first. In this example, all of the forums records will be returned along with all of the messages infor- mation, if a match is made. If no messages records match a forums row, then NULL values will be returned instead (Figure 6.17). In both inner and outer joins, if the column in both tables being used in the equality comparison has the same name, you can simplify your query with USING: SELECT * FROM messages INNER JOIN forums USING (forum_id) WHERE forums.name = 'kindling' SELECT * FROM forums LEFT JOIN messages ➝ USING (forum_id) Before running through some examples, two last notes. First, because of the complicated syntax with joins, the SQL concept of an alias—introduced in Chapter 5—will come in handy when writing them. Second, because joins often return so much information, it’s normally best to specify exactly what columns you want returned, instead of selecting them all (Figure 6.17, in its uncropped form, couldn’t even fit within my 22" monitor’s screen!). 174 Chapter 6 Performing Joins Figure 6.17 An outer join returns more records than an inner join because all of the first table’s records will be returned. This join returns every forum name, even if there are no messages in a forum (like Modern Dance at bottom). Also, to make it legible, I’ve cropped this image, omitting the body and date_entered columns from the result. To use joins: 1. Retrieve the forum name and message subject for every record in the messages table (Figure 6.18). SELECT f.name, m.subject FROM forums AS f INNER JOIN messages AS m USING (forum_id) ORDER BY f.name; This query, which contains an inner join, will effectively replace the forum_id value in the messages table with the correspon- ding name value from the forums table for each of the records in the messages table. The end result is that it displays the textual version of the forum name for each message subject. Notice that you can still use ORDER BY clauses in joins. 2. Retrieve the subject and date entered value for every message posted by the user funny man (Figure 6.19). SELECT m.subject, ➝ DATE_FORMAT(m.date_entered, '%M %D, ➝ %Y') AS Date FROM users AS u INNER JOIN messages AS m USING (user_id) WHERE u.username = 'funny man'; This join also uses two tables, users and messages. The linking column for the two tables is user_id, so that’s placed in the USING clause. The WHERE conditional identifies the user being targeted, and the DATE_FORMAT() function will help for- mat the date_entered value. 175 Advanced SQL and MySQL Performing Joins Figure 6.18 A basic inner join that returns only two columns of values. Figure 6.19 A slightly more complicated version of an inner join, using the users and messages tables. continues on next page 3. Retrieve the message ID, subject, and forum name for every message posted by the user troutster (Figure 6.20). SELECT m.message_id, m.subject, f.name FROM users AS u INNER JOIN messages AS m USING (user_id) INNER JOIN forums AS f USING (forum_id) WHERE u.username = 'troutster'; This join is similar to the one in Step 2, but takes things a step further by incor- porating a third table. Take note of how a three-table inner join is written and how the aliases are used for shorthand when referring to the three tables and their columns. 4. Retrieve the username, message sub- ject, and forum name for every user (Figure 6.21). SELECT u.username, m.subject, f.name FROM users AS u LEFT JOIN messages AS m USING (user_id) LEFT JOIN forums AS f USING (forum_id); If you were to run an inner join similar to this, a user who had not yet posted a message would not be listed (Figure 6.22). So an outer join is required to be inclusive of all users. Note that the fully included table (here, users), must be the first table listed in a left join. 176 Chapter 6 Performing Joins Figure 6.21 This left join returns for every user, every posted message subject, and every forum name. If a user hasn’t posted a message (like finchy at the bottom), their subject and forum name values will be NULL. Figure 6.20 An inner join across all three tables. ✔ Tips ■ You can even join a table with itself (a self-join)! ■ Joins can be created using conditionals involving any columns, not just the pri- mary and foreign keys, although that’s most common. ■ You can perform joins across multiple databases using the database.table.column syntax, as long as every database is on the same server (you cannot do this across a network) and you’re connected as a user with permission to access every database involved. ■ Joins that do not include a WHERE clause (e.g., SELECT * FROM urls, url_associations) are called full joins and will return every record from both tables. This construct can have unwieldy results with larger tables. ■ A NULL value in a column referenced in a join will never be returned, because NULL matches no other value, includ- ing NULL. 177 Advanced SQL and MySQL Performing Joins Figure 6.22 This inner join will not return any users who haven’t yet posted messages (see finchy at the bottom of Figure 6.21). Grouping Selected Results In the preceding chapter, two different clauses—ORDER BY and LIMIT—were intro- duced as ways of affecting the returned results. The former dictates the order in which the selected rows are returned; the latter dictates which of the selected rows are actually returned. This next clause, GROUP BY, is different in that it works by grouping the returned data into similar blocks of information. For example, to group all of the messages by forum, you would use SELECT * FROM messages GROUP BY forum_id The returned data is altered in that you’ve now aggregated the information instead of returned just the specific itemized records. So where you might have lots of messages in each forum, the GROUP BY would return all those messages as one row. That particular example is not particularly useful, but it demonstrates the concept. You will often use one of several aggregate functions either with a GROUP BY clause or without. Table 6.7 lists these. You can apply combinations of WHERE, ORDER BY, and LIMIT conditions to a GROUP BY, nor- mally structuring your query like this: SELECT what_columns FROM table WHERE condition GROUP BY column ORDER BY column LIMIT x, y To group data: 1. Count the number of registered users (Figure 6.23). SELECT COUNT(user_id) FROM users; COUNT() is perhaps the most popular group- ing function. With it, you can quickly count records, like the number of records in the users table here. Notice that not all queries using the aggregate functions necessarily have GROUP BY clauses. 178 Chapter 6 Grouping Selected Results Figure 6.23 This grouping query counts the number of user_id values in the users table. Function Returns AVG() The average of the values in the column. COUNT() The number of values in a column. GROUP_CONCAT() The concatenation of a column’s values. MAX() The largest value in a column. MIN() The smallest value in a column. SUM() The sum of all the values in a column. Grouping Functions Table 6.7 MySQL’s grouping functions. 3. Find the top two users that have posted the most (Figure 6.25). SELECT username, COUNT(message_id) AS Number FROM users LEFT JOIN messages AS m USING (user_id) GROUP BY (m.user_id) ORDER BY Number DESC LIMIT 2; With grouping, you can order the results as you would with any other query. Assigning the value of COUNT(*) as the alias Number facilitates this process. ✔ Tips ■ NULL is a peculiar value, and it’s interest- ing to know that GROUP BY will group NULL values together, since they have the same nonvalue. ■ You have to be careful how you apply the COUNT() function, as it only counts non- NULL values. Be certain to use it on either every column (*) or on columns that will not contain NULL values (like the primary key). That being said, if the query in Step 2 and Figure 6.24 applied COUNT() to every column (*) instead of just message_id, then users who did not post would erro- neously show a COUNT(*) of 1, because the whole query returns one row for that user. ■ The GROUP BY clause, and the functions listed here, take some time to figure out, and MySQL will report an error whenever your syntax is inapplicable. Experiment within the mysql client to determine the exact wording of any query you might want to run from a PHP script. ■ A related clause is HAVING, which is like a WHERE condition applied to a group. 179 Advanced SQL and MySQL Grouping Selected Results Figure 6.24 This GROUP BY query counts the number of times each user has posted a message. 2. Count the number of times each user has posted a message (Figure 6.24). SELECT username, COUNT(message_id) AS Number FROM users LEFT JOIN messages AS m USING (user_id) GROUP BY (m.user_id); This query is an extension of that in Step 1, but instead of counting users, it counts the number of messages associated with each user. A join allows the query to select information from both tables. An inner join is used so that users who have not yet posted will also be represented. Figure 6.25 An ORDER BY clause is added to sort the most frequent posters by their number of listings. A LIMIT clause cuts the result down to two. Creating Indexes Indexes are a special system that databases use to improve the performance of SELECT queries. Indexes can be placed on one or more columns, of any data type, effectively telling MySQL to pay particular attention to those values. MySQL allows for a minimum of 16 indexes for each table, and each index can incorpo- rate up to 15 columns. While a multicolumn index may not seem obvious, it will come in handy for searches frequently performed on the same combinations of columns (e.g., first and last name, city and state, etc.). Although indexes are an integral part of any table, not everything needs to be indexed. While an index does improve the speed of reading from databases, it slows down queries that alter data in a database (because the changes need to be recorded in the index). Indexes are best used on columns ◆ That are frequently used in the WHERE part of a query ◆ That are frequently used in an ORDER BY part of a query ◆ That are frequently used as the focal point of a join ◆ That have many different values (columns with numerous repeating values ought not to be indexed) MySQL has four types of indexes: INDEX (the standard), UNIQUE (which requires each row to have a unique value for that column), FULLTEXT (for performing FULLTEXT searches, discussed later in this chapter), and PRIMARY KEY (which is just a particular UNIQUE index and one you’ve already been using). Note that a column should only ever have a single index on it, so choose the index type that’s most appropriate. With this in mind, let’s modify the forum database tables by adding indexes to them. Table 6.8 lists the indexes to be applied to each column. Adding indexes to existing tables requires use of the ALTER command, as described in the sidebar. 180 Chapter 6 Creating Indexes Column Name Table Index Type forum_id forums PRIMARY name forums UNIQUE message_id messages PRIMARY forum_id messages INDEX parent_id messages INDEX user_id messages INDEX body/subject messages FULLTEXT date_entered messages INDEX user_id users PRIMARY username users UNIQUE pass/username users INDEX email users UNIQUE The forum Database with Indexes Table 6.8 The indexes to be used in the forum database. Not every column will be indexed, and there are two indexes created on a pair of columns: user.pass plus user.username and messages.body plus messages. subject. Altering Tables The ALTER SQL term is primarily used to modify the structure of an existing table. Commonly this means adding, deleting, or changing the columns therein, but it also includes the addition of indexes. An ALTER statement can even be used for renaming the table as a whole. While proper database design should give you the structure you need, in the real world, making alterations is commonplace. The basic syntax of ALTER is ALTER TABLE tablename CLAUSE There are many possible clauses; Table 6.9 lists the most common ones. As always, the MySQL manual covers the topic in exhaustive detail. To add an index to an existing table: 1. Add an index on the name column in the forums table (Figure 6.26). ALTER TABLE forums ADD UNIQUE(name); The forums table already has a primary key index on the forum_id. Since the name may also be a frequently referenced field and since its value should be unique for every row, add a UNIQUE index to the table. 181 Advanced SQL and MySQL Creating Indexes Table 6.9 Common variants on the ALTER command (where t represents the table’s name, c a column’s name, and i an index’s name). See the MySQL manual for the full specifications. ALTER TABLE Clauses Clause Usage Meaning ADD COLUMN ALTER TABLE t ADD COLUMN c TYPE Adds a new column to the end of the table. CHANGE COLUMN ALTER TABLE t CHANGE COLUMN c c TYPE Allows you to change the data type and prop- erties of a column. DROP COLUMN ALTER TABLE t DROP COLUMN c Removes a column from a table, including all of its data. ADD INDEX ALTER TABLE t ADD INDEX i (c) Adds a new index on c. DROP INDEX ALTER TABLE t DROP INDEX i Removes an existing index. RENAME AS ALTER TABLE t RENAME AS new_t Changes the name of a table. Figure 6.26 A unique index is placed on the name column. This will improve the efficiency of certain queries and protect against redundant entries. continues on next page 2. Add indexes to the messages table (Figure 6.27). ALTER TABLE messages ADD INDEX(forum_id), ADD INDEX(parent_id), ADD INDEX(user_id), ADD FULLTEXT(body, subject), ADD INDEX(date_entered); This table contains the most indexes, because it’s the most important table and has three foreign keys (forum_id, parent_id, and user_id), all of which should be indexed. The body and subject columns get a FULLTEXT index, to be used in FULLTEXT searches later in this chapter. The date_entered column is indexed, as it will be used in ORDER BY clauses (to sort messages by date). If you get an error message that the table type doesn’t not support FULLTEXT indexes (Figure 6.28), omit that one line from this query and then see the next section of the chapter for how to change a table’s type. 182 Chapter 6 Creating Indexes Figure 6.27 Several indexes are added to the messages table. MySQL will report on the success of the alteration and how many rows were affected (which should be every row in the table). Figure 6.28 FULLTEXT indexes cannot be used on all table types. If you see this error message, read “Using Different Table Types” in this chapter for the solution. 3. Add indexes to the users table (Figure 6.29). ALTER TABLE users ADD UNIQUE(username), ADD INDEX(pass, username), ADD UNIQUE(email); The users table has two UNIQUE indexes and one multicolumn index. UNIQUE indexes are important here because you don’t want two people trying to register with the same username (which, among other things, would make it impossible to log in), nor do you want the same user registering multiple times with the same email address. The index on the combination of the pass- word and username columns will improve the efficiency of login queries, when the combination of those two columns will be used in a WHERE conditional. 4. View the current structure of each table (Figure 6.30). DESCRIBE forums; DESCRIBE messages; DESCRIBE users; The DESCRIBE SQL term will tell you infor- mation about a table’s column names and order, column types, and index types (under Key). It also indicates whether or not a field can be NULL, what default value has been set (if any), and more. 183 Advanced SQL and MySQL Creating Indexes Figure 6.29 The requisite indexes are added to the third and final table. Figure 6.30 To view the details of a table’s structure, use DESCRIBE. The Key column indicates the indexes. continues on next page ✔ Tips ■ You’ll get an error and the index will not be created if you attempt to add a UNIQUE index to a column that has duplicate values. ■ Indexes can be named when they are created: ALTER TABLE tablename ADD INDEX indexname (columnname) If no name is provided, the index will take the name of the column to which it is applied. ■ The word COLUMN in most ALTER statements is optional. ■ Suppose you define an index on multiple columns, like this: ALTER TABLE tablename ADD INDEX (col1, col2, col3) This effectively creates an index for searches on col1, on col1 and col2 together, or on all three columns together. It does not provide an index for searching just col2 or col3 or those two together. 184 Chapter 6 Creating Indexes Using Different Table Types The MySQL database application supports several different types of tables (a table’s type is also called its storage engine). Each table type supports different features, has its own limits (in terms of how much data it can store), and even performs better or worse under certain situations. Still, how you interact with any table type—in terms of running queries—is generally consistent across them all. The most important table type is MyISAM, which is the default table type on all operating systems except for Windows. MyISAM tables are great for most applications, handling SELECTs and INSERTs very quickly. The MyISAM storage engine cannot handle transactions, though, which is its main drawback. After MyISAM, the next most common storage engine is InnoDB, which is also the default table type for Windows installations of MySQL. InnoDB tables can be used for transactions and perform UPDATEs nicely. But the InnoDB storage engine is generally slower than MyISAM and requires more disk space on the server. Also, an InnoDB table does not support FULLTEXT indexes (which is why, if you’re running Windows, you might have seen the error message in Figure 6.28). To specify the storage engine when you define a table, add a clause to the end of the crea- tion statement: CREATE TABLE tablename ( column1name COLUMNTYPE, column1name COLUMNTYPE… ) ENGINE = INNODB If you don’t specify a storage engine when creating tables, MySQL will use the default type for that MySQL server. To change the type of an existing table— which is perfectly acceptable—use an ALTER command: ALTER TABLE tablename ENGINE = MYISAM Because the next example in this chapter will require a MyISAM table, let’s run through the steps necessary for making sure that the messages table is the correct type. The first couple of steps will show you how to see the current storage engine being used (as you may not need to change the messages table’s type). 185 Advanced SQL and MySQL Using Different Table Types To change a table’s type: 1. View the current table information (Figure 6.31). SHOW TABLE STATUS; The SHOW TABLE STATUS command returns all sorts of useful information about a database’s tables. The returned result will be hard to read, though, as it is a wide table displayed over multiple lines. What you’re looking for is this: The first item on each row is the table’s name, and the second item is the table’s engine, or table type. The engine will most likely be either MyISAM (Figure 6.31) or InnoDB (Figure 6.32). 2. If necessary, change the messages table to MyISAM (Figure 6.33). ALTER TABLE messages ENGINE=MYISAM; If the results in Step 1 (Figures 6.31 and 6.32) indicate that the engine is anything other than MyISAM, you’ll need to change it over to MyISAM using this command (capitalization doesn’t matter). For me, using the default MySQL installation and configuration, changing the table’s type wasn’t necessary on Mac OS X but was on Windows. 186 Chapter 6 Using Different Table Types Figure 6.31 Before altering a table’s type, view its current type with the SHOW TABLE STATUS command. This is a cropped version of the results using MySQL on Mac OS X. Figure 6.32 The SHOW TABLE STATUS query (using MySQL on Windows) shows that all three tables are, in fact, InnoDB, not MyISAM. 3. If desired, confirm the engine change by rerunning the SHOW TABLE STATUS command. ✔ Tips ■ To make any query’s results easier to view in the mysql client, you can add the \G parameter (Figure 6.34): SHOW TABLE STATUS \G This flag states that the table of results should be displayed vertically instead of horizontally. Notice that you don’t need to use a terminating semicolon now, because the \G ends the command. ■ The same database can have tables of different types. This may be true for your forum database now (depending upon your default table type). You may also see this with an e-commerce database that uses MyISAM for customers and prod- ucts but InnoDB for orders (to allow for transactions). 187 Advanced SQL and MySQL Using Different Table Types Figure 6.34 For a more legible version of a query’s results, add the \G option in the mysql client. Figure 6.33 Successfully changing a table’s type (or storage engine) using an ALTER command. Performing FULLTEXT Searches In Chapter 5, the LIKE keyword was introduced as a way to perform somewhat simple string matches like SELECT * FROM users WHERE last_name LIKE 'Smith%' This type of conditional is effective enough but is still very limiting. For example, it would not allow you to do Google-like searches using multiple words. For those kinds of sit- uations, you need FULLTEXT searches. FULLTEXT searches require a FULLTEXT index, which itself requires a MyISAM table. These next examples will use the messages table in the forum database. If your messages table is not of the MyISAM type and/or does not have a FULLTEXT index on the body and subject columns, follow the steps in the previous few pages to make that change before proceeding. ✔ Tips ■ Inserting records into tables with FULLTEXT indexes can be much slower because of the complex index that’s required. ■ You can add FULLTEXT indexes on multiple columns, if those columns will all be used in searches. ■ FULLTEXT searches can successfully be used in a simple search engine. But a FULLTEXT index can only be applied to a single table at a time, so more elaborate Web sites, with content stored in multi- ple tables, would benefit from using more formal search engines. Performing Basic FULLTEXT Searches Once you’ve established a FULLTEXT index on a column or columns, you can start query- ing against it, using MATCH…AGAINST in a WHERE conditional: SELECT * FROM tablename WHERE MATCH (columns) AGAINST (terms) MySQL will return matching rows in order of a mathematically calculated relevance, just like a search engine. When doing so, certain rules apply: ◆ Strings are broken down into their indi- vidual keywords. ◆ Keywords less than four characters long are ignored. ◆ Very popular words, called stopwords, are ignored. ◆ If more than fifty percent of the records match the keywords, no records are returned. This last fact is problematic to many users as they begin with FULLTEXT searches and wonder why no results are retrieved. When you have a sparsely populated table, there just won’t be sufficient records for MySQL to return relevant results. 188 Chapter 6 Performing FULLTEXT Searches This is a very simple example that will return some results as long as at least one and less than fifty percent of the records in the messages table have the word “database” in their body or subject. Note that the columns referenced in MATCH must be the same as those on which the FULLTEXT index was made. In this case, you could use either body, subject or subject, body, but you could not use just body or just subject (Figure 6.36). 189 Advanced SQL and MySQL Performing FULLTEXT Searches Figure 6.35 A basic FULLTEXT search. Figure 6.36 A FULLTEXT query can only be run on the same column or combination of columns that the FULLTEXT index was created on. With this query, even though the combination of body and subject has a FULLTEXT index, attempting to run the match on just subject will fail. continues on next page To perform FULLTEXT searches: 1. Thoroughly populate the messages table, focusing on adding lengthy bodies. Once again, SQL INSERT commands can be downloaded from this book’s corre- sponding Web site. 2. Run a simple FULLTEXT search on the word database (Figure 6.35). SELECT subject, body FROM messages WHERE MATCH (body, subject) AGAINST('database'); 3. Run the same FULLTEXT search while also showing the relevance (Figure 6.37). SELECT subject, body, MATCH (body, subject) AGAINST('database') AS R FROM messages WHERE MATCH (body, ➝ subject) AGAINST('database'); If you use the same MATCH…AGAINST expression as a selected value, the actual relevance will be returned. 4. Run a FULLTEXT search using multiple keywords (Figure 6.38). SELECT subject, body FROM messages WHERE MATCH (body, subject) AGAINST('html xhtml'); With this query, a match will be made if the subject or body contains either word. Any record that contains both words will be ranked higher. ✔ Tips ■ Remember that if a FULLTEXT search returns no records, this means that either no matches were made or that over half of the records match. ■ For sake of simplicity, all of the queries in this section are simple SELECT statements. You can certainly use FULLTEXT searches within joins or more complex queries. ■ MySQL comes with several hundred stop- words already defined. These are part of the application’s source code. ■ The minimum keyword length—four characters by default—is a configuration setting you can change in MySQL. ■ FULLTEXT searches are case-insensitive by default. 190 Chapter 6 Performing FULLTEXT Searches Figure 6.37 The relevance of a FULLTEXT search can be selected, too. In this case, you’ll see that the two records with the word “database” in both the subject and body have higher relevance than the record that contains the word in just the subject. Figure 6.38 Using the FULLTEXT search, you can easily find messages that contain multiple keywords. Performing Boolean FULLTEXT Searches The basic FULLTEXT search is nice, but a more sophisticated FULLTEXT search can be accomplished using its Boolean mode. To do so, add the phrase IN BOOLEAN MODE to the AGAINST clause: SELECT * FROM tablename WHERE MATCH(columns) AGAINST('terms' IN BOOLEAN MODE) Boolean mode has a number of operators (Table 6.10) to tweak how each keyword is treated: SELECT * FROM tablename WHERE MATCH(columns) AGAINST('+database -mysql' IN BOOLEAN MODE) In that example, a match will be made if the word database is found and mysql is not present. Alternatively, the tilde (~) is used as a milder form of the minus sign, meaning that the keyword can be present in a match, but such matches should be considered less relevant. The wildcard character (*) matches variations on a word, so cata* matches catalog, catalina, and so on. Two operators explicitly state what keywords are more (>) or less (<) impor- tant. Finally, you can use double quotation marks to hunt for exact phrases and paren- theses to make subexpressions. The following query would look for records with the phrase Web develop with the word html being required and the word JavaScript detracting from a match’s relevance: SELECT * FROM tablename WHERE MATCH(columns) AGAINST('>"Web develop" +html ~JavaScript' IN BOOLEAN MODE) When using Boolean mode, there are several differences as to how FULLTEXT searches work: ◆ If a keyword is not preceded by an opera- tor, the word is optional but a match will be ranked higher if it is present. ◆ Results will be returned even if more than fifty percent of the records match the search. ◆ The results are not automatically sorted by relevance. Because of this last fact, you’ll also want to sort the returned records by their relevance, as demonstrated in the next sequence of steps. One important rule that’s the same with Boolean searches is that the minimum word length (four characters by default) still applies. So trying to require a shorter word using a plus sign (+php) still won’t work. 191 Advanced SQL and MySQL Performing FULLTEXT Searches Operator Meaning + Must be present in every match - Must not be present in any match ~ Lowers a ranking if present * Wildcard < Decrease a word’s importance > Increase a word’s importance "" Must match the exact phrase () Create subexpressions Boolean Mode Operators Table 6.10 Use these operators to fine-tune your FULLTEXT searches. To perform FULLTEXT Boolean searches: 1. Run a simple FULLTEXT search that finds HTML, XHTML, or (X)HTML (Figure 6.39). SELECT subject, body FROM messages WHERE MATCH(body, subject) AGAINST('*HTML' IN BOOLEAN MODE)\G The term HTML may appear in messages in many formats, including HTML, XHTML, or (X)HTML. This Boolean mode query will find all of those, thanks to the wildcard character (*). To make the results easier to view, I’m using the \G trick mentioned earlier in the chap- ter, which tells the mysql client to return the results vertically, not horizontally. 2. Find matches involving databases, with an emphasis on normal forms (Figure 6.40). SELECT subject, body FROM messages WHERE MATCH (body, subject) AGAINST('>"normal form"* +database*' IN BOOLEAN MODE)\G This query first finds all records that have database, databases, etc. and normal form, normal forms, etc. in them. The database* term is required (as indicated by the plus sign), but emphasis is given to the normal form clause (which is pre- ceded by the greater-than sign). 192 Chapter 6 Performing FULLTEXT Searches Figure 6.39 A simple Boolean-mode FULLTEXT search. Figure 6.40 This search looks for variations on two different keywords, ranking the one higher than the other. ✔ Tips ■ MySQL 5.1.7 added another FULLTEXT search mode: natural language. This is the default mode, if no other mode (like Boolean) is specified. ■ The WITH QUERY EXPANSION modifier can increase the number of returned results. Such queries perform two searches and return one result set. It bases a second search on terms found in the most rele- vant results of the initial search. While a WITH QUERY EXPANSION search can find results that would not otherwise have been returned, it can also return results that aren’t at all relevant to the original search terms. 193 Advanced SQL and MySQL Performing FULLTEXT Searches Database Optimization The performance of your database is pri- marily dependent upon its structure and indexes. When creating databases, try to ◆ Choose the best storage engine ◆ Use the smallest data type possible for each column ◆ Define columns as NOT NULL whenever possible ◆ Use integers as primary keys ◆ Judiciously define indexes, selecting the correct type and applying them to the right column or columns ◆ Limit indexes to a certain number of characters, if applicable Along with these tips, there are two simple techniques for optimizing databases. One way to improve MySQL’s performance is to run an OPTIMIZE command on such tables. This query will rid a table of any unnecessary overhead, thereby speeding any interactions with it. OPTIMIZE TABLE tablename Running this command is particularly beneficial after changing a table via an ALTER command. To improve a query’s efficiency, it helps to understand how exactly MySQL will run that query. This can be accomplished using the EXPLAIN SQL keyword. Explaining queries is a very advanced topic, so see the MySQL manual or search the Web for more information. Performing Transactions A database transaction is a sequence of queries run during a single session. For example, you might insert a record into one table, insert another record into another table, and maybe run an update. Without using trans- actions, each individual query takes effect immediately and cannot be undone. With transactions, you can set start and stop points and then enact or retract all of the queries as needed (for example, if one query failed, all of the queries can be undone). Commercial interactions commonly require transactions, even something as basic as transferring $100 from my bank account to yours. What seems like a simple process is actually several steps: ◆ Confirm that I have $100 in my account. ◆ Decrease my account by $100. ◆ Increase the amount of money in your account by $100. ◆ Verify that the increase worked. If any of the steps failed, I would want to undo all of them. For example, if the money couldn’t be deposited in your account, it should be returned to mine until the entire transaction can go through. To perform transactions with MySQL, you must use the InnoDB table type (or storage engine). To begin a new transaction in the mysql client, type START TRANSACTION; Once your transaction has begun, you can now run your queries. Once you have fin- ished, you can either enter COMMIT to enact all of the queries or ROLLBACK to undo the effect of all of the queries. After you have either committed or rolled back the queries, the transaction is considered complete, and MySQL returns to an autocom- mit mode. This means that any queries you execute take immediate effect. To start another transaction, just type START TRANSACTION. It is important to know that certain types of queries cannot be rolled back. Specifically those that create, alter, truncate (empty), or delete tables or that create or delete databases cannot be undone. Furthermore, using such a query has the effect of committing and ending the current transaction. Finally, you should understand that transac- tions are particular to each connection. So one user connected through the mysql client has a different transaction than another mysql client user, both of which are different than a connected PHP script. With this in mind, I’ll run through a very trivial use of transactions within the mysql client here. In Chapter 17, “Example— E-Commerce,” transactions will be run through a PHP script. 194 Chapter 6 Performing Transactions To perform transactions: 1. Connect to the mysql client and select the test database. Since this is just a demonstration, I’ll use the all-purpose test database. 2. Create a new accounts table (Figure 6.41). CREATE TABLE accounts ( id INT UNSIGNED NOT NULL ➝ AUTO_INCREMENT, name VARCHAR(40) NOT NULL, balance DECIMAL(10,2) NOT NULL ➝ DEFAULT 0.0, PRIMARY KEY (id) ) ENGINE=InnoDB; Obviously this isn’t a complete table or database design. For starters, normaliza- tion would require that the user’s name be separated into multiple columns, if not stored in a separate table altogether. But for demonstration purposes, this will be fine. The most important aspect of the table definition is its engine—InnoDB, which allows for transactions. 3. Populate the table. INSERT INTO accounts (name, balance) VALUES ('Sarah Vowell', 5460.23), ('David Sedaris', 909325.24), ('Kojo Nnamdi', 892.00); You can use whatever names and values here that you want. The important thing to note is that MySQL will automatically commit this query, as no transaction has begun yet. 195 Advanced SQL and MySQL Performing Transactions Figure 6.41 A new table is created within the test database for the purposes of demonstrating transactions. continues on next page 4. Begin a transaction and show the table’s current contents (Figure 6.42). START TRANSACTION; SELECT * FROM accounts; 5. Subtract $100 from David Sedaris’ (or any user’s) account. UPDATE accounts SET balance = (balance-100) WHERE id=2; Using an UPDATE query, a little math, and a WHERE conditional, I can subtract 100 from a balance. Although MySQL will indicate that one row was affected, the effect is not permanent until the transac- tion is committed. 6. Add $100 to Sarah Vowell’s account. UPDATE accounts SET balance = (balance+100) WHERE id=1; This is the opposite of Step 5, as if $100 were being transferred from the one per- son to the other. 7. Confirm the results (Figure 6.43). SELECT * FROM accounts; As you can see in the figure, the one bal- ance is 100 more and the other is 100 less then they originally were (Figure 6.42). 196 Chapter 6 Performing Transactions Figure 6.42 A transaction is begun and the existing table records are shown. Figure 6.43 Two UPDATE queries are executed and the results are viewed. 8. Roll back the transaction. ROLLBACK; To demonstrate how transactions can be undone, I’ll undo the effects of these queries. The ROLLBACK command returns the database to how it was prior to starting the transaction. The com- mand also terminates the transaction, returning MySQL to its autocommit mode. 9. Confirm the results (Figure 6.44). SELECT * FROM accounts; The query should reveal the contents of the table as they original were. 10. Repeat Steps 4 through 6. To see what happens when the transac- tion is committed, the two UPDATE queries will be run again. Be certain to start the transaction first, though, or the queries will automatically take effect! 11. Commit the transaction and confirm the results (Figure 6.45). COMMIT; SELECT * FROM accounts; Once you enter COMMIT, the entire trans- action is permanent, meaning that any changes are now in place. COMMIT also ends the transaction, returning MySQL to autocommit mode. 197 Advanced SQL and MySQL Performing TransactionsFigure 6.44 Because I used the ROLLBACK command, the potential effects of the UPDATE queries were ignored. Figure 6.45 Invoking the COMMIT command makes the transaction’s effects permanent. continues on next page ✔ Tips ■ One of the great features of transactions is that they offer protection should a ran- dom event occur, such as a server crash. Either a transaction is executed in its entirety or all of the changes are ignored. ■ To alter MySQL’s autocommit nature, type SET AUTOCOMMIT=0; Then you do not need to type START TRANSACTION and no queries will be per- manent until you type COMMIT (or use an ALTER, CREATE, etc., query). ■ You can create savepoints in transactions: SAVEPOINT savepoint_name; Then you can roll back to that point: ROLLBACK TO SAVEPOINT savepoint_name; 198 Chapter 6 Performing Transactions If you’re working through this book sequentially (which would be for the best), the next subject to learn is how to use PHP and MySQL together. However, that process will undoubtedly generate errors, errors that can be tricky to debug. So before moving on to new concepts, these next few pages address the bane of the programmer: errors. As you gain experience, you’ll make fewer errors and pick up your own debug- ging methods, but there are plenty of tools and techniques the beginner can use to help ease the learning process. This chapter has three main threads. One focus is on learning about the various kinds of errors that can occur when developing dynamic Web sites and what their likely causes are. Second, a multitude of debugging techniques are taught, in a step- by-step format. Finally, you’ll see different techniques for handling the errors that occur in the most graceful manner possible. Before reading on, a word regarding errors: they happen to the best of us. Even the author of this here book sees more than enough errors in his Web development duties (but rest assured that the code in this book should be bug-free). Thinking that you’ll get to a skill level where errors never occur is a fool’s dream, but there are tech- niques for minimizing errors, and knowing how to quickly catch, handle, and fix errors is a major skill in its own right. So try not to become frustrated as you make errors; instead, bask in the knowledge that you’re becoming a better debugger! 199 Error Handling and Debugging 7 Error Handling and Debugging Error Types and Basic Debugging When developing Web applications with PHP and MySQL, you end up with potential bugs in one of four or more technologies. You could have HTML issues, PHP problems, SQL errors, or MySQL mistakes. To be able to stop the bugs, you must first find the crack they’re sneaking in through. HTML problems are often the least disrup- tive and the easiest to catch. You normally know there’s a problem when your layout is all messed up. Some steps for catching and fixing these, as well as general debugging hints, are discussed in the next section. PHP errors are the ones you’ll see most often, as this language will be at the heart of your applications. PHP errors fall into three general areas: ◆ Syntactical ◆ Run time ◆ Logical Syntactical errors are the most common and the easiest to fix. You’ll see them if you merely omit a semicolon. Such errors stop the script from executing, and if display_errors is on in your PHP configuration, PHP will show an error, including the line PHP thinks it’s on (Figure 7.1). If display_errors is off, you’ll see a blank page. (You’ll learn more about display_errors later in this chapter.) Run-time errors include those things that don’t stop a PHP script from executing (like parse errors do) but do stop the script from doing everything it was supposed to do. Examples include calling a function using the wrong number or types of parameters. With these errors, PHP will normally display a mes- sage (Figure 7.2) indicating the exact prob- lem (again, assuming that display_errors is on). 200 Chapter 7 Error Types and Basic Debugging Figure 7.1 Parse errors—which you’ve probably seen many times over by now—are the most common sort of PHP error, particularly for beginning programmers. Figure 7.2 Misusing a function (calling it with improper parameters) will create errors during the execution of the script. 201 Error Handling and Debugging The final category of error—logical—is actually the worst, because PHP won’t necessarily report it to you. These are out- and-out bugs: problems that aren’t obvious and don’t stop the execution of a script. Tricks for solving all of these PHP errors will be demonstrated in just a few pages. SQL errors are normally a matter of syntax, and they’ll be reported when you try to run the query on MySQL. For example, I’ve done this many times (Figure 7.3): DELETE * FROM tablename The syntax is just wrong, a confusion with the SELECT syntax (SELECT * FROM tablename). The right syntax is DELETE FROM tablename Again, MySQL will raise a red flag when you have SQL errors, so these aren’t that difficult to find and fix. With dynamic Web sites, the catch is that you don’t always have static queries, but rather ones dynamically gener- ated by PHP. In such cases, if there’s a syntax problem, the issue is probably in your PHP code. Besides reporting on SQL errors, MySQL has its own errors to consider. An inability to access the database is a common one and a showstopper at that (Figure 7.4). You’ll also see errors when you misuse a MySQL func- tion or ambiguously refer to a column in a join. Again, MySQL will report any such error in specific detail. Keep in mind that when a query doesn’t return the records or otherwise have the result you expect, that’s not a MySQL or SQL error, but rather a logi- cal one. Toward the end of this chapter you’ll see how to solve SQL and MySQL problems. But as you have to walk before you can run, the next section covers the fundamentals of debugging dynamic Web sites, starting with the basic checks you should make and how to fix HTML problems. Basic debugging steps This first sequence of steps may seem obvi- ous, but when it comes to debugging, missing one of these steps leads to an unproductive and extremely frustrating debugging experi- ence. And while I’m at it, I should mention that the best piece of general debugging advice is this: When you get frustrated, step away from the computer! I have solved almost all of the most perplex- ing issues I’ve come across by taking a break, clearing my head, and coming back to the Error Types and Basic Debugging Figure 7.3 MySQL will report any errors found in the syntax of an SQL command. Figure 7.4 An inability to connect to a MySQL server or a specific database is a common MySQL error. continues on next page code with fresh eyes. Readers in the book’s supporting forum (www.DMCInsights.com/ phorum/) have frequently found this to be true as well. Trying to forge ahead when you’re frustrated tends to make things worse. To begin debugging any problem: ◆ Make sure that you are running the right page. It’s altogether too common that you try to fix a problem and no matter what you do, it never goes away. The reason: you’ve actually been editing a different page than you thought. ◆ Make sure that you have saved your latest changes. An unsaved document will continue to have the same problems it had before you edited it (because the edits haven’t been enacted). ◆ Make sure that you run all PHP pages through the URL. Because PHP works through a Web serv- er (Apache, IIS, etc.), running any PHP code requires that you access the page through a URL (http://www.example. com/page.php or http://localhost/ page.php). If you double-click a PHP page to open it in a browser (or use the brows- er’s File > Open option), you’ll see the PHP code, not the executed result. This also occurs if you load an HTML page without going through a URL (which will work on its own) but then submit the form to a PHP page (Figure 7.5). ◆ Know what versions of PHP and MySQL you are running. Some problems are specific to a certain version of PHP or MySQL. For example, some functions are added in later versions of PHP, and MySQL added significant new features in versions 4, 4.1, and 5. Run a phpinfo() script (Figure 7.6, see Appendix A, “Installation,” for a script example) and open a mysql client session 202 Chapter 7 Error Types and Basic Debugging Figure 7.5 PHP code will only be executed if run through a URL. This means that forms that submit to a PHP page must also be loaded through http://. (Figure 7.7) to determine this informa- tion. phpMyAdmin will often report on the versions involved as well (but don’t confuse the version of phpMyAdmin, which will likely be 2.something with the versions of PHP or MySQL). I consider the versions being used to be such an important, fundamental piece of information that I won’t normally assist people looking for help until they provide this information! 203 Error Handling and Debugging Error Types and Basic Debugging Figure 7.6 A phpinfo() script is one of your best tools for debugging, informing you of the PHP version and how it’s configured. Book Errors If you’ve followed an example in this book and something’s not working right, what should you do? 1. Double-check your code or steps against those in the book. 2. Use the index at the back of the book to see if I reference a script or function in an earlier page (you may have missed an important usage rule or tip). 3. View the PHP manual for a specific function to see if it’s available in your version of PHP and to verify how the function is used. 4. Check out the book’s errata page (through the supporting Web site, www.DMCInsights.com/ phpmysql3/) to see if an error in the code does exist and has been reported. Don’t post your particular problem there yet, though! 5. Triple-check your code and use all the debugging techniques outlined in this chapter. 6. Search the book’s supporting forum to see if others have had this problem and if a solu- tion has already been determined. 7. If all else fails, use the book’s supporting forum to ask for assistance. When you do, make sure you include all the pertinent information (version of PHP, version of MySQL, the debugging steps you took and what the results were, etc.). Figure 7.7 When you connect to a MySQL server, it should let you know the version number. continues on next page ◆ Know what Web server you are running. Similarly, some problems and features are unique to your Web serving application— Apache, IIS, or Abyss. You should know which one you are using, and which version, from when you installed the application. ◆ Try executing pages in a different Web browser. Every Web developer should have and use at least two Web browsers. If you test your pages in different ones, you’ll see if the problem has to do with your script or a particular browser. ◆ If possible, try executing the page using a different Web server. PHP and MySQL errors sometimes stem from particular configurations and ver- sions on one server. If something works on one server but not another, then you’ll know that the script isn’t inherently at fault. From there it’s a matter of using phpinfo() scripts to see what server set- tings may be different. ✔ Tips ■ If taking a break is one thing you should do when you become frustrated, here’s what you shouldn’t do: send off one or multiple panicky and persnickety emails to a writer, to a newsgroup or mailing list, or to anyone else. When it comes to asking for free help from strangers, patience and pleasantries garner much better and faster results. ■ For that matter, I would highly advise against randomly guessing at solutions. I’ve seen far too many people only com- plicate matters further by taking stabs at solutions, without a full understanding of what the attempted changes should or should not do. ■ There’s another different realm of errors that you could classify as usage errors: what goes wrong when the site’s user doesn’t do what you thought they would. These are very difficult to find on your own because it’s hard for the program- mer to use an application in a way other than she intended. As a golden rule, write your code so that it doesn’t break even if the user doesn’t do anything right! Debugging HTML Debugging HTML is relatively easy. The source code is very accessible, most prob- lems are overt, and attempts at fixing the HTML don’t normally make things worse (as can happen with PHP). Still, there are some basic steps you should follow to find and fix an HTML problem. To debug an HTML error: ◆ Check the source code. If you have an HTML problem, you’ll almost always need to check the source code of the page to find it. How you view the source code depends upon the browser being used, but normally it’s a matter of using something like View > Page Source. ◆ Use a validation tool (Figure 7.8). Validation tools, like the one at http://validator.w3.org, are great for finding mismatched tags, broken tables, and other problems. ◆ Add borders to your tables. Frequently layouts are messed up because tables are incomplete. To confirm this, add a prominent border to your table to make it obvious where the different columns and rows are. 204 Chapter 7 Error Types and Basic Debugging ✔ Tip ■ The first step toward fixing any kind of problem is understanding what’s causing it. Remember the role each technology— HTML, PHP, SQL, and MySQL—plays as you debug. If your page doesn’t look right, that’s an HTML problem. If your HTML is dynamically generated by PHP, it’s still an HTML problem but you’ll need to work with the PHP code to make it right. ◆ Use Firefox or Opera. I’m not trying to start a discussion on which is the best Web browser, and as Internet Explorer is the most used one, you’ll need to eventually test using it, but I personally find that Firefox (available for free from www.mozilla.com) and Opera (available for free from www.opera.com) are the best Web browsers for Web devel- opers. They offer reliability and debugging features not available in other browsers. If you want to stick with IE or Safari for your day-to-day browsing, that’s up to you, but when doing Web development, start with either Firefox or Opera. ◆ Use Firefox’s add-on widgets (Figure 7.9). Besides being just a great Web browser, the very popular Firefox browser has a ton of features that the Web developer will appreciate. Furthermore, you can expand Firefox’s functionality by installing any of the free widgets that are available. The Web Developer widget in particular provides quick access to great tools, such as showing a table’s borders, revealing the CSS, validating a page, and more. I also frequently use these add-ons: DOM Inspector, Firebug, and HTML Validator, among others. ◆ Test the page in another browser. PHP code is generally browser-independ- ent, meaning you’ll get consistent results regardless of the client. Not so with HTML. Sometimes a particular browser has a quirk that affects the rendered page. Running the same page in another browser is the easiest way to know if it’s an HTML problem or a browser quirk. 205 Error Handling and Debugging Error Types and Basic Debugging Figure 7.8 Validation tools like the one provided by the W3C (World Wide Web Consortium) are good for finding problems and making sure your HTML conforms to standards. Figure 7.9 Firefox’s Web Developer widget provides quick access to lots of useful tools. Displaying PHP Errors PHP provides remarkably useful and descrip- tive error messages when things go awry. Unfortunately, PHP doesn’t show these errors when running using its default configuration. This policy makes sense for live servers, where you don’t want the end users seeing PHP- specific error messages, but it also makes everything that much more confusing for the beginning PHP developer. To be able to see PHP’s errors, you must turn on the display_ errors directive, either in an individual script or for the PHP configuration as a whole. To turn on display_errors in a script, use the ini_set() function. As its arguments, this function takes a directive name and what setting that directive should have: ini_set('display_errors', 1); Including this line in a script will turn on display_errors for that script. The only downside is that if your script has a syntax error that prevents it from running at all, then you’ll still see a blank page. To have PHP display errors for the entire server, you’ll need to edit its configuration, as is discussed in the “Configuring PHP” section of Appendix A. To turn on display_errors: 1. Create a new PHP document in your text editor or IDE (Script 7.1). 206 Chapter 7 Displaying PHP Errors 1 3 4 5 6 Display Errors 7 8 9

Testing Display Errors

10 20 21 Script 7.1 The ini_set() function can be used to tell a PHP script to reveal any errors that might occur. Display Errors 5. Save the file as display_errors.php, place it in your Web directory, and test it in your Web browser (Figure 7.10). 6. If you want, change the first line of PHP code to read ini_set('display_errors', 0); and then save and retest the script (Figure 7.11). ✔ Tips ■ There are limits as to what PHP settings the ini_set() function can be used to adjust. See the PHP manual for specifics as to what can and cannot be changed using it. ■ As a reminder, changing the display_ errors setting in a script only works so long as that script runs (i.e., it cannot have any parse errors). To be able to always see any errors that occur, you’ll need to enable display_errors in PHP’s configuration file (again, see the appendix). 207 Error Handling and Debugging Displaying PHP Errors Figure 7.10 With display_errors turned on (for this script), the page reports the errors when they occur. Figure 7.11 With display_errors turned off (for this page), the same errors (Script 7.1 and Figure 7.10) are no longer reported. Unfortunately, they still exist. Adjusting Error Reporting in PHP Once you have PHP set to display the errors that occur, you might want to adjust the level of error reporting. Your PHP installa- tion as a whole, or individual scripts, can be set to report or ignore different types of errors. Table 7.1 lists most of the levels, but they can generally be one of these three kinds: ◆ Notices, which do not stop the execution of a script and may not necessarily be a problem. ◆ Warnings, which indicate a problem but don’t stop a script’s execution. ◆ Errors, which stop a script from continu- ing (including the ever-common parse error, which prevent scripts from running at all). As a rule of thumb, you’ll want PHP to report on any kind of error while you’re developing a site but report no specific errors once the site goes live. For security and aesthetic purposes, it’s generally unwise for a public user to see PHP’s detailed error messages. Frequently, error messages—particularly those dealing with the database—will reveal 208 Chapter 7 Adjusting Error Reporting in PHP Number Constant Report On 1 E_ERROR Fatal run-time errors (that stop execution of the script) 2 E_WARNING Run-time warnings (non-fatal errors) 4 E_PARSE Parse errors 8 E_NOTICE Notices (things that could or could not be a problem) 256 E_USER_ERROR User-generated error messages, generated by the trigger_error() function 512 E_USER_WARNING User-generated warnings, generated by the trigger_error() function 1024 E_USER_NOTICE User-generated notices, generated by the trigger_error() function 2048 E_STRICT Recommendations for compatibility and interoperability 8191 E_ALL All errors, warnings, and recommendations Error-Reporting Levels Table 7.1 PHP’s error-reporting settings, to be used with the error_reporting() function or in the php.ini file. Note that E_ALL’s number value was different in earlier versions of PHP and did not include E_STRICT (it does in PHP 6). Suppressing Errors with @ Individual errors can be suppressed in PHP using the @ operator. For example, if you don’t want PHP to report if it couldn’t include a file, you would code @include ('config.inc.php'); Or if you don’t want to see a “division by zero” error: $x = 8; $y = 0; $num = @($x/$y); The @ symbol will work only on expres- sions, like function calls or mathematical operations. You cannot use @ before con- ditionals, loops, function definitions, and so forth. As a rule of thumb, I recommend that @ be used on functions whose execution, should they fail, will not affect the func- tionality of the script as a whole. Or you can suppress PHP’s errors when you will handle them more gracefully yourself (a topic discussed later in this chapter). certain behind-the-scenes aspects of your Web application that are best not shown. While you hope all of these will be worked out during the development stages, that may not be the case. You can universally adjust the level of error reporting following the instructions in Appendix A. Or you can adjust this behavior on a script-by-script basis using the error_reporting() function. This function is used to establish what type of errors PHP should report on within a specific page. The function takes either a number or a con- stant, using the values in Table 7.1 (the PHP manual lists a few others, related to the core of PHP itself). error_reporting(0); // Show no errors. A setting of 0 turns error reporting off entirely (errors will still occur; you just won’t see them anymore). Conversely, error_reporting (E_ALL) will tell PHP to report on every error that occurs. The num- bers can be added up to customize the level of error reporting, or you could use the bit- wise operators—| (or), ~ (not), & (and)—with the constants. With this following setting any non-notice error will be shown: error_reporting (E_ALL & ~E_NOTICE); To adjust error reporting: 1. Open display_errors.php (Script 7.1) in your text editor or IDE. To play around with error reporting levels, use display_errors.php as an example. 2. After adjust the display_errors setting, add (Script 7.2) error_reporting (E_ALL); For development purposes, have PHP notify you of all errors, notices, warnings, and recommendations. This line will 209 Error Handling and Debugging Adjusting Error Reporting in PHP 1 3 4 5 6 Report Errors 7 8 9

Testing Error Reporting

10 23 24 Script 7.2 This script will demonstrate how error reporting can be manipulated in PHP. continues on next page accomplish that. In short, PHP will let you know about anything that is, or may be, a problem. Because E_ALL is a constant, it is not enclosed in quotation marks. 3. Save the file as report_errors.php, place it in your Web directory, and run it in your Web browser (Figure 7.12). I also altered the page’s title and the heading, but both are immaterial to the point of this exercise. 4. Change the level of error reporting to something different and retest (Figures 7.13 and 7.14). ✔ Tips ■ Because you’ll often want to adjust the display_errors and error_reporting for every page in a Web site, you might want to place those lines of code in a separate PHP file that can then be included by other PHP scripts. ■ In case you are curious, the scripts in this book were all written with PHP’s error reporting on the highest level (with the intention of catching every possible problem). 210 Chapter 7 Adjusting Error Reporting in PHP Figure 7.12 On the highest level of error reporting, PHP has two warnings and one notice for this page (Script 7.2). Figure 7.13 The same page (Script 7.2) after disabling the reporting of notices. Figure 7.14 The same page again (Script 7.2) with error reporting turned off (set to 0). The result is the same as if display_errors was disabled. Of course, the errors still occur; they’re just not being reported. Creating Custom Error Handlers Another option for error management with your sites is to alter how PHP handles errors. By default, if display_errors is enabled and an error is caught (that falls under the level of error reporting), PHP will print the error, in a somewhat simplistic form, within some minimal HTML tags (Figure 7.15). You can override how errors are handled by creating your own function that will be called when errors occur. For example, function report_errors (arguments) { // Do whatever here. } set_error_handler ('report_errors'); The set_error_handler() function is used to name the function to be called when an error occurs. The handling function (report_ errors, in this case) will, at that time, receive several values that can be used in any possi- ble manner. This function can be written to take up to five arguments. In order, these arguments are: an error number (corresponding to Table 7.1), a textual error message, the name of the file where the error was found, the specific line number on which it occurred, and the variables that existed at the time of the error. Defining a function that accepts all these arguments might look like function report_errors ($num, $msg, $file, $line, $vars) {… To make use of this concept, the report_ errors.php file (Script 7.2) will be rewritten one last time. 211 Error Handling and Debugging Creating Custom Error Handlers Figure 7.15 The HTML source code for the errors shown in Figure 7.12. To create your own error handler: 1. Open report_errors.php (Script 7.2) in your text editor or IDE. 2. Remove the ini_set() and error_ reporting() lines (Script 7.3). When you establish your own error han- dling function, the error reporting levels no longer have any meaning, so that line can be removed. Adjusting the display_ errors setting is also meaningless, as the error handling function will control whether errors are displayed or not. 3. Before the script creates the errors, add define ('LIVE', FALSE); This constant will be a flag used to indi- cate whether or not the site is currently live. It’s an important distinction, as how you handle errors and what you reveal in the browser should differ greatly when you’re developing a site and when a site is live. This constant is being set outside of the function for two reasons. First, I want to treat the function as a black box that does what I need it to do without having to go in and tinker with it. Second, in many sites, there might be other settings (like the database connectivity information) that are also live versus development- specific. Conditionals could, therefore, also refer to this constant to adjust those settings. 4. Begin defining the error handling function. function my_error_handler ($e_number, ➝ $e_message, $e_file, $e_line, ➝ $e_vars) { The my_error_handler() function is set to receive the full five arguments that a custom error handler can. 212 Chapter 7 Creating Custom Error Handlers 1 3 4 5 6 Handling Errors 7 8 9

Testing Error Handling

10 ' . $message . "\n"; 26 debug_print_backtrace(); 27 echo '
'; 28 } else { // Don't show the error. Script 7.3 By defining your own error handling function, you can customize how errors are treated in your PHP scripts. (script continues on next page) 5. Create the error message using the received values. $message = "An error occurred in ➝ script '$e_file' on line $e_line: ➝ $e_message\n"; The error message will begin by referenc- ing the filename and number where the error occurred. Added to this is the actu- al error message. All of these values are passed to the function when it is called (when an error occurs). 6. Add any existing variables to the error message. $message .= print_r ($e_vars, 1); The $e_vars variable will receive all of the variables that exist, and their values, when the error happens. Because this might contain useful debugging informa- tion, it’s added to the message. The print_r() function is normally used to print out a variable’s structure and value; it is particularly useful with arrays. If you call the function with a second argument (1 or TRUE), the result is returned instead of printed. So this line adds all of the variable information to $message. 7. Print a message that will vary, depending upon whether or not the site is live. if (!LIVE) { echo '
' . $message . "\n";  debug_print_backtrace();  echo '

'; } else { echo '
A ➝ system error occurred. We ➝ apologize for the ➝ inconvenience.

'; } 213 Error Handling and Debugging Creating Custom Error Handlers 29 echo '
A system error occurred. We apologize for the inconvenience.

'; 30 } 31 32 } // End of my_error_handler() definition. 33 34 // Use my error handler: 35 set_error_handler ('my_error_handler'); 36 37 // Create errors: 38 foreach ($var as $v) {} 39 $result = 1/0; 40 41 ?> 42 43 Script 7.3 continued continues on next page If the site is not live (if LIVE is false), which would be the case while the site is being developed, a detailed error message should be printed (Figure 7.16). For ease of viewing, the error message is printed within HTML PRE tags (which aren’t XHMTL valid but are very helpful here). Furthermore, a useful debugging func- tion, debug_print_backtrace(), is also called. This function returns a slew of information about what functions have been called, what files have been includ- ed, and so forth. If the site is live, a simple mea culpa will be printed, letting the user know that an error occurred but not what the specific problem is (Figure 7.17). Under this situation, you could also use the error_ log() function (see the sidebar) to have the detailed error message emailed or written to a log. 8. Complete the function and tell PHP to use it. } set_error_handler('my_error_handler' ➝ ); This second line is the important one, telling PHP to use the custom error handler instead of PHP’s default handler. 9. Save the file as handle_errors.php, place it in your Web directory, and test it in your Web browser (Figure 7.16). 10. Change the value of LIVE to TRUE, save, and retest the script (Figure 7.17). To see how the error handler behaves with a live site, just change this one value. 214 Chapter 7 Creating Custom Error Handlers Figure 7.16 During the development phase, detailed error messages are printed in the Web browser. (In a more real-world script, with more code, the messages would be more useful.) ✔ Tips ■ If your PHP page uses special HTML for- matting—like CSS tags to affect the lay- out and font treatment—add this infor- mation to your error reporting function. ■ Obviously in a live site you’ll probably need to do more than apologize for the inconvenience (particularly if the error significantly affects the page’s functional- ity). Still, this example demonstrates how you can easily adjust error handling to suit the situation. ■ If you don’t want the error handling function to report on every notice, error, or warning, you could check the error number value (the first argument sent to the function). For example, to ignore notices when the site is live, you would change the main conditional to if (!LIVE) { echo '
' . $message . "\n";  debug_print_backtrace();  echo '

'; } elseif ($e_number != E_NOTICE) { echo '
A ➝ system error occurred. We ➝ apologize for the ➝ inconvenience.

'; } ■ You can invoke your error handling func- tion using trigger_error(). 215 Error Handling and Debugging Creating Custom Error Handlers Figure 7.17 Once a site has gone live, more user- friendly (and less revealing) errors are printed. Here, one message is printed for each of the three errors in the script. Logging PHP Errors In Script 7.3, errors are handled by simply printing them out in detail or not. Another option is to log the errors: make a permanent note of them somehow. For this purpose, the error_log() function instructs PHP how to file an error. It’s syntax is error_log (message, type, ➝ destination, extra headers); The message value should be the text of the logged error (i.e., $message in Script 7.3). The type dictates how the error is logged. The options are the numbers 0 through 3: use the computer’s default log- ging method (0), send it in an email (1), send to a remote debugger (2), or write it to a text file (3). The destination parameter can be either the name of a file (for log type 3) or an email address (for log type 1). The extra headers argument is used only when sending emails (log type 1). Both the des- tination and extra headers are optional. PHP Debugging Techniques When it comes to debugging, what you’ll best learn from experience are the causes of certain types of errors. Understanding the common causes will shorten the time it takes to fix errors. To expedite the learning process, Table 7.2 lists the likely reasons for the most common PHP errors. The first, and most common, type of error that you’ll run across is syntactical and will prevent your scripts from executing. An error like this will result in messages like the one in Figure 7.18, which every PHP devel- oper has seen too many times. To avoid making this sort of mistake when you pro- gram, be sure to: ◆ End every statement (but not language constructs like loops and conditionals) with a semicolon. ◆ Balance all quotation marks, parenthe- ses, curly braces, and square brackets (each opening character must be closed). ◆ Be consistent with your quotation marks (single quotes can be closed only with single quotes and double quotes with double quotes). ◆ Escape, using the backslash, all single- and double-quotation marks within strings, as appropriate. One thing you should also understand about syntactical errors is that just because the PHP error message says the error is occur- ring on line 12, that doesn’t mean that the mistake is actually on that line. At the very least, it is not uncommon for there to be 216 Chapter 7 PHP Debugging Techniques Error Likely Cause Blank Page HTML problem, or PHP error and display_errors or error_reporting is off. Parse error Missing semicolon; unbalanced curly braces, parentheses, or quotation marks; or use of an unescaped quotation mark in a string. Empty variable value Forgot the initial $, misspelled or miscapitalized the variable name, or inappropriate variable scope (with functions). Undefined variable Reference made to a variable before it is given a value or an empty variable value (see those potential causes). Call to undefined function Misspelled function name, PHP is not configured to use that function (like a MySQL function), or document that contains the function definition was not included. Cannot redeclare function Two definitions of your own function exist; check within included files. Headers already sent White space exists in the script before the PHP tags, data has already been printed, or a file has been included. Common PHP Errors Table 7.2 These are some of the most common errors you’ll see in PHP, along with their most probable causes. Figure 7.18 The parse error prevents a script from running because of invalid PHP syntax. This one was caused by failing to enclose $array['key'] within curly braces when printing its value. a difference between what PHP thinks is line 12 and what your text editor indicates is line 12. So while PHP’s direction is useful in tracking down a problem, treat the line number referenced as more of a starting point than an absolute. If PHP reports an error on the last line of your document, this is almost always because a mismatched parenthesis, curly brace, or quotation mark was not caught until that moment. The second type of error you’ll encounter results from misusing a function. This error occurs, for example, when a function is called without the proper arguments. This error is discovered by PHP when attempting to execute the code. In later chapters you’ll probably see such errors when using the header() function, cookies, or sessions. To fix errors, you’ll need to do a little detec- tive work to see what mistakes were made and where. For starters, though, always thor- oughly read and trust the error message PHP offers. Although the referenced line number may not always be correct, a PHP error is very descriptive, normally helpful, and almost always 100 percent correct. To debug your scripts: ◆ Turn on display_errors. Use the earlier steps to enable display_ errors for a script, or, if possible, the entire server, as you develop your applications. ◆ Use comments. Just as you can use comments to docu- ment your scripts, you can also use them to rule out problematic lines. If PHP is giving you an error on line 12, then com- menting out that line should get rid of the error. If not, then you know the error is elsewhere. Just be careful that you don’t introduce more errors by improper- ly commenting out only a portion of a code block: the syntax of your scripts must be maintained. ◆ Use the print() and echo() functions. In more complicated scripts, I frequently use echo() statements to leave me notes as to what is happening as the script is executed (Figure 7.19). When a script has several steps, it may not be easy to know if the problem is occurring in step 2 or step 5. By using an echo() statement, you can narrow the problem down to the specific juncture. 217 Error Handling and Debugging PHP Debugging Techniques Figure 7.19 More complex debugging can be accomplished by leaving yourself notes as to what the script is doing. continues on next page ◆ Check what quotation marks are being used for printing variables. It’s not uncommon for programmers to mistakenly use single quotation marks and then wonder why their variables are not printed properly. Remember that sin- gle quotation marks treat text literally and that you must use double quotation marks to print out the values of variables. ◆ Track variables (Figure 7.20). It is pretty easy for a script not to work because you referred to the wrong vari- able or the right variable by the wrong name or because the variable does not have the value you would expect. To check for these possibilities, use the print() or echo() statements to print out the values of variables at important points in your scripts. This is simply a matter of echo "

\$var = $var

\n"; The first dollar sign is escaped so that the variable’s name is printed. The sec- ond reference of the variable will print its value. ◆ Print array values. For more complicated variable types (arrays and objects), the print_r() and var_dump() functions will print out their values without the need for loops. Both functions accomplish the same task, although var_dump() is more detailed in its reporting than print_r(). 218 Chapter 7 PHP Debugging Techniques Figure 7.20 Printing the names and values of variables is the easiest way to track them over the course of a script. ✔ Tips ■ Many text editors include utilities to check for balanced parentheses, brackets, and quotation marks. ■ If you cannot find the parse error in a complex script, begin by using the /* */ comments to render the entire PHP code inert. Then continue to uncomment sec- tions at a time (by moving the opening or closing comment characters) and rerun the script until you deduce what lines are causing the error. Watch how you comment out control structures, though, as the curly braces must contin- ue to be matched in order to avoid parse errors. For example: if (condition) { /* Start comment. Inert code. End comment. */ } ■ To make the results of print_r() more readable in the Web browser, wrap it within HTML
 (preformatted) tags.  This one line is my absolute favorite  debugging tool:  echo '
' . print_r ($var, 1) .   ➝ '
'; 219 Error Handling and Debugging PHP Debugging Techniques Using die() and exit() Two functions that are often used with error management are die() and exit(), (they’re technically language constructs, not functions, but who cares?). When a die() or exit() is called in your script, the entire script is terminated. Both are useful for stopping a script from continu- ing should something important—like establishing a database connection— fail to happen. You can also pass die() and exit() a string that will be printed out in the browser. You’ll commonly see die() or exit() used in an OR conditional. For example: include('config.inc.php') OR die ➝ ('Could not open the file. '); With a line like that, if PHP could not include the configuration file, the die() statement will be executed and the “Could not open the file.” message will be printed. You’ll see variations on this throughout this book and in the PHP manual, as it’s a quick (but potentially excessive) way to handle errors without using a custom error handler. SQL and MySQL Debugging Techniques The most common SQL errors are caused by the following issues: ◆ Unbalanced use of quotation marks or parentheses ◆ Unescaped apostrophes in column values ◆ Misspelling a column name, table name, or function ◆ Ambiguously referring to a column in a join ◆ Placing a query’s clauses (WHERE, GROUP BY, ORDER BY, LIMIT) in the wrong order Furthermore, when using MySQL you can also run across the following: ◆ Unpredictable or inappropriate query results ◆ Inability to access the database Since you’ll be running the queries for your dynamic Web sites from PHP, you need a methodology for debugging SQL and MySQL errors within that context (PHP will not report a problem with your SQL). Debugging SQL problems To decide if you are experiencing a MySQL (or SQL) problem rather than a PHP one, you need a system for finding and fixing the issue. Fortunately, the steps you should take to debug MySQL and SQL problems are easy to define and should be followed without thinking. If you ever have any MySQL or SQL errors to debug, just abide by this sequence of steps. To hammer the point home, this next sequence of steps is probably the most useful debugging technique in this chapter and the entire book. You’ll likely need to follow these steps in any PHP-MySQL Web application when you’re not getting the results you expected. To debug your SQL queries: 1. Print out any applicable queries in your PHP script (Figure 7.21). As you’ll see in the next chapter, SQL queries will often be assigned to a vari- able, particularly when you use PHP to dynamically write them. Using the code echo $query (or whatever the query vari- able is called) in your PHP scripts, you can send to the browser the exact query being run. Sometimes this step alone will help you see what the real problem is. 220 Chapter 7 SQL and MySQL Debugging Techniques Figure 7.21 Knowing exactly what query a PHP script is attempting to execute is the most useful first step for solving SQL and MySQL problems. 2. Run the query in the mysql client or other tool (Figure 7.22). The most foolproof method of debugging an SQL or MySQL problem is to run the query used in your PHP scripts through an independent application: the mysql client, phpMyAdmin, or the like. Doing so will give you the same result as the original PHP script receives but without the overhead and hassle. If the independent application returns the expected result but you are still not getting the proper behavior in your PHP script, then you will know that the prob- lem lies within the script itself, not your SQL or MySQL database. 3. If the problem still isn’t evident, rewrite the query in its most basic form, and then keep adding dimensions back in until you discover which clause is caus- ing the problem. Sometimes it’s difficult to debug a query because there’s too much going on. Like commenting out most of a PHP script, taking a query down to its bare mini- mum structure and slowly building it back up can be the easiest way to debug complex SQL commands. ✔ Tips ■ Another common MySQL problem is try- ing to run queries or connect using the mysql client when the MySQL server isn’t even running. Be sure that MySQL is available for querying! ■ As an alternative to printing out the query to the browser, you could print it out as an HTML comment (viewable only in the HTML source), using echo ""; 221 Error Handling and Debugging SQL and MySQL Debugging Techniques Figure 7.22 To understand what result a PHP script is receiving, run the same query through a separate interface. In this case the problem is the reference to the password column, when the table’s column is actually called just pass. Debugging access problems Access denied error messages are the most common problem beginning developers encounter when using PHP to interact with MySQL. These are among the common solutions: ◆ Reload MySQL after altering the privi- leges so that the changes take effect. Either use the mysqladmin tool or run FLUSH PRIVILEGES in the mysql client. You must be logged in as a user with the appropriate permissions to do this (see Appendix A for more). ◆ Double-check the password used. The error message Access denied for user: ‘user@localhost’ (Using password: YES) frequently indicates that the password is wrong or mistyped. (This is not always the cause but is the first thing to check.) ◆ The error message Can’t connect to… (error number 2002) indicates that MySQL either is not running or is not running on the socket or TCP/IP port tried by the client. ✔ Tips ■ MySQL keeps its own error logs, which are very useful in solving MySQL prob- lems (like why MySQL won’t even start). MySQL’s error log will be located in the data directory and titled hostname.err. ■ The MySQL manual is very detailed, containing SQL examples, function references, and the meanings of error codes. Make the manual your friend and turn to it when confusing errors pop up. 222 Chapter 7 SQL and MySQL Debugging Techniques Now that you have a sufficient amount of PHP, SQL, and MySQL experience under your belt, it’s time to put all of the technologies together. PHP’s strong integration with MySQL is just one reason so many programmers have embraced it; it’s impressive how easily you can use the two together. This chapter will use the existing sitename database—created in Chapter 5, “Introduction to SQL”—to build a PHP interface for interacting with the users table. The knowledge taught and the examples used here will be the basis for all of your PHP-MySQL Web applications, as the principles involved are the same for any PHP-MySQL interaction. Before heading into this chapter, you should be comfortable with everything covered in the first six chapters. Also, understanding the error debugging and handling techniques covered in Chapter 7 will make the learning process less frustrating, should you encounter snags. Finally, remember that you need a PHP-enabled Web server and access to a running MySQL server in order to test the following examples. 223 Using PHP with MySQL 8 Using PHP with MySQL Modifying the Template Since all of the pages in this chapter and the next will be part of the same Web applica- tion, it’ll be worthwhile to use a common template system. Instead of creating a new template from scratch, the layout from Chapter 3, “Creating Dynamic Web Sites,” will be used again, with only a minor modifi- cation to the header file’s navigation links. To make the header file: 1. Open header.html (Script 3.2) in your text editor. 2. Change the list of links to read (Script 8.1)
  • Home ➝ Page
  • Register
  • View ➝ Users
  • Change ➝ Password
  • link five
  • All of the examples in this chapter will involve the registration, view users, and change password pages. The date form and calculator links from Chapter 3 can be deleted. 3. Save the file as header.html. 4. Place the new header file in your Web directory, within the includes folder along with footer.html (Script 3.3) and style.css (available for download from the book’s supporting Web site, www.DMCInsights.com/phpmysql3/). 224 Chapter 8 1 2 3 4 <?php echo $page_title; ?> 5 6 7 8 9 13 22
    23 Script 8.1 The site’s header file, used for the pages’ template, modified with new navigation links. Modifying the Template WARNING: READ THIS! PHP and MySQL have gone through many changes over the past decade. Of these, the most important for this chapter and one of the most important for the rest of the book involves what PHP functions you use to communicate with MySQL. For years, PHP developers used the standard MySQL functions (called the mysql extension). As of PHP 5 and MySQL 4.1, you can use the newer Improved MySQL functions (called the mysqli extension). These functions provide improved performance and take advantage of added features (among other benefits). As this book assumes you’re using at least PHP 6 and MySQL 5, all of the examples will only use the Improved MySQL functions. If your server does not support this extension, you will not be able to run these examples as they are written! Most of the examples in the rest of the book will also not work for you. If the server or home computer you’re using does not support the Improved MySQL func- tions, you have three options: upgrade PHP and MySQL, read the second edition of this book (which teaches and primarily uses the older functions), or learn how to use the older func- tions and modify all the examples accordingly. For questions or problems, see the book’s cor- responding forum (www.DMCInsights.com/phorum/). 5. Test the new header file by running index.php in your Web browser (Figure 8.1). ✔ Tips ■ For a preview of this site’s structure, see the sidebar “Organizing Your Documents” in the next section. ■ Remember that you can use any file exten- sion for your template files, including .inc or .php. ■ To refresh your memory on the template- creation process or the specifics of this layout, see the first few pages of Chapter 3. 225 Using PHP with MySQL Modifying the Template Figure 8.1 The dynamically generated home page with new navigation links. Connecting to MySQL The first step for interacting with MySQL— connecting to the server—requires the appropriately named mysqli_connect() function: $dbc = mysqli_connect (hostname, ➝ username, password, db_name); The first three arguments sent to the func- tion (host, username, and password) are based upon the users and privileges set up within MySQL (see Appendix A, “Installation,” for more information). Commonly (but not always), the host value will be localhost. The fourth argument is the name of the data- base to use. This is the equivalent of saying USE databasename within the mysql client. If the connection was made, the $dbc variable, short for database connection, will become a reference point for all of your subsequent database interactions. Most of the PHP func- tions for working with MySQL will take this variable as its first argument. Before putting this knowledge to the test, there’s one more function to learn about. If a connection problem occurred, you can call mysqli_connect_error(), which returns the connection error message. It takes no argu- ments, so would be called using just mysqli_connect_error(); To start using PHP with MySQL, let’s create a special script that makes the connection. Other PHP scripts that require a MySQL connection can then include this file. To connect to and select a database: 1. Create a new PHP document in your text editor or IDE (Script 8.2). Script 8.2 The mysqli_connect.php script will be used by every other script in this chapter. It establishes a connection to MySQL and selects the database. 2. Set the MySQL host, username, pass- word, and database name as constants. DEFINE ('DB_USER', 'username'); DEFINE ('DB_PASSWORD', 'password'); DEFINE ('DB_HOST', 'localhost'); DEFINE ('DB_NAME', 'sitename'); I prefer to establish these values as con- stants for security reasons (they cannot be changed this way), but that isn’t required. In general, setting these values as some sort of variable or constant makes sense so that you can separate the configuration parameters from the functions that use them, but again, this is not obligatory. When writing your script, change these values to ones that will work on your setup. If you have been provided with a MySQL username/password combina- tion and a database (like for a hosted site), use that information here. Or, if possible, follow the steps in Appendix A to create a user that has access to the sitename database, and insert those val- ues here. Whatever you do, don’t just use these values unless you know for certain they will work on your server. 3. Connect to MySQL. $dbc = @mysqli_connect (DB_HOST, ➝ DB_USER, DB_PASSWORD, DB_NAME) OR ➝ die ('Could not connect to MySQL: ' . ➝ mysqli_connect_error() ); The mysqli_connect() function, if it suc- cessfully connects to MySQL, will return a resource link that corresponds to the open connection. This link will be assigned to the $dbc variable, so that other func- tions can make use of this connection. The function call is preceded by the error suppression operator (@). This prevents the PHP error from being displayed in the Web browser. This is preferable, as the error will be handled by the OR die() clause. If the mysqli_connect() function cannot return a valid resource link, then the OR die() part of the statement is executed (because the first part of the OR will be false, so the second part must be true). As discussed in the preceding chapter, the die() function terminates the execu- tion of the script. The function can also take as an argument a string that will be printed to the Web browser. In this case, the string is a combination of Could not connect to MySQL: and the specific MySQL error (Figure 8.2). Using this blunt error management system makes debugging much easier as you develop your sites. 227 Using PHP with MySQL Connecting to MySQL Figure 8.2 If there were problems connecting to MySQL, an informative message is displayed and the script is halted. continues on next page 4. Save the file as mysqli_connect.php. Since this file contains information—the database access data—that must be kept private, it will use a .php extension. With a .php extension, even if malicious users ran this script in their Web browser, they would not see the page’s actual content. 5. Place the file outside of the Web docu- ment directory (Figure 8.3). Because the file contains sensitive MySQL access information, it ought to be stored securely. If you can, place it in the direc- tory immediately above or otherwise out- side of the Web directory. This way the file will not be accessible from a Web browser. See the “Organizing Your Documents” sidebar for more. 6. Temporarily place a copy of the script within the Web directory and run it in your Web browser (Figure 8.4). In order to test the script, you’ll want to place a copy on the server so that it’s accessible from the Web browser (which means it must be in the Web directory). If the script works properly, the result should be a blank page (see Figure 8.4). If you see an Access denied… or similar message (see Figure 8.2), it means that the combination of username, password, and host does not have permission to access the particular database. 7. Remove the temporary copy from the Web directory. 228 Chapter 8 Connecting to MySQL Organizing Your Documents I introduced the concept of site structure back in Chapter 3 when developing the first Web application. Now that pages will begin using a database connection script, the topic is more important. Should the database connectivity informa- tion (username, password, host, and data- base) fall into malicious hands, it could be used to steal your information or wreak havoc upon the database as a whole. Therefore, you cannot keep a script like mysqli_connect.php too secure. The best recommendation for securing such a file is to store it outside of the Web documents directory. If, for example, the htdocs folder in Figure 8.3 is the root of the Web directory (in other words, the URL www.example.com leads there), then not storing mysqli_connect.php anywhere within the html directory means it will never be accessible via the Web browser. Granted, the source code of PHP scripts is not viewable from the Web browser (only the data sent to the browser by the script is), but you can never be too careful. If you aren’t allowed to place documents outside of the Web directory, placing mysqli_connect.php in the Web directory is less secure, but not the end of the world. Secondarily, I would recommend using a .php extension for your connection scripts. A properly configured and work- ing server will execute rather than display code in such a file. Conversely, if you use just .inc as your extension, that page’s contents would be displayed in the Web browser if accessed directly. ✔ Tips ■ The same values used in Chapter 5 to log in to the mysql client should work from your PHP scripts. ■ If you receive an error that claims mysqli_connect() is an undefined func- tion, it means that PHP has not been compiled with support for the Improved MySQL Extension. See the appendix for installation information. ■ If you see a Can’t connect… error message when running the script (see Figure 8.5), it likely means that MySQL isn’t running. ■ In case you are curious, Figure 8.6 shows what would happen if you didn’t use @ before mysqli_connect() and an error occurred. ■ If you don’t need to select the database when establishing a connection to MySQL, omit that argument from the mysqli_connect() function: $dbc = mysqli_connect (hostname, ➝ username, password); Then, when appropriate, you can select the database using mysqli_select_db($dbc, db_name); 229 Using PHP with MySQL Connecting to MySQL Figure 8.3 A visual representation of a server’s Web documents, where mysqli_connect.php is not stored within the main directory (htdocs). Figure 8.4 If the MySQL connection script works properly, the end result will be a blank page (no HTML is generated by the script). Figure 8.5 Another reason why PHP might not be able to connect to MySQL (besides using invalid username/ password/hostname/database information) is if MySQL isn’t currently running. Figure 8.6 If you don’t use the error suppression operator (@), you’ll see both the PHP error and the custom OR die() error. Executing Simple Queries Once you have successfully connected to and selected a database, you can start per- forming queries. These queries can be as basic as inserts, updates, and deletions or as involved as complex joins returning numer- ous rows. In any case, the PHP function for executing a query is mysqli_query(): result = mysqli_query(dbc, query); The function takes the database connection as its first argument and the query itself as the second. I normally assign the query to another variable, called $query or just $q. So running a query might look like $r = mysqli_query($dbc, $q); For simple queries like INSERT, UPDATE, DELETE, etc. (which do not return records), the $r variable—short for result—will be either TRUE or FALSE, depending upon whether the query executed successfully. Keep in mind that “executed successfully” means that it ran without error; it doesn’t mean it neces- sarily had the desired result; you’ll need to test for that. For complex queries that return records (SELECT, SHOW, DESCRIBE, and EXPLAIN), $r will be a resource link to the results of the query 230 Chapter 8 Executing Simple Queries Figure 8.7 The registration form. if it worked or be FALSE if it did not. Thus, you can use this line of code in a conditional to test if the query successfully ran: $r = mysqli_query ($dbc, $q); if ($r) { // Worked! If the query did not successfully run, some sort of MySQL error must have occurred. To find out what that error was, call the mysqli_error() function: echo mysqli_error($dbc); One final, albeit optional, step in your script would be to close the existing MySQL con- nection once you’re finished with it: mysqli_close($dbc); This function is not required, because PHP will automatically close the connection at the end of a script, but it does make for good programming form to incorporate it. To demonstrate this process, let’s create a registration script. It will show the form when first accessed (Figure 8.7), handle the form submission, and, after validating all the data, insert the registration information into the users table of the sitename database. To execute simple queries: 1. Create a new PHP script in your text edi- tor or IDE (Script 8.3). Thank you! 56

    You are now registered. In Chapter 11 you will actually be able to log in!


    ’; 57 58 } else { // If it did not run OK. 59 60 // Public message: 61 echo ‘

    System Error

    62

    You could not be registered due to a system error. We apologize for any inconvenience.

    ’; 63 Script 8.3 continued (script continues on next page) 5. Validate the password. if (!empty($_POST['pass1'])) { if ($_POST['pass1'] != ➝ $_POST['pass2']) { $errors[] = 'Your password ➝ did not match the ➝ confirmed password.'; } else { $p = trim($_POST['pass1']); } } else { $errors[] = 'You forgot to enter ➝ your password.'; } To validate the password, the script needs to check the pass1 input for a value and then confirm that the pass1 value matches the pass2 value (so the password and confirmed password are the same). 6. Check if it’s OK to register the user. if (empty($errors)) { If the submitted data passed all of the conditions, the $errors array will have no values in it (it will be empty), so this condition will be TRUE and it’s safe to add the record to the database. If the $errors array is not empty, then the appropriate error messages should be printed (see Step 10) and the user given another opportunity to register. 233 Using PHP with MySQL Executing Simple Queries 64 // Debugging message: 65 echo ‘

    ’ . mysqli_error($dbc) . ‘

    Query: ‘ . $q . ‘

    ’; 66 67 } // End of if ($r) IF. 68 69 mysqli_close($dbc); // Close the database connection. 70 71 // Include the footer and quit the script: 72 include (‘includes/footer.html’); 73 exit(); 74 75 } else { // Report the errors. 76 77 echo ‘

    Error!

    78

    The following error(s) occurred:
    ’; 79 foreach ($errors as $msg) { // Print each error. 80 echo “ - $msg
    \n”; 81 } 82 echo ‘

    Please try again.


    ’; 83 84 } // End of if (empty($errors)) IF. 85 86 } // End of the main Submit conditional. 87 ?> 88

    Register

    89
    90

    First Name: ” />

    91

    Last Name: ” />

    Script 8.3 continued (script continues on next page) continues on next page 7. Add the user to the database. require_once ➝ ('../mysqli_connect.php'); $q = "INSERT INTO users (first_name, ➝ last_name, email, pass, ➝ registration_date) VALUES ('$fn', ➝ '$ln', '$e', SHA1('$p'), NOW() )"; $r = @mysqli_query ($dbc, $q); The first line of code will insert the con- tents of the mysqli_connect.php file into this script, thereby creating a connection to MySQL and selecting the database. You may need to change the reference to the location of the file as it is on your server (as written, this line assumes that mysqli_connect.php is in the parent fold- er of the current folder). The query itself is similar to those demon- strated in Chapter 5. The SHA1() function is used to encrypt the password, and NOW() is used to set the registration date as this moment. After assigning the query to a variable, it is run through the mysqli_query() function, which sends the SQL command to the MySQL database. As in the mysqli_ connect.php script, the mysqli_query() call is preceded by @ in order to suppress any ugly errors. If a problem occurs, the error will be handled more directly in the next step. 234 Chapter 8 Executing Simple Queries 92

    Email Address: ” />

    93

    Password:

    94

    Confirm Password:

    95

    96 97 98 Script 8.3 continued The $r variable, which is assigned the value returned by mysqli_query(), can be used in a conditional to indicate the suc- cessful operation of the query. If $r is TRUE, then a Thank you! message is displayed (Figure 8.8). If $r is FALSE, error messages are printed. For debug- ging purposes, the error messages will include both the error spit out by MySQL (thanks to the mysqli_error() function) and the query that was run (Figure 8.9). This information is critical to debugging the problem. 235 Using PHP with MySQL Executing Simple Queries continues on next page 8. Report on the success of the registration. if ($r) { echo '

    Thank you!

    You are now registered. In ➝ Chapter 11 you will actually be ➝ able to log in!


    '; } else { echo '

    System Error

    You could not be ➝ registered due to a system ➝ error. We apologize for any ➝ inconvenience.

    '; echo '

    ' . mysqli_error($dbc) . ➝ '

    Query: ' . $q . ➝ '

    '; } Figure 8.9 Any MySQL errors caused by the query will be printed, as will the query that was being run. Figure 8.8 If the user could be registered in the database, this message is displayed. 9. Close the database connection and complete the HTML template. mysqli_close(); include ('includes/footer.html'); exit(); Closing the connection isn’t required but is a good policy. Then the footer is included and the script terminated (thanks to the exit() function). If those two lines weren’t here, then the registra- tion form would be displayed again (which isn’t necessary after a successful registration). 10. Print out any error messages and close the submit conditional. } else { echo '

    Error!

    The ➝ following ››error(s) ➝ occurred:
    '; foreach ($errors as $msg) { echo " - $msg
    \n"; } echo '

    Please try ➝ ›again.


    '; } } The else clause is invoked if there were any errors. In that case, all of the errors are displayed using a foreach loop (Figure 8.10). The final closing curly brace closes the main submit conditional. The main con- ditional is a simple IF, not an if-else, so that the form can be made sticky (again, see Chapter 3). 236 Chapter 8 Executing Simple Queries Figure 8.10 Each form validation error is reported to the user so that they may try registering again. 11. Close the PHP section and begin the HTML form. ?>

    Register

    First Name:

    Last Name:

    The form is really simple, with one text input for each field in the users table (except for the user_id column, which will automatically be populated). Each input is made sticky, using the code value="" Also, I would strongly recommend that you use the same name for your form inputs as the corresponding column in the database where that value will be stored. Further, you should set the max- imum input length in the form equal to the maximum column length in the database. Both of these habits help to minimize errors. 12. Complete the HTML form.

    Email Address:

    Password:

    Confirm Password:

    This is all much like that in Step 11. A submit button and a hidden input are in the form as well. The hidden input trick is discussed in (you guessed it…Chapter 3). As a side note, I don’t need to follow my maxlength recommendation (from Step 11) with the password inputs, because they will be encrypted with SHA1(), which always creates a string 40 characters long. And since there are two of them, they can’t both use the same name as the column in the database. 13. Complete the template. 237 Using PHP with MySQL Executing Simple Queries continues on next page 14. Save the file as register.php, place it in your Web directory, and test it in your Web browser. Note that if you use an apostrophe in one of the form values, it will likely break the query (Figure 8.11). The section “Ensuring Secure SQL” later in this chapter will show how to protect against this. ✔ Tips ■ After running the script, you can always ensure that it worked by using the mysql client or phpMyAdmin to view the values in the users table. ■ You should not end your queries with a semicolon in PHP, as you did when using the mysql client. When working with MySQL, this is a common, albeit harm- less, mistake to make. When working with other database applications (Oracle, for one), doing so will make your queries unusable. ■ As a reminder, the mysqli_query() func- tion returns TRUE if the query could be executed on the database without error. This does not necessarily mean that the result of the query is what you were expecting. Later scripts will demonstrate how to more accurately gauge the suc- cess of a query. ■ Youarenotobligatedtocreatea$q variable as I tend to do (you could directly insert your query text into mysqli_query()). However, as the construction of your queries becomes more complex, using a variable will be the only option. ■ Practically any query you would run in the mysql client can also be executed using mysqli_query(). ■ Another benefit of the Improved MySQL Extension over the standard extension is that the mysqli_multi_query() function lets you execute multiple queries at one time. The syntax for doing so, particularly if the queries return results, is a bit more complicated, so see the PHP manual if you have this need. 238 Chapter 8 Executing Simple Queries Figure 8.11 Apostrophes in form values (like the last name here) will conflict with the apostrophes used to delineate values in the query. Retrieving Query Results The preceding section of this chapter demon- strates how to execute simple queries on a MySQL database. A simple query, as I’m call- ing it, could be defined as one that begins with INSERT, UPDATE, DELETE, or ALTER. What all four of these have in common is that they return no data, just an indication of their success. Conversely, a SELECT query generates information (i.e., it will return rows of records) that has to be handled by other PHP functions. The primary tool for handling SELECT query results is mysqli_fetch_array(), which uses the query result variable (that I’ve been call- ing $r) and returns one row of data at a time, in an array format. You’ll want to use this function within a loop that will continue to access every returned row as long as there are more to be read. The basic construction for reading every record from a query is while ($row = mysqli_fetch_array($r)) { // Do something with $row. } Youwillalmostalwayswanttouseawhile loop to fetch the results from a SELECT query. The mysqli_fetch_array() function takes an optional second parameter specifying what type of array is returned: associative, indexed, or both. An associative array allows you to refer to column values by name, whereas an indexed array requires you to use only numbers (starting at 0 for the first col- umn returned). Each parameter is defined by a constant listed in Table 8.1. The MYSQLI_NUM setting is marginally faster (and uses less memory) than the other options. Conversely, MYSQLI_ASSOC is more overt ($row['column'] rather than $row[3]) and may continue to work even if the query changes. An optional step you can take when using mysqli_fetch_array() would be to free up the query result resources once you are done using them: mysqli_free_result ($r); This line removes the overhead (memory) taken by $r. It’s an optional step, since PHP will automatically free up the resources at the end of a script, but—like using mysqli_close()—it does make for good pro- gramming form. To demonstrate how to handle results returned by a query, let’s create a script for viewing all of the currently registered users. 239 Using PHP with MySQL Retrieving Query Results Constant Example MYSQLI_ASSOC $row['column'] MYSQLI_NUM $row[0] MYSQLI_BOTH $row[0] or $row['column'] mysqli_fetch_array() Constants Table 8.1 Adding one of these constants as an optional parameter to the mysqli_fetch_array() function dictates how you can access the values returned. The default setting of the function is MYSQLI_BOTH. To retrieve query results: 1. Create a new PHP document in your text editor or IDE (Script 8.4). Registered Users'; 2. Connect to and query the database. require_once ➝ ('../mysqli_connect.php'); $q = "SELECT CONCAT(last_name, ', ', ➝ first_name) AS name, ➝ DATE_FORMAT(registration_date, '%M ➝ %d, %Y') AS dr FROM users ORDER BY ➝ registration_date ASC"; $r = @mysqli_query ($dbc, $q); The query here will return two columns (Figure 8.12): the users’ names (format- ted as Last Name, First Name) and the date they registered (formatted as Month DD, YYYY). Because both columns are formatted using MySQL functions, aliases are given to the returned results (name and dr, accordingly). See Chapter 5 if you are confused by any of this syntax. 3. Display the query results. if ($r) { echo '
    < td ➝ align="left">Date ➝ Registered '; 240 Chapter 8 Retrieving Query Results 1 Registered Users'; 9 10 require_once ('../mysqli_connect.php'); // Connect to the db. 11 12 // Make the query: 13 $q = "SELECT CONCAT(last_name, ', ', first_name) AS name, DATE_FORMAT(registration_date, '%M %d, %Y') AS dr FROM users ORDER BY registration_date ASC"; 14 $r = @mysqli_query ($dbc, $q); // Run the query. 15 16 if ($r) { // If it ran OK, display the records. 17 18 // Table header. 19 echo '
    Name
    20 21 '; 22 23 // Fetch and print all the records: 24 while ($row = mysqli_fetch_array($r, MYSQLI_ASSOC)) { Script 8.4 The view_users.php script runs a static query on the database and prints all of the returned rows. (script continues on next page) while ($row = ➝ mysqli_fetch_array($r, ➝ MYSQLI_ASSOC)) { echo ' '; } echo '
    NameDate Registered
    ' . ➝ $row['name'] . '' . $row['dr'] . ➝ '
    '; To display the results, make a table and a header row in HTML. Then loop through the results using mysqli_fetch_array() and print each fetched row. Finally, close the table. Notice that within the while loop, the code refers to each returned value using the proper alias: $row['name'] and $row['dr']. The script could not refer to $row['first_name'] or $row['date_ registered'] because no such field name was returned (see Figure 8.12). 241 Using PHP with MySQL Retrieving Query Results 25 echo '' . $row['name'] . '' . $row['dr'] . ' 26 '; 27 } 28 29 echo ''; // Close the table. 30 31 mysqli_free_result ($r); // Free up the resources. 32 33 } else { // If it did not run OK. 34 35 // Public message: 36 echo '

    The current users could not be retrieved. We apologize for any inconvenience.

    '; 37 38 // Debugging message: 39 echo '

    ' . mysqli_error($dbc) . '

    Query: ' . $q . '

    '; 40 41 } // End of if ($r) IF. 42 43 mysqli_close($dbc); // Close the database connection. 44 45 include ('includes/footer.html'); 46 ?> Script 8.4 continued Figure 8.12 The query results as run within the mysql client. continues on next page 4. Free up the query resources. mysqli_free_result ($r); Again, this is an optional step but a good one to take. 5. Complete the main conditional. } else { echo '

    The ➝ current users could not be ➝ retrieved. We apologize for any ➝ inconvenience.

    '; echo '

    ' . mysqli_error($dbc) . ➝ '

    Query: ' . $q . ➝ '

    '; } As in the register.php example, there are two kinds of error messages here. The first is a generic message, the type you’d show in a live site. The second is much more detailed, printing both the MySQL error and the query, both being critical for debugging purposes. 6. Close the database connection and finish the page. mysqli_close($dbc); include ('includes/footer.html'); ?> 7. Save the file as view_users.php, place it in your Web directory, and test it in your browser (Figure 8.13). ✔ Tips ■ The function mysqli_fetch_row()is the equivalent of mysqli_fetch_array ($r, MYSQLI_NUM); ■ The function mysqli_fetch_assoc() is the equivalent of mysqli_fetch_array ($r, MYSQLI_ASSOC); ■ As with any associative array, when you retrieve records from the database, you must refer to the columns exactly as they are defined in the database. This is to say that the keys are case-sensitive. ■ If you are in a situation where you need to run a second query inside of your while loop, be certain to use different variable names for that query. For exam- ple, the inner query would use $r2 and $row2 instead of $r and $row. If you don’t do this, you’ll encounter logical errors. ■ I frequently see beginning PHP develop- ers muddle the process of fetching query results. Remember that you must exe- cute the query using mysqli_query(), and then use mysqli_fetch_array() to retrieve a single row of information. If you have multiple rows to retrieve, use a while loop. 242 Chapter 8 Retrieving Query Results Figure 8.13 All of the user records are retrieved from the database and displayed in the Web browser. You can accomplish the first objective by securing the MySQL connection script outside of the Web directory so that it is never view- able through a Web browser (see Figure 8.3). I discuss this in some detail earlier in the chapter. The second objective is attained by not letting the user see PHP’s error messages or your queries (in these scripts, that infor- mation is printed out for your debugging purposes; you’d never want to do that on a live site). For the third objective, there are numerous steps you can and should take, all based upon the premise of never trusting user- supplied data. First, validate that some value has been submitted, or that it is of the prop- er type (number, string, etc.). Second, use regular expressions to make sure that sub- mitted data matches what you would expect it to be (this topic is covered in Chapter 13, “Perl-Compatible Regular Expressions”). Third, you can typecast some values to guarantee that they’re numbers (discussed in Chapter 12, “Security Methods”). A fourth recommendation is to run user- submitted data through the mysqli_real_ escape_string() function. This function cleans data by escaping what could be prob- lematic characters. It’s used like so: $clean = mysqli_real_escape_string($dbc, ➝ data); For security purposes, mysqli_real_escape_ string() should be used on every text input in a form. To demonstrate this, let’s revamp register.php (Script 8.3). 243 Using PHP with MySQL Ensuring Secure SQL Ensuring Secure SQL Database security with respect to PHP comes down to three broad issues: 1. Protecting the MySQL access information 2. Not revealing too much about the database 3. Being cautious when running queries, particularly those involving user- submitted data To use mysqli_real_escape_string(): 1. Open register.php (Script 8.3) in your text editor or IDE. 2. Move the inclusion of the mysqli_ connect.php file (line 46 in Script 8.3) to just after the main conditional (Script 8.5). Because the mysqli_real_escape_ string() function requires a database connection, the mysqli_connect.php script must be required earlier in the script. 244 Chapter 8 Ensuring Secure SQL 1 Thank you! 56

    You are now registered. In Chapter 11 you will actually be able to log in!


    ’; 57 58 } else { // If it did not run OK. 59 60 // Public message: (script continues on next page) Script 8.5 continued 4. Add a second call to mysqli_close() before the end of the main conditional. mysqli_close($dbc); To be consistent, since the database con- nection is opened as the first step of the main conditional, it should be closed as the last step of this same conditional. It still needs to be closed before including the footer and terminating the script (lines 72 and 73), though. 246 Chapter 8 Ensuring Secure SQL 61 echo ‘

    System Error

    62

    You could not be registered due to a system error. We apologize for any inconvenience.

    ’; 63 64 // Debugging message: 65 echo ‘

    ’ . mysqli_error($dbc) . ‘

    Query: ‘ . $q . ‘

    ’; 66 67 } // End of if ($r) IF. 68 69 mysqli_close($dbc); // Close the database connection. 70 71 // Include the footer and quit the script: 72 include (‘includes/footer.html’); 73 exit(); 74 75 } else { // Report the errors. 76 77 echo ‘

    Error!

    78

    The following error(s) occurred:
    ’; 79 foreach ($errors as $msg) { // Print each error. 80 echo “ - $msg
    \n”; 81 } 82 echo ‘

    Please try again.


    ’; 83 84 } // End of if (empty($errors)) IF. 85 86 mysqli_close($dbc); // Close the database connection. 87 88 } // End of the main Submit conditional. 89 ?> (script continues on next page) Script 8.5 continued 247 Using PHP with MySQL Ensuring Secure SQL 5. Save the file as register.php, place it in your Web directory, and test it in your Web browser (Figures 8.14 and 8.15). continues on next page 90

    Register

    91
    92

    First Name: ” />

    93

    Last Name: ” />

    94

    Email Address: ” />

    95

    Password:

    96

    Confirm Password:

    97

    98 99
    100 Script 8.5 continued Figure 8.14 Values with apostrophes in them, like a person’s last name, will no longer break the INSERT query, thanks to the mysqli_real_ escape_string() function. Figure 8.15 Now the registration process will handle problematic characters and be more secure. ✔ Tips ■ The mysqli_real_escape_string() func- tion escapes a string in accordance with the language being used, which is an added advantage over alternative solutions. ■ If you see results like those in Figure 8.16, it means that the mysqli_real_escape_ string() function cannot access the database (because it has no connection, like $dbc). 248 Chapter 8 Ensuring Secure SQL Figure 8.16 Since the mysqli_real_escape_string() requires a database connection, using it without that connection (e.g., before including the connection script) can lead to other errors. Modifying register.php The mysqli_num_rows() function could be applied to register.php to prevent someone from registering with the same email address multiple times. Although the UNIQUE index on that column in the database will prevent that from happening, such attempts will create a MySQL error. To prevent this using PHP, run a SELECT query to confirm that the email address isn’t currently registered. That query would be simply SELECT user_id FROM users WHERE email='$e' You would run this query (using the mysqli_query() function) and then call mysqli_num_rows(). If mysqli_num_rows() returns 0, you know that the email address hasn’t already been regis- tered and it’s safe to run the INSERT. ■ If Magic Quotes is enabled on your server (which means you’re using a version of PHP prior to 6), you’ll need to remove any slashes added by Magic Quotes, prior to using the mysqli_real_escape_string() function. The code (cumbersome as it is) would look like: $fn = mysqli_real_escape_string ➝ ($dbc, trim (stripslashes ➝ ($_POST['first_name']))); If you don’t use stripslashes() and Magic Quotes is enabled, the form values will be doubly escaped. Counting Returned Records The next logical function to discuss is mysqli_num_rows(). This function returns the number of rows retrieved by a SELECT query. It takes one argument, the query result variable: $num = mysqli_num_rows($r); Although simple in purpose, this function is very useful. It’s necessary if you want to pag- inate your query results (an example of this can be found in the next chapter). It’s also a good idea to use this function before you attempt to fetch any results using a while loop (because there’s no need to fetch the results if there aren’t any, and attempting to do so may cause errors). In this next sequence of steps, let’s modify view_users.php to list the total number of registered users. For another example of how you might use mysqli_num_rows(), see the sidebar. To modify view_users.php: 1. Open view_users.php (refer to Script 8.4) in your text editor or IDE. 2. Before the if ($r) conditional, add this line (Script 8.6) $num = mysqli_num_rows ($r); This line will assign the number of rows returned by the query to the $num variable. 3. Change the original $r conditional to if ($num > 0) { The conditional as it was written before was based upon whether the query did or did not successfully run, not whether or not any records were returned. Now it will be more accurate. 249 Using PHP with MySQL Counting Returned Records 1 Registered Users'; 9 10 require_once ('../mysqli_connect.php'); // Connect to the db. 11 12 // Make the query: 13 $q = "SELECT CONCAT(last_name, ', ', first_name) AS name, DATE_FORMAT(registration_date, '%M %d, %Y') AS dr FROM users ORDER BY registration_date ASC"; 14 $r = @mysqli_query ($dbc, $q); // Run the query. 15 16 // Count the number of returned rows: 17 $num = mysqli_num_rows($r); 18 19 if ($num > 0) { // If it ran OK, display the records. 20 21 // Print how many users there are: 22 echo "

    There are currently $num registered users.

    \n"; 23 24 // Table header. 25 echo ' Script 8.6 Now the view_users.php script will display the total number of registered users, thanks to the mysqli_num_rows() function. (script continues on next page) continues on next page 4. Before creating the HTML table, print the number of registered users. echo "

    There are currently $num ➝ registered users.

    \n"; 5. Change the else part of the main condi- tional to read echo '

    There are ➝ currently no registered users.

    '; The original conditional was based upon whether or not the query worked. Hopefully you’ve successfully debugged the query so that it is working and the original error messages are no longer needed. Now the error message just indi- cates if no records were returned. 6. Save the file as view_users.php, place it in your Web directory, and test it in your Web browser (Figure 8.17). 250 Chapter 8 Counting Returned Records 26 27 '; 28 29 // Fetch and print all the records: 30 while ($row = mysqli_fetch_array($r, MYSQLI_ASSOC)) { 31 echo ' 32 '; 33 } 34 35 echo '
    NameDate Registered
    ' . $row['name'] . '' . $row['dr'] . '
    '; // Close the table. 36 37 mysqli_free_result ($r); // Free up the resources. 38 39 } else { // If no records were returned. 40 41 echo '

    There are currently no registered users.

    '; 42 43 } 44 45 mysqli_close($dbc); // Close the database connection. 46 47 include ('includes/footer.html'); 48 ?> Script 8.6 continued Figure 8.17 The number of registered users is now displayed at the top of the page. Updating Records with PHP The last technique in this chapter shows how to update database records through a PHP script. Doing so requires an UPDATE query, and its successful execution can be verified with PHP’s mysqli_affected_rows() function. While the mysqli_num_rows() function will return the number of rows generated by a SELECT query, mysqli_affected_rows() returns the number of rows affected by an INSERT, UPDATE, or DELETE query. It’s used like so: $num = mysqli_affected_rows($dbc); Unlike mysqli_num_rows(), the one argument the function takes is the database connection ($dbc), not the results of the previous query ($r). The following example will be a script that allows registered users to change their pass- word. It demonstrates two important ideas: ◆ Checking a submitted username and password against registered values (the key to a login system as well) ◆ Updating database records using the pri- mary key as a reference As with the registration example, this one PHP script will both display the form (Figure 8.18) and handle it. 251 Using PHP with MySQL Updating Records with PHP Figure 8.18 The form for changing a user’s password. To update records with PHP: 1. Create a new PHP script in your text edi- tor or IDE (Script 8.7). Thank you! 59

    Your password has been updated. In Chapter 11 you will actually be able to log in!


    ’; 60 61 } else { // If it did not run OK. 62 63 // Public message: 64 echo ‘

    System Error

    65

    Your password could not be changed due to a system error. We apologize for any inconvenience.

    ’; 66 67 // Debugging message: 68 echo ‘

    ’ . mysqli_error($dbc) . ‘

    Query: ‘ . $q . ‘

    ’; 69 70 } 71 72 // Include the footer and quit the script (to not show the form). 73 include (‘includes/footer.html’); 74 exit(); 75 76 } else { // Invalid email address/password combination. 77 echo ‘

    Error!

    78

    The email address and password do not match those on file.

    ’; 79 } 80 81 } else { // Report the errors. (script continues on next page) Script 8.7 continued 6. If all the tests are passed, retrieve the user’s ID. if (empty($errors)) { $q = "SELECT user_id FROM users ➝ WHERE (email='$e' AND ➝ pass=SHA1('$p') )"; $r = @mysqli_query($dbc, $q); $num = @mysqli_num_rows($r); if ($num = = 1) { $row = mysqli_fetch_array($r, ➝ MYSQLI_NUM); This first query will return just the user_id field for the record that matches the submitted email address and pass- word (Figure 8.19). To compare the sub- mitted password against the stored one, encrypt it again with the SHA1() function. If the user is registered and has correctly entered both the email address and pass- word, exactly one row will be selected (since the email value must be unique across all rows). Finally, this one record is assigned as an array (of one element) to the $row variable. 255 Using PHP with MySQL Updating Records with PHP 82 83 echo ‘

    Error!

    84

    The following error(s) occurred:
    ’; 85 foreach ($errors as $msg) { // Print each error. 86 echo “ - $msg
    \n”; 87 } 88 echo ‘

    Please try again.


    ’; 89 90 } // End of if (empty($errors)) IF. 91 92 mysqli_close($dbc); // Close the database connection. 93 94 } // End of the main Submit conditional. 95 ?> 96

    Change Your Password

    97
    98

    Email Address: ” />

    99

    Current Password:

    100

    New Password:

    101

    Confirm New Password:

    102

    103 104
    105 Script 8.7 continued Figure 8.19 The result when running the SELECT query from the script (the first of two queries it has) within the mysql client. continues on next page If this part of the script doesn’t work for you, apply the standard debugging meth- ods: remove the error suppression opera- tors (@) so that you can see what errors, if any, occur; use the mysqli_error() func- tion to report any MySQL errors; and print, then run the query using another interface (as in Figure 8.19). 7. Update the database. $q = "UPDATE users SET ➝ pass=SHA1('$np') WHERE ➝ user_id=$row[0]"; $r = @mysqli_query($dbc, $q); This query will change the password— using the new submitted value—where the user_id column is equal to the num- ber retrieved from the previous query. 8. Check the results of the query. if (mysqli_affected_rows($dbc) = = 1) { echo '

    Thank you!

    Your password has been ➝ updated. In Chapter 11 you will ➝ actually be able to log ➝ in!


    '; } else { echo '

    System Error

    Your password ➝ could not be changed due to a ➝ system error. We apologize for ➝ any inconvenience.

    '; echo '

    ' . mysqli_error($dbc) . ➝ '

    Query: ' . $q . ➝ '

    '; } This part of the script again works simi- lar to register.php. In this case, if mysqli_affected_rows() returns the number 1, the record has been updated, and a success message will be printed. If not, both a public, generic message and a more useful debugging message will be printed. 9. Include the footer and terminate the script. include ('includes/footer.html'); exit(); At this point in the script, the UPDATE query has been run. It either worked or it did not (because of a system error). In both cases, there’s no need to show the form again, so the footer is included (to complete the page) and the script is ter- minated, using the exit() function. 10. Complete the if ($num = = 1) conditional. } else { echo '

    Error!

    The email ➝ address and password do not ➝ match those on file.

    '; } If mysqli_num_rows() does not return a value of 1, then the submitted email address and password do not match those on file and this error is printed. In this case, the form will be displayed again so that the user can enter the correct information. 256 Chapter 8 Updating Records with PHP 11. Print any validation error messages. } else { echo '

    Error!

    The following ➝ error(s) occurred:
    '; foreach ($errors as $msg) { echo " - $msg
    \n"; } echo '

    Please try ➝ again.


    '; } This else clause applies if the $errors array is not empty (which means that the form data did not pass all the vali- dation tests). As in the registration page, the errors will be printed. 12. Close the database connection and complete the PHP code. mysqli_close($dbc); } ?> 13. Display the form.

    Change Your Password

    Email Address:

    Current Password:

    New Password:

    Confirm New Password:

    The form takes three different inputs of type password—the current password, the new one, and a confirmation of the new password—and one text input for the email address. The email address input is sticky (password inputs cannot be). 257 Using PHP with MySQL Updating Records with PHP continues on next page 14. Include the footer file. 15. Save the file as password.php, place it in your Web directory, and test it in your Web browser (Figures 8.20 and 8.21). ✔ Tips ■ If you delete every record from a table using the command TRUNCATE tablename, mysqli_affected_rows() will return 0, even if the query was successful and every row was removed. This is just a quirk. ■ If an UPDATE query runs but does not actu- ally change the value of any column (for example, a password is replaced with the same password), mysqli_affected_rows() will return 0. ■ The mysqli_affected_rows() conditional used here could (and maybe should) also be applied to the register.php script to confirm that one record was added. That would be a more exacting condition to check than if ($r). 258 Chapter 8 Updating Records with PHP Figure 8.20 The password was changed in the database. Figure 8.21 If the entered email address and password don’t match those on file, the password will not be updated. Now that you have a little PHP and MySQL interaction under your belt, it’s time to take things up a notch. This chapter is similar to Chapter 3, “Creating Dynamic Web Sites,” in that it covers myriad independent topics. But what all of these have in common is that they demonstrate common PHP-MySQL programming techniques. You won’t learn new functions here; instead, you’ll see how to use the knowledge you already possess to create standard Web functionality. The examples themselves will broaden the Web application started in the preceding chapter by adding new, popular features. You’ll see several tricks for managing data- base information, in particular editing and deleting records using PHP. At that same time a couple new ways of passing data to your PHP pages will be introduced. The final sections of the chapter add features to the view_users.php page. 259 Common Programming Techniques 9 Common Programming Techniques Sending Values to a Script In the examples so far, all of the data received in the PHP script came from what the user entered in a form. There are, however, two different ways you can pass variables and values to a PHP script, both worth knowing. The first method is to make use of HTML’s hidden input type: As long as this code is anywhere between the form tags, the variable $_POST['do'] will have a value of this in the handling PHP script (assuming that the form uses the POST method). You’ve already been using this technique in the book with a hidden input named submitted, used to test when a form should be handled. The second method for sending values to a PHP script is to append it to the URL: www.example.com/page.php?do=this This technique emulates the GET method of an HTML form. With this specific exam- ple, page.php receives a variable called $_GET['do'] with a value of this. To demonstrate this GET method trick, a new version of the view_users.php script, first created in the last chapter, will be writ- ten. This one will provide links to pages that will allow you to edit or delete an existing user’s record. The links will pass the user’s ID to the handling pages, both of which will also be written in this chapter. To manually send values to a PHP script: 1. Open view_users.php (Script 8.6) in your text editor or IDE. 260 Chapter 9 Sending Values to a Script 1 Registered Users'; 10 11 require_once ('../mysqli_connect.php'); 12 13 // Make the query: 14 $q = "SELECT last_name, first_name, DATE_FORMAT(registration_date, '%M %d, %Y') AS dr, user_id FROM users ORDER BY registration_date ASC"; 15 $r = @mysqli_query ($dbc, $q); 16 17 // Count the number of returned rows: 18 $num = mysqli_num_rows($r); 19 20 if ($num > 0) { // If it ran OK, display the records. 21 22 // Print how many users there are: 23 echo "

    There are currently $num registered users.

    \n"; 24 25 // Table header. 26 echo ' 27 28 29 Script 9.1 The view_users.php script, started in Chapter 8, “Using PHP with MySQL,” now modified so that it presents Edit and Delete links, passing the user’s ID number along in each URL. (script continues on next page) 261 Common Programming Techniques Sending Values to a Script 2. Change the SQL query to read (Script 9.1). $q = "SELECT last_name, first_name, ➝ DATE_FORMAT(registration_date, '%M ➝ %d, %Y') AS dr, user_id FROM users ➝ ORDER BY registration_date ASC"; The query has been changed in a couple of ways. First, the first and last names are selected separately, not concatenated together. Second, the user_id is also now being selected, as that value will be nec- essary in creating the links. 3. Add three more columns to the main table. echo '
    EditDelete
    '; In the previous version of the script, there were only two columns: one for the name and another for the date the user registered. The name column has been separated into its two parts and two new columns added: one for the Edit link and another for the Delete link. Script 9.1 continued 30 31 32 33 34 '; 35 36 // Fetch and print all the records: 37 while ($row = mysqli_fetch_array($r, MYSQLI_ASSOC)) { 38 echo ' 39 40 41 42 43 44 45 '; 46 } 47 48 echo '
    Edit Delete Last Name ➝ First Name ➝ Date ➝ Registered
    Last NameFirst NameDate Registered
    Edit Delete ' . $row['last_name'] . '' . $row['first_ name'] . '' . $row['dr'] . '
    '; 49 mysqli_free_result ($r); 50 51 } else { // If no records were returned. 52 echo '

    There are currently no registered users.

    '; 53 } 54 55 mysqli_close($dbc); 56 57 include ('includes/footer.html'); 58 ?> continues on next page 4. Change the echo statement within the while loop to match the table’s new structure. echo ' Edit Delete ➝ ' . $row ➝ ['last_name'] . ' ' . $row ➝ ['first_name'] . ' ' . $row['dr'] ➝ . ' '; For each record returned from the data- base, this line will print out a row with five columns. The last three columns are obvious and easy to create: just refer to the returned column name. For the first two columns, which provide links to edit or delete the user, the syntax is slightly more complicated. The desired end result is HTML code like Edit,where X is the user’s ID. Knowing this, all the PHP code has to do is print $row['user_id'] for X, being mindful of the quotation marks to avoid parse errors. Because the HTML attributes use a lot of double quotation marks and this echo() statement requires a lot of variables to be printed, I find it easiest to use single quotes for the HTML and then to con- catenate the variables to the printed text. 5. Save the file as view_users.php, place it in your Web directory, and run it in your Web browser (Figure 9.1). 262 Chapter 9 Sending Values to a Script Figure 9.1 The revised version of the view_users.php page, with new columns and links. 6. If you want, view the HTML source of the page to see each dynamically gener- ated link (Figure 9.2). ✔ Tips ■ To append multiple variables to a URL, use this syntax: page.php?name1=value1 &name2=value2&name3=value3. It’s simply a matter of using the ampersand, plus another name=value pair. ■ One trick to adding variables to URLs is that strings should be encoded to ensure that the value is handled properly. For example, the space in the string Elliott Smith would be problematic. The solution then is to use the urlencode() function: $url = 'page.php?name=' . urlencode ➝ ('Elliott Smith'); You only need to do this when program- matically adding values to a URL. When a form uses the GET method, it automati- cally encodes the data. 263 Common Programming Techniques Sending Values to a Script Figure 9.2 Part of the HTML source of the page (see Figure 9.1) shows how the user’s ID is added to each link’s URL. Using Hidden Form Inputs In the preceding example, a new version of the view_users.php script was written. This one now includes links to the edit_user.php and delete_user.php pages, passing each a user’s ID through the URL. This next example, delete_user.php, will take the passed user ID and allow the administrator to delete that user. Although you could have this page simply execute a DELETE query as soon as the page is accessed, for security purposes (and to prevent an inadvertent deletion), there should be multiple steps: 1. The page must check that it received a numeric user ID. 2. A message will confirm that this user should be deleted. 3. The user ID will be stored in a hidden form input. 4. Upon submission of this form, the user will actually be deleted. To use hidden form inputs: 1. Create a new PHP document in your text editor or IDE (Script 9.2). Delete a User'; This document will use the same tem- plate system as the other pages in the application. 264 Chapter 9 Using Hidden Form Inputs 1 Delete a User'; 9 10 // Check for a valid user ID, through GET or POST: 11 if ( (isset($_GET['id'])) && (is_numeric ($_GET['id'])) ) { // From view_users.php 12 $id = $_GET['id']; 13 } elseif ( (isset($_POST['id'])) && (is_numeric($_POST['id'])) ) { // Form submission. 14 $id = $_POST['id']; 15 } else { // No valid ID, kill the script. 16 echo '

    This page has been accessed in error.

    '; 17 include ('includes/footer.html'); 18 exit(); 19 } 20 21 require_once ('../mysqli_connect.php'); 22 23 // Check if the form has been submitted: 24 if (isset($_POST['submitted'])) { 25 26 if ($_POST['sure'] == 'Yes') { // Delete the record. 27 28 // Make the query: Script 9.2 This script expects a user ID to be passed to it through the URL. It then presents a confirmation form and deletes the user upon submission. (script continues on next page) continues on page 266 265 Common Programming Techniques Using Hidden Form Inputs 29 $q = "DELETE FROM users WHERE user_id=$id LIMIT 1"; 30 $r = @mysqli_query ($dbc, $q); 31 if (mysqli_affected_rows($dbc) == 1) { // If it ran OK. 32 33 // Print a message: 34 echo '

    The user has been deleted.

    '; 35 36 } else { // If the query did not run OK. 37 echo '

    The user could not be deleted due to a system error.

    '; // Public message. 38 echo '

    ' . mysqli_error($dbc) . '
    Query: ' . $q . '

    '; // Debugging message. 39 } 40 41 } else { // No confirmation of deletion. 42 echo '

    The user has NOT been deleted.

    '; 43 } 44 45 } else { // Show the form. 46 47 // Retrieve the user's information: 48 $q = "SELECT CONCAT(last_name, ', ', first_name) FROM users WHERE user_id=$id"; 49 $r = @mysqli_query ($dbc, $q); 50 51 if (mysqli_num_rows($r) == 1) { // Valid user ID, show the form. 52 53 // Get the user's information: 54 $row = mysqli_fetch_array ($r, MYSQLI_NUM); 55 (script continues) Script 9.2 continued 56 // Create the form: 57 echo '
    58

    Name: ' . $row[0] . '

    59

    Are you sure you want to delete this user?
    60 Yes 61 No

    62

    63 64 65
    '; 66 67 } else { // Not a valid user ID. 68 echo '

    This page has been accessed in error.

    '; 69 } 70 71 } // End of the main submission conditional. 72 73 mysqli_close($dbc); 74 75 include ('includes/footer.html'); 76 ?> Script 9.2 continued 3. Check for a valid user ID value. if ( (isset($_GET['id'])) && (is_ ➝ numeric($_GET['id'])) ) { $id = $_GET['id']; } elseif ( (isset($_POST['id'])) && ➝ (is_numeric($_POST['id'])) ) { $id = $_POST['id']; } else { echo '

    This page ➝ has been accessed in error. ➝

    '; include ('includes/footer. ➝ html'); exit(); } This script relies upon having a valid user ID, which will be used in a DELETE query’s WHERE clause. The first time this page is accessed, the user ID should be passed in the URL (the page’s URL will end with delete_user.php?id=X), after clicking the Delete link in the view_ users.php page. The first if condition checks for such a value and that the value is numeric. As you will see, the script will then store the user ID value in a hidden form input. When the form is submitted (back to this same page), the page will receive the ID through $_POST. The second condition checks this and, again, that the ID value is numeric. If neither of these conditions are TRUE, then the page cannot proceed, so an error message is displayed and the script’s execution is terminated (Figure 9.3). 4. Include the MySQL connection script. require_once ('../mysqli_connect. ➝ php'); Both of this script’s processes—showing the form and handling the form—require a database connection, so this line is outside of the main submit conditional (Step 5). 5. Begin the main submit conditional. if (isset($_POST['submitted'])) { 6. Delete the user, if appropriate. if ($_POST['sure'] == 'Yes') { $q = "DELETE FROM users WHERE ➝ user_id=$id LIMIT 1"; $r = @mysqli_query ($dbc, $q); The form (Figure 9.4) will make the user click a radio button to confirm the deletion. This little step prevents any accidents. Thus, the handling process 266 Chapter 9 Using Hidden Form Inputs Figure 9.3 If the page does not receive a number ID value, this error is shown. Figure 9.4 The page confirms the user deletion using this simple form. first checks that the right radio button was selected. If so, a basic DELETE query is defined, using the user’s ID in the WHERE clause. A LIMIT clause is added to the query as an extra precaution. 7. Check if the deletion worked and respond accordingly. if (mysqli_affected_rows($dbc) == 1) ➝ { echo '

    The user has been ➝ deleted.

    '; } else { echo '

    The user ➝ could not be deleted due to a ➝ system error.

    '; echo '

    ' . mysqli_error($dbc) ➝ . '
    Query: ' . $q . ➝ '

    '; } The mysqli_affected_rows() function checks that exactly one row was affected by the DELETE query. If so, a happy mes- sage is displayed (Figure 9.5). If not, an error message is sent out. Keep in mind that it’s possible that no rows were affected without a MySQL error occurring. For example, if the query tries to delete the record where the user ID is equal to 42000 (and if that doesn’t exist), no rows will be deleted but no MySQL error will occur. Still, because of the checks made when the form is first loaded, it would take a fair amount of hacking by the user to get to that point. 8. Complete the $_POST['sure'] conditional. } else { echo '

    The user has NOT been deleted.

    '; } If the user did not explicitly check the Yes box, the user will not be deleted and this message is displayed (Figure 9.6). 9. Begin the else clause of the main submit conditional. } else { The page will either handle the form or display it. Most of the code prior to this takes effect if the form has been submit- ted (if $_POST['submitted'] is set). The code from here on takes effect if the form has not yet been submitted, in which case the form should be displayed. 267 Common Programming Techniques Using Hidden Form Inputs Figure 9.6 If you do not select Yes in the form, no database changes are made. Figure 9.5 If you select Yes in the form (see Figure 9.4) and click Submit, this should be the result. continues on next page 10. Retrieve the information for the user being deleted. $q = "SELECT CONCAT(last_name, ', ', ➝ first_name) FROM users WHERE ➝ user_id=$id"; $r = @mysqli_query ($dbc, $q); if (mysqli_num_rows($r) == 1) { To confirm that the script received a valid user ID and to state exactly who is being deleted (refer back to Figure 9.4), the to-be-deleted user’s name is retrieved from the database (Figure 9.7). The conditional—checking that a single row was returned—ensures that a valid user ID was provided. 11. Display the form. $row = mysqli_fetch_array ($r, ➝ MYSQLI_NUM); echo '

    Name: ' . $row[0] . '

    Are you sure you want to delete ➝ this user?
    Yes ➝ No

    '; 268 Chapter 9 Using Hidden Form Inputs Figure 9.7 Running the same SELECT query in the mysql client. The closing brace finishes the main submission conditional. Then the MySQL connection is closed and the footer is included. 14. Save the file as delete_user.php and place it in your Web directory (it should be in the same directory as view_users.php). 15. Run the page by first clicking a Delete link in the view_users.php page. ✔ Tips ■ Another way of writing this script would be to have the form use the GET method. Then the validation conditional (lines 10–19) would only have to validate $_GET['id'], as the ID would be passed in the URL whether the page was first being accessed or the form had been submitted. ■ Hidden form elements don’t display in the Web browser but are still present in the HTML source code (Figure 9.8). For this reason, never store anything there that must be kept truly secure. ■ Using hidden form inputs and appending values to a URL are just two ways to make data available to other PHP pages. Two more methods—cookies and ses- sions—are thoroughly covered in Chapter 11, “Cookies and Sessions.” 269 Common Programming Techniques Using Hidden Form Inputs Figure 9.8 The user ID is stored as a hidden input so that it’s available when the form is submitted. First, the database record returned by the SELECT query is retrieved using the mysqli_fetch_array() function. Then the form is printed, showing the name value retrieved from the database at the top. An important step here is that the user ID ($id) is stored as a hidden form input so that the handling process can also access this value (Figure 9.8). 12. Complete the mysqli_num_rows() conditional. } else { echo '

    This ➝ page has been accessed in ➝ error.

    '; } If no record was returned by the SELECT query (because an invalid user ID was submitted), this message is displayed. If you see this message when you test this script but don’t understand why, apply the standard debugging steps outlined at the end of Chapter 7, “Error Handling and Debugging.” 13. Complete the PHP page. } mysqli_close($dbc); include ('includes/footer.html'); ?> Editing Existing Records A common practice with database-driven Web sites is having a system in place so that you can easily edit existing records. This concept seems daunting to many beginning programmers, but the process is surprisingly straightforward. For the following example— editing registered user records—the process combines skills the book has already taught: ◆ Making sticky forms ◆ Using hidden inputs ◆ Validating registration data ◆ Running simple queries This next example is generally very similar to delete_user.php and will also be linked from the view_users.php script (when a per- son clicks Edit). A form will be displayed with the user’s current information, allowing for those values to be changed (Figure 9.9). Upon submitting the form, if the data passes all of the validation routines, an UPDATE query will be run to update the database. To edit an existing database record: 1. Create a new PHP document in your text editor or IDE (Script 9.3). Edit a User'; 2. Check for a valid user ID value. if ( (isset($_GET['id'])) && ➝ (is_numeric($_GET['id'])) ) { $id = $_GET['id']; } elseif ( (isset($_POST['id'])) && ➝ (is_numeric($_POST['id'])) ) { $id = $_POST['id']; } else { echo '

    This ➝ page has been accessed in ➝ error.

    '; include ('includes/ ➝ footer.html'); exit(); } This validation routine is exactly the same as that in delete_user.php, con- firming that a numeric user ID has been received, whether the page has first been accessed from view_users.php (the first condition) or upon submission of the form (the second condition). 270 Chapter 9 Editing Existing Records Figure 9.9 The form for editing a user’s record. continues on page 273 271 Common Programming Techniques Editing Existing Records 1 Edit a User'; 10 11 // Check for a valid user ID, through GET or POST: 12 if ( (isset($_GET['id'])) && (is_numeric ($_GET['id'])) ) { // From view_users.php 13 $id = $_GET['id']; 14 } elseif ( (isset($_POST['id'])) && (is_numeric($_POST['id'])) ) { // Form submission. 15 $id = $_POST['id']; 16 } else { // No valid ID, kill the script. 17 echo '

    This page has been accessed in error.

    '; 18 include ('includes/footer.html'); 19 exit(); 20 } 21 22 require_once ('../mysqli_connect.php'); 23 24 // Check if the form has been submitted: 25 if (isset($_POST['submitted'])) { 26 27 $errors = array(); 28 29 // Check for a first name: 30 if (empty($_POST['first_name'])) { 31 $errors[] = 'You forgot to enter your first name.'; Script 9.3 The edit_user.php page first displays the user’s current information in a form. Upon submission of the form, the record will be updated in the database. (script continues) Script 9.3 continued 32 } else { 33 $fn = mysqli_real_escape_string($dbc, trim($_POST['first_name'])); 34 } 35 36 // Check for a last name: 37 if (empty($_POST['last_name'])) { 38 $errors[] = 'You forgot to enter your last name.'; 39 } else { 40 $ln = mysqli_real_escape_string($dbc, trim($_POST['last_name'])); 41 } 42 43 // Check for an email address: 44 if (empty($_POST['email'])) { 45 $errors[] = 'You forgot to enter your email address.'; 46 } else { 47 $e = mysqli_real_escape_string($dbc, trim($_POST['email'])); 48 } 49 50 if (empty($errors)) { // If everything's OK. 51 52 // Test for unique email address: 53 $q = "SELECT user_id FROM users WHERE email='$e' AND user_id != $id"; 54 $r = @mysqli_query($dbc, $q); 55 if (mysqli_num_rows($r) == 0) { 56 57 // Make the query: 58 $q = "UPDATE users SET first_name= '$fn', last_name='$ln', email='$e' WHERE user_id=$id LIMIT 1"; 59 $r = @mysqli_query ($dbc, $q); 60 if (mysqli_affected_rows($dbc) == 1) { // If it ran OK. 61 (script continues on next page) 272 Chapter 9 Editing Existing Records Script 9.3 continued 62 // Print a message: 63 echo '

    The user has been edited.

    '; 64 65 } else { // If it did not run OK. 66 echo '

    The user could not be edited due to a system error. We apologize for any inconvenience.

    '; // Public message. 67 echo '

    ' . mysqli_error($dbc) . '
    Query: ' . $q . '

    '; // Debugging message. 68 } 69 70 } else { // Already registered. 71 echo '

    The email address has already been registered.

    '; 72 } 73 74 } else { // Report the errors. 75 76 echo '

    The following error(s) occurred:
    '; 77 foreach ($errors as $msg) { // Print each error. 78 echo " - $msg
    \n"; 79 } 80 echo '

    Please try again.

    '; 81 82 } // End of if (empty($errors)) IF. 83 84 } // End of submit conditional. 85 86 // Always show the form... 87 88 // Retrieve the user's information: 89 $q = "SELECT first_name, last_name, email FROM users WHERE user_id=$id"; (script continues) Script 9.3 continued 90 $r = @mysqli_query ($dbc, $q); 91 92 if (mysqli_num_rows($r) == 1) { // Valid user ID, show the form. 93 94 // Get the user's information: 95 $row = mysqli_fetch_array ($r, MYSQLI_NUM); 96 97 // Create the form: 98 echo '
    99

    First Name:

    100

    Last Name:

    101

    Email Address:

    102

    103 104 105
    '; 106 107 } else { // Not a valid user ID. 108 echo '

    This page has been accessed in error.

    '; 109 } 110 111 mysqli_close($dbc); 112 113 include ('includes/footer.html'); 114 ?> 3. Include the MySQL connection script and begin the main submit conditional. require_once ('../mysqli_connect.php'); if (isset($_POST['submitted'])) { $errors = array(); Like the registration examples you have already done, this script makes use of an array to track errors. 4. Validate the first name. if (empty($_POST['first_name'])) { $errors[ ] = 'You forgot to ➝ enter your first name.'; } else { $fn = mysqli_real_escape_ ➝ string($dbc, trim($_POST ➝ ['first_name'])); } The form (Figure 9.9) is like a registration page but without the password fields. The form data can therefore be validated using the same methods used in the reg- istration scripts. As with the registration examples, the validated data is trimmed and then run through mysqli_real_ escape_string() for security. 5. Validate the last name and email address. if (empty($_POST['last_name'])) { $errors[ ] = 'You forgot to |➝ enter your last name.'; } else { $ln = mysqli_real_escape_ ➝ string($dbc, trim($_POST ➝ ['last_name'])); } if (empty($_POST['email'])) { $errors[ ] = 'You forgot to ➝ enter your email address.'; } else { $e = mysqli_real_escape_ ➝ string($dbc, trim($_POST ➝ ['email'])); } 6. If there were no errors, check that the submitted email address is not already in use. if (empty($errors)) { $q = "SELECT user_id FROM users ➝ WHERE email='$e' AND user_id ➝ != $id"; $r = @mysqli_query($dbc, $q); if (mysqli_num_rows($r) == 0) { The integrity of the database and of the application as a whole partially depends upon having unique email address values in the users table. That requirement guarantees that the login system, which uses a combination of the email address and password (to be developed in Chapter 11), works. Because the form allows for altering the user’s email address (see Figure 9.9), special steps have to be taken to ensure uniqueness. To understand this query, consider two possibilities.... In the first, the user’s email address is being changed. In this case you just need to run a query making sure that that particular email address isn’t already reg- istered (i.e., SELECT user_id FROM users WHERE email='$e'). 273 Common Programming Techniques Editing Existing Records continues on next page In the second possibility, the user’s email address will remain the same. In this case, it’s okay if the email address is already in use, because it’s already in use for this user. To write one query that will work for both possibilities, don’t check to see if the email address is being used, but rather see if it’s being used by anyone else, hence: SELECT user_id FROM users WHERE email='$e' AND user_id != $id 7. Update the database. $q = "UPDATE users SET first_name= ➝ '$fn', last_name='$ln', email='$e' ➝ WHERE user_id=$id LIMIT 1"; $r = @mysqli_query ($dbc, $q); The UPDATE query is similar to examples you may have seen in Chapter 5, “Intro- duction to SQL.” The query updates all three fields—first name, last name, and email address—using the values submit- ted by the form. This system works because the form is preset with the exist- ing values. So, if you edit the first name in the form but nothing else, the first name value in the database is updated using this new value, but the last name and email address values are “updated” using their current values. This system is much easier than trying to determine which form values have changed and updating just those in the database. 8. Report on the results of the update. if (mysqli_affected_rows($dbc) == 1) ➝ { echo '

    The user has been edited.

    '; } else { echo '

    The user ➝ could not be edited due to a ➝ system error. We apologize for ➝ any inconvenience.

    '; echo '

    ' . mysqli_error($dbc) ➝ . '
    Query: ' . $q . ➝ '

    '; } The mysqli_affected_rows() function will return the number of rows in the database affected by the most recent query. If any of the three form values was altered, then this function should return the value 1. This conditional tests for that and prints a message indicating success or failure. Keep in mind that the mysqli_affected_ rows() function will return a value of 0 if an UPDATE command successfully ran but didn’t actually affect any records. So if you submit this form without changing any of the form values, a system error is displayed, which may not technically be correct. Once you have this script effectively working, you could change the error message to indicate that no alterations were made if mysqli_ affected_rows() returns 0. 9. Complete the email conditional. } else { echo '

    The email ➝ address has already been ➝ registered.

    '; } This else completes the conditional that checked if an email address was already being used by another user. If so, that message is printed. 274 Chapter 9 Editing Existing Records 10. Complete the $errors and submission conditionals. } else { // Report the errors. echo '

    The following ➝ error(s) occurred:
    '; foreach ($errors as ➝ $msg) { echo " - $msg
    \n"; } echo '

    Please try ➝ again.

    '; } // End of if (empty($errors)) ➝ IF. } // End of submit conditional. The first else is used to report any errors in the form (namely, a lack of a first name, last name, or email address). The final closing brace completes the main submit conditional. In this example, the form will be dis- played whenever the page is accessed. So after submitting the form, the data- base will be updated, and the form will be shown again, now displaying the lat- est information. 11. Retrieve the information for the user being edited. $q = "SELECT first_name, last_name, ➝ email FROM users WHERE user_ ➝ id=$id"; $r = @mysqli_query ($dbc, $q); if (mysqli_num_rows($r) == 1) { In order to pre-populate the form ele- ments, the current information for the user must be retrieved from the data- base. This query is similar to the one in delete_user.php. The conditional— checking that a single row was returned—ensures that a valid user ID was provided. 12. Display the form. $row = mysqli_fetch_array ($r, ➝ mysqli_NUM); echo '

    First Name:

    Last Name:

    Email Address:

    '; The form has but three text inputs, each of which is made sticky using the data retrieved from the database. Again, the user ID ($id) is stored as a hidden form input so that the handling process can also access this value. 275 Common Programming Techniques Editing Existing Records continues on next page 13. Complete the mysqli_num_rows() condi- tional. } else { echo '

    This ➝ page has been accessed in ➝ error.

    '; } If no record was returned from the data- base, because an invalid user ID was sub- mitted, this message is displayed. 14. Complete the PHP page. mysqli_close($dbc); include ('includes/footer.html'); ?> 15. Save the file as edit_user.php and place it in your Web directory (in the same folder as view_users.php). 16. Run the page by first clicking an Edit link in the view_users.php page (Figures 9.10 and 9.11). ■ This edit page does not include the func- tionality to change the password. That concept was already demonstrated in password.php (Script 8.7). If you would like to incorporate that functionality here, keep in mind that you cannot dis- play the current password, as it is encrypted. Instead, just present two boxes for changing the password (the new password input and a confirmation). If these values are submitted, update the password in the database as well. If these inputs are left blank, do not update the password in the database. 276 Chapter 9 Editing Existing Records Figure 9.10 The new values are displayed in the form after successfully updating the database (compare with the form values in Figure 9.9). Figure 9.11 If you try to change a record to an existing email address or if you omit an input, errors are reported. ✔ Tips ■ As written, the sticky form always shows the values retrieved from the database. This means that if an error occurs, the database values will be used, not the ones the user just entered (if those are different). To change this behavior, the sticky form would have to check for the presence of $_POST variables, using those if they exist, or the database values if not. Paginating Query Results Pagination is a concept you’re familiar with even if you don’t know the term. When you use a search engine like Google, it displays the results as a series of pages and not as one long list. The view_users.php script could benefit from this same feature. Paginating query results makes extensive use of the LIMIT SQL clause introduced in Chapter 5. LIMIT restricts which subset of the matched records are actually returned. To paginate the returned results of a query, each page will run the same query using different LIMIT parameters. So the first page will request the first X records; the second page, the second group of X records; and so forth. To make this work, an indicator of which records the page should display needs to be passed from page to page in the URL, like the user IDs passed from the view_users.php page. Another, more cosmetic technique will be demonstrated here: displaying each row of the table—each returned record—using an alternating background color (Figure 9.12). This effect will be achieved with ease, using the ternary operator (see the sidebar “The Ternary Operator”). There’s a lot of good, new information here, so be careful as you go through the steps and make sure that your script matches this one exactly. To make it easier to follow along, let’s write this version from scratch instead of trying to modify Script 9.1. 277 Common Programming Techniques Paginating Query Results Figure 9.12 Alternating the table row colors makes this list of users more legible (every other row has a light gray background). To paginate view_users.php: 1. Begin a new PHP document in your text editor or IDE (Script 9.4). Registered Users'; require_once ('../mysqli_ ➝ connect.php'); 2. Set the number of records to display per page. $display = 10; By establishing this value as a variable here, you’ll make it easy to change the number of records displayed on each page at a later date. Also, this value will be used multiple times in this script, so it’s best represented as a single variable. 3. Check if the number of required pages has been determined. if (isset($_GET['p']) && is_numeric ➝ ($_GET['p'])) { $pages = $_GET['p']; } else { For this script to display the users over several pages, it will need to determine how many total pages of results will be required. The first time the script is run, this number has to be calculated. For every subsequent call to this page, the total number of pages will be passed to the script in the URL, so it will be available in $_GET['p']. If this variable is set and is numeric, its value will be assigned to the $pages variable. If not, then the number of pages will need to be calculated. 278 Chapter 9 Paginating Query Results 1 Registered Users'; 9 10 require_once ('../mysqli_connect.php'); 11 12 // Number of records to show per page: 13 $display = 10; 14 15 // Determine how many pages there are... 16 if (isset($_GET['p']) && is_numeric($_GET ['p'])) { // Already been determined. 17 18 $pages = $_GET['p']; 19 20 } else { // Need to determine. 21 22 // Count the number of records: 23 $q = "SELECT COUNT(user_id) FROM users"; 24 $r = @mysqli_query ($dbc, $q); 25 $row = @mysqli_fetch_array ($r, MYSQLI_NUM); 26 $records = $row[0]; 27 28 // Calculate the number of pages... 29 if ($records > $display) { // More than 1 page. 30 $pages = ceil ($records/$display); 31 } else { Script 9.4 This new version of view_users.php incorporates pagination so that the users are listed over multiple Web browser pages. (script continues) continues on page 280 279 Common Programming Techniques Paginating Query Results Script 9.4 continued 32 $pages = 1; 33 } 34 35 } // End of p IF. 36 37 // Determine where in the database to start returning results... 38 if (isset($_GET['s']) && is_numeric ($_GET['s'])) { 39 $start = $_GET['s']; 40 } else { 41 $start = 0; 42 } 43 44 // Make the query: 45 $q = "SELECT last_name, first_name, DATE_ FORMAT(registration_date, '%M %d, %Y') AS dr, user_id FROM users ORDER BY registration_date ASC LIMIT $start, $display"; 46 $r = @mysqli_query ($dbc, $q); 47 48 // Table header: 49 echo ' 50 51 52 53 54 55 56 57 '; 58 59 // Fetch and print all the records.... 60 61 $bg = '#eeeeee'; // Set the initial background color. (script continues) Script 9.4 continued 62 63 while ($row = mysqli_fetch_array($r, MYSQLI_ASSOC)) { 64 65 $bg = ($bg=='#eeeeee' ? '#ffffff' : '#eeeeee'); // Switch the background color. 66 67 echo ' 68 69 70 71 72 73 74 '; 75 76 } // End of WHILE loop. 77 78 echo '
    EditDeleteLast NameFirst NameDate Registered
    EditDelete ' . $row['last_name'] . '' . $row['first_name'] . '' . $row['dr'] . '
    '; 79 mysqli_free_result ($r); 80 mysqli_close($dbc); 81 82 // Make the links to other pages, if necessary. 83 if ($pages > 1) { 84 85 // Add some spacing and start a paragraph: 86 echo '

    '; 87 88 // Determine what page the script is on: 89 $current_page = ($start/$display) + 1; 90 (script continues on next page) 4. Count the number of records in the data- base. $q = "SELECT COUNT(user_id) FROM ➝ users"; $r = @mysqli_query ($dbc, $q); $row = @mysqli_fetch_array ($r, ➝ MYSQLI_NUM); $records = $row[0]; Using the COUNT() function, introduced in Chapter 6, “Advanced SQL and MySQL,” you can easily see the number of records in the users table. This query will return a single row with a single column: the number of records (Figure 9.13). 5. Mathematically calculate how many pages are required. if ($records > $display) { $pages = ceil ($records/ ➝ $display); } else { $pages = 1; } } // End of np IF. The number of pages required to display all of the records is based upon the total number of records to be shown and the number to display per page (as assigned 280 Chapter 9 Paginating Query Results Script 9.4 continued 91 // If it's not the first page, make a Previous button: 92 if ($current_page != 1) { 93 echo 'Previous '; 94 } 95 96 // Make all the numbered pages: 97 for ($i = 1; $i <= $pages; $i++) { 98 if ($i != $current_page) { 99 echo '' . $i . ' '; 100 } else { 101 echo $i . ' '; 102 } 103 } // End of FOR loop. 104 105 // If it's not the last page, make a Next button: 106 if ($current_page != $pages) { 107 echo 'Next'; 108 } 109 110 echo '

    '; // Close the paragraph. 111 112 } // End of links section. 113 114 include ('includes/footer.html'); 115 ?> Figure 9.13 The result of running the counting query in the mysql client. to the $display variable). If there are more rows than there are records to be displayed per page, multiple pages will be required. To calculate exactly how many pages, take the next highest integer from the division of the two (the ceil() func- tion returns the next highest integer). For example, if there are 25 records returned and 10 are being displayed per page, then 3 pages are required (the first page will display 10, the second page 10, and the third page 5). If $records is not greater than $display, only one page is necessary. 6. Determine the starting point in the database. if (isset($_GET['s']) && is_numeric ➝ ($_GET['s'])) { $start = $_GET['s']; } else { $start = 0; } The second parameter the script will receive—on subsequent viewings of the page—will be the starting record. This corresponds to the first number in a LIMIT x, y clause. Upon initially calling the script, the first ten records should be retrieved (because $display has a value of 10). The second page would show records 10 through 20; the third, 20 through 30; and so forth. The first time this page is accessed, the $_GET['s'] variable will not be set, and so $start should be 0 (the first record in a LIMIT clause is indexed at 0). Subsequent pages will receive the $_GET['s'] variable from the URL, and it will be assigned to $start. 7. Write the query with a LIMIT clause. $q = "SELECT last_name, first_name, ➝ DATE_FORMAT(registration_date, '%M ➝ %d, %Y') AS dr, user_id FROM users ➝ ORDER BY registration_date ASC ➝ LIMIT $start, $display"; $r = @mysqli_query ($dbc, $q); The LIMIT clause dictates which record to begin retrieving ($start) and how many to return ($display) from that point. The first time the page is run, the query will be SELECT last_name, first_ name … LIMIT 0, 10. Clicking to the next page will result in SELECT last_name, first_name … LIMIT 10, 10. 8. Create the HTML table header. echo ' '; In order to simplify this script a little bit, I’m assuming that there are records to be displayed. To be more formal, this script, prior to creating the table, would invoke the mysqli_num_rows() function and have a conditional that confirms that some records were returned. 281 Common Programming Techniques Paginating Query Results continues on next page 9. Initialize the background color variable. $bg = '#eeeeee'; To make each row have its own back- ground color, a variable will be used to store that color. To start, the $bg vari- able is assigned a value of #eeeeee, a light gray. This color will alternate with white (#ffffff). 10. Begin the while loop that retrieves every record. while ($row = mysqli_fetch_array($r, ➝ MYSQLI_ASSOC)) { $bg = ($bg=='#eeeeee' ? ➝ '#ffffff' : '#eeeeee'); The background color used by each row in the table is assigned to the $bg variable. Because I want this color to alternate, I use this line of code to assign the opposite color to $bg. If it’s equal to #eeeeee, then it will be assigned the value of #ffffff and vice versa (again, see the sidebar for the syntax and explanation of the ternary operator). For the first row, $bg is equal to #eeeeee and will therefore be assigned #ffffff, making a white background. For the second row, $bg is not equal to #eeeeee, so it will be assigned that value, making a gray background. 11. Print the records in a table row. echo ' '; This code only differs in one way from that in the previous version of this script. The initial TR tag now includes the bgcolor attribute, whose value will be the $bg variable (so #eeeeee and #ffffff, alternating). 12. Complete the while loop and the table, free up the query result resources, and close the database connection. } echo '
    Edit ➝ Delete ➝ Last Name ➝ First Name ➝ Date ➝ Registered
    Edit Delete ' . $row['last_ ➝ name'] . ' ' . $row['first_ ➝ name'] . ' ' . $row['dr'] . ➝ '
    '; mysqli_free_result ($r); mysqli_close($dbc); 13. Begin a section for displaying links to other pages, if necessary. if ($pages > 1) { echo '

    '; $current_page = ($start/ ➝ $display) + 1; if ($current_page != 1) { echo 'Previous ➝ '; } 282 Chapter 9 Paginating Query Results If the script requires multiple pages to display all of the records, it needs the appropriate links at the bottom of the page (Figure 9.14). To make these links, first determine the current page. This can be calculated as the start number divided by the display number, plus 1. For example, on the second instance of this script, $start will be 10 (because on the first instance, $start is 0), so the current page is 2 (10/10 + 1 = 2). If the current page is not the first page, it also needs a Previous link to the earli- er result set (Figure 9.15). This isn’t strictly necessary, but is nice. Each link will be made up of the script name, plus the starting point and the number of pages. The starting point for the previous page will be the current starting point minus the number being displayed. These values must be passed in every link, or else the pagination will fail. 283 Common Programming Techniques Paginating Query Results The Ternary Operator This example uses an operator not introduced before, called the ternary operator. Its structure is (condition) ? valueT : valueF The condition in parentheses will be evaluated; if it is TRUE, the first value will be returned (valueT). If the condition is FALSE, the second value (valueF) will be returned. Because the ternary operator returns a value, the entire structure is often used to assign a value to a variable or used as an argument for a function. For example, the line echo (isset($var)) ? 'SET' : 'NOT SET'; will print out SET or NOT SET, depending upon the status of the variable $var. In this version of the view_users.php script, the ternary operator assigns a different value to a variable than its current value. The variable itself will then be used to dictate the back- ground color of each record in the table. There are certainly other ways to set this value, but the ternary operator is the most concise. Figure 9.15 The Previous link will appear only if the current page is not the first one. Figure 9.14 After all of the returned records, links are generated to the other result pages. 14. Make the numeric links. for ($i = 1; $i <= $pages; $i++) { if ($i != $current_page) { echo '' . $i . ➝ ' '; } else { echo $i . ' '; } } The bulk of the links will be created by looping from 1 to the total number of pages. Each page will be linked except for the current one. 15. Create a Next link. if ($current_page != $pages) { echo 'Next ➝ '; } Finally, a Next page link will be dis- played, assuming that this is not the final page (Figure 9.16). 16. Complete the page. echo '

    '; } include ('includes/footer.html'); ?> 17. Save the file as view_users.php, place it in your Web directory, and test it in your Web browser. ✔ Tips ■ This example paginates a simple query, but if you want to paginate a more com- plex query, like the results of a search, it’s not that much more complicated. The main difference is that whatever terms are used in the query must be passed from page to page in the links. If the main query is not exactly the same from one viewing of the page to the next, the pagination will fail. ■ If you run this example and the pagina- tion doesn’t match the number of results that should be returned (for example, the counting query indicates there are 150 records but the pagination only creates 3 pages, with 10 records on each), it’s most likely because the main query and the COUNT() query are too different. These two queries will never be the same, but they must perform the same join (if applicable) and have the same WHERE and/or GROUP BY clauses to be accurate. ■ No error handling has been included in this script, as I know the queries func- tion as written. If you have problems, remember your MySQL/SQL debugging steps: print the query, run it using the mysql client or phpMyAdmin to confirm the results, and invoke the mysqli_error() function as needed. 284 Chapter 9 Paginating Query Results Figure 9.16 The final results page will not display a Next link. Making Sortable Displays To wrap up this chapter, there’s one final fea- ture that could be added to view_users.php. In its current state the list of users is dis- played in order by the date they registered. It would be nice to be able to view them by name as well. From a MySQL perspective, accomplishing this task is easy: just change the ORDER BY clause. Therefore, all that needs to be done is to add some functionality in PHP that will change the ORDER BY clause. The logical way to do this is to link the column headings so that clicking them changes the display order. As you hopefully can guess, this involves using the GET method to pass a parameter back to this page indicating the preferred sort order. To make sortable links: 1. Open view_users.php (Script 9.4) in your text editor or IDE. 2. After determining the starting point, define a $sort variable (Script 9.5). $sort = (isset($_GET['sort'])) ? ➝ $_GET['sort'] : 'rd'; The $sort variable will be used to deter- mine how the query results are to be ordered. This line uses the ternary operator (see the sidebar earlier in the chapter) to assign a value to $sort. If $_GET['sort'] is set, which will be the case after the user clicks any link, then $sort should be assigned that value. If $_GET['sort'] is not set, then $sort is assigned a default value of rd (short for registration date). 285 Common Programming Techniques Making Sortable Displays 1 Registered Users'; 9 10 require_once ('../mysqli_connect.php'); 11 12 // Number of records to show per page: 13 $display = 10; 14 15 // Determine how many pages there are... 16 if (isset($_GET['p']) && is_numeric($_GET['p'])) { // Already been determined. 17 $pages = $_GET['p']; 18 } else { // Need to determine. 19 // Count the number of records: 20 $q = "SELECT COUNT(user_id) FROM users"; 21 $r = @mysqli_query ($dbc, $q); 22 $row = @mysqli_fetch_array ($r, MYSQLI_NUM); 23 $records = $row[0]; 24 // Calculate the number of pages... 25 if ($records > $display) { // More than 1 page. 26 $pages = ceil ($records/$display); 27 } else { 28 $pages = 1; 29 } 30 } // End of p IF. 31 Script 9.5 This latest version of the view_users.php script creates clickable links out of the table’s column headings. (script continues on next page) continues on page 287 286 Chapter 9 Making Sortable Displays 32 // Determine where in the database to start returning results... 33 if (isset($_GET['s']) && is_numeric ($_GET['s'])) { 34 $start = $_GET['s']; 35 } else { 36 $start = 0; 37 } 38 39 // Determine the sort... 40 // Default is by registration date. 41 $sort = (isset($_GET['sort'])) ? $_GET ['sort'] : 'rd'; 42 43 // Determine the sorting order: 44 switch ($sort) { 45 case 'ln': 46 $order_by = 'last_name ASC'; 47 break; 48 case 'fn': 49 $order_by = 'first_name ASC'; 50 break; 51 case 'rd': 52 $order_by = 'registration_date ASC'; 53 break; 54 default: 55 $order_by = 'registration_date ASC'; 56 $sort = 'rd'; 57 break; 58 } 59 60 // Make the query: 61 $q = "SELECT last_name, first_name, DATE_ FORMAT(registration_date, '%M %d, %Y') AS dr, user_id FROM users ORDER BY $order_by LIMIT $start, $display"; 62 $r = @mysqli_query ($dbc, $q); // Run the query. 63 (script continues) Script 9.5 continued 64 // Table header: 65 echo ' 66 67 68 69 70 71 72 73 '; 74 75 // Fetch and print all the records.... 76 $bg = '#eeeeee'; 77 while ($row = mysqli_fetch_array($r, MYSQLI_ASSOC)) { 78 $bg = ($bg=='#eeeeee' ? '#ffffff' : '#eeeeee'); 79 echo ' 80 81 82 83 84 85 86 '; 87 } // End of WHILE loop. 88 89 echo '
    EditDeleteLast NameFirst NameDate Registered
    Edit Delete ' . $row['last_name'] . '' . $row['first_name'] . '' . $row['dr'] . '
    '; 90 mysqli_free_result ($r); 91 mysqli_close($dbc); 92 (script continues on next page) Script 9.5 continued 3. Determine how the results should be ordered. switch ($sort) { case 'ln': $order_by = 'last_name ASC'; break; case 'fn': $order_by = 'first_name ➝ ASC'; break; case 'rd': $order_by = 'registration_ ➝ date ASC'; break; default: $order_by = 'registration_ ➝ date ASC'; $sort = 'rd'; break; } The switch checks $sort against several expected values. If, for example, it is equal to ln, then the results should be ordered by the last name in ascending order. The assigned $order_by variable will be used in the SQL query. If $sort has a value of fn, then the results should be in ascending order by first name. If the value is rd, then the results will be in ascending order of registration date. This is also the default case. Having this default case here protects against a malicious user changing the value of $_GET['sort'] to something that could break the query. 287 Common Programming Techniques Making Sortable Displays 93 // Make the links to other pages, if necessary. 94 if ($pages > 1) { 95 96 echo '

    '; 97 $current_page = ($start/$display) + 1; 98 99 // If it's not the first page, make a Previous button: 100 if ($current_page != 1) { 101 echo 'Previous '; 102 } 103 104 // Make all the numbered pages: 105 for ($i = 1; $i <= $pages; $i++) { 106 if ($i != $current_page) { 107 echo '' . $i . ' '; 108 } else { 109 echo $i . ' '; 110 } 111 } // End of FOR loop. 112 113 // If it's not the last page, make a Next button: 114 if ($current_page != $pages) { 115 echo 'Next'; 116 } 117 118 echo '

    '; // Close the paragraph. 119 120 } // End of links section. 121 122 include ('includes/footer.html'); 123 ?> Script 9.5 continued continues on next page 4. Modify the query to use the new $order_by variable. $q = "SELECT last_name, first_name, ➝ DATE_FORMAT(registration_date, '%M ➝ %d, %Y') AS dr, user_id FROM users ➝ ORDER BY $order_by LIMIT $start, ➝ $display"; By this point, the $order_by variable has a value indicating how the returned results should be ordered (for example, registration_date ASC), so it can be easily added to the query. Remember that the ORDER BY clause comes before the LIMIT clause. If the resulting query doesn’t run properly for you, print it out and inspect its syntax. 5. Modify the table header echo() state- ment to create links out of the column headings. echo ' '; 288 Chapter 9 Making Sortable Displays ✔ Tip ■ A very important security concept was also demonstrated in this example. Instead of using the value of $_GET['sort'] directly in the query, it’s checked against assumed values in a switch. If, for some reason, $_GET['sort'] has a value other than would be expected, the query uses a default sorting order. The point is this: don’t make assumptions about received data, and don’t use unvalidated data in an SQL query. To make the column headings clickable links, just surround them with the tags. The value of the href attribute for each link corresponds to the acceptable values for $_GET['sort'] (see the switch in Step 3). 6. Modify the echo() statement that creates the Previous link so that the sort value is also passed. echo 'Previous '; Add another name=value pair to the Previous link so that the sort order is also sent to each page of results. If you don’t, then the pagination will fail, as the ORDER BY clause will differ from one page to the next. 7. Repeat Step 6 for the numbered pages and the Next link. echo '' . ➝ $i . ' '; echo 'Next'; 8. Save the file as view_users.php, place it in your Web directory, and run it in your Web browser (Figures 9.17 and 9.18). 289 Common Programming Techniques Making Sortable Displays Figure 9.17 The first time viewing the page, the results are shown in ascending order of registration date. After clicking the first name column, the results are shown in ascending order by first name (as seen here). Figure 9.18 Clicking the Last Name column displays the results in order by last name ascending. This page intentionally left blank The preceding two chapters focus on using PHP and MySQL together (which is, after all, the primary point of this book). But there’s still a lot of PHP-centric material to be covered. Taking a quick break from using PHP with MySQL, this chapter covers a handful of techniques that are often used in more complex Web applications. The first topic covered in this chapter is sending email using PHP. It’s a very common thing to do and is surprisingly simple (assuming that the server is properly set up). After that, the chapter touches upon some of the date and time functions present in PHP. The third subject demonstrates how to handle file uploads in an HTML form. This in turn leads to a discussion of using PHP and JavaScript together, then how to use the header() function to manipulate the Web browser. 291 Web Application Development 10 Web Application Development Sending Email One of my absolute favorite things about PHP is how easy it is to send an email. On a properly configured server, the process is as simple as using the mail() function: mail (to, subject, body, [headers]); The to value should be an email address or a series of addresses, separated by commas. Any of these are allowed: ◆ email@example.com ◆ email1@example.com, email2@example.com ◆ Actual Name ◆ Actual Name , This Name The subject value will create the email’s subject line, and body is where you put the contents of the email. To make things more legible, variables are often assigned values and then used in the mail() function call: $to = 'email@example.com'; $subject = 'This is the subject'; $body = 'This is the body. It goes over multiple lines.'; mail ($to, $subject, $body); As you can see in the assignment to the $body variable, you can create an email message that goes over multiple lines by having the text do exactly that within the quotation marks. You can also use the newline charac- ter (\n) within double quotation marks to accomplish this: $body = "This is the body.\nIt goes over ➝ multiple lines."; This is all very straightforward, and there are only a couple of caveats. First, the subject line cannot contain the newline character (\n). Second, each line of the body should be no longer than 70 characters in length. You can accomplish this using the wordwrap() function. It will insert a newline into a string every X number of characters. To wrap text to 70 characters, use $body = wordwrap($body, 70); The mail() function takes a fourth, optional parameter for additional headers. This is where you could set the From, Reply-To, Cc, Bcc, and similar settings. For example, mail ($to, $subject, $body, 'From: ➝ reader@example.com'); To use multiple headers of different types in your email, separate each with \r\n: $headers = "From: John@example.com\r\n"; $headers .= "Cc: Jane@example.com, ➝ Joe@example.com\r\n"; mail ($to, $subject, $body, $headers); Although this fourth argument is optional, it is advised that you always include a From value (although that can also be established in PHP’s configuration file). To demonstrate this, let’s create a page that shows a contact form (Figure 10.1) and then handles the form submission, validating the data and sending it along in an email. This example will also contain a nice variation on the sticky form technique used in this book. 292 Chapter 10 Sending Email Figure 10.1 A standard (but not very attractive) contact form. 1 3 4 5 6 Contact Me 7 8 9

    Contact Me

    10 Contact Me

    Contact Me

    Thank you for contacting me. I will reply some day.

    '; 29 30 // Clear $_POST (so that the form's not sticky): 31 $_POST = array(); 32 33 } else { 34 echo '

    Please fill out the form completely.

    '; 35 } 36 37 } // End of main isset() IF. 38 39 // Create the HTML form: 40 ?> 41

    Please fill out this form to contact me.

    42
    43

    Name:

    44

    Email Address:

    45

    Comments:

    46

    47 48 49 50 Script 10.1 continued 4. Send the email and print a message in the Web browser. mail('your_email@example.com', ➝ 'Contact Form Submission', $body, ➝ "From: {$_POST['email']}"); echo '

    Thank you for contacting ➝ me. I will reply some day.

    '; Assuming the server is properly config- ured, this one line will send the email. You will need to change the to value to your actual email address. The From value will be the email address from the form. The subject will be a literal string. There’s no way of confirming that the email was successfully sent, let alone received, but a generic message is printed. 5. Clear the $_POST array. $_POST = array(); In this example, the form will always be shown, even upon successful submission. The form will be sticky in case the user omitted something (Figure 10.2). However, if the mail was sent, there’s no need to show the values in the form again. To avoid that, the $_POST array can be cleared of its values using the array() function. 6. Complete the conditionals. } else { echo '

    Please ➝ fill out the form ➝ completely.

    '; } } // End of main isset() IF. ?> The error message contains some inline CSS, so that it’s in red and made bold. 7. Begin the form.

    Please fill out this form to ➝ contact me.

    Name:

    Email Address:

    The form will submit back to this same page using the POST method. The first two inputs are of type text; both are made sticky by checking if the corresponding $_POST variable has a value. If so, that value is printed as the current value for that input. 295 Web Application Development Sending Email Figure 10.2 The contact form will remember the user- supplied values in case it is not completely filled out. continues on next page 8. Complete the form.

    Comments:

    The comments input is a textarea, which does not use a value attribute. Instead, to be made sticky, the value is printed between the opening and clos- ing textarea tags. 9. Complete the HTML page. 10. Save the file as email.php, place it in your Web directory, and test it in your Web browser (Figure 10.3). 11. Check your email to confirm that you received the message (Figure 10.4). If you don’t actually get the email, you’ll need to do some debugging work. With this example, you should confirm with your host (if using a hosted site) or your- self (if running PHP on your server), that there’s a working mail server installed. You should also test this using different email addresses (for both the to and from values). Also watch that your spam filter isn’t eating up the message. 296 Chapter 10 Sending Email Figure 10.3 Successful completion and submission of the form. Figure 10.4 The resulting email (from the data in Figure 10.1). ✔ Tips ■ On some—primarily Unix—systems, the \r\n characters aren’t handled properly. If you have problems with them, use just \n instead. ■ The mail() function returns a 1 or a 0 indicating the success of the function call. This is not the same thing as the email successfully being sent or received. You cannot easily test for either using PHP. ■ While it’s easy to send a simple message with the mail() function, sending HTML emails or emails with attachments involves more work. I discuss how you can do both in my book PHP 5 Advanced: Visual QuickPro Guide (Peachpit Press, 2007). ■ Using a contact form that has PHP send an email is a great way to minimize the spam you receive. With this system, your actual email address is not visible in the Web browser, meaning it can’t be har- vested by spambots. 297 Web Application Development Sending Email PHP mail() Dependencies PHP’s mail() function doesn’t actually send the email itself. Instead, it tells the mail server running on the computer to do so. What this means is that the com- puter on which PHP is running must have a working mail server in order for this function to work. If you have a computer running a Unix variant or if you are running your Web site through a professional host, this should not be a problem. But if you are running PHP on your own desktop or laptop com- puter, you’ll probably need to make adjustments. If you are running Windows and have an Internet service provider (ISP) that pro- vides you with an SMTP server (like smtp.comcast.net), this information can be set in the php.ini file (see Appendix A, “Installation,” for how to edit this file). Unfortunately, this will only work if your ISP does not require authentication—a username and password combination— to use the SMTP server. Otherwise, you’ll need to install an SMTP server on your computer. There are plenty available, and they’re not that hard to install and use: just search the Internet for free windows smtp server and you’ll see some options. There are also threads on this subject in the book’s corresponding forum (www.DMCInsights.com/phorum/). If you are running Mac OS X, you’ll need to enable the built-in SMTP server (either sendmail or postfix, depending upon the specific version of Mac OS X you are run- ning). You can find instructions online for doing so (search with enable sendmail “Mac OS X”). Date and Time Functions Chapter 5, “Introduction to SQL,” demonstrates a handful of great date and time functions that MySQL supports. Naturally, PHP has its own date and time functions. To start, there’s date_default_timezone_set(). This function is used to establish the default time zone (which can also be set in PHP’s configura- tion file). date_default_timezone_set(tz); The tz value is a string like America/New_York or Pacific/Auckland. There are too many to list here (Africa alone has over 50), but see the PHP manual for them all. Note that as of PHP 5.1, the default time zone must be set prior to calling any of the date and time func- tions, or else you’ll see an error (Figure 10.5). Next up, the checkdate() function takes a month, a day, and a year and returns a Boolean value indicating whether that date actually exists (or existed). It even takes into account leap years. This function can be used to ensure that a user supplied a valid date (birth date or other): if (checkdate(month, day, year)) { // OK! Perhaps the most frequently used function is the aptly named date(). It returns the date and/or time as a formatted string. It takes two arguments: date (format, [timestamp]); The timestamp is an optional argument rep- resenting the number of seconds since the Unix Epoch (midnight on January 1, 1970) for the date in question. It allows you to get information, like the day of the week, for a particular date. If a timestamp is not speci- fied, PHP will just use the current time on the server. There are myriad formatting parameters available (Table 10.1), and these can be used in conjunction with literal text. For example, echo date('F j, Y'); // January 26, 2008 echo date('H:i'); // 23:14 echo date('D'); // Sat You can find the timestamp for a particular date using the mktime() function. $stamp = mktime (hour, minute, second, ➝ month, day, year); If called with no arguments, mktime() returns the current timestamp, which is the same as calling the time() function. Finally, the getdate() function can be used to return an array of values (Table 10.2) for a date and time. For example, $today = getdate(); echo $today['month']; // October This function also takes an optional time- stamp argument. If that argument is not used, getdate() returns information for the current date and time. These are just a handful of the many date and time functions PHP has. For more, see the PHP manual. To practice working with these functions, let’s modify email.php (Script 10.1) in an admittedly superfluous way. 298 Chapter 10 Date and Time Functions Figure 10.5 If running PHP 5.1 and later and error_reporting is set on its highest level, PHP will generate a notice when a date or time function is used without the time zone being set. To use the date and time functions: 1. Open email.php (Script 10.1) in your text editor or IDE. 2. As the first line of code after the open- ing PHP tag, establish the time zone (Script 10.2). date_default_timezone_set ➝ ('America/New_York'); Before calling any of the date and time functions (and this script will call two different ones, twice each), the time zone has to be established. To find your time zone, see www.php.net/timezones. 299 Web Application Development Date and Time Functions 1 3 4 5 6 Contact Me 7 8 9

    Contact Me

    10 " /> Just to try something interesting, this script will time how long it takes for the user to receive, fill out, and submit the form. Timing this is just a matter of sub- tracting the time the form was sent to the Web browser from the time it was submitted back to the server. The time() function will return a timestamp (the number of seconds since the epoch). This value will be stored in the HTML form so that it can be used in the calculation upon submission (Figure 10.6). 4. Change the form’s action attribute so that it points to this new script.
    This file will be named datetime.php, so the action has to be changed as well. 5. Going back up a few lines in the script to where the form is submitted, change the message so that it includes the current date and time. echo '

    Thank you for contacting ➝ me at ' . date('g:i a (T)') . ' on ' . ➝ date('l F j, Y') .'. I will reply ➝ some day.

    '; Two invocations of the date() function are added to this message. The first will return the current time formatted as HH:MM am/pm (XXX), where XXX rep- resents the time zone identifier. The sec- ond call to date() will return the day of the week, month, day, and year, in the format Day Month D, YYYY. 300 Chapter 10 Date and Time Functions 17 18 // Minimal form validation: 19 if (!empty($_POST['name']) && !empty($_POST['email']) && !empty($_POST['comments']) ) { 20 21 // Create the body: 22 $body = "Name: {$_POST['name']}\n\nComments: {$_POST['comments']}"; 23 $body = wordwrap($body, 70); 24 25 // Send the email: 26 mail('your_email_address@example.com', 'Contact Form Submission', $body, "From: {$_POST['email']}"); 27 28 // Print a message: 29 echo '

    Thank you for contacting me at ' . date('g:i a (T)') . ' on ' . date('l F j, Y') .'. I will reply some day.

    '; 30 31 // How long did it all take? 32 echo '

    It took ' . (time() - $_POST['start']) . ' seconds for you to complete and submit the form.

    '; 33 34 // Clear $_POST (so that the form's not sticky): 35 $_POST = array(); 36 37 } else { 38 echo '

    Please fill out the form completely.

    '; 39 } 40 41 } // End of main isset() IF. 42 43 // Create the HTML form: 44 ?> (script continues on next page) Script 10.2 continued 6. Add another message indicating how long the whole process took. echo '

    It took ' . (time() ➝ - $_POST['start']) . ' seconds for ➝ you to complete and submit the ➝ form.

    '; This message includes the calculation of the current timestamp (returned by time()) minus the timestamp stored in the HTML form. 7. Save the file as datetime.php, place it in your Web directory, and test it in your Web browser (Figures 10.7 and 10.8). ✔ Tips ■ The date() function has some parameters that are used for informative purposes, not formatting. For example, date('L') returns 1 or 0 indicating if it’s a leap year; date('t') returns the number of days in the current month; and date('I') returns a 1 if it’s currently daylight saving time. ■ PHP’s date functions reflect the time on the server (because PHP runs on the server); you’ll need to use JavaScript if you want to determine the date and time on the user’s computer. 301 Web Application Development Date and Time FunctionsFigure 10.6 The HTML source code of the page reveals the timestamp stored in a hidden input called start. Figure 10.7 The form itself does not seem to be that much different from the original in email.php (see Figure 10.1). Figure 10.8 The response message now uses two date and time functions for a more customized reply. 45

    Please fill out this form to contact me.

    46 47

    Name:

    48

    Email Address:

    49

    Comments:

    50

    51 52 53 54 55 Script 10.2 continued Handling File Uploads Chapters 2, “Programming with PHP,” and 3, “Creating Dynamic Web Sites,” go over the basics of handling HTML forms with PHP. For the most part, every type of form element can be handled the same in PHP, with one exception: file uploads. The process of upload- ing a file has two dimensions. First the HTML form must be displayed, with the proper code to allow for file uploads. Then upon submis- sion of the form, the PHP script must copy the uploaded file to its final destination. However, for this process to work, several things must be in place: ◆ PHP must run with the right settings. ◆ A temporary storage directory must exist with the correct permissions. ◆ The final storage directory must exist with the correct permissions. With this in mind, this next section will cover the server setup to allow for file uploads; then a PHP script will be created that actually does the uploading. Allowing for file uploads As I said, certain settings must be established in order for PHP to be able to handle file uploads. I’ll first discuss why or when you’d need to make these adjustments before walking you through the steps. The first issue is PHP itself. There are several settings in PHP’s configuration file (php.ini) that dictate how PHP handles uploads, spe- cifically stating how large of a file can be uploaded and where the upload should tem- porarily be stored (Table 10.3). Generally speaking, you’ll need to edit this file if any of these conditions apply: 302 Chapter 10 Handling File Uploads Setting Value Type Importance file_uploads Boolean Enables PHP support for file uploads max_input_time integer Indicates how long, in seconds, a PHP script is allowed to run post_max_size integer Size, in bytes, of the total allowed POST data upload_max_filesize integer Size, in bytes, of the largest possible file upload allowed upload_tmp_dir string Indicates where uploaded files should be temporarily stored File Upload Configurations Table 10.3 These PHP configuration settings each impact file upload capabilities. ◆ file_uploads is disabled. ◆ PHP has no temporary directory to use. ◆ You will be uploading very large files (larger than 2 MB). If you don’t have access to your php.ini file— like if you’re using a hosted site, presumably the host has already configured PHP to allow for file uploads. If you installed PHP on Mac OS X or Unix, you should also be good to go (assuming reasonable-sized files). The second issue is the location of, and per- missions on, the temporary directory. This is where PHP will store the uploaded file until your PHP script moves it to its final destina- tion. If you installed PHP on your own Windows computer, you might need to take steps here (I had no problems with the default PHP 6 installation on Windows XP, but I don’t want to assume that’ll be the same for everyone). Mac OS X and Unix users need not worry about this, as a temporary direc- tory already exists for such purposes. Finally, the destination folder must be created and have the proper permissions established on it. This is a step that everyone must take for every application that handles file uploads. Because there are important security issues involved in this step, please also make sure that you read and understand the sidebar, “Secure Folder Permissions.” With all of this in mind, let’s go through the steps. To prepare the server: 1. Run the phpinfo() function to confirm your server settings (Figure 10.9). The phpinfo() function prints out a slew of information about your PHP setup. It’s one of the most important functions in PHP,if not the most (in my opinion). Search for the settings listed in Table 10.3 and confirm their values. Make sure that file_uploads has a value of On and that the limit for upload_max_filesize (2MB, by default) and post_max_size (8MB) won’t be a restriction for you. If running PHP on Windows, see if upload_tmp_dir has a value. If it doesn’t, that might be a prob- lem (you’ll know for certain after running the PHP script that handles the file upload). For non-Windows users, if this value says no value, that’s perfectly fine. 303 Web Application Development Handling File Uploads Figure 10.9 A phpinfo() script returns all the information regarding your PHP setup, including all the file upload handling stuff. continues on next page 2. If necessary, open php.ini in your text editor. If there’s anything you saw in Step 1 that needs to be changed, or if something happens when you actually go to handle a file upload using PHP, you’ll need to edit the php.ini file. To find this file, see the Configuration File (php.ini) path value in the phpinfo() output. This indicates exactly where this file is on your comput- er (also see Appendix A for more). If you are not allowed to edit your php.ini file (if, for instance, you’re using a hosted server), then presumably any necessary edits would have already been made to allow for file uploads. If not, you’ll need to request these changes from your host- ing company (who may or may not agree to make them). 3. Search the php.ini file for the configura- tion to be changed and make any edits (Figure 10.10). For example, in the File Uploads section, you’ll see these three lines: file_uploads = On ;upload_tmp_dir = upload_max_filesize = 2M The first line dictates whether or not uploads are allowed. The second states where the uploaded files should be tem- porarily stored. On most operating systems, including Mac OS X and Unix, this set- ting can be left commented out (preceded by a semicolon) without any problem. If you are running Windows and need to create a temporary directory, set this value to C:\tmp, making sure that the line is not preceded by a semicolon. Again, using the most recent version of PHP on Windows XP, I did not need to create a temporary directory, so you may be able to get away without one too. Finally, a maximum upload file size is set (the M is shorthand for megabytes in con- figuration settings). 4. Save the php.ini file and restart your Web server. How you restart your Web server depends upon the operating system and Web serv- ing application being used. See Appendix Afor instructions. 304 Chapter 10 Handling File Uploads Figure 10.10 The File Uploads subsection of the php.ini file. Figure 10.11 Windows users need to make sure that the C:\tmp (or whatever directory is used) is writable by PHP. On my Windows XP installation, this just meant that it couldn’t be marked private (see the top portion of this image). 5. Confirm the changes by rerunning the phpinfo() script. Before going any further, confirm that the necessary changes have been enacted by repeating Step 1. 6. If you are running Windows and need to create a temporary directory, add a tmp folder within C:\ and make sure that everyone can write to that direc- tory (Figure 10.11). PHP, through your Web server, will tem- porarily store the uploaded file in the upload_tmp_dir. For this to work, the Web user (if your Web server runs as a particular user) must have permission to write to the folder. In all likelihood, you may not actually have to change the permissions, but to do so, depending upon what version of Windows you are running, you can nor- mally adjust the permissions by right- clicking the folder and selecting Properties. With the Properties window, there should be a Security tab where permissions are set. It may also be under Sharing. Windows uses a more lax permissions system, so you probably won’t have to change any- thing unless the folder is deliberately restricted. (Note: I haven’t tested this on Windows Vista, so I’m unsure what, if anything, might have changed in it.) Mac OS X and Unix users can skip this step as the temporary directory—/tmp— has open permissions already. 7. Create a new directory, called uploads, in a directory outside of the Web root directory. All of the uploaded files will be perma- nently stored in the uploads directory. If you’ll be placing your PHP script in the C:\inetpub\wwwroot\ch10 directory, then create a C:\inetpub\uploads directory. Or if the files are going in /Users/~/ Sites/ch10, make a /Users/~/ uploads folder. Figure 10.12 shows the structure you should establish, and the side- bar discusses why this step is necessary. 305 Web Application Development Handling File Uploads Figure 10.12 Assuming that htdocs is the Web root directory (www.example.com or http://localhost points there), then the uploads directory needs to be placed outside of it. continues on next page 8. Set the permissions on the uploads direc- tory so that the Web server can write to it. Again, Windows users can use the Properties window to make these changes, although it may not be necessary. Mac OS X users can… A) Select the folder in the Finder. B) Press Command+I. C) Allow everyone to Read & Write, under the Ownership & Permissions panel (Figure 10.13). If you’re using a hosted site, the host likely provides a control panel through which you can tweak a folder’s settings or you might be able to do this within your FTP application. Depending upon your operating system, you may be able to upload files without first taking this step. You can try the fol- lowing script before altering the permis- sions, just to see. If you see messages like those in Figure 10.14, then you will need to make some adjustments. 306 Chapter 10 Handling File Uploads Figure 10.13 Adjusting the properties on the uploads folder in Mac OS X. Figure 10.14 If PHP could not move the uploaded image over to the uploads folder because of a permissions issue, you’ll see an error message like this one. Fix the permissions on uploads to correct this. ✔ Tips ■ Unix users can use the chmod command to adjust a folder’s permissions. The proper permissions in Unix terms will be either 755 or 777. ■ Because of the time it may take to upload a large file, you may also need to change the max_input_time value in the php.ini file or temporarily bypass it using the set_time_limit() function in your script. ■ File and directory permissions can be complicated stuff, particularly if you’ve never dealt with them before. If you have problems with these steps or the next script, search the Web or turn to the book’s corresponding forum (www.DMCInsights. com/phorum/). 307 Web Application Development Handling File Uploads Secure Folder Permissions There’s normally a trade-off between security and convenience. With this example, it’d be more convenient to place the uploads folder within the Web document directory (the con- venience arises with respect to how easily the uploaded images can be viewed in the Web browser), but doing that is less secure. For PHP to be able to place files in the uploads folder, it needs to have write permissions on that directory. On most servers, PHP is running as the same user as the Web server itself. On a hosted server, this means that all X number of sites being hosted are running as the same user. Creating a folder that PHP can write to means creating a folder that everyone can write to. Literally anyone on the server can now move, copy, or write files to the uploads folder (assuming that they know it exists). This even means that a malicious user could write a PHP script to your uploads directory. However, since the uploads directory in this example is not within the Web directory, such a PHP script cannot be run in a Web browser. It’s less conven- ient to do things this way, but more secure. If you must keep the uploads folder publicly accessible, the permissions could be tweaked. For security purposes, you ideally want to allow only the Web server user to read, write, and browse this directory. This means knowing what user the Web server runs as and making that user—and no one else—ruler of the uploads. This isn’t a perfect solution, but it does help a bit. This change also limits your access to that folder, though, as its contents would belong to only the Web server. Finally, if you’re using Apache, you could limit access to the uploads folder using an .htaccess file. Basically, you would state that only image files in the folder be publicly viewable, mean- ing that even if a PHP script were to be placed there, it could not be executed. Information on how to use .htaccess files can be found online (search on .htaccess tutorial). Sometimes even the most conservative programmer will make security concessions. The important point is that you’re aware of the potential concerns and that you do the most you can to minimize the danger. Uploading files with PHP Now that the server has (hopefully) been set up to properly allow for file uploads, you can create the PHP script that does the actual file handling. There are two parts to such a script: the HTML form and the PHP code. The required syntax for a form to handle a file upload has three parts:
    File The enctype part of the initial form tag indi- cates that the form should be able to handle multiple types of data, including files. If you want to accept file uploads, you must include this enctype! Also note that the form must use the POST method. The MAX_FILE_SIZE hidden input is a form restriction on how large the chosen file can be, in bytes, and must come before the file input. While it’s easy for a user to circumvent this restriction, it should still be used. Finally, the file input type will create the proper button in the form (Figures 10.15 and 10.16). Upon form submission, the uploaded file can be accessed using the $_FILES super- global. The variable will be an array of val- ues, listed in Table 10.4. Once the file has been received by the PHP script, the move_uploaded_file() function can transfer it from the temporary directory to its permanent location. move_uploaded_file (temporary_filename, /path/to/destination/filename); This next script will let the user select a file on their computer and will then store it in the uploads directory. The script will check that the file is of an image type. In the next section of this chapter, another script will list, and create links to, the uploaded images. 308 Chapter 10 Handling File Uploads Index Meaning name The original name of the file (as it was on the user’s computer). type The MIME type of the file, as provided by the browser. size The size of the uploaded file in bytes. tmp_name The temporary filename of the uploaded file as it was stored on the server. error The error code associated with any problem. The $_FILES Array Table 10.4 The data for an uploaded file will be available through these array elements. Figure 10.15 The file input as it appears in IE 7 on Windows. Figure 10.16 The file input as it appears in Firefox on Mac OS X. To handle file uploads in PHP: 1. Create a new PHP document in your text editor or IDE (Script 10.3). Upload an Image 3 4 5 6 Upload an Image 7 13 14 15 The file has been uploaded!

    '; 30 } // End of move... IF. 31 32 } else { // Invalid type. 33 echo '

    Please upload a JPEG or PNG image.

    '; 34 } 35 36 } // End of isset($_FILES['upload']) IF. 37 38 // Check for an error: 39 40 if ($_FILES['upload']['error'] > 0) { 41 echo '

    The file could not be uploaded because: '; 42 43 // Print a message based upon the error. 44 switch ($_FILES['upload']['error']) { 45 case 1: 46 print 'The file exceeds the upload_max_filesize setting in php.ini.'; 47 break; 48 case 2: 49 print 'The file exceeds the MAX_FILE_SIZE setting in the HTML form.'; 50 break; 51 case 3: 52 print 'The file was only partially uploaded.'; (script continues on next page) Script 10.3 continued Figure 10.17 This very basic HTML form only takes one input: a file. 4. Copy the file to its new location on the server. if (move_uploaded_file ➝ ($_FILES['upload']['tmp_name'], ➝ "../uploads/{$_FILES['upload'] ➝ ['name']}")) { echo '

    The file has been ➝ uploaded!

    '; } The move_uploaded_file() function will move the file from its temporary to its permanent location (in the uploads fold- er). The file will retain its original name. In Chapter 17, “Example—E-Commerce,” you’ll see how to give the file a new name, which is generally a good idea. As a rule, you should always use a condi- tional to confirm that a file was success- fully moved, instead of just assuming that the move worked. 5. Complete the image type and isset($_FILES['upload']) conditionals. } else { // Invalid type. echo '

    Please ➝ upload a JPEG, GIF, or PNG ➝ GIF image.

    '; } } // End of isset($_FILES['upload']) ➝ IF. The first else clause completes the if begun in Step 3. It applies if a file was uploaded but it wasn’t of the right MIME type (Figure 10.18). 311 Web Application Development Handling File Uploads 53 break; 54 case 4: 55 print 'No file was uploaded.'; 56 break; 57 case 6: 58 print 'No temporary folder was available.'; 59 break; 60 case 7: 61 print 'Unable to write to the disk.'; 62 break; 63 case 8: 64 print 'File upload stopped.'; 65 break; 66 default: 67 print 'A system error occurred.'; 68 break; 69 } // End of switch. 70 71 print '

    '; 72 73 } // End of error IF. 74 75 // Delete the file if it still exists: 76 if (file_exists ($_FILES['upload']['tmp_name']) && is_file($_FILES['upload']['tmp_name' ]) ) { 77 unlink ($_FILES['upload']['tmp_name']); 78 } 79 80 } // End of the submitted conditional. (script continues on next page) Script 10.3 continued Figure 10.18 If the user uploads a file that’s not a JPEG or PNG, this is the result. continues on next page 6. Check for, and report on, any errors. if ($_FILES['upload']['error'] > 0) { echo '

    The file ➝ could not be uploaded because: ➝ '; If an error occurred, then $_FILES['upload']['error'] will have a value greater than 0. In such cases, this script will report what the error was. 7. Create a switch that prints a more detailed error. switch ($_FILES['upload']['error']) { case 1: print 'The file exceeds the ➝ upload_max_filesize setting ➝ in php.ini.'; break; case 2: print 'The file exceeds the ➝ MAX_FILE_SIZE setting in ➝ the HTML form.'; break; case 3: print 'The file was only ➝ partially uploaded.'; break; case 4: print 'No file was uploaded.'; break; case 6: print 'No temporary folder was ➝ available.'; break; 312 Chapter 10 Handling File Uploads 81 ?> 82 83 84 85 86 87

    Select a JPEG or PNG image of 512KB or smaller to be uploaded: 88 89

    File:

    90 91
    92
    93 94 95 96 Script 10.3 continued case 7: print 'Unable to write to the ➝ disk.'; break; case 8: print 'File upload stopped.'; break; default: print 'A system error ➝ occurred.'; break; } // End of switch. There are several possible reasons a file could not be uploaded and moved. The first and most obvious one is if the per- missions are not set properly on the des- tination directory. In such a case, you’ll see an appropriate error message (refer back to Figure 10.14). PHP will often also store an error number in the $_FILES['upload']['error'] variable. The numbers correspond to specific problems, from 0 to 4, plus 6 through 8 (oddly enough, there is no 5). The switch conditional here prints out the problem according to the error number. The default case is added for future support (if different numbers are added in later versions of PHP). For the most part, these errors are useful to you, the developer, and not things you’d indicate to the average user. 313 Web Application Development Handling File Uploads 8. Complete the error if conditional. print '

    '; } // End of error IF. 9. Delete the temporary file if it still exists and complete the PHP section. if (file_exists ➝ ($_FILES['upload']['tmp_name']) ➝ && ➝ is_file($_FILES['upload']['tmp_ ➝ name']) ) { unlink ➝ ($_FILES['upload']['tmp_ ➝ name']); } } // End of the submitted conditional. ?> If the file was uploaded but it could not be moved to its final destination or some other error occurred, then that file is still sitting on the server in its temporary location. To remove it, use the unlink() function. Just to be safe, prior to applying unlink(), a conditional checks that the file exists and that it is a file (because the file_exists() function will return TRUE if the named item is a directory). continues on next page ✔ Tips ■ Omitting the enctype form attribute is a common reason for file uploads to mys- teriously fail. ■ The existence of an uploaded file can also be validated with the is_uploaded_file() function. ■ Windows users must use forward slashes or double backslashes to refer to directo- ries (so C:\\ or C:/ but not C:\). This is because the backslash is the escape char- acter in PHP. ■ The move_uploaded_file() function will overwrite an existing file without warn- ing if the new and existing files both have the same name. ■ The MAX_FILE_SIZE is a restriction in the browser as to how large a file can be, although not all browsers abide by this restriction. The PHP configuration file has its own restrictions. You can also val- idate the uploaded file size within the receiving PHP script. 10. Create the HTML form.
    Select a JPEG ➝ or PNG image of 512KB or ➝ smaller to be uploaded:

    File:

    This form is very simple (Figure 10.17), but it contains the three necessary parts for file uploads: the form’s enctype attribute, the MAX_FILE_SIZE hidden input, and the file input. 11. Complete the HTML page. 12. Save the file as upload_image.php, place it in your Web directory, and test it in your Web browser (Figures 10.19 and 10.20). If you want, you can confirm that the script works by checking the contents of the uploads directory. 314 Chapter 10 Handling File Uploads Figure 10.19 The result upon successfully uploading and moving a file. Figure 10.20 The result upon attempting to upload a file that is too large. 315 Web Application Development PHP and JavaScript PHP and JavaScript Although PHP and JavaScript are fundamen- tally different technologies, they can be used together to make better Web sites. The most significant difference between the two lan- guages is that JavaScript is client-side (meaning it runs in the Web browser) and PHP is server-side. Therefore, JavaScript can do such things as detect the size of the browser window, create pop-up windows, and make image mouseovers, whereas PHP can do nothing like these things. But while PHP cannot do certain things that JavaScript can, PHP can be used to create or work with JavaScript (just as PHP can create HTML). In this example, PHP will list all the images uploaded by the upload_image.php script and make clickable links using their names. The links themselves will call a JavaScript function that creates a pop-up window. This example will in no way be a thorough discussion of JavaScript, but it does adequately demonstrate how the two technologies—PHP and JavaScript—can be used together. Along with the JavaScript, three new PHP functions are used in this example. The first, getimagesize(), returns an array of informa- tion for a given image (Table 10.5). The sec- ond, scandir(), returns an array listing the files in a given directory (it was added in PHP 5). The third, filesize(), returns the size of a file in bytes. Element Value Example 0 image’s width in pixels 423 1 image’s height in pixels 368 2 image’s type 2 (representing JPG) 3 appropriate HTML img tag data height="368" width="423" mime image’s MIME type image/png The getimagesize() Array Table 10.5 The getimagesize() function returns this array of data. To create JavaScript with PHP: 1. Begin a new PHP document in your text editor or IDE (Script 10.4). Images 35 36 37

    Click on an image to view it in a separate window.

    38
    Edit ➝ Delete ➝ ➝ Last Name ➝ First Name Date ➝ Registered
    39 40 41 42 43 66 \t\t 67 \t\t 68 \t\n”; 69 70 } // End of the IF. 71 72 } // End of the foreach loop. 73 ?> 74
    Image NameImage Size
    $image$file_size
    75 76 Script 10.4 continued 7. Start the PHP code and create an array of images by referring to the uploads directory. 6. Create the introductory text and begin the table.

    Click on an image to view it in a ➝ separate window.

    Not a lot of effort is being put into the appearance of the page. It will be just one table with a caption (Figure 10.21). Figure 10.21 This PHP page has a caption and a table that lists all the images, along with their file sizes. continues on next page 9. Get the image information and encode its name. $image_size = getimagesize ➝ ("$dir/$image"); $file_size = round ( (filesize ➝ ("$dir/$image")) / 1024) . "kb"; $image = urlencode($image); Three PHP functions are used here that haven’t been used before (for more infor- mation, check the PHP manual). The getimagesize() function returns an array of information about an image (Table 10.5). The values returned by this func- tion will be used to set the width and height sent to the create_window() JavaScript function. The filesize() function returns the size of a file in bytes. To calculate the kilo- bytes of a file, divide this number by 1,024 (there are that many bytes in a kilobyte) and round it off. Lastly, the urlencode() function makes a string safe to pass in a URL. Because the image name may contain characters not allowed in a URL (and it will be passed in the URL when invoking show_image.php), the name should be encoded. 10. Print the table row. echo "\t \t\t \t\t \t\n"; Finally, the loop creates the HTML table row, consisting of the linked image name and the image size. The caption is linked as a call to the JavaScript create_window() function so that when the link is clicked, that function is exe- cuted. To make the HTML source more legible, tabs (\t) and newline characters (\n) are printed as well. 11. Complete the PHP code and the HTML page. } // End of the IF. } // End of the foreach loop. ?>
    Image ➝ Name Image ➝ Size
    $image$file_size
    320 Chapter 10 PHP and JavaScript ✔ Tips ■ Some versions of Windows create a Thumbs.db file in a folder of images. You might want to check for this value in the conditional in Step 8 that weeds out some returned items. That code would be if ( (substr($image, 0, 1) != '.') && ➝ ($image != 'Thumbs.db') ) { ■ Not to belabor the point, but most every- thing Web developers do with JavaScript (for example, resize or move the browser window) cannot be done using the server- side PHP. ■ There is a little overlap between the PHP and JavaScript. Both can set and read cookies, create HTML, and do some browser detection. 321 Web Application Development PHP and JavaScript Figure 10.22 Each image’s name is linked as a call to a JavaScript function. The function call’s parameters were created by PHP. 12. Save the file as images.php, place it in your Web directory (in the same direc- tory as upload_image.php), and test it in your Web browser (Figure 10.21). 13. View the source code to see the dynam- ically generated links (Figure 10.22). Notice how the parameters to each function call are appropriate to the specific image. Understanding HTTP Headers This chapter will conclude by discussing how you can use HTTP headers with your PHP scripts. HTTP (Hypertext Transfer Protocol) is the technology at the heart of the World Wide Web and defines the way clients and servers communicate (in layman’s terms). When a browser requests a Web page, it receives a series of HTTP headers in return. This happens behind the scenes, of course; most users aren’t aware of this at all. PHP’s built-in header() function can be used to take advantage of this protocol. The most common example of this will be demonstrated in the next chapter, when the header() func- tion will be used to redirect the Web browser from the current page to another. Here, you’ll use it to send files to the Web browser. In theory, the header() function is easy to use. Its syntax is header(header string); The list of possible header strings is quite long, as headers are used for everything from redi- recting the Web browser to sending files to sending cookies to controlling page caching and much, much more. Starting with some- thing simple, to use header() to redirect the Web browser, type header ('Location: ➝ http://www.example.com/page.php'); That line will send the Web browser from the page it’s on over to that URL. In this next example, which will send a file to the Web browser, three header calls are used. The first is Content-Type. This is an indication to the Web browser of what kind of data is about to follow. The Content-Type value matches the data’s MIME type. This line lets the browser know it’s about to receive a PDF file: header("Content- ➝ Type:application/pdf\n"); Next, you can use Content-Disposition, which tells the browser how to treat the data: header ("Content-Disposition: attachment; ➝ filename=\"somefile.pdf\"\n"); The attachment value will prompt the browser to download the file (Figure 10.23). An alter- native is to use inline, which tells the browser to display the data, assuming that the brows- er can. The filename attribute is just that: it tells the browser the name associated with the data. 322 Chapter 10 Understanding HTTP Headers Figure 10.24 The headers already sent error means that the Web browser was sent something—HTML, plain text, even a space—prior to using the header() function. Figure 10.23 Firefox prompts the user to download the file because of the attachment Content-Disposition value. A third header to use for downloading files is Content-Length. This is a value, in bytes, cor- responding to the amount of data to be sent. header ("Content-Length: 4096\n"); That’s the basics with respect to using the header() function. Before getting to the exam- ple, note that if a script uses multiple header() calls, each should be terminated by a new- line (\n) as in the preceding code snippets. More importantly, the absolutely critical thing to remember about the header() func- tion is that it must be called before anything is sent to the Web browser. This includes HTML or even blank spaces. If your code has any echo() or print() statements, has blank lines outside of PHP tags, or includes files that do any of these things before calling header(), you’ll see an error message like that in Figure 10.24. To use the header() function: 1. Begin a new PHP document in your text editor or IDE (Script 10.5). Script 10.5 continued 5. Complete the conditionals begun in Steps 2 and 3. } // End of file_exists() IF. } // End of isset($_GET['image']) IF. 6. If no valid image was received by this page, use a default image. if (!$name) { $image = 'images/unavailable.png'; $name = 'unavailable.png'; } If the image doesn’t exist, if it isn’t a file, or if it doesn’t have the proper extension, then the $name variable will still have a value of FALSE. In such cases, a default image will be used instead (Figure 10.25). The image itself can be downloaded from the book’s corresponding Web site (www. DMCInsights.com/phpmysql3/, see the Extras page) and should be placed in an images folder. The images folder should be in the same directory as this script, not in the same directory as the uploads folder. 7. Retrieve the image information. $info = getimagesize($image); $fs = filesize($image); To send the file to the Web browser, the script needs to know the file’s type and size. The file’s type can be found using getimagesize(). The file’s size, in bytes, is found using filesize(). Because the $image variable represents either ../uploads/ {$_GET['image']} or images/unavailable. png, these lines will work on both the correct and the unavailable image. 8. Send the file. header ("Content-Type: ➝ {$info['mime']}\n"); header ("Content-Disposition: inline; ➝ filename=\"$name\"\n"); header ("Content-Length: $fs\n"); readfile ($image); These header() calls will send the file data to the Web browser. The first line uses the image’s MIME type for the Content-Type. The second line tells the browser the name of the file and that it should be displayed in the browser (inline). The last header() function indicates how much data is to be expected. The file data itself is sent using the readfile() function, which reads in a file and imme- diately sends the content to the Web browser. 9. Complete the page. ?> Notice that this page contains no HTML. It only sends an image file to the Web browser. 325 Web Application Development Understanding HTTP Headers continues on next pageFigure 10.25 This image will be shown any time there’s a problem with showing the requested image. 10. Save the file as show_image.php, place it in your Web directory, in the same fold- er as images.php, and test it in your Web browser by clicking a link in images.php (Figure 10.26). ✔ Tips ■ I cannot stress strongly enough that nothing can be sent to the Web browser before using the header() function. Even an included file that has a blank line after the closing PHP tag will make the header() function unusable. ■ To avoid problems when using header(), you can call the headers_sent() function first. It returns a Boolean value indicating if something has already been sent to the Web browser: if (!headers_sent()) { // Use the header() function. } else { // Do something else. } Output buffering, demonstrated in Chapter 16, “Example—User Registration,” can also prevent problems when using header(). ■ Debugging scripts like this, where PHP sends data, not text, to the Web browser, can be challenging. For help, use the Live HTTP Headers plug-in for Firefox (Figure 10.27). 326 Chapter 10 Understanding HTTP Headers Figure 10.27 The Live HTTP Headers extension for Firefox shows what headers were sent by a page and/or server. This can be useful debugging information. Figure 10.26 This image is displayed by having PHP send the file to the Web browser. The Hypertext Transfer Protocol (HTTP) is a stateless technology, meaning that each individual HTML page is an unrelated entity. HTTP has no method for tracking users or retaining variables as a person traverses a site. Although your browser tracks the pages you visit, the server keeps no record of who had seen what. Without the server being able to track a user, there can be no shopping carts or custom Web site person- alization. Using a server-side technology like PHP, you can overcome the stateless- ness of the Web. The two best PHP tools for this purpose are cookies and sessions. As you probably already know, cookies store data in the user’s Web browser. When the user accesses a page on the site from which the cookie came, the server can read the data from that cookie. Sessions store data on the server itself. Sessions are generally more secure than cookies and can store much more information. Both technologies are easy to use with PHP and are worth knowing. In this chapter you’ll see uses of both cookies and sessions. The examples for demonstrating this information will be a login system, based upon the existing users database. 327 Cookies and Sessions 11 Cookies and Sessions Making a Login Page A login process involves just a few components: ◆ A form for submitting the login information ◆ A validation routine that confirms the necessary information was submitted ◆ A database query that compares the submitted information against the stored information ◆ Cookies or sessions to store data that reflects a successful login Subsequent pages will then contain checks to confirm that the user is logged in (to limit access to that page). There is also, of course, a logging out process, which involves clearing out the cookies or session data representing a logged-in status. To start all this, let’s take some of these common elements and place them into sep- arate files. Then, the pages that require this functionality can include the necessary files. Breaking up the logic this way will make some of the following scripts easier to read and write, plus cut down on their redundan- cies. I’ve designed two includable files. This first one will contain the bulk of a login page, including the header, the error reporting, the form, and the footer (Figure 11.1). 328 Chapter 11 Making a Login Page 1 Error! 13

    The following error(s) occurred:
    '; 14 foreach ($errors as $msg) { 15 echo " - $msg
    \n"; 16 } 17 echo '

    Please try again.

    '; 18 } 19 20 // Display the form: 21 ?> 22

    Login

    23
    24

    Email Address:

    25

    Password:

    26

    Script 11.1 The login_page.inc.php script creates the complete login page, including the form, and reports any errors. It will be included by other pages that need to show the login page. (script continues on next page) Figure 11.1 The login form and page. 329 Cookies and Sessions Making a Login Page To make a login page: 1. Create a new PHP page in your text editor or IDE (Script 11.1). Error!

    The following ➝ error(s) occurred:
    '; foreach ($errors as $msg) { echo " - $msg
    \n"; } echo '

    Please try again. ➝

    '; } This code was also developed back in Chapter 8. If any errors exist (in the $errors array variable), they’ll be printed as an unordered list (Figure 11.2). Script 11.1 continued 27 28
    29 30 Figure 11.2 The login form and page, with error reporting. continues on next page 4. Display the form. ?>

    Login

    Email Address:

    Password:

    The HTML form only needs two text inputs: one for an email address and a second for the password. The names of the inputs match those in the users table of the sitename database (which this login system is based upon). To make it easier to create the HTML form, the PHP section is closed first. The form is not sticky, but you could easily add code to accomplish that (but only for the email address, as passwords can’t be sticky). 5. Complete the page. 6. Save the file as login_page.inc.php and place it in your Web directory (in the includes folder, along with the files from Chapter 8: header.html, footer.html, and style.css). The page will use a .inc.php extension to indicate both that it’s an includable file and that it contains PHP code. ✔ Tip ■ It may seem illogical that this script includes the header and footer file from within the includes directory when this script will also be within that same direc- tory. This code works because this script will be included by pages within the main directory; thus the include refer- ences are with respect to the parent file, not this one. 330 Chapter 11 Making a Login Page Making the Login Functions Along with the login page that was stored in login_page.inc.php, there’s a little bit of functionality that will be common to several scripts in this chapter. In this next script, also to be included by other pages in the login/logout system, two functions will be defined. Many pages will end up redirecting the user from one page to another. For example, upon successfully logging in, the user will be taken to loggedin.php. If a user accesses loggedin.php and they aren’t logged in, they should be taken to index.php. Redirection uses the header() function, introduced in Chapter 10, “Web Application Development.” The syntax for redirection is header ('Location: http://www.example. ➝ com/page.php'); Because this function will send the browser to page.php, the current script should be ter- minated using exit() immediately after this: header ('Location: http://www.example. ➝ com/page.php'); exit(); If you don’t do this, the current script will continue to run (just not in the Web browser). The location value in the header() call should be an absolute URL (www.example. com/page.php instead of just page.php). You can hard-code this value or, better yet, dynamically determine it. The first function in this next script will do just that. The other bit of code that will be used by multiple scripts in this chapter validates the login form. This is a three-step process: 1. Confirm that an email address was provided. 2. Confirm that a password was provided. 3. Confirm that the provided email address and password match those stored in the database (during the registration process). So this next script will define two different functions. The details of how each function works will be explained in the steps that follow. To create the login functions: 1. Create a new PHP document in your text editor or IDE (Script 11.2). continues on next page script is being run (as long as the redirec- tion is taking place within that directory). 6. Begin a new function. function check_login($dbc, $email = ➝ '', $pass = '') { This function will validate the login infor- mation. It takes three arguments: the database connection, which is required; the email address, which is optional; and the password, which is also optional. Although this function could access $_POST['email'] and $_POST['pass'] directly, it’s better if the function is passed these values, making the function more independent. 7. Validate the email address and password. $errors = array(); if (empty($email)) { $errors[ ] = 'You forgot to enter ➝ your email address.'; } else { $e = mysqli_real_escape_string ➝ ($dbc, trim($email)); } if (empty($pass)) { $errors[ ] = 'You forgot to enter ➝ your password.'; } else { $p = mysqli_real_escape_ ➝ string($dbc, trim($pass)); } This validation routine is similar to that used in the registration page. If any prob- lems occur, they’ll be added to the $errors array, which will eventually be used on the login page (see Figure 11.2). 8. If no errors occurred, run the database query. if (empty($errors)) { $q = "SELECT user_id, first_name ➝ FROM users WHERE email='$e' ➝ AND pass=SHA1('$p')"; $r = @mysqli_query ($dbc, $q); The query selects the user_id and first_ name values from the database where the submitted email address (from the form) matches the stored email address and the SHA1() version of the submitted password matches the stored password (Figure 11.3). 9. Check the results of the query. if (mysqli_num_rows($r) == 1) { $row = mysqli_fetch_array ($r, ➝ MYSQLI_ASSOC); return array(true, $row); } else { $errors[ ] = 'The email address ➝ and password entered do not ➝ match those on file.'; } If the query returned one row, then the login information was correct. The results are then fetched into $row. The final step in a successful login is to return two pieces of information back to the requesting script: the value true, indicating that the login was a success; and the data fetched from MySQL. Using the array() function, both the Boolean value and the $row array can be returned by this function. If the query did not return one row, then an error message is added to the array. It will end up being displayed on the login page (Figure 11.4). 334 Chapter 11 Making the Login Functions 10. Complete the conditional begun in Step 8 and complete the function. } // End of empty($errors) IF. return array(false, $errors); } // End of check_login() function. The final step is for the function to return a value of false, indicating that login failed, and to return the $errors array, which stores the reason(s) for failure. This return statement can be placed here—at the end of the function instead of within a conditional— because the function will only get to this point if the login failed. If the login succeeded, the return line in Step 9 will stop the function from continuing (a function stops as soon as it executes a return). 11. Complete the page. ?> 12. Save the file as login_functions.inc. php and place it in your Web directory (in the includes folder, along with head- er.html, footer.html, and style.css). This page will also use a .inc.php exten- sion to indicate both that it’s an includ- able file and that it contains PHP code. ✔ Tips ■ The scripts in this chapter include no debugging code (like the MySQL error or query). If you have problems with these scripts, apply the debugging techniques outlined in Chapter 7, “Error Handling and Debugging.” ■ You can add name=value pairs to the URL in a header() call to pass values to the target page: $url .= '?name=' . urlencode(value); 335 Cookies and Sessions Making the Login Functions Figure 11.3 The results of the login query if the user submitted the proper email address/password combination. Figure 11.4 If the user entered an email address and password, but they don’t match the values stored in the database, this is the result. Using Cookies Cookies are a way for a server to store infor- mation on the user’s machine. This is one way that a site can remember or track a user over the course of a visit. Think of a cookie as being like a name tag: you tell the server your name and it gives you a sticker to wear. Then it can know who you are by referring back to that name tag. Some people are suspicious of cookies because they believe that cookies allow a server to know too much about them. How- ever, a cookie can only be used to store infor- mation that the server is given, so it’s no less secure than most anything else online (that saying what it does). Unfortunately, many people still have misconceptions about the technology, which is a problem, as those misconceptions can undermine the func- tionality of your Web application. In this section you will learn how to set a cookie, retrieve information from a stored cookie, alter a cookie’s settings, and then delete a cookie. Setting cookies The most important thing to understand about cookies is that they must be sent from the server to the client prior to any other information. Should the server attempt to send a cookie after the Web browser has already received HTML—even an extraneous white space—an error message will result and the cookie will not be sent (Figure 11.5). This is by far the most common cookie- related error but is easily fixed. 336 Chapter 11 Using Cookies Testing for Cookies To effectively program using cookies, you need to be able to accurately test for their presence. The best way to do so is to have your Web browser ask what to do when receiving a cookie. In such a case, the browser will prompt you with the cookie information each time PHP attempts to send a cookie. Different versions of different browsers on different platforms all define their cookie handling policies in different places. I’ll quickly run through a couple of options for popular Web browsers. To set this up using Internet Explorer on Windows XP, choose Tools > Internet Options. Then click the Privacy tab, followed by the Advanced button under Settings. Click “Override automatic cookie handling” and then choose “Prompt” for both First- and Third-party Cookies. Using Firefox on Windows, choose Tools > Options > Privacy. In the Cookies section, select “ask me every time” in the “Keep until” drop-down menu. If you are using Firefox on Mac OS X, the steps are the same, but you start by choosing Firefox > Preferences. Unfortunately, Safari on Mac OS X does not have a cookie prompting option, but it will allow you to view existing cookies, which is still a useful debugging tool. This option can be found under the Security pane of Safari’s Preferences panel. Figure 11.5 The headers already sent… error message is all too common when creating cookies. Pay attention to what the error message says in order to find and fix the problem. Cookies are sent via the setcookie() function: setcookie (name, value); setcookie ('name', 'Nicole'); The second line of code will send a cookie to the browser with a name of name and a value of Nicole (Figure 11.6). You can continue to send more cookies to the browser with subsequent uses of the setcookie() function: setcookie ('ID', 263); setcookie ('email', 'email@example. ➝ com'); As when using any variable in PHP, when naming your cookies, do not use white spaces or punctuation, but do pay attention to the exact case used. To send a cookie: 1. Create a new PHP document in your text editor (Script 11.3). 4. Redirect the user to another page. $url = absolute_url ('loggedin.php'); header("Location: $url"); exit(); Using the steps outlined earlier in the chapter, the redirection URL is first dynamically generated and returned by the absolute_url() function. The specific page to be redirected to is loggedin.php. The absolute URL is then used in the header() function and the script’s execution is terminated with exit(). 5. Complete the $check conditional (started in Step 3) and then close the database connection. } else { $errors = $data; } mysqli_close($dbc); If $check has a false value, then the $data variable is storing the errors generated within the check_login() function. If so, they should be assigned to the $errors variable, because that’s what the code in the script that displays the login page— login_page.inc.php—is expecting. 6. Complete the main submit conditional and include the login page. } include ('includes/login_page.inc.php'); ?> This login.php script primarily validates the login form by calling the check_login() function. The login_page.inc.php file contains the login page itself, so it just needs to be included. 7. Save the file as login.php, place it in your Web directory (in the same folder as the files from Chapter 8), and load this page in your Web browser (see Figure 11.2). ✔ Tips ■ Cookies are limited to about 4 KB of total data, and each Web browser can remember a limited number of cookies from any one site. This limit is 50 cookies for most of the current Web browsers (but if you’re sending out 50 different cookies, you may want to rethink how you do things). ■ The setcookie() function is one of the few functions in PHP that could have dif- ferent results in different browsers, since each browser treats cookies in its own way. Be sure to test your Web sites in multiple browsers on different platforms to ensure consistency. ■ If the first two included files sends any- thing to the Web browser or even has blank lines or spaces after the closing PHP tag, you’ll see a headers already sent error. If you see such an error, go to the document and line number referenced in the error (after output started at) and fix the problem. 339 Cookies and Sessions Using Cookies Accessing cookies To retrieve a value from a cookie, you only need to refer to the $_COOKIE superglobal, using the appropriate cookie name as the key (as you would with any array). For example, to retrieve the value of the cookie established with the line setcookie ('username', 'Trout'); you would refer to $_COOKIE['username']. In the following example, the cookies set by the login.php script will be accessed in two ways. First a check will be made that the user is logged in (otherwise, they shouldn’t be accessing this page). Second, the user will be greeted by their first name, which was stored in a cookie. To access a cookie: 1. Create a new PHP document in your text editor (Script 11.4). Logged In! 22

    You are now logged in, {$_COOKIE ['first_name']}!

    23

    Logout

    "; 24 25 include ('includes/footer.html'); 26 ?> Script 11.4 The loggedin.php script prints a greeting to a user based upon a stored cookie. ✔ Tips ■ A cookie is not accessible until the setting page (e.g., login.php) has been reloaded or another page has been accessed (in other words, you cannot set and access a cookie in the same page). ■ If users decline a cookie or have their Web browser set not to accept them, they will automatically be redirected to the home page in this example, even if they successfully logged in. For this rea- son you may want to let the user know that cookies are required. If the user is not logged in, they will be automatically redirected to the main page. This is a simple way to limit access to content. 4. Include the page header. $page_title = 'Logged In!'; include ('includes/header.html'); 5. Welcome the user, using the cookie. echo "

    Logged In!

    You are now logged in, {$_COOKIE ➝ ['first_name']}!

    ➝ Logout

    "; To greet the user by name, refer to the $_COOKIE['first_name'] variable (enclosed within curly braces to avoid parse errors). A link to the logout page (to be written later in the chapter) is also printed. 6. Complete the HTML page. include ('includes/footer.html'); ?> 7. Save the file as loggedin.php, place it in your Web directory (in the same folder as login.php), and test it in your Web browser by logging in through login.php (Figure 11.7). Since these examples use the same data- base as those in Chapter 8, you should be able to log in using the registered username and password submitted at that time. 8. To see the cookies being set (Figures 11.8 and 11.9), change the cookie set- tings for your browser and test again. 341 Cookies and Sessions Using Cookies Figure 11.7 If you used the correct email address and password, you’ll be redirected here after logging in. Figure 11.8 The user_id cookie with a value of 1. Figure 11.9 The first_name cookie with a value of Larry (yours might be different). Setting cookie parameters Although passing just the name and value arguments to the setcookie() function will suffice, you ought to be aware of the other arguments available. The function can take up to five more parameters, each of which will alter the definition of the cookie. setcookie (name, value, expiration, ➝ path, host, secure, httponly); The expiration argument is used to set a definitive length of time for a cookie to exist, specified in seconds since the epoch (the epoch is midnight on January 1, 1970). If it is not set or if it’s set to a value of 0, the cookie will continue to be functional until the user closes their browser. These cookies are said to last for the browser session (also indicated in Figures 11.8 and 11.9). To set a specific expiration time, add a number of minutes or hours to the current moment, retrieved using the time() func- tion. The following line will set the expira- tion time of the cookie to be 30 minutes (60 seconds times 30 minutes) from the current moment: setcookie (name, value, time()+1800); The path and host arguments are used to limit a cookie to a specific folder within a Web site (the path) or to a specific host (www.example.com or 192.168.0.1). For exam- ple, you could restrict a cookie to exist only while a user is within the admin folder of a domain (and the admin folder’s subfolders): setcookie (name, value, expire, ➝ '/admin/'); Setting the path to / will make the cookie visible within an entire domain (Web site). Setting the domain to .example.com will make the cookie visible within an entire domain and every subdomain (www.example.com, admin.example.com, pages.example.com, etc.). The secure value dictates that a cookie should only be sent over a secure HTTPS connection. A 1 indicates that a secure connection must be used, and a 0 says that a standard connection is fine. setcookie (name, value, expire, path, ➝ host, 1); If your site is using a secure connection, restricting cookies to HTTPS will be much more secure than not doing so. Finally, added in PHP 5.2 is the httponly argument. A Boolean value is used to make the cookie only accessible through HTTP (and HTTPS). Enforcing this restriction will make the cookie more secure (preventing some hack attempts) but is not supported by all browsers at the time of this writing. setcookie (name, value, expire, path, ➝ host, secure, TRUE); As with all functions that take arguments, you must pass the setcookie() values in order. To skip any parameter, use NULL, 0, or an empty string (don’t use FALSE). The expi- ration and secure values are both integers and are therefore not quoted. To demonstrate this information, let’s add an expiration setting to the login cookies so that they last for only one hour. 342 Chapter 11 Using Cookies To set a cookie’s parameters: 1. Open login.php in your text editor (refer to Script 11.3). 2. Change the two setcookie() lines to include an expiration date that’s 60 min- utes away (Script 11.5): setcookie ('user_id', $data['user_ ➝ id'], time()+3600, '/', '', 0, 0); setcookie ('first_name', $data ➝ ['first_name'], time()+3600, '/', ➝ '', 0, 0); With the expiration date set to time() + 3600 (60 minutes times 60 seconds), the cookie will continue to exist for an hour after it is set. While making this change, every other parameter is explicitly addressed. For the final parameter, which accepts a Boolean value, you can also use 0 to rep- resent false (PHP will handle the con- version for you). Doing so is a good idea, as using false in any of the cookie argu- ments can cause problems. 343 Cookies and Sessions Using Cookies 1 Script 11.5 The login.php script now uses every argument the setcookie() function can take. continues on next page 3. Save the script, place it in your Web directory, and test it in your Web browser by logging in (Figure 11.10). ✔ Tips ■ Some browsers have difficulties with cookies that do not list every argument. Explicitly stating every parameter—even as an empty string—will achieve more reliable results across all browsers. ■ Here are some general guidelines for cookie expirations: If the cookie should last as long as the session, do not set an expiration time; if the cookie should con- tinue to exist after the user has closed and reopened his or her browser, set an expiration time weeks or months ahead; and if the cookie can constitute a securi- ty risk, set an expiration time of an hour or fraction thereof so that the cookie does not continue to exist too long after a user has left his or her browser. ■ For security purposes, you could set a five- or ten-minute expiration time on a cookie and have the cookie resent with every new page the user visits (assuming that the cookie exists). This way, the cookie will continue to persist as long as the user is active but will automatically die five or ten minutes after the user’s last action. ■ E-commerce and other privacy-related Web applications should use an SSL (Secure Sockets Layer) connection for all transactions, including the cookie. ■ Be careful with cookies created by scripts within a directory. If the path isn’t speci- fied, then that cookie will only be avail- able to other scripts within that same directory. 344 Chapter 11 Using Cookies Figure 11.10 Changes to the setcookie() parameters, like an expiration date and time, will be reflected in the cookie sent to the Web browser (compare with Figure 11.9). Deleting cookies The final thing to understand about using cookies is how to delete one. While a cookie will automatically expire when the user’s browser is closed or when the expiration date/time is met, sometimes you’ll want to manually delete the cookie instead. For example, in Web sites that have login capa- bilities, you will want to delete any cookies when the user logs out. Although the setcookie() function can take up to seven arguments, only one is actually required—the cookie name. If you send a cookie that consists of a name without a value, it will have the same effect as deleting the existing cookie of the same name. For example, to create the cookie first_name, you use this line: setcookie('first_name', 'Tyler'); To delete the first_name cookie, you would code: setcookie('first_name'); As an added precaution, you can also set an expiration date that’s in the past. setcookie('first_name', '', time ➝ ()-3600); To demonstrate all of this, let’s add a logout capability to the site. The link to the logout page appears on loggedin.php. As an added feature, the header file will be altered so that a Logout link appears when the user is logged in and a Login link appears when the user is logged out. To delete a cookie: 1. Create a new PHP document in your text editor or IDE (Script 11.6). Logged Out! 25

    You are now logged out, {$_COOKIE['first_name']}!

    "; 26 27 include ('includes/footer.html'); 28 ?> Script 11.6 The logout.php script deletes the previously established cookies. 2. Check for the existence of a user_id cookie; if it is not present, redirect the user. if (!isset($_COOKIE['user_id'])) { require_once ('includes/login_ ➝ functions.inc.php'); $url = absolute_url(); header("Location: $url"); exit(); As with loggedin.php, if the user is not already logged in, this page should redi- rect the user to the home page. There’s no point in trying to log out a user that isn’t logged in! 3. Delete the cookies, if they exist. } else { setcookie ('first_name', '', ➝ time()-3600, '/', '', 0, 0); setcookie ('user_id', '', ➝ time()-3600, '/', '', 0, 0); } If the user is logged in, these two cookies will effectively delete the existing ones. Except for the value and the expiration, the other arguments should have the same values as they do when the cookies were created. 4. Make the remainder of the PHP page. $page_title = 'Logged Out!'; include ('includes/header.html'); echo "

    Logged Out!

    You are now logged out, {$_ ➝ COOKIE['first_name']}!

    "; include ('includes/footer.html'); ?> The page itself is also much like the loggedin.php page. Although it may seem odd that you can still refer to the first_name cookie (that you just deleted in this script), it makes perfect sense considering the process: A) This page is requested by the client. B) The server reads the available cookies from the client’s browser. C) The page is run and does its thing (including sending new cookies that delete the existing ones). So, in short, the original first_name cookie data is available to this script when it first runs. The set of cookies sent by this page (the delete cookies) aren’t available to this page, so the original values are still usable. 5. Save the file as logout.php and place it in your Web directory (in the same folder as login.php). 346 Chapter 11 Using Cookies To create the logout link: 1. Open header.html (refer to Script 8.1) in your text editor or IDE. 2. Change the fifth and final link to (Script 11.7)
  • Logout'; } else { echo 'Login'; } ?>
  • Instead of having a permanent login link in the navigation area, it should display a Logout link if the user is logged in or a Login link if the user is not. The preced- ing conditional will accomplish just that, depending upon the presence of a cookie. 347 Cookies and Sessions Using Cookies continues on next page 1 2 3 4 <?php echo $page_title; ?> 5 6 7 8 9 13 28
    29 Script 11.7 continued Because the logout.php script would ordinarily display a logout link (because the cookie exists when the page is first being viewed), the conditional has to check that the current page is not the logout.php script. The strpos() func- tion, which checks if one string is found within another string, is an easy way to accomplish this. 3. Save the file, place it in your Web direc- tory (within the includes folder), and test the login/logout process in your Web browser (Figures 11.11, 11.12, and 11.13). ✔ Tips ■ To see the result of the setcookie() calls in the logout.php script, turn on cookie prompting in your browser (Figure 11.14). ■ Due to a bug in how Internet Explorer on Windows handles cookies, you may need to set the host parameter to false (with- out quotes) in order to get the logout process to work when developing on your own computer (i.e., through localhost). ■ When deleting a cookie, you should always use the same parameters that were used to set the cookie. If you set the host and path in the creation cookie, use them again in the deletion cookie. ■ To hammer the point home, remember that the deletion of a cookie does not take effect until the page has been reloaded or another page has been accessed. In other words, the cookie will still be available to a page after that page has deleted it. 348 Chapter 11 Using Cookies Figure 11.11 The home page with a Login link. Figure 11.12 After the user logs in, the page now has a Logout link. Figure 11.13 The result after logging out. Figure 11.14 This is how the deletion cookie appears in a Firefox prompt. Using Sessions Another method of making data available to multiple pages of a Web site is to use sessions. The premise of a session is that data is stored on the server, not in the Web browser, and a session identifier is used to locate a particu- lar user’s record (the session data). This session identifier is normally stored in the user’s Web browser via a cookie, but the sen- sitive data itself—like the user’s ID, name, and so on—always remains on the server. The question may arise: why use sessions at all when cookies work just fine? First of all, sessions are likely more secure in that all of the recorded information is stored on the server and not continually sent back and forth between the server and the client. Second, you can store more data in a session. Third, some users reject cookies or turn them off completely. Sessions, while designed to work with a cookie, can function without them, too. To demonstrate sessions—and to compare them with cookies—let’s rewrite the previ- ous set of scripts. Setting session variables The most important rule with respect to sessions is that each page that will use them must begin by calling the session_start() function. This function tells PHP to either begin a new session or access an existing one. This function must be called before anything is sent to the Web browser! The first time this function is used, session_start() will attempt to send a cookie with a name of PHPSESSID (the session name) and a value of something like a61f8670baa8e90a30c878df89a2074b (32 hexadecimal letters, the session ID). Because of this attempt to send a cookie, session_start() must be called before any data is sent to the Web browser, as is the case when using the setcookie() and header() functions. Once the session has been started, values can be registered to the session using the normal array syntax: $_SESSION['key'] = value; $_SESSION['name'] = 'Roxanne'; $_SESSION['id'] = 48; Let’s update the login.php script with this in mind. 349 Cookies and Sessions Using Sessions Sessions vs. Cookies This chapter has examples accomplishing the same tasks—logging in and logging out—using both cookies and sessions. Obviously, both are easy to use in PHP, but the true question is when to use one or the other. Sessions have the following advantages over cookies: ◆ They are generally more secure (because the data is being retained on the server). ◆ They allow for more data to be stored. ◆ They can be used without cookies. Whereas cookies have the following advantages over sessions: ◆ They are easier to program. ◆ They require less of the server. In general, to store and retrieve just a couple of small pieces of information, use cookies. For most of your Web applications, though, you’ll use sessions. continues on next page To begin a session: 1. Open login.php (refer to Script 11.5) in your text editor or IDE. 2. Replace the setcookie() lines (12–14) with these lines (Script 11.8): session_start(); $_SESSION['user_id'] = $data['user_ ➝ id']; $_SESSION['first_name'] = $data ➝ ['first_name']; The first step is to begin the session. Since there are no echo() statements, inclusions of HTML files, or even blank spaces prior to this point in the script, it will be safe to use session_start() now (although it could be placed at the top of the script as well). Then, two key-value pairs are added to the $_SESSION super- global array to register the user’s first name and user ID to the session. 3. Save the page as login.php, place it in your Web directory, and test it in your Web browser (Figure 11.15). Although loggedin.php and the header and script will need to be rewritten, you can still test the login script and see the resulting cookie (Figure 11.16). The loggedin.php page should redirect you back to the home page, though, as it’s still checking for the presence of a $_COOKIE variable. 350 Chapter 11 Using Sessions 1 Script 11.8 The login.php script now uses sessions instead of cookies. ✔ Tips ■ Because sessions will normally send and read cookies, you should always try to begin them as early in the script as possi- ble. Doing so will help you avoid the problem of attempting to send a cookie after the headers (HTML or white space) have already been sent. ■ If you want, you can set session.auto_ start in the php.ini file to 1, making it unnecessary to use session_start() on each page. This does put a greater toll on the server and, for that reason, shouldn’t be used without some consideration of the circumstances. ■ You can store arrays in sessions (making $_SESSION a multidimensional array), just as you can store strings or numbers. Accessing session variables Once a session has been started and vari- ables have been registered to it, you can cre- ate other scripts that will access those vari- ables. To do so, each script must first enable sessions, again using session_start(). This function will give the current script access to the previously started session (if it can read the PHPSESSID value stored in the cookie) or create a new session if it cannot. Understand that if the current session ID cannot be found and a new session ID is generated, none of the data stored under the old session ID will be available. I mention this here and now because if you’re having problems with sessions, checking the session ID value to see if it changes from one page to the next is the first debugging step. Assuming that there was no problem access- ing the current session, to then refer to a session variable, use $_SESSION['var'], as you would refer to any other array. 351 Cookies and Sessions Using Sessions Figure 11.15 The login form remains unchanged to the end user, but the underlying functionality now uses sessions. Figure 11.16 This cookie, created by PHP’s session_start() function, stores the session ID. To access session variables: 1. Open loggedin.php (refer to Script 11.4) in your text editor or IDE. 2. Add a call to the session_start() function (Script 11.9). session_start(); Every PHP script that either sets or accesses session variables must use the session_start() function. This line must be called before the header.html file is included and before anything is sent to the Web browser. 3. Replace the references to $_COOKIE with $_SESSION (lines 6 and 22 of the original file). if (!isset($_SESSION['user_id'])) { and echo "

    Logged In!

    You are now logged in, {$_SESSION ➝ ['first_name']}!

    Logout ➝

    "; Switching a script from cookies to ses- sions requires only that you change uses of $_COOKIE to $_SESSION (assuming that the same names were used). 4. Save the file as loggedin.php, place it in your Web directory, and test it in your browser (Figure 11.17). 352 Chapter 11 Using Sessions 1 Logged In! 20

    You are now logged in, {$_SESSION ['first_name']}!

    21

    Logout

    "; 22 23 include ('includes/footer.html'); 24 ?> Script 11.9 The loggedin.php script is updated so that it refers to $_SESSION and not $_COOKIE (changes are required on two lines). Figure 11.17 After logging in, the user is redirected to loggedin.php, which will welcome the user by name using the stored session value. 5. Replace the reference to $_COOKIE with $_SESSION in header.html (from Script 11.7 to Script 11.10). if ( (isset($_SESSION['user_id'])) ➝ && (!strpos($_SERVER['PHP_SELF'], ➝ 'logout.php')) ) { For the Login/Logout links to function properly (notice the incorrect link in Figure 11.17), the reference to the cookie variable within the header file must be switched over to sessions. The header file does not need to call the session_start() function, as it’ll be included by pages that do. 6. Save the header file, place it in your Web directory (in the includes folder), and test it in your browser (Figure 11.18). 353 Cookies and Sessions Using Sessions 1 2 3 4 <?php echo $page_title; ?> 5 6 7 8 9 13 28
    29 Script 11.10 continued Figure 11.18 With the header file altered for sessions, the proper Login/Logout links will be displayed (compare with Figure 11.17). continues on next page ✔ Tips ■ For the Login/Logout links to work on the other pages (register.php, index.php, etc.), you’ll need to add the session_start() command to each of those. ■ As a reminder of what I already said, if you have an application where the ses- sion data does not seem to be accessible from one page to the next, it could be because a new session is being created on each page. To check for this, compare the session ID (the last few characters of the value will suffice) to see if it is the same. You can see the session’s ID by viewing the session cookie as it is sent or by invoking the session_id() function: echo session_id(); ■ Session variables are available as soon as you’ve established them. So, unlike when using cookies, you can assign a value to $_SESSION['var'] and then refer to $_SESSION['var'] later in that same script. 354 Chapter 11 Using Sessions Garbage Collection Garbage collection with respect to sessions is the process of deleting the session files (where the actual data is stored). Creating a logout system that destroys a session is ideal, but there’s no guarantee all users will formally log out as they should. For this reason, PHP includes a cleanup process. Whenever the session_start() function is called, PHP’s garbage collection kicks in, checking the last modification date of each session (a session is modified whenever variables are set or retrieved). Two settings dictate garbage collection: session.gc_maxlifetime and session.gc_ probability. The first states after how many seconds of inactivity a session is considered idle and will therefore be deleted. The second setting determines the probability that garbage collection is performed, on a scale of 1 to 100. With the default settings, each call to session_start() has a 1 percent chance of invoking garbage collection. If PHP does start the cleanup, any sessions that have not been used in more than 1,440 seconds will be deleted. You can change these settings using the ini_set() function, although be careful in doing so. Too frequent or too probable garbage collection can bog down the server and inadvertently end the sessions of slower users. Deleting session variables When using sessions, you ought to create a method to delete the session data. In the current example, this would be necessary when the user logs out. Whereas a cookie system only requires that another cookie be sent to destroy the existing cookie, sessions are slightly more demanding, since there are both the cookie on the client and the data on the server to consider. To delete an individual session variable, you can use the unset() function (which works with any variable in PHP): unset($_SESSION['var']); To delete every session variable, reset the entire $_SESSION array: $_SESSION = array(); Finally, to remove all of the session data from the server, use session_destroy(): session_destroy(); Note that prior to using any of these meth- ods, the page must begin with session_ start() so that the existing session is accessed. Let’s update the logout.php script to clean out the session data. To delete a session: 1. Open logout.php (Script 11.6) in your text editor or IDE. 2. Immediately after the opening PHP line, start the session (Script 11.11). session_start(); Anytime you are using sessions, you must use the session_start() function, preferably at the very beginning of a page. This is true even if you are deleting a session. 355 Cookies and Sessions Using Sessions 1 Logged Out! 28

    You are now logged out!

    "; 29 30 include ('includes/footer.html'); 31 ?> Script 11.11 Destroying a session, as you would in a logout page, requires special syntax to delete the session cookie and the session data on the server, as well as to clear out the $_SESSION array. continues on next page 3. Change the conditional so that it checks for the presence of a session variable. if (!isset($_SESSION['user_id'])) { As with the logout.php script in the cookie examples, if the user is not cur- rently logged in, they will be redirected. 4. Replace the setcookie() lines (that delete the cookies) with $_SESSION = array(); session_destroy(); setcookie ('PHPSESSID', '', time ➝ ()-3600, '/', '', 0, 0); The first line here will reset the entire $_SESSION variable as a new array, erasing its existing values. The second line removes the data from the server, and the third sends a cookie to replace the existing session cookie in the browser. 5. Remove the reference to $_COOKIE in the message. echo "

    Logged Out!

    You are now logged out!

    "; Unlike when using the cookie version of the logout.php script, you cannot refer to the user by their first name anymore, as all of that data has been deleted. 6. Save the file as logout.php, place it in your Web directory, and test it in your browser (Figure 11.19). ✔ Tips ■ Never set $_SESSION equal to NULL and never use unset($_SESSION). Either could cause problems on some servers. ■ In case it’s not absolutely clear what’s going on, there exists three kinds of information with a session: the session identifier (which is stored in a cookie by default), the session data (which is stored in a text file on the server), and the $_SESSION array (which is how a script accesses the session data in the text file). Just deleting the cookie doesn’t remove the text file and vice versa. Clearing out the $_SESSION array would erase the data from the text file, but the file itself would still exist, as would the cookie. The three steps outlined in this logout script effec- tively remove all traces of the session. 356 Chapter 11 Using Sessions Figure 11.19 The logout page (now featuring sessions). 357 Cookies and Sessions Using Sessions Changing the Session Behavior As part of PHP’s support for sessions, there are over 20 different configuration options you can set for how PHP handles sessions. For the full list, see the PHP manual, but I’ll highlight a few of the most important ones here. Note two rules about changing the session settings: 1. All changes must be made before calling session_start(). 2. The same changes must be made on every page that uses sessions. Most of the settings can be set within a PHP script using the ini_set() function (discussed in Chapter 7): ini_set (parameter, new_setting); For example, to require the use of a session cookie (as mentioned, sessions can work without cookies but it’s less secure), use ini_set ('session.use_only_cookies', 1); Another change you can make is to the the name of the session (perhaps to use a more user- friendly one). To do so, use the session_name() function. session_name('YourSession'); The benefits of creating your own session name are twofold: it’s marginally more secure and it may be better received by the end user (since the session name is the cookie name the end user will see). The session_name() function can also be used when deleting the session cookie: setcookie (session_name(), '', time()-3600); Finally, there’s also the session_set_cookie_params() function. It’s used to tweak the settings of the session cookie. session_set_cookie_params(expire, path, host, secure, httponly); Note that the expiration time of the cookie refers only to the longevity of the cookie in the Web browser, not to how long the session data will be stored on the server. Improving Session Security Because important information is normally stored in a session (you should never store sensitive data in a cookie), security becomes more of an issue. With sessions there are two things to pay attention to: the session ID, which is a reference point to the session data, and the session data itself, stored on the server. A malicious person is far more likely to hack into a session through the ses- sion ID than the data on the server, so I’ll focus on that side of things here (in the tips at the end of this section I mention two ways to protect the session data). The session ID is the key to the session data. By default, PHP will store this in a cookie, which is preferable from a security stand- point. It is possible in PHP to use sessions without cookies, but that leaves the applica- tion vulnerable to session hijacking: If I can learn another user’s session ID, I can easily trick a server into thinking that their session ID is my session ID. At that point I have effectively taken over the original user’s entire session and would have access to their data. So storing the session ID in a cookie makes it somewhat harder to steal. One method of preventing hijacking is to store some sort of user identifier in the ses- sion, and then to repeatedly double-check this value. The HTTP_USER_AGENT—a combination of the browser and operating system being used—is a likely candidate for this purpose. This adds a layer of security in that one person could only hijack another user’s session if they are both running the exact same browser and operating system. As a demonstration of this, let’s modify the examples one last time. 358 Chapter 11 Improving Session Security 1 Script 11.12 This final version of the login.php script also stores an encrypted form of the user’s HTTP_USER_AGENT (the browser and operating system of the client) in a session. To use sessions more securely: 1. Open login.php (refer to Script 11.8) in your text editor or IDE. 2. After assigning the other session variables, also store the HTTP_USER_AGENT value (Script 11.12). $_SESSION['agent'] = md5($_SERVER ➝ ['HTTP_USER_AGENT']); The HTTP_USER_AGENT is part of the $_SERVER array (you may recall using it way back in Chapter 1, “Introduction to PHP”). It will have a value like Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; .NET CLR 1.1.4322). Instead of storing this value in the ses- sion as is, it’ll be run through the md5() function for added security. That func- tion returns a 32-character hexadecimal string (called a hash) based upon a value. In theory, no two strings will have the same md5() result. 3. Save the file and place it in your Web directory. 4. Open loggedin.php (Script 11.9) in your text editor or IDE. 5. Change the !isset($_SESSION['user_ id']) conditional to (Script 11.13) if (!isset($_SESSION['agent']) OR ➝ ($_SESSION['agent'] != md5($_SERVER ➝ ['HTTP_USER_AGENT']) ) { This conditional checks two things. First, it sees if the $_SESSION['agent'] variable is not set (this part is just as it was before, although agent is being used instead of user_id). The second part of the condi- tional checks if the md5() version of $_SERVER['HTTP_USER_AGENT'] does not equal the value stored in $_SESSION ['agent']. If either of these conditions are true, the user will be redirected. 359 Cookies and Sessions Improving Session Security 1 Logged In! 21

    You are now logged in, {$_SESSION['first_name']}!

    22

    Logout

    "; 23 24 include ('includes/footer.html'); 25 ?> Script 11.13 This loggedin.php script now confirms that the user accessing this page has the same HTTP_USER_AGENT as they did when they logged in. continues on next page 6. Save this file, place in your Web directory, and test in your Web browser by logging in. ✔ Tips ■ For critical uses of sessions, require the use of cookies and transmit them over a secure connection, if at all possible. You can even set PHP to only use cookies by setting session.use_only_cookies to 1 (this is the default in PHP 6). ■ If you are using a server shared with other domains, changing the session. save_path from its default setting— which is accessible by all users—will be more secure. ■ The session data itself can be stored in a database rather than a text file. This is a more secure, but more programming- intensive, option. I teach how to do this in my book PHP 5 Advanced: Visual QuickPro Guide. ■ The user’s IP address (the network address from which the user is connect- ing) is not a good unique identifier, for two reasons. First, a user’s IP address can, and normally does, change frequently (ISPs dynamically assign them for short periods of time). Second, many users accessing a site from the same network (like a home network or an office) could all have the same IP address. 360 Chapter 11 Improving Session Security Preventing Session Fixation Another specific kind of session attack is known as session fixation. This is where one malicious user specifies the session ID that another user should use. This ses- sion ID could be randomly generated or legitimately created. In either case, the real user will go into the site using the fixed session ID and do whatever. Then the malicious user can access that ses- sion because they know what the session ID is. You can help protect against these types of attacks by changing the session ID after a user logs in. The session_ regenerate_id() does just that, providing a new session ID to refer to the current session data. You can use this function on sites for which security is paramount (like e-commerce or online banking) or in situations when it’d be particularly bad if certain users (i.e., administrators) had their sessions manipulated. The security of your Web applications is such an important topic that it really cannot be overstressed. Although security-related issues have been mentioned throughout this book, this chapter will help to fill in certain gaps and finalize other points. The most important concept to understand about security is that it’s not a binary state: don’t think of a Web site or script as being either secure or not secure. Security isn’t a switch that you turn on and off; it’s a scale that you can move up and down. When you program, think about what you can do to make your site more secure and what you’ve done that makes it less secure. Also, keep in mind that improved security normally comes at a cost of convenience (both to you, the programmer, and to the end user) and performance. Increased security normally means more code, more checks, and more required of the server. When developing Web applications, think about these considerations and make the right decisions—for the particular situation— from the outset. The topics discussed here include: preventing spam; using typecasting; preventing cross-site scripting (XSS) and SQL injection attacks; and database security. This chapter will use several discrete examples to best demonstrate these concepts. Some other common security issues and best practices will be mentioned in sidebars as well. 361 Security Methods 12 Security Methods Preventing Spam Spam is nothing short of a plague, cluttering up the Internet and our inboxes. There are steps you can take to avoid receiving spam at your email accounts, but in this book the focus is on preventing spam being sent through your PHP scripts. Chapter 10, “Web Application Development,” shows how easy it is to send email using PHP’s mail() function. The example there, a con- tact form, took some information from the user (Figure 12.1) and sent it to an email address. Although it may seem like there’s no harm in this system, there’s actually a big security hole. But first, some background on what an email actually is. Regardless of how an email is sent, how it’s formatted, and what it looks like when it’s received, an email contains two parts: a header and a body. The header includes such information as the to and from addresses, the subject, the date, and more (Figure 12.2). Each item in the header is on its own line, in the format Name: value. The body of the email is exactly what you think it is: the body of the email. In looking at PHP’s mail() function— mail (to, subject, body, [headers]); —you can see that one of the arguments goes straight to the email’s body and the rest appear in its header. To send spam to your address (as in Chapter 10’s example), all a person would have to do is enter the spam message into the comments section of the form (Figure 12.1). That’s bad enough, but to send spam to anyone else at the same time, all the user would have to do is add Bcc: poorsap@example.org, followed by a some sort of line terminator (like a newline or car- riage return), to the email’s header. With the example as is, this just means entering into the from value of the contact form me@ example.com\nBcc:poorsap@example.org. You might think that safeguarding every- thing that goes into an email’s header would be sufficiently safe, but as an email is just one document, bad input in a body can impact the header. There are a couple of preventive techniques. First, validate any email addresses using regu- lar expressions. Chapter 13, “Perl-Compatible Regular Expressions,” covers this subject. 362 Chapter 12 Preventing Spam Figure 12.1 A simple, standard HTML contact form. Figure 12.2 The raw source version of the email sent by the contact form (Figure 12.1). 363 Security Methods Preventing Spam Second, now that you know what an evildoer must enter to send spam (Table 12.1), watch for those characters in form values. If a value contains anything from that list, don’t use that value. In this next example, a modification of the email script from Chapter 10, I’ll define a function that scrubs all the potentially dan- gerous characters from data. Two new PHP functions will be used as well: str_replace() and array_map(). Both will be explained in detail in the steps that follow. To prevent spam: 1. Open email.php (Script 10.1) in your text editor or IDE. To complete this spam-purification, the email script needs to be modified. 2. After checking for the form submission, begin defining a function (Script 12.1). function spam_scrubber($value) { This function will take one argument: a string. 3. Create a list of really bad things that wouldn’t be in a legitimate contact form submission. $very_bad = array('to:', 'cc:', ➝ 'bcc:', 'content-type:', 'mime- ➝ version:', 'multipart-mixed:', ➝ 'content-transfer-encoding:'); Any of these strings should not be present in an honest contact form submission (it’s possible someone might legitimately use to: in their comments, but unlikely). If any of these strings are present, then this is a spam attempt. To make it easier to test for all these, they’re placed in an array, which will be looped through (Step 4). Characters content-type: mime-version: multipart-mixed: content-transfer-encoding: bcc: cc: to: \r \n %0a %0d Spam Tip-offs Table 12.1 The presence of any of these character strings in a form submission is a likely indicator that someone is trying to send spam through your site. The last four are all different ways of creating newlines. 1 3 4 5 6 Contact Me 7 8 9

    Contact Me

    10 Thank you for contacting me. I will reply some day.

    '; 54 55 // Clear $_POST (so that the form's not sticky): 56 $_POST = array(); 57 58 } else { 59 echo '

    Please fill out the |form completely.

    '; 60 } 61 62 } // End of main isset() IF. 63 ?> 64

    Please fill out this form to contact me.

    65
    66

    Name:

    67

    Email Address:

    68

    Comments:

    69

    70 71
    72 73 Script 12.1 continued 4. Loop through the array. If a very bad thing is found, return an empty string. foreach ($very_bad as $v) { if (stripos($value, $v) !== ➝ false) return ''; } The foreach loop will access each item in $very_bad one at a time. Within the loop, the stripos() function will check if the item is in the string provided to this function as $value. The stripos() func- tion performs a case-insensitive search (so it would match bcc:, Bcc:, bCC:, etc.). The first time that any of these items is found in the submitted value, the func- tion will return an empty string and ter- minate (functions automatically stop executing once they hit a return). 5. Replace any newline characters with spaces. $value = str_replace(array("\r", ➝ "\n", "%0a", "%0d"), ' ', $value); Newline characters, which are represent- ed by \r, \n , %0a, and %0d, may or may not be problematic. A newline character is required to send spam (or else you can’t create the proper header) but will also appear if a user just hits Enter while typing in a textarea box. For this reason, any found newlines will just be replaced by a space. This means that the submit- ted value could lose some of its format- ting, but that’s a reasonable price to pay to stop spam. The str_replace() function looks through the value in the third argument and replaces any occurrences of the char- acters in the first argument with the character or characters in the second. Or as the PHP manual puts it: mixed str_replace (mixed $search, ➝ mixed $replace, mixed $subject) This function is very flexible in that it can take strings or arrays for its three argu- ments (the mixed means it accepts a mix of argument types). So this line of code in the script assigns to the $value vari- able its original value, with any newline characters replaced by a single space. There is a case-insensitive version of this function, but it’s not necessary, as, for example, \r is a carriage return but \R is not. 6. Return the value and complete the function. return trim($value); } // End of spam_scrubber() function. Finally, this function returns the value, trimmed of any leading and ending spaces. Keep in mind that the function will only get to this point if none of the very bad things was found. 7. After the function definition, invoke the spam_scrubber() function. $scrubbed = array_map('spam_ ➝ scrubber', $_POST); I’ve demonstrated this technique in the book’s supporting forum (www. DMCInsights.com/phorum/), and I think the simplicity of this line confuses many people. The array_map() function has two required arguments. The first is the name of the function to call. In this case, that’s spam_scrubber (without the paren- theses, because you’re providing the function’s name, not calling the func- tion). The second argument is an array. 365 Security Methods Preventing Spam continues on next page What array_map() does is call the named function, once for each array element, sending each array element’s value to that function. In this script, $_POST has five elements: name, email, comments, submit, and submitted. After this line of code, the $scrubbed array will end up with five elements: $scrubbed['name'] will have the value of $_POST['name'] after running it through spam_scrubber(); $scrubbed['email'] will have the same value as $_POST['email'] after running it through spam_scrubber(); and so forth. This one line of code then takes an entire array of potentially tainted data ($_POST), cleans it using spam_scrubber(), and assigns the result to a new variable. Here’s the most important thing: from here on out, the script will use the $scrubbed array, which is clean, not $_POST, which is still potentially dirty. 8. Change the form validation to use this new array. if (!empty($scrubbed['name']) && ➝ !empty($scrubbed['email']) && ➝ !empty($scrubbed['comments']) ) { Each of these elements could have an empty value for two reasons. First, if the user left them empty. Second, if the user entered one of the bad strings in the field, which would be turned into an empty string by the spam_scrubber() function. 9. Change the creation of the $body variable so that it uses the clean values. $body = "Name: {$scrubbed['name']} ➝ \n\nComments: {$scrubbed ➝ ['comments']}"; 366 Chapter 12 Preventing Spam Figure 12.3 The presence of cc: in the email address field will prevent this submission from being sent in an email (see Figure 12.4). Figure 12.4 The email was not sent because of the very bad characters used in the email address. 10. Change the invocation of the mail() function to use the clean email address. mail('your_email@example.com', ➝ 'Contact Form Submission', $body, ➝ "From: {$scrubbed['email']}"); 11. Save the script as email.php, place it in your Web directory, and test it in your Web browser (Figures 12.3, 12.4, 12.5, and 12.6). ✔ Tips ■ Using the array_map() function as I have in this example is convenient but not without its downsides. First, it blindly applies the spam_scrubber() function to the entire $_POST array, even to the sub- mit button and hidden form input. This isn't harmful but is unnecessary. Second, any multidimensional arrays within $_POST will be lost. In this specific exam- ple, that's not a problem but it is some- thing to be aware of. ■ To prevent automated submissions to any form, you could use a CAPTCHA test. These are prompts that can only be understood by humans (in theory). While this is commonly accomplished using an image of random characters, the same thing can be achieved using a question like What is two plus two? or On what continent is China?. Checking for the correct answer to this question would then be part of the validation routine. ■ If you wanted, you could change the sticky form so that it refers to the $scrubbed values, not the original $_POST ones. 367 Security Methods Preventing Spam Figure 12.5 Although the comments field contains newline characters (created by pressing Enter or Return), the email will still be sent (Figure 12.6). Figure 12.6 The received email, with the newlines in the comments (Figure 12.5) turned into spaces. 368 Chapter 12 Preventing Spam More Security Recommendations This chapter covers many specific techniques for improving your Web security. Here are a handful of other recommendations: ◆ Make it your job to study, follow, and abide by security recommendations. Don’t just rely upon the advice of one chapter, one book, or one author. ◆ Don’t use user-supplied names for uploaded files. You’ll see an alternative to doing that in Chapter 17, “Example—E-Commerce.” ◆ Watch how database references are used. For example, if a person’s user ID is their pri- mary key from the database and this is stored in a cookie (as in Chapter 11, “Cookies and Sessions”), a malicious user just needs to change that cookie value to access another user’s account. ◆ Don’t show detailed error messages (this point was repeated in Chapter 7, “Error Handling and Debugging”). ◆ Use cryptography (this is discussed at the end of the chapter with respect to the database and in my book PHP 5 Advanced: Visual QuickPro Guide (Peachpit Press, 2007) with respect to the server). ◆ Don’t store credit card numbers, social security numbers, banking information, and the like. The only exception to this would be if you have deep enough pockets to pay for the best security and to cover the lawsuits that arise when this data is stolen from your site (which will inevitably happen). ◆ Use SSL, if appropriate. A secure connection is one of the best protections a server can offer a user. ◆ Reliably and consistently protect every page and directory that needs it. Never assume that people won’t find sensitive areas just because there’s no link to them. If access to a page or directory should be limited, make sure it is. My final recommendation is to be aware of your own limitations. As the programmer, you probably approach a script thinking how it should be used. This is not the same as to how it will be used, either accidentally or on purpose. Try to break your site to see what happens. Do bad things, do the wrong thing. Have other people try to break it, too (it’s normally easy to find such volunteers). When you code, if you assume that no one will ever use a page properly, it’ll be much more secure than if you assume people always will. Validating Data by Type For the most part, the form validation used in this book thus far has been rather mini- mal, often just checking if a variable has any value at all. In many situations, this really is the best you can do. For example, there’s no perfect test for what a valid street address is or what a user might enter into a comments field. Still, much of the data you’ll work with can be validated in stricter ways. In the next chapter, the sophisticated concept of regular expressions will demonstrate just that. But here I’ll cover the more approachable ways you can validate some data by type. PHP supports many types of data: strings, numbers (integers and floats), arrays, and so on. For each of these, there’s a specific func- tion that checks if a variable is of that type (Table 12.2). You’ve probably already seen the is_numeric() function in action in earli- er chapters, and is_array() is great for con- firming a variable’s type before attempting to use it in a foreach loop. In PHP, you can even change a variable’s type, after it’s been assigned a value. Doing so is called typecasting and is accomplished by preceding a variable’s name by the type in parentheses: $var = 20.2; echo (int) $var; // 20 Depending upon the original and destina- tion types, PHP will convert the variable’s value accordingly: $var = 20; echo (float) $var; // 20.0 369 Security Methods Validating Data by Type Function Checks For is_array() Arrays is_bool() Booleans (TRUE, FALSE) is_float() Floating-point numbers is_int() Integers is_null() NULLs is_numeric() Numeric values, even as a string (e.g., '20') is_resource() Resources, like a database connection is_scalar() Scalar (single-valued) variables is_string() Strings Type Validation Functions Table 12.2 These functions return TRUE if the submitted variable is of a certain type and FALSE otherwise. Two Validation Approaches A large part of security is based upon val- idation: if data comes from outside of the script—from HTML forms, the URL, cookies, sessions, or even form a data- base, it can’t be trusted. There are two types of validation: whitelist and blacklist. In the calculator example, we know that all values must be positive, that they must all be numbers, and that the quanti- ty must be an integer (the other two numbers could be integers or floats, it makes no difference). Typecasting forces the inputs to be numbers, and a check confirms that they are positive. At this point, the assumption is that the input is valid. This is a whitelist approach: these values are good; anything else is bad. The preventing spam example uses a blacklist approach. That script knows exactly which characters are bad and invalidates input that contains them. All other input is considered to be good. Many security experts prefer the whitelist approach, but it can’t always be used. The example will dictate which approach will work best, but it’s important to use one or the other. Don’t just assume that data is safe without some sort of validation. continues on next page With numeric values, the conversion is straightforward, but with other variable types, more complex rules apply: $var = 'trout'; echo (int) $var; // 0 In most circumstances you don’t need to cast a variable from one type to another, as PHP will often automatically do so as needed. But forcibly casting a variable’s type can be a good security measure in your Web applica- tions. To show how you might use this notion, let’s create a calculator script for determining the total purchase price of an item, similar to that defined in earlier chapters. To use typecasting: 1. Begin a new PHP document in your text editor or IDE (Script 12.2). Widget Cost Calculator</ ➝ title> </head> <body> <?php # Script 12.2 - calculator.php 2. Check if the form has been submitted. if (isset($_POST['submitted'])) { Like many previous examples, this one script will both display the HTML form 370 Chapter 12 Validating Data by Type 1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 2 "http://www.w3.org/TR/xhtml1/DTD/ xhtml1-transitional.dtd"> 3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> 4 <head> 5 <meta http-equiv="content-type" content="text/html; charset= iso-8859-1" /> 6 <title>Widget Cost Calculator 7 8 9 0) && ($price > 0) && ($tax > 0)) { 21 22 // Calculate the total: 23 $total = ($quantity * $price) * (($tax/100) + 1); 24 25 // Print the result: 26 echo '

    The total cost of purchasing ' . $quantity . ' widget(s) at $' . number_format ($price, 2) . ' each, plus tax, is $' . number_format ($total, 2) . '.

    '; Script 12.2 By typecasting variables, this script more definitively validates that data is of the correct format. (script continues on next page) and handle its submission. By checking for the presence of a specific $_POST ele- ment, you can know if the form has been submitted. 3. Cast all the variables to a specific type. $quantity = (int) $_POST['quantity']; $price = (float) $_POST['price']; $tax = (float) $_POST['tax']; The form itself has three text boxes (Figure 12.7), into which practically anything could be typed (there’s no num- ber type of input for HTML forms). But the quantity must be an integer and both price and tax should be floats (they will contain decimal points). To force these issues, cast each one to a specific type. 4. Check if the variables have proper values, and then calculate and print the results. if ( ($quantity > 0) && ($price > 0) ➝ && ($tax > 0) ) { $total = ($quantity * $price) * ➝ (($tax/100) + 1); 371 Security Methods Validating Data by Type 27 28 } else { // Invalid submitted values. 29 echo '

    Please enter a valid quantity, price, and tax rate.

    '; 30 } 31 32 } // End of main isset() IF. 33 34 // Leave the PHP section and create the HTML form. 35 ?> 36

    Widget Cost Calculator

    37
    38

    Quantity:

    39

    Price:

    40

    Tax (%):

    41

    42 43
    44 45 Script 12.2 continued Figure 12.7 The HTML form takes three inputs: a quantity, a price, and a tax rate. continues on next page echo '

    The total cost of ➝ purchasing ' . $quantity . ' ➝ widget(s) at $' . number_ ➝ format ($price, 2) . ' each, ➝ plus tax, is $' . number_ ➝ format ($total, 2) . '.

    '; For this calculator to work, the three vari- ables must be specific types (see Step 3). More importantly, they must all be posi- tive numbers. This conditional checks for that prior to performing the calculations. Note that, per the rules of typecasting, if the posted values are not numbers, they will be cast to 0 and therefore not pass this conditional. The calculation itself is accomplished in a single line of code, using parentheses to ensure reliable results (thereby sparing you concern for precedence issues). The quantity is multiplied by the price. This is then multiplied by the tax divided by 100 (so 8% becomes .08) plus 1 (1.08). The number_format() function is used to print both the price and total values in the proper format. 5. Complete the conditionals. } else { echo '

    Please ➝ enter a valid quantity, ➝ price, and tax rate.

    '; } } // End of main isset() IF. A little CSS is used to create a bold, red error message, should there be a problem (Figure 12.8). 6. Begin the HTML form. ?>

    Widget Cost Calculator

    Quantity:

    The HTML form is really simple and posts back to this same page. The inputs will have a sticky quality, so the user can see what was previously entered. By referring to $quantity etc. instead of $_POST['quantity'] etc., the form will reflect the value for each input as it was typecast (see the tax value in Figure 12.8). 372 Chapter 12 Validating Data by Type Figure 12.8 An error message is printed in bold, red text if any of the three fields does not contain a positive number. 7. Complete the HTML form.

    Price:

    Tax (%):

    8. Complete the HTML page. 9. Save the file as calculator.php, place it in your Web directory, and test it in your Web browser (Figures 12.9 and 12.10). ✔ Tips ■ You should definitely use typecasting when working with numbers within SQL queries. Numbers aren’t quoted in queries, so if a string is somehow used in a num- ber’s place, there will be an SQL syntax error. If you typecast such variables to an integer or float first, the query may not work (in terms of returning a record) but will still be syntactically valid. You’ll fre- quently see this in the book’s last three chapters. ■ As I implied, regular expressions are a more advanced method of data valida- tion and are sometimes your best bet. But using type-based validation, when feasible, will certainly be faster (in terms of processor speed) and less prone to programmer error (did I mention that regular expressions are complex?). ■ To repeat myself, the rules of how values are converted from one data type to another are somewhat complicated. If you want to get into the details, see the PHP manual. ■ If you wanted to allow for no tax rate, then change that part of the validation conditional to … && ($tax >= 0) ) { …. 373 Security Methods Validating Data by Type Figure 12.9 If invalid values are entered, such as floats for the quantity or strings for the tax… Figure 12.10 …they’ll be cast into more appropriate formats. The negative price will also keep this calculation from being made (although the casting won’t change that value). Preventing XSS Attacks HTML is simply plain text, like , which is given special meaning by Web browsers (as by making text bold). Because of this fact, your Web site’s user could easily put HTML in their form data, like in the comments field in the email example. What’s wrong with that, you might ask? Many dynamically driven Web applications take the information submitted by a user, store it in a database, and then redisplay that information on another page. Think of a forum, as just one example. At the very least, if a user enters HTML code in their data, such code could throw off the layout and aesthetic of your site. Taking this a step further, JavaScript is also just plain text, but text that has special meaning—executable meaning—within a Web browser. If malicious code entered into a form were re-displayed in a Web browser, it could create pop-up windows (Figures 12.11 and 12.12), steal cookies, or redirect the browser to other sites. Such attacks are referred to as cross- site scripting (XSS). As in the email example, where you need to look for and nullify bad strings found in data, prevention of XSS attacks is accomplished by addressing any potentially dangerous PHP, HTML, or JavaScript. PHP includes a handful of functions for han- dling HTML and other code found within strings. These include: ◆ htmlspecialchars(), which turns &, ', ", <, and > into an HTML entity format (&, ", etc.) ◆ htmlentities(), which turns all applica- ble characters into their HTML entity format ◆ strip_tags(), which removes all HTML and PHP tags These three functions are roughly listed in order from least disruptive to most. Which you’ll want to use depends upon the applica- tion at hand. To demonstrate how these functions work and differ, let’s just create a simple PHP page that takes some text (see Figure 12.11) and runs it through these func- tions, printing the results (Figure 12.13). 374 Chapter 12 Preventing XSS Attacks Figure 12.11 The malicious and savvy user can enter HTML, CSS, and JavaScript into text inputs. Figure 12.12 The JavaScript entered into the comments field (see Figure 12.11) would create this alert window when the comments were displayed in the Web browser. Figure 12.13 Thanks to the htmlentities() and strip_tags() functions, malicious code entered into the a form field (see Figure 12.11) can be rendered inert. To handle HTML: 1. Create a new PHP document in your text editor or IDE (Script 12.3). XSS Attacks Original

    {$_ ➝ POST['data']}

    "; To compare and contrast what was origi- nally received with the result after apply- ing the functions, the original value must first be printed. 3. Apply the htmlentities() function, printing the results. echo '

    After htmlentities() ➝

    ' . htmlentities($_POST ➝ ['data']). '

    '; To keep submitted information from mess- ing up a page or hacking the Web brows- er, it’s run through the htmlentities() function. So, any HTML entity will be translated; for instance, < and > will become < and > respectively. 375 Security Methods Preventing XSS Attacks 1 3 4 5 6 XSS Attacks 7 8 9 Original

    {$_POST ['data']}

    "; 15 echo '

    After htmlentities()

    ' . htmlentities($_POST['data']). '

    '; 16 echo '

    After strip_tags()

    ' . strip_tags($_POST['data']). '

    '; 17 18 } 19 20 // Display the form: 21 ?> 22
    23 24

    Do your worst!

    25
    26 27 28
    29 30 Script 12.3 Applying the htmlentities() and strip_tags() functions to submitted text can prevent XSS attacks. continues on next page 4. Apply the strip_tags() function, print- ing the results. echo '

    After strip_tags()

    ' . strip_tags($_POST['data']). ➝ '

    '; The strip_tags() function completely takes out any HTML, JavaScript, or PHP tags. It’s therefore the most foolproof function to use on submitted data. 5. Complete the PHP section. } ?> 6. Display the HTML form.

    Do your worst!

    The form (see Figure 12.11) has only one field for the user to complete: a textarea. 7. Complete the page. 376 Chapter 12 Preventing XSS Attacks Figure 12.14 This snippet of the page’s HTML source (see Figure 12.13) shows the original, submitted value, the value after using html_entities(), and the value after using strip_tags(). 8. Save the page as xss.php, place it in your Web directory, and test it in your Web browser. 9. View the source code of the page to see the full effect of these functions (Figure 12.14). ✔ Tips ■ Both htmlspecialchars() and htmlentities() take an optional parameter indicating how quotation marks should be handled. See the PHP manual for specifics. ■ The strip_tags() function takes an optional parameter indicating what tags should not be stripped. $var = strip_tags ($var, '


    '); ■ The strip_tags() function will remove even invalid HTML tags, which may cause problems. For example, strip_tags() will yank out all of the code it thinks is an HTML tag, even if it’s improperly formed, like tag. Preventing SQL Injection Attacks Another type of attack that malicious users can attempt are SQL injection attacks. As the name implies, these are endeavors to insert bad code into a site’s SQL queries. One aim of such attacks is that they would create a syntactically invalid query, thereby revealing something about the script or database in the resulting error message (Figure 12.15). An even bigger aspiration is that the injec- tion attack could alter, destroy, or expose the stored data. Fortunately SQL injection attacks are rather easy to prevent. Start by validating all data to be used in queries (and perform typecast- ing, whenever possible). Second, use a func- tion like mysqli_real_escape_string(), which makes data safe to use in queries. This function was introduced in Chapter 8, “Using PHP and MySQL.” Third, don’t show detailed errors on live sites. An alternative to using mysqli_real_ escape_string() is to use prepared state- ments. Prepared statements were added to MySQL in version 4.1, and PHP can use them as of version 5 (thanks to the Improved MySQL extension). When not using pre- pared statements, the entire query, including the SQL syntax and the specific values, is sent to MySQL as one long string. MySQL then parses and executes it. With a prepared query, the SQL syntax is sent to MySQL first, where it is parsed, making sure it’s syntacti- cally valid. Then the specific values are sent separately; MySQL assembles the query using those values, then executes it. The benefits of prepared statements are important: greater security and potentially better per- formance. I’ll focus on the security aspect here, but see the sidebar for a discussion of performance. Prepared statements can be created out of any INSERT, UPDATE, DELETE, or SELECT query. Begin by defining your query, marking placeholders using question marks. As an example, take the SELECT query from edit_ user.php (Script 9.3): $q = "SELECT first_name, last_name, ➝ email FROM users WHERE user_id=$id"; As a prepared statement, this query becomes $q = "SELECT first_name, last_name, ➝ email FROM users WHERE user_id=?"; Next, prepare the statement in MySQL, assigning the results to a PHP variable. $stmt = mysqli_prepare($dbc, $q); At this point, MySQL will parse the query, but it won’t execute it. 377 Security Methods Preventing SQL Injection Attacks Figure 12.15 If a site reveals a detailed error message and doesn’t properly handle problematic characters in submitted values, hackers can learn a lot about your server. continues on next page Next, you bind PHP variables to the query’s placeholders. In other words, you state that one variable should be used for one question mark, another variable for the other ques- tion mark, and so on. Continuing with the same example, you would code mysqli_stmt_bind_param($stmt, 'i', $id); The i part of the command indicates what kind of value should be expected, using the characters listed in Table 12.3. In this case, the query expects to receive one integer. As another example, here’s how the login query from Chapter 11, “Cookies and Sessions,” would be handled: $q = "SELECT user_id, first_name FROM ➝ users WHERE email=? AND pass=SHA1(?)"; $stmt = mysqli_prepare($dbc, $q); mysqli_stmt_bind_param($stmt, 'ss', $e, ➝ $p); In this example, something interesting is also revealed: even though both the email address and password values are strings, they are not placed within quotes in the query. This is another difference between a prepared statement and a standard query. Once the statement has been bound, you can assign values to the PHP variables (if that hasn’t happened already) and then exe- cute the statement. Using the login example, that’d be: $e = 'email@example.com'; $p = 'mypass'; mysqli_stmt_execute($stmt); The values of $e and $p will be used when the prepared statement is executed. 378 Chapter 12 Preventing SQL Injection Attacks Letter Represents d Decimal i Integer b Blob (binary data) s All other types Bound Value Types Table 12.3 Use these characters to tell the mysql_stmt_bind_param() function what kinds of values to expect. Prepared Statement Performance Prepared statements will always be more secure than running queries in the old- fashioned way, but they may also be faster. If a PHP script sends the same query to MySQL multiple times, using different values each time, prepared statements can really speed things up. In such cases, the query itself is only sent to MySQL and parsed once. Then, the values are sent to MySQL separately. As a trivial example, the following code would run 100 queries in MySQL: $q = 'INSERT INTO counter (num) ➝ VALUES (?)'; $stmt = mysqli_prepare($dbc, $q); mysqli_stmt_bind_param($stmt, 'i', ➝ $n); for ($n = 1; $n <= 100; $n++) { mysqli_stmt_execute($stmt); } Even though the query is being run 100 times, the full text is only being trans- ferred to, and parsed by, MySQL once. MySQL versions 5.1.17 and later will include a caching mechanism that may also improve the performance of other uses of prepared statements. To see this process in action, let’s write a script that adds a message to the messages table in the forum database (created in Chapter 6, “Advanced SQL and MySQL”). I’ll also use the opportunity to demonstrate a couple of the other prepared statement- related functions. To use prepared statements: 1. Create a new PHP script in your text edi- tor or IDE (Script 12.4). Post a Message 3 4 5 6 Post a Message 7 8 9 Your message has been posted.

    '; 40 } else { 41 echo '

    Your message could not be posted.

    '; 42 echo '

    ' . mysqli_stmt_error($stmt) . '

    '; 43 } 44 45 // Close the statement: 46 mysqli_stmt_close($stmt); 47 48 // Close the connection: 49 mysqli_close($dbc); 50 51 } // End of submission IF. 52 53 // Display the form: 54 ?> 55
    56 57
    Post a message: 58 (script continues on next page) Script 12.4 continued For those variables, the subject and body values come straight from the form, after running them through strip_tags() to remove any potentially dangerous code. The forum ID and parent ID (which indi- cates if the message is a reply to an exist- ing message or not) also come from the form. They’ll be typecast to integers (for added security, you would confirm that they’re positive numbers after typecast- ing them). The user ID value, in a real script, would come from the session, where it would be stored when the user logged in. 5. Execute the query. mysqli_stmt_execute($stmt); Finally, the prepared statement is executed. 6. Print the results of the execution and complete the loop. if (mysqli_stmt_affected_rows ➝ ($stmt) == 1) { echo '

    Your message has been ➝ posted.

    '; } else { echo '

    Your ➝ message could not be ➝ posted.

    '; echo '

    ' . mysqli_stmt_ ➝ error($stmt) . '

    '; } The successful insertion of a record can be confirmed using the mysqli_stmt_ affected_rows() function, which works as you expect it would (returning the number of affected rows). If a problem occurred, the mysqli_stmt_error() func- tion returns the specific MySQL error message. This is for your debugging pur- poses, not to be used in a live site. 7. Close the statement and the database connection. mysqli_stmt_close($stmt); mysqli_close($dbc); The first function closes the prepared statement, freeing up the resources. At this point, $stmt no longer has a value. The second function closes the database connection. 8. Complete the PHP section. } // End of submission IF. ?> 381 Security Methods Preventing SQL Injection Attacks 59

    Subject:

    60 61

    Body:

    62 63
    64
    65 66 67 68 69
    70 71 Script 12.4 continued continues on next page 9. Create the form.
    Post a message: ➝

    Subject:

    Body:

    The form contains two fields the user would fill out and two hidden inputs that store values the query needs. In a real version of this script, it would determine the forum_id and parent_id values auto- matically. 10. Complete the page. 11. Save the file as post_message.php, place it in your Web directory, and test it in your Web browser (Figures 12.16, 12.17, and 12.18). ✔ Tip ■ There are two kinds of prepared state- ments. Here I have demonstrated bound parameters, where PHP variables are bound to a query. The other type is bound results, where the results of a query are bound to PHP variables. 382 Chapter 12 Preventing SQL Injection Attacks Figure 12.16 The simple HTML form. Figure 12.17 If one record in the database was affected by the query, this will be the result. Figure 12.18 Selecting the most recent entry in the messages table confirms that the prepared statement (Script 12.4) worked. Notice that the HTML was stripped out of the post but the quotes are still present. Database Encryption As a brief conclusion to this chapter, I’ll go over true encryption in a MySQL database. Up to this point, pseudo-encryption has been accomplished via the SHA1() function. In the registration and login examples, the user’s password has been stored after run- ning it through SHA1(). Although using this function in this way is perfectly fine (and quite common), the function doesn’t provide real encryption: the SHA1() function returns a representation of a value. If you need to store data in a protected way while still being able to view the data is its original form at some later point, other MySQL func- tions are necessary. Encryption MySQL has several encryption and decryp- tion functions built into the software. If you require data to be stored in an encrypted form that can be decrypted, you’ll want to use AES_ENCRYPT() and AES_DECRYPT(). These functions take two arguments: the string being encrypted or decrypted and a salt argument. The salt argument is a string that helps to randomize the encryption. The only trick is that the exact same salt must be used for both encryption and decryption. To add a record to a table while encrypting the data, the query might look like INSERT INTO tablename (username, pass) VALUES ('troutster', AES_ENCRYPT('mypass', 'nacl')) The encrypted data returned by the AES_ENCRYPT() function will be in binary for- mat. To store that data in a table, the col- umn must be defined as one of the binary types (e.g., BLOB). To run a login query for the record just inserted (matching a submitted username and password against those in the data- base), you would write SELECT * FROM tablename WHERE username = 'troutster' AND AES_DECRYPT(pass, 'nacl') = 'mypass' The AES_ENCRYPT() function is considered to be the most secure encryption option (it’s available as of MySQL version 4.0.2). To demonstrate how you’d use it, let’s run some queries on the test database using a MySQL client. To encrypt and decrypt data: 1. Access MySQL and select the test data- base (Figure 12.19). USE test; 383 Security Methods Database Encryption Figure 12.19 The following examples will all be run in the mysql client, on the test database. continues on next page Follow the steps outlined in Chapter 4, “Introduction to MySQL,” to connect to the mysql client. Alternatively, you can use phpMyAdmin or another interface to run the queries in the following steps. 2. Create a new encode table (Figure 12.20). CREATE TABLE encode ( id INT UNSIGNED NOT NULL AUTO_INCREMENT, card_number TINYBLOB, PRIMARY KEY (id) ); This table, encode, will contain fields for just an id and a (credit) card_number. The card_number will be encrypted using AES_ENCRYPT() so that it can be decoded. AES_ENCRYPT() returns a binary value that ought to be stored in a BLOB (or TINYBLOB here) column type. 3. Insert a new record (Figure 12.21). INSERT INTO encode (id, card_number) VALUES (NULL, AES_ENCRYPT(1234567890123456, 'eLL10tT')); Here I am adding a new record to the table, using the AES_ENCRYPT() function with a salt of eLL10tT to encrypt the card number. Always try to use a unique salt with your encryption functions. Also remember that you cannot have spaces between your function names and their opening parentheses. 4. Retrieve the record in an unencrypted form (Figure 12.22). SELECT id, AES_DECRYPT(card_number, 'eLL10tT') AS cc FROM encode; This query returns all of the records, decrypting the credit card number in the process. Any value stored using AES_ENCRYPT() can be retrieved (and matched) using AES_DECRYPT(), as long as the same salt is used (here, eLL10tT). 5. Check out the table’s contents without using decryption (Figure 12.23). SELECT * FROM encode; As you can see in the figure, the encrypt- ed version of the credit card number is unreadable. This is exactly the kind of security measure required by e-commerce applications. 384 Chapter 12 Database Encryption Figure 12.21 A record is inserted, using an encryption function to protect the credit card number. Figure 12.20 The encode table, consisting of only two columns, is added to the database. ✔ Tips ■ As a rule of thumb, use SHA1() for infor- mation that will never need to be view- able, such as passwords and perhaps usernames. Use AES_ENCRYPT() for infor- mation that needs to be protected but may need to be viewable at a later date, such as credit card information, Social Security numbers, addresses (perhaps), and so forth. ■ As a reminder, it’s much more secure to never store credit card numbers and other high-risk data. Secure salt storage While the preceding sequence of steps demonstrates how you can add a level of security to your Web applications by encrypting and decrypting sensitive data, there’s still room for improvement. The main issue is protecting the encryption salt, which is key to the encryption process. In order for a PHP script to use a salt in its queries, PHP must have access to it. Most likely, the salt might be placed in the same script that establishes a database connec- tion. But storing this value in a plain text for- mat on the server makes it more vulnerable. As an alternative, you can store the salt in a database table. Then, when a query needs to use this value, it can be selected. This process can be simplified thanks to user- defined MySQL variables. I discuss this con- cept in more detail in my book MySQL: Visual QuickStart Guide, Second Edition (Peachpit Press, 2006), but I’ll provide a quick rundown of that process here. To just establish a user-defined variable, use this SQL command: SELECT @var:=value So, you could write SELECT @PI:=3.14 To define a variable based upon a value stored in a table, the syntax is just an exten- sion of this: SELECT @var:=some_column FROM tablename Once you’ve established @var, it can be used in other queries: SELECT * FROM tablename WHERE col=@var This next sequence of steps will demon- strate this approach in action, using the mysql client. Doing the same thing in a PHP script is described in the first tip. 385 Security Methods Database Encryption Figure 12.22 The record has been retrieved, decrypting the credit card number in the process. Figure 12.23 Encrypted data is stored in an unreadable format (here, as a binary string of data). To use a database-stored salt: 1. Log in to the mysql client and select the test database, if you haven’t already. 2. Empty the encode table (Figure 12.24). TRUNCATE TABLE encode; Because I’m going to be using a different encryption function, I’ll want to clear out all the existing data before repopulating it. The TRUNCATE command is the best way to do so. 3. Create and populate an aes_salt table (Figure 12.25). CREATE TABLE aes_salt ( salt VARCHAR(12) NOT NULL ); INSERT INTO aes_salt (salt) VALUES ('0bfuscate'); This table, aes_salt, will store the encryp- tion salt value in its one column. The INSERT query stores the salt, which will be retrieved and assigned to a user- defined variable as needed. 4. Retrieve the stored salt value and use it to insert a new record into the encode table (Figure 12.26). SELECT @salt:=salt FROM aes_salt; INSERT INTO encode (card_number) VALUES (AES_ENCRYPT(1234567890123456, @salt)); The first line retrieves the stored salt value from the aes_salt table and assigns this to @salt (the figure shows the results of the SELECT statement). Then a stan- dard INSERT query is run to add a record to the encode table. In this case, @salt is used in the query instead of a hard- coded salt value. 386 Chapter 12 Database Encryption Figure 12.25 The aes_salt table has one column and should only ever have one row of data. The INSERT query stores the salt value in this table. Figure 12.26 These two queries show how you can retrieve a salt value using one query, assigning the value to a variable, then use that variable in a second query. Figure 12.24 Run a TRUNCATE query to empty a table. 5. Decrypt the stored credit card number (Figure 12.27). SELECT @salt:=salt FROM aes_salt; SELECT id, AES_DECRYPT(card_number, @salt) AS cc FROM encode; The first step retrieves the salt value so that it can be used for decryption purposes. (If you followed these steps without closing the MySQL session, this step wouldn’t actually be necessary, as @salt would already be established.) The @salt variable is then used with he AES_DECRYPT() function. ✔ Tips ■ The code in these steps (for retrieving and using a salt stored in a table) can easily be used in a PHP script. Run the first query, then run the second query, and then fetch the results: $r = mysqli_query($dbc, 'SELECT ➝ @salt:=salt FROM aes_salt'); $r = mysqli_query($dbc, 'SELECT id, ➝ AES_DECRYPT(card_number, @salt) ➝ AS cc FROM encode'); $row = mysqli_fetch_array($r, ➝ MYSQLI_ASSOC); You can make this more professional by calling the mysqli_num_rows() function prior to running the second query or fetching the results, of course. But notice that you don’t have to fetch the results of the first query into the PHP script. The results of that query will be assigned to the @salt variable, residing in MySQL, associated with this connection. ■ User variables are particular to each con- nection. When one script or one mysql client session connects to MySQL and establishes a variable, only that one script or session has access to that variable. 387 Security Methods Database Encryption Figure 12.27 A similar query (see Figure 12.22) is used to decrypt stored information using a database-stored salt. ■ Prior to version 5.0 of MySQL, user vari- able names are case-sensitive. ■ Never establish and use a user-defined variable within the same SQL statement. ■ Storing the salt in the database, as demonstrated in these steps, adds improved security over storing it in a PHP script. Even better security can be had by using unique and random salts for each stored record. 388 Chapter 12 Database Encryption Preventing Brute Force Attacks A brute force attack is an attempt to log into a secure system by making lots of attempts in the hopes of eventual success. It’s not a sophisticated type of attack, hence the name “brute force.” For example, if you have a login process that requires a username and password, there is a limit as to the possible number of username/password combinations. That limit may be in the billions or trillions, but still, it’s a finite number. Using algorithms and automated processes, a brute force attack repeatedly tries combinations until they succeed. The best way to prevent brute force attacks from succeeding is requiring users to register with good, hard-to-guess passwords: containing letters, numbers, and punctuation; both upper and lowercase; words not in the dictionary; at least eight characters long, etc. Also, don’t give indications as to why a login failed: saying that a username and password combi- nation isn’t correct gives away nothing, but saying that a username isn’t right or that the password isn’t right for that username says too much. To stop a brute force attack in its tracks, you could also limit the number of incorrect login attempts by a given IP address. IP addresses do change frequently, but in a brute force attack, the same IP address would be trying to login multiple times in a matter of minutes. You would have to track incorrect logins by IP address, and then, after X number of invalid attempts, block that IP address for 24 hours (or something). Or, if you didn’t want to go that far, you could use an “incremental delay” defense: each incorrect login from the same IP address cre- ates an added delay in the response (use PHP’s sleep() function to create the delay). Humans might not notice or be bothered by such delays, but automated attacks most certainly would. Regular expressions are an amazingly powerful (but tedious) tool available in most of today’s programming languages and even in many applications. Think of regular expressions as an elaborate system of matching patterns. You first write the pattern and then use one of PHP’s built-in functions to apply the pattern to a value (regular expressions are applied to strings, even if that means a string with a numeric value). Whereas a string function could see if the name John is in some text, a regular expres- sion could just as easily find John, Jon, and Jonathon. PHP supports several types of regular expressions, the two most popular being POSIX Extended and Perl-Compatible (PCRE). In previous editions of this book (and in other books), I exclusively use the POSIX version. They are somewhat less powerful and potentially slower than PCRE but are far easier to learn. But PCRE is becoming the preferred type to use in PHP, so I’ll provide an introduction to it here instead. Because the regular expression syntax is so complex, while the functions that use them are simple, the focus in this chapter will be on mastering the syntax in little bites. The PHP code will be very simple; later chapters will better incorporate regular expressions into real-world scripts. 389 Perl- Compatible Regular Expressions 13 Perl-Compatible Regular Expressions Creating a Test Script As already stated, regular expressions are a matter of applying patterns to values. The application of the pattern to a value is accomplished using one of a handful of functions, the most important being preg_match(). This function returns a 0 or 1, indicating whether or not the pattern matched the string. Its basic syntax is preg_match(pattern, subject); The preg_match() function will stop once it finds a single match. If you need to find all the matches, use preg_match_all(). That function will be discussed toward the end of the chapter. When providing the pattern to preg_match(), it needs to be placed within quotation marks, as it’ll be a string. Because many escaped characters within double quotation marks have special meaning (like \n), I advocate using single quotation marks to define your patterns. Secondarily, within the quotation marks, the pattern needs to be encased within delimiters. The delimiter can be any character that’s not alphanumeric or the backslash, and the same character must be used to mark the begin- ning and end of the pattern. Commonly you’ll see forward slashes used. So, to see if the word cat contains the letter a, you would code: if (preg_match('/a/', 'cat')) { … If you need to match a forward slash in the pattern, use a different delimiter, like the pipe (|) or an exclamation mark (!). The bulk of this chapter covers all the rules for defining patterns. In order to best learn by example, let’s start by creating a simple PHP script that takes a pattern and a string (Figure 13.1) and returns the regular expression result (Figure 13.2). 390 Chapter 13 Creating a Test Script Figure 13.1 The HTML form, which will be used for practicing regular expressions. Figure 13.2 The script will print what values were used in the regular expression and what the result was. The form will also be made sticky to remember previously submitted values. To match a pattern: 1. Create a new PHP document in your text editor or IDE (Script 13.1). Testing PCRE 3 4 5 6 Testing PCRE 7 8 9 The result of checking
    $pattern
    against
    $subject
    is "; 21 22 // Test: 23 if (preg_match ($pattern, $subject) ) { 24 echo 'TRUE!

    '; 25 } else { 26 echo 'FALSE!

    '; 27 } 28 29 } // End of submission IF. Script 13.1 The complex regular expression syntax will be best taught and demonstrated using this PHP script. (script continues) Script 13.1 continued 30 // Display the HTML form. 31 ?> 32
    33

    Regular Expression Pattern: (include the delimiters)

    34

    Test Subject:

    35 36 37
    38 39 continues on next page 2. Check for the form submission. if (isset($_POST['submitted'])) { 3. Treat the incoming values. $pattern = trim($_POST['pattern']); $subject = trim($_POST['subject']); The form will submit two values to this same script. Both should be trimmed, just to make sure the presence of any extraneous spaces doesn’t skew the results. I’ve omitted a check that each input isn’t empty, but you could include that if you wanted. 4. Print a caption. echo "

    The result of checking
    $pattern
    against
    $subject
    is "; As you can see in Figure 13.2, the form handling part of this script will start by printing the values used. 5. Run the regular expression. if (preg_match ($pattern, $subject) ) { print 'TRUE!

    '; } else { print 'FALSE!

    '; } To test the pattern against the string, feed both to the preg_match() function. If this function returns 1, that means a match was made, this condition will be true, and the word TRUE will be printed. If no match was made, the condition will be false and that will be stated (Figure 13.3). 392 Chapter 13 Creating a Test Script Figure 13.3 If the pattern does not match the string, this will be the result. This image also shows that regular expressions are case- sensitive by default. ✔ Tips ■ Some text editors, such as BBEdit and emacs, allow you to use regular expres- sions to match and replace patterns with- in and throughout several documents. ■ Another difference between POSIX and PCRE regular expressions is that the latter can be used on binary data while the former cannot. ■ The PCRE functions all use the estab- lished locale. A locale, discussed more in Chapter 14, “Making Universal Sites,” reflects a computer’s designated country and language, among other settings. 6. Complete the PHP code and create the HTML form. ?>

    Regular Expression Pattern: ➝ ➝ (include the delimiters)

    Test Subject: " size="30" ➝ />

    The form contains two text boxes, both of which are sticky (using the trimmed version of the values). 7. Complete the HTML page. 8. Save the file as pcre.php, place it in your Web directory, and test it in your Web browser (Figures 13.1, 13.2, and 13.3). Although you don’t know the rules for creating patterns yet, you could use the literal a test (see Figures 13.1 and 13.2) or check any other literal value. Remember to use delimiters around the pattern or else you’ll see an error message (Figure 13.4). 393 Perl-Compatible Regular Expressions Creating a Test Script Figure 13.4 If you fail to wrap the pattern in matching delimiters, you’ll see an error message. Defining Simple Patterns Using one of PHP’s regular expression func- tions is really easy, defining patterns to use is hard. There are lots of rules for creating a pattern. You can use these rules separately or in combination, making your pattern either quite simple or very complex. To start, then, you’ll see what characters are used to define a simple pattern. As a formatting rule, I’ll define patterns in bold and will indicate what the pattern matches in italics. The pat- terns in these explanations won’t be placed within delimiters or quotes (both being needed when used within preg_match()), just to keep things cleaner. The first type of character you will use for defining patterns is a literal. A literal is a value that is written exactly as it is inter- preted. For example, the pattern a will match the letter a, ab will match ab, and so forth. Therefore, assuming a case-insensitive search is performed, rom will match any of the fol- lowing strings, since they all contain rom: ◆ CD-ROM ◆ Rommel crossed the desert. ◆ I’m writing a roman à clef. Along with literals, your patterns will use meta-characters. These are special symbols that have a meaning beyond their literal value (Table 13.1). While a simply means a, the period (.) will match any single character except for a newline (. matches a, b, c, the underscore, a space, etc., just not \n). To match any meta-character, you will need to escape it, much as you escape a quotation mark to print it. Hence \. will match the period itself. So 1.99 matches 1.99 or 1B99 or 1299 (a 1 followed by any character followed by 99) but 1\.99 only matches 1.99. 394 Chapter 13 Defining Simple Patterns Character Meaning \ Escape character ^ Indicates the beginning of a string $ Indicates the end of a string . Any single character except newline | Alternatives (or) [ Start of a class ] End of a class ( Start of a subpattern ) End of a subpattern { Start of a quantifier } End of a quantifier Meta-Characters Table 13.1 The meta-characters have unique meanings inside of regular expressions. Two meta-characters specify where certain characters must be found. There is the caret (^), which will match a string that begins with whatever follows the caret. There is also the dollar sign ($), which marks the conclusion of a pattern. Accordingly, ^a will match any string beginning with an a, while a$ will correspond to any string ending with an a. Therefore, ^a$ will only match a (a string that both begins and ends with a). These two meta-characters—the caret and the dollar sign—are crucial to validation, as validation normally requires checking the value of an entire string, not just the presence of one string in another. For example, using an email matching pattern without those two characters will match any string containing an email address. Using an email matching pattern that begins with a caret and ends with a dollar sign will match a string that contains only a valid email address. Regular expressions also make use of the pipe (|) as the equivalent of or. Therefore, a|b will match strings containing either a or b. (Using the pipe within patterns is called alternation or branching). So yes|no accepts either of those two words in their entirety (the alternation is not just between the two letters surrounding it: s and n). Once you comprehend the basic symbols, then you can begin to use parentheses to group characters into more involved pat- terns. Grouping works as you might expect: (abc) will match abc, (trout) will match trout. Think of parentheses as being used to establish a new literal of a larger size. Because of precedence rules in PCRE, yes|no and (yes)|(no) are equivalent. But (even|heavy) handed will match either even handed or heavy handed. To use simple patterns: 1. Load pcre.php in your Web browser, if it is not already. 2. Check if a string contains the letters cat (Figure 13.5). To do so, use the literal cat as the pat- tern and any number of strings as the subject. Any of the following would be a match: catalog, catastrophe, my cat left, etc. For the time being, use all lower- case letters, as cat will not match Cat (Figure 13.6). Remember to use delimiters around the pattern, as well (see the figures). 395 Perl-Compatible Regular Expressions Defining Simple Patterns Figure 13.5 Looking for a cat in a string. Figure 13.6 Don’t forget that PCRE performs a case- sensitive comparison by default. continues on next page 3. Check if a string starts with cat (Figure 13.7). To have a pattern apply to the start of a string, use the caret as the first character (^cat). The sentence my cat left will not be a match now. 4. Check if a string contains the word color or colour (Figure 13.8). The pattern to look for the American or British spelling of this word is col(o|ou)r. The first three letters —col—must be present. This needs to be followed by either an o or ou. Finally, an r is required. ✔ Tips ■ If you are looking to match an exact string within another string, use the strstr() function, which is faster than regular expressions. In fact, as a rule of thumb, you should use regular expres- sions only if the task at hand cannot be accomplished using any other function or technique. ■ You can escape a bunch of characters in a pattern using \Q and \E. Every charac- ter within those will be treated literally (so \Q$2.99?\E matches $2.99?). ■ To match a single backslash, you have to use \\\\. The reason is that matching a backslash in a regular expression requires you to escape the backslash, resulting in \\. Then to use a backslash in a PHP string, it also has to be escaped, so escaping both backslashes means a total of four. 396 Chapter 13 Defining Simple Patterns Figure 13.7 The caret in a pattern means that the match has to be found at the start of the string. Figure 13.8 By using the pipe meta-character, the performed search can be more flexible. Using Quantifiers You’ve just seen and practiced with a couple of the meta-characters, the most important of which are the caret and the dollar sign. Next, there are three meta-characters that allow for multiple occurrences: a* will match zero or more a’s (no a’s, a, aa, aaa, etc.); a+ matches one or more a’s (a, aa, aaa, etc., but there must be at least one); and a? will match up to one a (a or no a’s match). These meta-characters all act as quantifiers in your patterns, as do the curly braces. Table 13.2 lists all of the quantifiers. To match a certain quantity of a thing, put the quantity between curly braces ({}), stat- ing a specific number, just a minimum, or both a minimum and a maximum. Thus, a{3} will match aaa; a{3,} will match aaa, aaaa, etc. (three or more a’s); and a{3,5} will match just aaa, aaaa, and aaaaa (between three and five). Note that quantifiers apply to the thing that came before it, so a? matches zero or one a’s, ab? matches an a followed by zero or one b’s, but (ab)? matches zero or one ab’s. Therefore, to match color or colour (see Figure 13.8), you could also use colou?r as the pattern. 397 Perl-Compatible Regular Expressions Using Quantifiers Character Meaning ? 0 or 1 * 0 or more + 1 or more {x} Exactly x occurrences {x, y} Between x and y (inclusive) {x,} At least x occurrences Quantifiers Table 13.2 The quantifiers allow you to dictate how many times something can or must appear. To use quantifiers: 1. Load pcre.php in your Web browser, if it is not already. 2. Check if a string contains the letters c and t, with one or more letters in between (Figure 13.9). To do so, use c.+t as the pattern and any number of strings as the subject. Remember that the period matches any character (except for the newline). Each of the following would be a match: cat, count, coefficient, etc. The word doctor would not match, as there are no letters between the c and the t (although doctor would match c.*t). 3. Check if a string matches either cat or cats (Figure 13.10). To start, if you want to make an exact match, use both the caret and the dollar sign. Then you’d have the literal text cat, followed by an s, followed by a question mark (representing 0 or 1 s’s). The final pattern—^cats?$—matches cat or cats but not my cat left or I like cats. 4. Check if a string ends with .33, .333, or .3333 (Figure 13.11). To find a period, escape it with a back- slash: \.. To find a three, use a literal 3. To find a range of 3’s, use the curly brackets ({}). Putting this together, the pattern is \.3{2,4}. Because the string should end with this (nothing else can follow), con- clude the pattern with a dollar sign: \.3{2,4}$. Admittedly, this is kind of a stupid exam- ple (not sure when you’d need to do exactly this), but it does demonstrate several things. This pattern will match lots of things—12.333, varmit.3333, .33, look .33—but not 12.3 or 12.334. 398 Chapter 13 Using Quantifiers Figure 13.10 You can check for the plural form of many words by adding s? to the pattern. Figure 13.11 The curly braces let you dictate the acceptable range of quantities present. Figure 13.9 The plus sign, when used as a quantifier, requires that one or more of a thing be present. 5. Match a five-digit number (Figure 13.12). A number can be any one of the num- bers 0 through 9, so the heart of the pat- tern is (0|1|2|3|4|5|6|7|8|9). Plainly said, this means: a number is a 0 or a 1 or a 2 or a 3…. To make it a five-digit number, follow this with a quantifier: (0|1|2|3|4|5|6|7|8|9){5}. Finally, to match this exactly (as opposed to match- ing a five-digit number within a string), use the caret and the dollar sign: ^(0|1|2|3|4|5|6|7|8|9){5}$. This, of course, is one way to match a United States zip code, a very useful pattern. ✔ Tips ■ When using curly braces to specify a number of characters, you must always include the minimum number. The maxi- mum is optional: a{3} and a{3,} are acceptable, but a{,3} is not. ■ Although it demonstrates good dedica- tion to programming to learn how to write and execute your own regular expressions, numerous working exam- ples are available already by searching the Internet. 399 Perl-Compatible Regular Expressions Using Quantifiers Figure 13.12 The proper test for confirming that a number contains five digits. Using Character Classes As the last example demonstrated (Figure 13.12), relying solely upon literals in a pat- tern can be tiresome. Having to write out all those digits to match any number is silly. Imagine if you wanted to match any four- letter word: ^(a|b|c|d…){4}$ (and that doesn’t even take into account uppercase letters)! To make these common references easier, you can use character classes. Classes are created by placing characters within square brackets ([]). For example, you can match any one vowel with [aeiou]. This is equivalent to (a|e|i|o|u). Or you can use the hyphen to indicate a range of char- acters: [a-z] is any single lowercase letter and [A-Z] is any uppercase, [A-Za-z] is any letter in general, and [0-9] matches any digit. As an example, [a-z]{3} would match abc, def, oiw, etc. Within classes, most of the meta-characters are treated literally, except for four. The backslash is still the escape, but the caret (^) is a negation operator when used as the first character in the class. So [^aeiou] will match any non-vowel. The only other meta- character within a class is the dash, which indicates a range. (If the dash is used as the last character in a class, it’s a literal dash.) And, of course, the closing bracket (]) still has meaning as the terminator of the class. Naturally a class can have both ranges and literal characters. A person’s first name, which can contain letters, spaces, apostro- phes, and periods, could be represented by [A-z '.] (again, the period doesn’t need to be escaped within the class, as it loses its meta- meaning there). Along with creating your own classes, there are six already-defined classes that have their own shortcuts (Table 13.3). The digit and space classes are easy to understand. 400 Chapter 13 Using Character Classes Class Shortcut Meaning [0-9] \d Any digit [\f\r\t\n\v] \s Any white space [A-Za-z0-9_] \w Any word character [^0-9] \D Not a digit [^\f\r\t\n\v] \S Not white space [^A-Za-z0-9_] \W Not a word character Character Classes Table 13.3 These character classes are commonly used in regular expressions. The word character class doesn’t mean “word” in the language sense but rather as in a string unbroken by spaces or punctuation. Using this information, the five-digit number (aka, zip code) pattern could more easily be written as ^[0-9]{5}$ or ^\d{5}$. As anoth- er example, can\s?not will match both can not and cannot (the word can, followed by zero or one space characters, followed by not). To use character classes: 1. Load pcre.php in your Web browser, if it is not already. 2. Check if a string is formatted as a valid United States zip code (Figure 13.13). A United States zip code always starts with five digits (^\d{5}). But a valid zip code could also have a dash followed by another four digits (-\d{4}$). To make this last part optional, use the question mark (the 0 or 1 quantifier). This com- plete pattern is then ^(\d{5})(-\d{4})?$. To make it all clearer, the first part of the pattern (matching the five digits) is also grouped in parentheses, although this isn’t required in this case. 3. Check if a string contains no spaces (Figure 13.14). The \S character class shortcut will match non-space characters. To make sure that the entire string contains no spaces, use the caret and the dollar sign: ^\S$. If you don’t use those, then all the pattern is confirming is that the subject contains at least one non-space character. 4. Validate an email address (Figure 13.15). The pattern ^[\w.-]+@[\w.-]+\.[A-Za- z]{2,6}$ provides for reasonably good email validation. It’s wrapped in the caret and the dollar sign, so the string must be a valid email address and nothing more. 401 Perl-Compatible Regular Expressions Using Character Classes continues on next page Figure 13.15 A pretty good and reliable validation for email addresses. Figure 13.13 The pattern to match a United States zip code, in either the five-digit or five plus four format. Figure 13.14 The no-white-space shortcut can be used to ensure that a submitting string is contiguous. An email address starts with letters, numbers, and the underscore (represent- ed by \w), plus a period (.) and a dash. This first block will match larryullman, larry77, larry.ullman, larry-ullman, and so on. Next, all email addresses include one and only one @. After that, there can be any number of letters, numbers, peri- ods, and dashes. This is the domain name: dmcinsights, smith-jones, amazon.co (as in amazon.co.uk), etc. Finally, all email addresses conclude with one period and between two and six let- ters. This accounts for .com, .edu, .info, .travel, etc. ✔ Tips ■ I think that the zip code example is a great demonstration as to how complex and useful regular expressions are. One pattern accurately tests for both formats of the zip code, which is fantastic. But when you put this into your PHP code, with quotes and delimiters, it’s not easily understood: if (preg_match ('/^(\d{5})(-\d{4})?$/ ➝ ', $zip)) {… That certainly looks like gibberish, right? ■ This email address validation pattern is pretty good, although not perfect. It will allow some invalid addresses to pass through (like ones starting with a period or containing multiple periods together). However, a 100 percent foolproof valida- tion pattern is ridiculously long, and fre- quently using regular expressions is real- ly a matter of trying to exclude the bulk of invalid entries without inadvertently excluding any valid ones. ■ Regular expressions, particularly PCRE ones, can be extremely complex. When starting out, it’s just as likely that your use of them will break the validation rou- tines instead of improving them. That’s why practicing like this is important. 402 Chapter 13 Using Character Classes Using Boundaries Boundaries are shortcuts for helping to find, um, boundaries. In a way, you’ve already seen this: using the caret and the dollar sign to match the beginning or end of a value. But what if you wanted to match boundaries within a value? The clearest boundary is between a word and a non-word. A “word” in this case is not cat, month, or zeitgeist, but in the \w shortcut sense: the letters A through Z (both upper- and lowercase), plus the numbers 0 through 9, and the underscore. To use words as boundaries, there’s the \b shortcut. To use non-word characters as boundaries, there’s \B. So the pattern \bfor\b matches they’ve come for you but doesn’t match force or forebode. Therefore \bfor\B would match force but not they’ve come for you or informal. Finding All Matches Going back to the PHP functions used with Perl-Compatible regular expressions, preg_match() has been used just to see if a pattern matches a value or not. But the script hasn’t been reporting what, exactly, in the value did match the pattern. You can find out this information by using a variable as a third argument to the function: preg_match(pattern, subject, $match) The $match variable will contain the first match found (because this function only returns the first match in a value). To find every match, use preg_match_all(). Its syntax is the same: preg_match_all(pattern, subject, ➝ $matches) This function will return the number of matches made, or FALSE if none were found. It will also assign to $matches every match made. Let’s update the PHP script to print the returned matches, and then run a couple more tests. To report all matches: 1. Open pcre.php (Script 13.1) in your text editor or IDE. 2. Change the invocation of preg_match() to (Script 13.2) if (preg_match_all ($pattern, $subject, $matches) ) { There are two changes here. First, the actual function being called is different. Second, the third argument is provided a variable name that will be assigned every match. 403 Perl-Compatible Regular Expressions Finding All Matches 1 3 4 5 6 Testing PCRE 7 8 9 The result of checking
    $pattern
    against
    $subject
    is "; 22 23 // Test: 24 if (preg_match_all ($pattern, $subject, $matches) ) { 25 echo 'TRUE!

    '; 26 Script 13.2 To reveal exactly what values in a string match which patterns, this revised version of the script will print out each match. You can retrieve the matches by naming a variable as the third argument in preg_match() or preg_match_all(). (script continues on next page) continues on next page 27 // Print the matches: 28 echo '
    ' . print_r($matches, 1) .  '
    '; 29 30 } else { 31 echo 'FALSE!

    '; 32 } 33 34 } // End of submission IF. 35 // Display the HTML form. 36 ?> 37
    38

    Regular Expression Pattern: (include the delimiters)

    39

    Test Subject:

    40 41 42
    43 44 3. After printing the value TRUE, print the contents of $matches. echo '
    ' . print_r($matches, 1) .   ➝ '
    '; Even though the PRE tags are not XHTML compliant, this is the easiest way to know what’s in $matches. As you’ll see when you run this script, this vari- able will be an array whose first element is an array of matches made. 4. Change the form’s action attribute to matches.php.
    This script will be renamed, so the action attribute must be changed, too. 5. Change the subject input to be a textarea.

    Test Subject:

    In order to be able to enter in more text for the subject, this element will become a textarea. 404 Chapter 13 Finding All Matches Script 13.2 continued 6. Save the file as matches.php, place it in your Web directory, and test it in your Web browser (Figures 13.16, 13.17, 13.18, and 13.19). For the first test, use for as the pattern and This is a formulaic test for informal matches. as the subject (Figure 13.16). It may not be proper English, but it’s a good test subject. For the second test, change the pattern to for.* (Figure 13.17). The result may surprise you, the cause of which is dis- cussed in the sidebar, “Being Less Greedy.” To make this search less greedy, the pattern could be changed to for.*?, whose results would be the same as those in Figure 13.16. 405 Perl-Compatible Regular Expressions Finding All Matches Figure 13.16 This first test returns three matches, as the literal text for was found three times. Figure 13.17 Because regular expressions are greedy by default (see the sidebar), this pattern only finds one match in the string. That match happens to start with the first instance of for and continue until the end of the string. Figure 13.18 This revised pattern matches strings that begin with for and end on a word. Figure 13.19 Unlike the pattern in Figure 13.18, this one matches entire words that contain for (informal here, formal in Figure 13.18). For the third test, use for[\S]*, or, more simply for\S* (Figure 13.18). This has the effect of making the match stop as soon as a white space character is found (because the pattern wants to match for followed by any number of non–white space characters). For the final test, use \b[a-z]*for[a-z] *\b as the pattern (Figure 13.19). This pattern makes use of boundaries, dis- cussed in the sidebar “Using Boundaries,” earlier in the chapter. ✔ Tip ■ The preg_split() function will take a string and break it into an array using a regular expression pattern. 406 Chapter 13 Finding All Matches Being Less Greedy A key component to Perl-Compatible reg- ular expressions, which isn’t present in POSIX, is the concept of greediness. By default, PCRE will attempt to match as much as possible. For example, the pattern <.+> matches any HTML tag. When test- ed on a string like Link, it will actually match that entire string, from the opening < to the closing one. This string contains three possible matches, though: the entire string, the opening tag (from ), and the closing tag (). To overrule greediness, make the match lazy. A lazy match will contain as little data as possible. Any quantifier can be made lazy by following it with the ques- tion mark. For example, the pattern <.+?> would return two matches in the preceding string: the opening tag and the closing tag. It would not return the whole string as a match. (This is one of the con- fusing aspects of the regular expression syntax: the same character—here, the question mark—can have different mean- ings depending on its context.) Another way to make patterns less greedy is to use negative classes. The pattern <[^>]+> matches everything between the opening and closing <> except for a closing >. So using this pattern would have the same result as using <.+?>. This pattern would also match strings that contain newline characters, which the period excludes. Using Modifiers The majority of the special characters you can use in regular expression patterns are introduced in this chapter. One final type of special character is the pattern modifier. Table 13.4 lists these. Pattern modifiers are different than the other meta-characters in that they are placed after the closing delimiter. Of these delimiters, the most important is i, which enables case-insensitive searches. All of the examples using variations on for (in the previous sequence of steps) would not match the word For. However, /for.*/i would be a match. Note that I am including the delim- iters in that pattern, as the modifier goes after the closing one. Similarly, the last step in that sequence referenced the sidebar “Begin Less Greedy” and stated how for.*? would perform a lazy search. So would /for.*/U. The multiline mode is also interesting in that you can make the caret and the dollar sign behave differently. By default, each applies to the entire value. In multiline mode, the caret matches the beginning of any line and the dollar sign matches the end of any line. 407 Perl-Compatible Regular Expressions Using Modifiers Character Result A Anchors the pattern to the beginning of the string i Enables case-insensitive mode m Enables multiline matching s Has the period match every character, including newline x Ignores most white space U Performs a non-greedy match Pattern Modifiers Table 13.4 These characters, when placed after the closing delimiter, alter the behavior of a regular expression. To use modifiers: 1. Load matches.php in your Web browser, if it is not already. 2. Validate a list of email addresses (Figure 13.20). To do so, use /^[\w.-]+@[\w.-]+\.[A- Za-z]{2,6}\r?$/m as the pattern. You’ll see that I’ve added an optional carriage return (\r?) before the dollar sign. This is necessary because some of the lines will contain returns and others won’t. And in multiline mode, the dollar sign matches the end of a line. (To be more flexible, you could use \s? instead.) 3. Validate a list of United States zip codes (Figure 13.21). Very similar to the example in Step 2, the pattern is now /^(\d{5})(-\d{4})?\ s?$/m. You’ll see that I’m using the more flexible \s? instead of \r?. You’ll also notice when you try this your- self (or in Figure 13.21) that the $matches variable contains a lot more information now. This will be explained in the next section of the chapter. ✔ Tip ■ To always match the start or end of a pattern, regardless of the multiline set- ting, there are shortcuts you can use. Within the pattern, the shortcut \A will match only the very beginning of the value, \z matches the very end, and \Z matches any line end, like $ in single- line mode. 408 Chapter 13 Using Modifiers Figure 13.20 A list of email addresses, one per line, can be validated using the multiline mode. Each valid address is stored in $matches. Figure 13.21 Validating a list of zip codes, one per line. Matching and Replacing Patterns The last subject to discuss in this chapter is how to match and replace patterns in a value. While preg_match() and preg_match_all() will find things for you, if you want to do a search and replace, you’ll need to use preg_ replace(). Its syntax is preg_replace(pattern, replacement, ➝ subject) This function takes an optional fourth argu- ment limiting the number of replacements made. To replace all instances of cat with dog, you would use $str = preg_replace('/cat/', 'dog', 'I ➝ like my cat.'); This function returns the altered value (or unaltered value if no matches were made), so you’ll likely want to assign it to a variable or use it as an argument to another function (like printing it by calling echo()). Also, as a reminder, this is just an example: you’d never want to replace one literal string with another using regular expressions, use str_replace() instead. There is a related concept to discuss that is involved with this function: back referencing. In a zip code matching pattern—^(\d{5})(- \d{4})?$—there are two groups within parentheses: the first five digits and the optional dash plus four-digit extension. Within a regular expression pattern, PHP will automatically number parenthetical groupings beginning at 1. Back referencing allows you to refer to each individual section by using $ plus the corresponding number. For example, if you match the zip code 94710-0001 with this pattern, referring back to $2 will give you -0001. The code $0 refers to the whole initial string. This is why Figure 13.21 shows entire zip code matches in $matches[0], the matching first five digits in $matches[1], and any matching dash plus four digits in $matches[2]. To practice with this, let’s modify Script 13.2 to also take a replacement input (Figure 13.22). 409 Perl-Compatible Regular Expressions Matching and Replacing Patterns Figure 13.22 One use of preg_replace() would be to replace variations on inappropriate words with symbols representing their omission. To match and replace patterns: 1. Open matches.php (Script 13.2) in your text editor or IDE. 2. Add a reference to a third incoming vari- able (Script 13.3). $replace = trim($_POST['replace']); As you can see in Figure 13.22, the third form input (added between the existing two) takes the replacement value. That value is also trimmed to get rid of any extraneous spaces. 3. Change the caption. echo "

    The result of replacing
    $pattern
    with
    $replace
    in
    $subject ➝

    "; The caption will print out all of the incoming values, prior to applying preg_replace(). 410 Chapter 13 Matching and Replacing Patterns 1 3 4 5 6 Testing PCRE Replace 7 8 9 The result of replacing
    $pattern
    with
    $replace
    in
    $subject

    "; 23 24 // Check for a match: 25 if (preg_match ($pattern, $subject) ) { 26 echo preg_replace($pattern, $replace, $subject) . '

    '; 27 } else { Script 13.3 To test the preg_replace() function, which replaces a matched pattern in a string with another value, you can use this third version of the PCRE test script. (script continues on next page) 4. Change the regular expression condition- al so that it only calls preg_replace() if a match is made. if (preg_match ($pattern, $subject) ➝ ) { echo preg_replace($pattern, ➝ $replace, $subject) . '

    '; } else { echo 'The pattern was not ➝ found!

    '; } You can call preg_replace() without running preg_match() first. If no match was made, then no replacement will occur. But to make it clear when a match is or is not being made (which is always good to confirm, considering how tricky regular expressions are), the preg_match() function will be applied first. If it returns a true value, then preg_replace() is called, printing the results (Figure 13.23). Otherwise, a message is printed indicating that no match was made (Figure 13.24). 411 Perl-Compatible Regular Expressions Matching and Replacing Patterns Script 13.3 continued 28 echo 'The pattern was not found!

    '; 29 } 30 31 } // End of submission IF. 32 // Display the HTML form. 33 ?> 34 35

    Regular Expression Pattern: (include the delimiters)

    36

    Replacement:

    37

    Test Subject:

    38 39 40
    41 42 Figure 13.23 The resulting text has uses of bleep, bleeps, bleeped, bleeper, and bleeping replaced with *****. Figure 13.24 If the pattern is not found within the subject, the subject will not be changed. The replacement value is hidden here because it uses HTML tags; see the source code for the full effect. 5. Change the form’s action attribute to replace.php.
    This file will be renamed, so this value needs to be changed accordingly. 6. Add a text input for the replacement string.

    Replacement:

    7. Save the file as replace.php, place it in your Web directory, and test it in your Web browser (Figure 13.25). As a good example, you can turn an email address found within some text into its HTML link equivalent: email@example.com. The pattern for matching an email address should be familiar by now: ^[\w.-]+@[\w.-]+\.[A- Za-z]{2,6}$. However, because the email address could be found within some text, the caret and dollar sign need to be 412 Chapter 13 Matching and Replacing Patterns replaced by the word boundaries short- cut: \b. The final pattern is therefore /\b[\w.-]+@[\w.-]+\.[A-Za-z]{2,6}\b/. To refer to this matched email address, you can refer to $0 (because $0 refers to the entire match, whether or not paren- theses are used). So the replacement value would be $0 . Because HTML is involved here, look at the HTML source code of the resulting page for the best idea of what happened. ✔ Tips ■ Back references can even be used within the pattern. For example, if a pattern included a grouping (i.e., a subpattern) that would be repeated. ■ I’ve introduced, somewhat quickly, the bulk of the PCRE syntax here, but there’s much more to it. Once you’ve mastered all this, you can consider moving on to anchors, named subpatterns, comments, lookarounds, possessive quantifiers, and more. Figure 13.25 Another use of preg_replace() is dynamically turning email addresses into clickable links. The biggest change in version 6 of PHP is support for Unicode. But what is Unicode and why should you care? In this chapter, I’ll answer those questions, and show you how you might change your Web sites using this new information. But as a preview, if you’d like your Web sites to be usable by people that don’t speak the same language as you, or if you don’t feel like always programming in your non-native language, keep reading! This chapter goes over several subjects, all with the goal of making a more global Web site. The bulk of these topics involve text: character sets, encodings, collation, transliteration, and Unicode. These topics apply to PHP, MySQL, HTML, and even the application you create your PHP scripts in. I’ll be presenting a book’s worth of infor- mation in just a few pages, but it’ll certainly be enough for you to use in real sites. The other subjects covered here are time zones and locales. Like the language a user reads and writes, these two ideas reflect the different cultures and regions in the world, and therefore ought to be considered in your Web applications. Understanding all of these subjects, and being able to apply the techniques taught herein, will make your Web sites more reliable, more impressive, and accessible to a larger audience. 413 Making Universal Sites 14 Making Universal Sites Character Sets and Encoding To understand the concepts of character sets and encoding, you have to first realize that, in your computer, there is no such thing as the letter A. The letter A is part of a charac- ter set: the symbols used by a language (also called a character repertoire). But the A on my screen as I write this, the A in the text document itself: these aren’t really A’s. At their foundation, computers understand num- bers, not letters. This works well for com- puters, but humans like to see letters. The solution is to have numbers represent letters. ASCII, which you’ve certainly heard of and is short for American Standard Code for Information Interchange, is a representation of all the letters in the English alphabet—A through Z, both upper- and lowercase—plus the digits 0 through 9, plus all English punc- tuation. That’s a total of 95 characters. Add to this 33 non-printing characters such as the newline (\n) and a tab (\t), and you have 128 characters, associated with the integers 0 through 127 (Table 14.1). This is a coded character set: each character is represented by a number (the number is also called a code point). When computers store data or transfer it from one computer to another, they don’t do so in numbers, they do so in bytes. Encoding is how a coded character set is mapped from integers to bytes. Working backward then, by identifying how text is encoded, a com- puter can recognize its coded character set, and therefore know what characters should be displayed. Although ASCII represents the entire English character set, it doesn’t include all the accented characters in related languages, like French and Spanish. Nor does it include non-Latin characters, like those present in German, Greek, or Korean. It doesn’t even include things like curly quotes. Other encodings have since been defined, lots and lots of them: different encodings for different languages, even differ- ent encodings for different computers (e.g., Windows vs. Mac). Making communication difficult, two encodings would commonly use the same number to represent different characters. From this mess, Unicode was born. Unicode provides a unique number represent- ing every symbol in every alphabet for any operating system and program. It’s a huge goal and Unicode succeeds rather well. Version 5 of Unicode—the current version at the time of this writing—supports over 99,000 char- acters, but the upper limit is well over a mil- lion. Table 14.2 lists just a sampling of the scripts supported (a script being the collection of symbols used by one or more languages). 414 Chapter 14 Integer Key/Character 0NULL 9\t 10 \n 27 Escape 32 Space 43 + 54 6 64 @ 65 A 97 a 126 ~ 127 Delete Some ASCII Characters Character Sets and Encoding Table 14.1 These twelve items are a sampling of the 128 characters defined by the ASCII standard. Script Arabic Cherokee Cyrillic Greek Han Hebrew Latin N’Ko Runic Tibetan Unicode Supported Scripts When using Unicode, you still have to choose which encoding to go with. UTF-8 is perhaps the most common, in part because ASCII, used so commonly for years, is a nice little subset of UTF-8. In fact, any ASCII text is also valid UTF-8. There’s also UTF-16 and UTF-32, each with larger character sets. In these paragraphs I’ve introduced the key concepts that will help you comprehend the information in the rest of the chapter. Doing so required the distillation of oodles of tech- nical information, the glossing over of many details, and the abbreviation of decades of computer history. If you want to learn more about these subjects, a search online will turn up volumes, but what you most need to understand is this: the encoding you use dic- tates what characters can be represented (and therefore, what languages can be used). ✔ Tips ■ Unfortunately, many resources, including HTML and MySQL, use the term charset or character set to refer to the encoding. The two things are technically different, but the terms are used synonymously. ■ Prior to UTF-8, ISO-8859-1 was one of the more commonly used encodings. It represents most Western European lan- guages. It’s still the default encoding for many Web browsers and other applications. ■ Email messages should (but don’t always) indicate the encoding. You can normally see this by viewing the raw source of a message, which will contain a line like Content-Type: text/plain; ➝ charset="UTF-8" ■ Any document—email, Web page, or text file—that contains some junk characters probably wasn’t encoding properly (Figure 14.1). 415 Making Universal Sites Character Sets and Encoding Table 14.2 A handful of the scripts represented in Unicode. Some scripts, like Latin, are used in many languages (English, Italian, Portuguese, etc); others, like Hangul, are only used in one (Korean, in this case). Figure 14.1 This friendly little piece of spam I received didn’t use the right encoding, so junk characters appeared instead (thereby denying me the full joy of the message). Creating Multilingual Web Pages Eventually this chapter will go over how to use multiple languages (i.e., multiple charac- ters) in PHP and MySQL, but doing so man- dates that you know how to make an HTML page that can display characters from many languages. Of course, what characters you can display is determined by the encoding, but even that topic comes into play more than once in this process. Say you want to create a Web page that con- tains text in both English and Japanese. For starters, your computer must be able to enter characters in both languages (it must have the necessary fonts). Normally you can type in one (native) language, but most operating systems offer tools for inserting characters from other languages, too. If your computer supports both languages, then you need to use an encoding for the Web page that sup- ports both, too. That would be UTF-8, in all likelihood. Therefore, the HTML file needs to be written in an application that supports UTF-8 encoding; not all do. If you have all that, you can now create a document with both English and Japanese characters. This HTML page will be viewable by others in their Web browsers. The Web browsers, then, need to know what encoding the HTML page uses. One way to convey this information is to use a META tag: (To repeat what’s said on a previous page, unfortunately the term charset is used to mean encoding, not character set.) The last requirement is that the end user’s computer also support both character sets (i.e., they have the necessary fonts). If so, then you’ve successfully created and shared a multilingual Web page. Before writing another opening PHP tag, let’s make sure you can get all this working. To create a multilingual Web page: 1. Confirm that your text editor or IDE sup- ports UTF-8 encoding (Figure 14.2). You’ll need to check the Web site, help files, or other documentation for your application. Getting this step right is necessary, though, as you can’t create a UTF-8-encoded document if your editor doesn’t support UTF-8. Some applications let you set this in their preferences (as in Figure 14.2). Others set the encoding when you save the file (Figure 14.3). 416 Chapter 14 Creating Multilingual Web Pages Figure 14.2 My favorite text editor, BBEdit (which sadly only runs on a Mac), has a preferences area where you can set the default encoding for documents. 2. Begin a new HTML document (Script 14.1). Testing UTF-8 This is mostly standard HTML. To make the resulting page easier to view, an inline CSS style increases the base font size to 18 points. Note that the language declarations in the opening html tag(thetwousesof lang="en") are indications of the document’s main language. This is a separate issue from the encoding and the character set. 417 Making Universal Sites Creating Multilingual Web Pages Script 14.1 This script will be a test to confirm that a UTF-8Web page can be successfully created and viewed. Figure 14.3 Notepad on Windows, which isn’t a great text editor but is usable, lets you define a file’s encoding when you save it. continues on next page 3. Add a META tag that indicates the encoding. This line should be the first one inside of the HEAD tag, as the browser needs to know this information as soon as possi- ble. It should come before the title tags (see Script 14.1) or any other META tags. 4. Add some characters or text to the body of the page. The first word is a good test of encoding, as it contains many different accents and non-Latin characters. You can also throw in symbols or characters from other lan- guages. In a list, I’ve added the Euro sym- bol, the schwa, and infinity; then individ- ual characters from the Cyrillic, Arabic, Hebrew, and Hangul scripts (and hope- fully I haven’t included anything that will offend anyone!). How you insert characters depends upon your operating system. Windows has the Character Map utility, which lets you choose characters from installed fonts. Mac OS X has the Character Palette, which displays available scripts, and the Keyboard Viewer, which shows characters by font. Both can be accessed in the menu bar, after checking the right boxes in the International System Preferences pane. 5. Save the file as utf.html and test it in your Web browser (Figures 14.4 and 14.5). Because this is just an HTML file, it does not need to be run through a URL, like a PHP script. 418 Chapter 14 Creating Multilingual Web Pages Figure 14.4 A UTF- 8 encoded Web page, successfully showing characters and symbols from all over the world. Figure 14.5 The same HTML page (as in Figure 14.4), viewed in Windows. This browser and operating system didn’t support the Korean character (the last one), replacing it with a question mark. ✔ Tips ■ If a page’s encoding is different than the encoding it indicates it uses (in the META tag), that will likely lead to problems. ■ Curly quotes often cause problems in improperly encoded documents, as they aren’t part of the ASCII standard. ■ Firefox’s Page Info window (Figure 14.6) will show the document’s encoding. This can be a useful debugging tool. ■ Because there are so many variables when creating multilingual Web pages, they can be tougher to debug. Make sure that you use the proper encoding in your application that creates the HTML or PHP page, that the encoding is indicated within the file itself, and that you test using as many browsers and operating systems as possible. ■ You can also indicate to the Web browser the page’s encoding using PHP and the header() function: This can be more effective than using a META tag, but it does require the page to be a PHP script. If using this, it must be the first line in the page, before any HTML. ■ You can specify the encoding to accept in an HTML form tag, too: By default, a Web page will use the same encoding as the page itself for any sub- mitted data. ■ You can declare the encoding of an exter- nal CSS file by adding @charset "utf-8"; as the first line in the file. If you’re not using UTF-8, change the line accordingly. ■ Another way to use special characters in an HTML page is by using a numeric character reference (NCR). Any Unicode character can be referenced using the format &#XXXX;. For example, the Latin capital A is A. But ideally you should use your computer to add the character itself instead of using an NCR. 419 Making Universal Sites Creating Multilingual Web Pages Figure 14.6 The informative Page Info window is yet another reason to use Firefox for your Web development. Unicode in PHP Now that you know what Unicode is and how to create a properly encoded HTML page, how does this affect PHP, which now supports Unicode? Lacking Unicode sup- port, earlier versions of PHP had only one type of string. PHP 6 has three: Unicode, binary (for other encodings and binary data), and native (for backward compatibili- ty). But because PHP is a weakly typed lan- guage, you can work with all three types in more or less the same way. 420 Chapter 14 Unicode in PHP Figure 14.7 In PHP 6, the output generated by calling the phpinfo() function now has a section for Unicode settings. Unicode and PHP 5 The most important addition to PHP 6 is support for Unicode, including UTF-8, which I’m advocating using in this chapter. What’s implied is that earlier versions of PHP did not support Unicode. This isn’t just a matter of convenience; it’s actually a problem. If you attempt to work with Unicode text in earlier versions of PHP 6, the results can range from being unex- pected and unpredictable to insecure. The reason is that practically every string function in earlier versions of PHP treated each char- acter as a single byte. This was fine when working with English and many other languages, in which each character was, in fact, a single byte. But the characters in other languages some- times require multiple bytes apiece. Applying even a simple function like substr() to such text would give erroneous results. PHP 5 and earlier has two sets of functions for working with multibyte strings—mb_* and iconv_*—but neither is perfect and you really need to know your stuff to use them. Simply said, if you need to handle Unicode data, make sure you’re using PHP 6. If you’re not using PHP 6, don’t accept multibyte characters (i.e., use a different encoding). To use Unicode with PHP, it first has to be enabled. Doing so requires modifying PHP’s configuration file. The specific setting is uni- code.semantics, which must be turned on. If you’re running your own installation of PHP, see Appendix A, “Installation,” for instruc- tions on changing PHP’s configuration. If using a hosted server that’s running PHP 6, you’ll have to ask them to enable Unicode support. You can confirm this setting by calling the phpinfo() function (Figure 14.7). With Unicode enabled, PHP scripts can properly handle Unicode text that might come from a form, a text file, or a database. Functions like substr() or strlen(), which would not properly work with Unicode data in PHP 5, will now function correctly. You can also now use non-Latin characters for identifiers: the names of variables, functions, and so forth (keywords in PHP will still be in English). These are possible in PHP 6: If you’re going to use Unicode characters in identifiers, you need indicate to PHP what encoding you’re using (aside from encoding the script itself properly using your applica- tion). To do so, use declare (encoding="UTF-8"); This must be the first line in the PHP script (after the opening tag, of course). Also, any included file also needs to indicate its encod- ing (the encoding is not inherited from one script to another). While I think that being able to use your native language for identifiers is really cool, to demonstrate Unicode in PHP, let’s create a script that highlights some differences between PHP 5 and PHP 6. 421 Making Universal Sites Unicode in PHP To use Unicode in PHP: 1. Begin a new PHP document in your text editor or IDE (Script 14.2). Unicode in PHP

    Names from Around the World

    $name has " . ➝ strlen($name) . " ➝ characters
    \n" . ➝ strtoupper($name) . " in ➝ capital letters

    \n"; } This code should be pretty easy to under- stand, even though it’s going to be applied to strings in multiple languages. It loops through the array, printing out each name as it originally is. Then it also prints out the number of characters in the name and the name in all caps. 4. Complete the page. ?> 5. Save the file as unicode.php, place it in your Web directory, and test it in your Web browser (Figure 14.8). 6. If possible, run the same script using an older version of PHP (Figure 14.9). ✔ Tips ■ You can use casting to forcibly convert a string from one encoding type to another (see Chapter 12, “Security Methods,” for an introduction to typecasting). The casting keywords are (binary), (unicode), and (string). ■ Alternatively, you can use unicode_encode() and unicode_decode() to convert strings from one encoding to another. The unicode_set_error_mode() determines how any conversion problems are handled. 423 Making Universal Sites Unicode in PHP Figure 14.8 The (accurate) results of running the Unicode PHP script using PHP 6. Figure 14.9 The same page (Script 14.2), run under PHP 5.2. Notice how both the character counts and capitalization are incorrect and differ from the results in Figure 14.8. Collation in PHP Collation refers to the rules used for com- paring characters in a set. It’s like alphabeti- zation, but takes into account numbers, spaces, and other characters as well. Collation relates to the character set being used, reflecting both the kinds of characters pres- ent and cultural habits. How text is sorted in English is not the same as it is in Traditional Spanish or in Arabic. For example, are upper- and a lowercase versions of a character con- sidered to be the same or different (i.e., is it a case-sensitive comparison)? Or, how do accented characters get sorted? Is a space counted or ignored? The best way to sort Unicode strings in PHP 6 is to use the Collator class. This gets into the subject of object-oriented programming (OOP), not otherwise discussed in this book (a solid introduction provided by my book PHP 5 Advanced: Visual QuickPro Guide (Peachpit Press, 2007) requires over 100 pages), but the syntax is easy enough to follow. Start by creating a new object of type Collator: $c = new Collator(locale); When doing this, you need to indicate the locale. I discuss locales at the end of the chap- ter, but for now, just know that it’ll be a short string indicating a language and geographic reference point. For example, jp_JP is Japanese in Japan; pt_BR is Portuguese in Brazil. Next, apply the sort() function to an array of strings. Calling functions in a class uses the $object->function() syntax: $array = $c->sort($array); Let’s run through an example of this. 424 Chapter 14 Collation in PHP To use collation in PHP: 1. Begin a new PHP document in your text editor or IDE (Script 14.3). Collation in PHP 2 4 5 6 7 Collation in PHP 8 9 10 Using sort()'; 17 sort($words); 18 echo implode('
    ', $words); 19 20 // Sort using the Collator: 21 echo '

    Using Collator

    '; 22 $c = new Collator('fr_FR'); 23 $words = $c->sort($words); 24 echo implode('
    ', $words); 25 26 ?> 27 28 Script 14.3 The collation.php script sorts several French words using the Collator class. Using that code is demonstrably more effective than using PHP’s built-in sort() function. continues on next page 3. Use the sort() function, and then print the results. echo '

    Using sort()

    '; sort($words); echo implode('
    ', $words); PHP’s sort() function is the default sort- ing utility and it works just fine…with standard English. Let’s see how it does with French! The third line here uses the implode() function as a quick way of printing each item in the array on its own line. This function turns an array into a string, using the first argument as the glue. The returned string is then printed by echo(). Figure 14.10 shows the HTML source code resulting from this little shortcut. 4. Use the Collator class, and then print the results. echo '

    Using Collator

    '; $c = new Collator('fr_FR'); $words = $c->sort($words); echo implode('
    ', $words); The syntax for using this Collator class is described before these steps. For the locale value, I use fr_FR, which means French in France. 5. Complete the page. ?> 6. Save the file as collation.php, place it in your Web directory, and test it in your Web browser (Figure 14.11). ✔ Tips ■ Simple comparisons in PHP, using the com- parison operators, do not use collation. ■ The Collator class has a setStrength() function that can be used to adjust the collation rules. For example, you can use this to ignore accents or to change the stress placed on case. 426 Chapter 14 Collation in PHP Figure 14.10 To print each item in the array on its own line, I place HTML breaks in between them using implode(). Figure 14.11 The Collator class does a better job sorting accented and capitalized characters than PHP’s sort() function. Transliteration in PHP Transliteration is the conversion of text from one character set to another. This is not the same thing as translating, which involves a certain amount of interpretation. For exam- ple, in unicode.php (Script 14.2), several names are placed into an array. One of those is Greek: . Transliterated into the Latin alphabet, that would be Gi rgos. The example also used two Asian names— and . Those would be turned into Jié Xi K and Ài Zi, respectively. Because Unicode maps all the characters in every language to numbers, it’s actually very easy to perform transliteration. To achieve this in PHP, use the str_transliterate() function. It takes as its first argument the string to change. The second argument is the script of the original string. The third is the destination script. For both of these, I’m using “script” in the sense of Table 14.2, which lists the scripts supported by Unicode: Latin, Greek, Cyrillic, Arabic, etc. To try this out, let’s see what my (or your) name looks like in other alphabets. 427 Making Universal Sites Transliteration in PHP Unicode Documentation As I’m currently writing this book, PHP 6 has not yet been officially released. However, using available beta versions of the software, I have been able to test all of the code under PHP 6 with only minor hiccups. Unfortunately, what’s not available to me is good, and sometimes any, documentation on many of these new features. In fact, a couple of examples in this chapter use functions that aren’t even in the PHP manual yet! I’m absolutely confident about the examples and content of this book, naturally, but should something change in the official release of PHP 6, you may experience a problem here or there. If so, check out the PHP manual (which will be updated to correspond with the release) and turn to the book’s corresponding Web site (www.DMCInsights.com/phpmysql3/) or its sup- porting book forum for assistance. To use transliteration: 1. Begin a new PHP document in your text editor or IDE (Script 14.4). Transliteration What's my name? 2 4 5 6 7 Transliteration 8 9 10 What's my name? 11 $me is " . str_transliterate($me, 'Latin', $script) . " in $script.

    \n"; 22 } 23 24 ?> 25 26 Script 14.4 This script uses the new str_translitera- tion() function to convert a name from one character set to another. 3. Print the name in each script. foreach ($scripts as $script) { echo "

    $me is " . ➝ str_transliterate($me, 'Latin', ➝ $script) . " in $script.

    \n"; } Within the foreach loop, an echo() state- ment will print the name as it is originally and then transliterated. It will also print the destination script. For the origination script argument, Latin is being used, as the name was written using the Latin alpha- bet (change this if yours is different). 4. Complete the page. ?> 5. Save the file as trans.php, place it in your Web directory, and test it in your Web browser (Figure 14.12). If you get an error message like the one in Figure 14.13, that means that a par- ticular script is not available for translit- eration (or you misspelled it). 429 Making Universal Sites Transliteration in PHP Figure 14.12 My name, transliterated into different alphabets. Figure 14.13 The attempted conversion into Tibetan failed, as that script isn’t supported by my installation. Languages and MySQL Just as an HTML page and PHP script can use different encodings, so can MySQL. To see a list of ones supported by your version of MySQL, run a SHOW CHARACTER SET com- mand (Figure 14.14). Note that the phrase character set is being used in MySQL to mean encoding (which I’ll generally follow in this section to be consistent with MySQL). Each character set in MySQL has one or more collations. To view those, run this query, replacing charset with the proper value from the result in the last query (Figure 14.15): SHOW COLLATION LIKE 'charset%' 430 Chapter 14 Languages and MySQL The results of this query will also indicate the default collation for that character set. In MySQL, the server as a whole, each data- base, each table, and even every column can have a character set and collation. To set these values when you create a database, use CREATE DATABASE name CHARACTER SET ➝ charset COLLATION collation To set these values when you create a table, use CREATE TABLE name ( column definitions ) CHARACTER SET charset COLLATION ➝ collation Figure 14.14 The list of character sets supported by this MySQL installation. Figure 14.15 The list of collations available in the UTF-8 encoding. The first one, utf_general_ci, is the default. Establishing the character set and collation when you define a database affects what data can be stored (e.g., you can’t store a character in a column if its encoding doesn’t support that character). A second issue is the encoding used to communicate with MySQL. If you want to store Chinese charac- ters in a table with a Chinese encoding, those characters will need to be transferred using the same encoding. To do so from a PHP script, execute this query— SET NAMES charset —prior to executing any others. If you fail to do this, all data will be transferred using the default character set, which may or may not cause problems. Within the mysql client, set the encoding using just CHARSET charset These last two ideas will be revisited in the next chapter. I’ve just run through a fair amount of infor- mation, so to practice, let’s connect to MySQL and run some queries. For the example, I’ll use Spanish, which has two collations. Using traditional rules, the letter combinations ch and ll are each treated as a singular letter. In modern rules, they are not. To use character sets and collation: 1. Connect to MySQL using the mysql client. Recent versions of phpMyAdmin (at the time of this writing) do support setting the character sets and collations, if you’d rather use it. 2. Change the encoding to UTF8 (Figure 14.16). CHARSET utf8; 431 Making Universal Sites Languages and MySQL To establish the character set and collation for a column, add the right clause to the column’s definition (you’d only use this for text types): CREATE TABLE name ( something TEXT CHARACTER SET charset ➝ COLLATION collation …) In each of these cases, both clauses are optional. If omitted, a default character set or collation will be used. Collations in MySQL can also be specified within a query, to affect the results: SELECT … ORDER BY column COLLATE collation SELECT … WHERE column LIKE 'value' ➝ COLLATE collation Figure 14.16 When communicating with MySQL, to use a non-default encoding, change it upon connecting to the server. continues on next page 3. Select the test database and create a new table (Figure 14.17). USE test; CREATE TABLE test_utf ( id INT UNSIGNED NOT NULL ➝ AUTO_INCREMENT, word VARCHAR(20), PRIMARY KEY (id) ) CHARSET utf8; Because this is just practice, create a new table within the test database. This table is rather minimally defined, using just two columns. The character set (which is to say the encoding) for the table is UTF-8. 4. Insert some sample records (Figure 14.18). INSERT INTO test_utf (word) VALUES ('Calle'), ('cuchillo'), ('cuchara'), ('castillo'), ('cucaracha'), ('castigo'), ('castizo'), ('cuclillo'); 432 Chapter 14 Languages and MySQL Figure 14.17 This table will be used to demonstrate collation and character sets. Figure 14.18 Populating the table with some sample data. 5. Retrieve the records in alphabetical order (Figure 14.19). SELECT * FROM test_utf ORDER BY word; This query will use the established colla- tion for the column. With the table defi- nition in Step 3, that would be the default collation for the UTF-8 character set. 6. Retrieve the records in order using Traditional Spanish rules (Figure 14.20). SELECT * FROM test_utf ORDER BY word COLLATE ➝ utf8_spanish2_ci; To change the order used in a sort, with- out making a permanent change in the database, add the COLLATE clause to your query. The utf8_spanish2_ci collation uses the Traditional Spanish rules of order. ✔ Tips ■ It’s recommended that any column using the UTF-8 encoding not be defined as CHAR for performance reasons. Use a text or VARCHAR type instead. ■ The CONVERT() function can convert text from one character set to another. ■ Because different character sets require more space to represent a string, you will likely need to increase the size of a col- umn for UTF-8 characters. Do this before changing a column’s encoding so that no data is lost. 433 Making Universal Sites Languages and MySQL Figure 14.19 The words in order using the default collation. Figure 14.20 The difference in collations is evident in the new location of the word with an ID of 8 (compare with Figure 14.19). Time Zones and MySQL Chapter 10, “Web Application Development,” introduces a couple of PHP’s date and time functions. These include date_default_time- zone_set(), which needs to be called prior to using any other date or time function (as of PHP 5.1). I think there’s enough information in that chapter, and in the PHP manual, if you need to work with time zones in PHP. But what about MySQL? Start by remembering that the date and time in MySQL represents the date and time on the server. Invocations of NOW() and other functions reflect the server’s time. Therefore, values stored in a database using these func- tions are also storing the server’s time, reflecting that server’s time zone. But say you move your site from one server to another: you export all the data, import it into the other, and everything’s fine…unless the two servers are in different time zones, in which case all of the dates are now off. That won’t be a big deal for some sites, but what if your site features paid memberships? That means some people’s membership might expire a day early and for others, a day late! The solution is to store dates and times in a time zone–neutral way. Doing so uses some- thing called UTC (Coordinated Universal Time, and, yes, the abbreviation doesn’t exactly match the term). UTC, like Greenwich Mean Time (GMT), provides a common point of origin, from which all times in the world can be expressed as UTC plus or minus some hours and minutes (Table 14.3). Fortunately you don’t have to perform any calculations in order to determine UTC for your server. Instead, the UTC_DATE() function returns the UTC date; UTC_TIME() returns the current UTC time; and UTC_TIMESTAMP() returns the current date and time. Once you have stored a UTC time, you’ll likely want to retrieve it adjusted to reflect the server’s or the user’s location. To change a date and time from any one time zone to another, use CONVERT_TZ(): CONVERT_TZ(dt, from, to) The first argument is a date and time value, like the result of a function or what’s stored in a column. The second and third arguments are named time zones (see the sidebar). 434 Chapter 14 Time Zones and MySQL City Time New York City, U.S. UTC–4 Cape Town, South Africa UTC+2 Mumbai, India UTC+5:30 Auckland, New Zealand UTC+13 Kathmandu, Nepal UTC+5:45 Santiago, Chile UTC–3 Dublin, Ireland UTC+1 UTC Offsets Table 14.3 A sampling of cities and how their time would be represented, depending upon daylight saving time. Note that not all time zones use hourly offsets. Some use 30- or 45-minute offsets. To work with UTC: 1. Connect to MySQL. You can use the mysql client (as I will in the corresponding figures), phpMyAdmin, or something else. 2. Select the test database and create a new table (Figure 14.21). USE test; CREATE TABLE tz ( id INT UNSIGNED NOT NULL ➝ AUTO_INCREMENT, utc DATETIME, PRIMARY KEY (id) ); Because this is just practice, the new table will again be created within the test data- base. This table is also rather minimally defined, using just two columns. The sec- ond column, of type DATETIME, will be the important one for this example. I haven’t tweaked the character set, as this example won’t be working with text. 3. Insert a sample record. INSERT INTO tz (utc) VALUES (UTC_TIMESTAMP()); Using the UTC_TIMESTAMP() function, the record will store the UTC date and time, not the date and time on the server. 435 Making Universal Sites Time Zones and MySQL Using Time Zones in MySQL MySQL does not install support for time zones by default. In order to use named time zones, there are five tables in the mysql database that have to be populat- ed. While MySQL doesn’t automatically do this for you, it does provide the tools to do this yourself. This process is just complicated enough that there’s not room to discuss it in this book (not for every possible contingency: operating system etc.). But you can find the instructions by looking up “server time zone support” in the MySQL manu- al. The manual even has sample queries you can run to confirm that your time zones are accurate. If you continue to use time zones in MySQL, you also need to keep this infor- mation in the mysql database updated. The rules for time zones, in particular, when and how they observe daylight sav- ing time, change often. Again, the MySQL manual has instructions for updating your time zones. Figure 14.21 Creating another example table. continues on next page 4. View the record as it’s stored (Figure 14.22). SELECT * FROM tz; As you can see in the figure and the table definition, UTC times are stored just the same as non-UTC times. What’s not obvious in the figure is that the record just inserted reflects a time four hours ahead of the server (because the server is in a time zone four hours away). 5. Retrieve the record in your time zone (Figure 14.23). SELECT CONVERT_TZ(utc, 'UTC', ➝ 'America/New_York') FROM tz; Using the CONVERT_TZ() function, you can format any date and time converted to a different time zone. For the from time zone, use UTC. For the to time zone, use yours. The time zone names match those used by PHP (see Chapter 10 or, more directly, www.php.net/timezones). If you get a NULL result (Figure 14.24), either the name of one of your time zones is wrong or MySQL hasn’t had its time zones loaded yet (see the sidebar). ✔ Tips ■ However you decide to handle dates, the key is to be consistent. If you decide to use UTC, then always use UTC. ■ UTC is also known as Zulu time, repre- sented by the letter Z. ■ Besides being time zone and daylight saving time agnostic, UTC is also more accurate. It has irregular leap seconds that compensate for the inexact move- ment of the planet. 436 Chapter 14 Time Zones and MySQL Figure 14.22 The record that was just inserted, which reflects a time four hours ahead (the server is UTC-4). Figure 14.23 The UTC-stored date and time converted to my local time. Figure 14.24 The CONVERT_TZ() function will return NULL if it references an invalid time zone or if the time zones haven’t been installed in MySQL (which is the case here). Working with Locales A locale is an interesting concept that most beginner programmers aren’t familiar with. It occupies several realms that overlap with some of the other topics in this chapter. A locale represents the language and format- ting habits for a culture. Locales describe: ◆ How dates, times, currencies, and num- bers should be written ◆ What unit of measurement is used ◆ How text should be sorted or matched ◆ How characters are capitalized For example, both the United States and England speak English, but they format dates differently. Each computer has a default locale. Using PHP, you can change the locale value. You might want to do this if, for example, your server is located in the United States but you have a site targeting the Swiss population. To change the locale in version 6 of PHP, use the locale_set_default() function (see the sidebar for the PHP 5 alternative). This func- tion takes just one argument, a string in the format [_