.NET 4.0 Generics Beginner's Guide


.NET 4.0 Generics Beginner's Guide Enhance the type safety of your code and create applications easily using Generics in the .NET 4.0 Framework Sudipta Mukherjee BIRMINGHAM - MUMBAI D o wnload from Wow! eBook D o wnload from Wow! eBook .NET 4.0 Generics Beginner's Guide Copyright © 2012 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: January 2012 Production Reference: 1190112 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-84969-078-2 www.packtpub.com Cover Image by Asher Wishkerman (a.wishkerman@mpic.de) Credits Author Sudipta Mukherjee Reviewers Atul Gupta WEI CHUNG, LOW Antonio Radesca Acquisition Editor David Barnes Lead Technical Editor Meeta Rajani Technical Editors Veronica Fernandes Copy Editor Laxmi Subramanian Project Coordinator Vishal Bodwani Proofreader Joanna McMahon Indexer Monica Ajmera Mehta Rekha Nair Graphics Manu Joseph Production Coordinator Alwin Roy Cover Work Alwin Roy Foreword It is my pleasure to write the foreword to a book which will introduce you to the world of generic programming with C# and other .NET languages. You will be able to learn a lot from this book, as it introduces you to the elegant power of generic programming in C#. Through it, you will become a better C# programmer, and a better programmer in all future languages you might choose to use. It is now almost 10 years since .NET Generics was first described in publications from Microsoft Research, Cambridge, a project I was able to lead and contribute to, and six years since it was released in product form in C# 2.0. In this foreword, I would like to take a moment to review the importance of .NET Generics in the history of programming languages, and the way it continues to inspire a new generation of programmers. When we began the design of C# and .NET Generics, generic programming was not new. However, it was considered to be outside the mainstream, and attempts to change that with C++ templates and proposals for Java Generics were proving highly problematic for practitioners. At Microsoft Research, we pride ourselves on solving problems at their core. The three defining core features of .NET Generics as we designed them were efficient generics over value types with code generation and sharing managed by the virtual machine, reified run-time types, and language neutrality. These technical features are now widely acknowledged to represent the "right" fundamental design choices for programming language infrastructure. They are not easy to design or build, and they are not easy to deliver, and when Microsoft Research embarked on this project, we believe we put the .NET platform many years ahead of its rivals. The entire credit goes to Microsoft and people such as Bill Gates, Eric Rudder, and Anders Hejlsberg for taking the plunge to push this into our range of programming languages. However, without the prototyping, research, engineering, and incessant advocacy by Microsoft Research, C# and .NET Generics would never be in their current form. Let's take some time to examine why this was important. First, .NET Generics represents the moment where strongly typed and functional programming entered the mainstream. .NET Generics enabled C# to become more functional (through LINQ, Lambdas, and generic collections), and it enabled a new class of strongly typed, fully functional .NET languages (such as F#) to thrive. Further, .NET Generics also enabled new key programming techniques, such as Async programming in F# 2.0 and C# 5.0, and Rx programming for reactive systems. Even though you may not realize it, you'll have learned a lot of functional programming by the end of this book. Next, .NET Generics categorically proved that strongly typed object-oriented programming can integrate seamlessly with generic programming. It is hard to describe the extent to which .NET Generics managed to defeat the "object fundamentalists" of the 1990s (who want a world where there is nothing but classes). These people, many still occupying powerful positions in the software industry, seemed satisfied with a world where programmers are less productive, and programs less efficient, in the name of orthodoxy. Today, no practicing programmer or language designer with experience of .NET Generics would design a strongly typed programming language that does not include Generics. Further, almost every .NET API now features the use of .NET Generics, and it has become an essential weapon in the programmer's toolkit for solving many problems. Finally, and for me most importantly, .NET Generics represents the victory of pragmatic beauty over pragmatic ugliness. In the eyes of many, alternative solutions to the problem of generic programming such as Java's "erasure" of Generics are simply unpleasant "hacks". This leads to reduced productivity when using those languages. In contrast, .NET Generics is perhaps the most smoothly integrated advanced programming language feature ever constructed. It integrates with reflection, .NET NGEN pre-compilation, debugging, and run- time code generation. I've had many people e-mail me to say that .NET Generics is their favorite programming language feature. That is what language research is all about. I trust you will learn a great deal from this book, and enjoy the productivity that comes from C#, and .NET languages such as F#. Dr. Don Syme Principal Research, Microsoft Research, Cambridge, U.K. Generic types are more than just lists of T. Functional programmers have known this for a long time. C++ programmers who use templates knew this too. But 10 years ago when Don Syme and I first designed and prototyped the Generics feature of the .NET run-time, most mainstream developers were constrained by the rudimentary type systems of languages such as Visual Basic and Java, writing type-generic code only by resorting to casting tricks or worse. In that space, it's hard to conceive of myriad uses of generic types beyond lists and simple collections, and it's fair to say that there was some resistance to our design! Fortunately, some forward thinkers in Microsoft's .NET run-time team regarded Generics in managed languages as more than an academic indulgence, and committed substantial resources to completing a first-class implementation of Generics that is deeply embedded in the run-time languages and tools. We've come a long way in 10 years! Managed code frameworks make liberal use of generic types, ranging from obvious collection types such as List and Dictionary, through `action' types such as Func and IEnumerable, to more specialized use of Generics such as Lazy initialization. Blogs and online forums are full of discussions on sophisticated topics such as variance and circular constraints. And if it weren't for Generics, it's hard to see how newer language features such as LINQ, or even complete languages such as F#, could have got off the ground. Coming back, Generics really does start with List, and this book sensibly begins from there. It then takes a leisurely tour around the zoo of generic types in the .NET Framework and beyond, to Power Collections and C5. The style is very much one of exploration: the reader is invited to experiment with Generics, prodding and poking a generic type through its methods and properties, and thereby understand the type and solve problems by using it. As someone whose background is in functional programming, in which the initial experience is very much like experimenting with a calculator, I find this very appealing. I hope you like it as much as I do. Andrew Kennedy Microsoft Research, Cambridge, U.K. About the Author Sudipta Mukherjee was born in Kolkata and migrated to Bangalore, the IT capital of India, to assume the position of a Senior Research Engineer in a renowned research lab. He is an Electronics Engineer by education and a Computer Engineer/Scientist by profession and passion. He graduated in 2004 with a degree in Electronics and Communication Engineering. He has been working with .NET Generics since they first appeared in the .NET Framework 2.0. He has a keen interest in data structure, algorithms, text processing, natural language processing ,programming language, tools development, and game development. His first book on data structure using the C programming language has been well received. Parts of the book can be read on Google Books at http://goo.gl/pttSh. The book was also translated into Simplified Chinese available on Amazon at http://goo.gl/lc536. He is an active blogger and an open source enthusiast. He mainly blogs about programming and related concepts at sudipta.posterous.com. Inspired by several string processing methods in other languages, he created an open source string processing framework for .NET, available for free at stringdefs.codeplex.com. He lives in Bangalore with his wife. He can be reached via e-mail at sudipto80@yahoo.com and via Twitter at @samthecoder. Acknowledgement Books like this cannot be brought to life by the author alone. I want to take this opportunity to thank all the people who were involved in this book in any way. First of all, I want to thank Microsoft Research for bringing Generics into the .NET Framework. Great work guys. I have used STL in C++ and Collections in Java. But I can say without being biased that Generics in .NET is the smartest implementation of generic programming paradigm that I have ever come across. Without that, I wouldn't have anything to write about. I owe a big "Thank You" to the Senior Acquisition Editor and Publisher David Barnes at Packt Publishing for offering me this opportunity to write for them. I want to thank Vishal Bodwani and Meeta Rajani, also from Packt Publishing, for their great support. Everytime I missed a deadline, they helped me get back on track. Thanks for bearing with me. Last but not the least, I want to thank my Technical Editors Snehal and Veronica who painstakingly corrected all the mistakes, did all the formatting, without which the book would not have been possible. Thanks a lot. I have no words to express my gratitude towards Don and Andrew for taking time off to read the manuscript and their kind words. Thank you Don. Thank you Andrew. I want to thank all the reviewers of the book. Thanks for all your great feedback. It really made the book better. My wife, Mou, motivated me to write this book. She stood by me when I needed her throughout all these months. Thank you sweetheart. Last but not the least, I can't thank my mom Dipali and dad Subrata enough for finding the love of my life and always being supportive. Thank you mom. Thank you dad. About the Reviewers Atul Gupta, is currently a Principal Technology Architect at Infosys' Microsoft Technology Center. He also has close to 15 years of experience working on Microsoft technologies. His expertise spans user interface technologies, and he currently focuses on Windows Presentation Foundation (WPF) and Silverlight technologies. Other technologies of interest to him are Touch (Windows 7), Deepzoom, Pivot, Surface, and Windows Phone 7. He recently co-authored the book "ASP.NET 4 Social Networking", Packt Publishing (http:// www.packtpub.com/asp-net-4-social-networking/book). His prior interest areas were COM, DCOM, C, VC++, ADO.NET, ASP.NET, AJAX, and ASP.NET MVC. He has also authored papers for industry publications and websites, some of which are available on Infosys' Technology Showcase (http://www.infosys.com/microsoft/ resource-center/pages/technology-showcase.aspx). Along with colleagues from Infosys, Atul is also an active blogger (http://www.infosysblogs.com/microsoft). Being actively involved in professional Microsoft online communities and developer forums, Atul has received Microsoft's Most Valuable Professional award for multiple years in a row. WEI CHUNG, LOW, a Technical Lead in BizTalk and .NET, and a MCT, MCPD, MCITP, MCTS, and MCSD.NET, works with ResMed (NYSE: RMD), at its Kuala Lumpur, Malaysia campus. He is also a member of PMI, certified as a PMP. He started working on Microsoft .NET very early on and has been involved in development, consultation, and corporate training in the areas of business intelligence, system integration, and virtualization. He has been working for the Bursa Malaysia (formerly Kuala Lumpur Stock Exchange) and Shell IT International previously, which prepared him with rich integration experience across different platforms. He strongly believes that great system implementation delivers precious value to the business, and integration of various systems across different platforms shall always be a part of it, just as people from different cultures and diversities are able to live in harmony in most of the major cities. Antonio Radesca has over 15 years of programming experience. He has a degree in Computer Science and is interested in architectures, programming languages, and enterprise development. He has worked at some of the most important Italian companies, especially at Microsoft .NET Framework as a Developer and an Architect. His expertise spans .NET programming to mobile development on iOS, Android, and Windows Phone. D o wnload from Wow! eBook D o wnload from Wow! eBook www.PacktPub.com Support files, eBooks, discount offers and more You might want to visit www.PacktPub.com for support files and downloads related to your book. Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at service@packtpub.com for more details. At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks. http://PacktLib.PacktPub.com Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.  Why Subscribe? ‹‹ Fully searchable across every book published by Packt ‹‹ Copy and paste, print and bookmark content ‹‹ On demand and accessible via web browser Free Access for Packt account holders If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access. Table of Contents Preface 1 Chapter 1: Why Generics? 7 An analogy 8 Reason 1: Generics can save you a lot of typing 8 Reason 2: Generics can save you type safety woes, big time 10 What's the problem with this approach? 12 Reason 3: Generics leads to faster code 14 Reason 4: Generics is now ubiquitous in the .NET ecosystem 15 Setting up the environment 15 Summary 17 Chapter 2: Lists 19 Why bother learning about generic lists? 20 Types of generic lists 20 Checking whether a sequence is a palindrome or not 22 Time for action – creating the generic stack as the buffer 24 Time for action – completing the rest of the method 26 Designing a generic anagram finder 28 Time for action – creating the method 29 Life is full of priorities, let's bring some order there 32 Time for action – creating the data structure for the prioritized shopping list 33 Time for action – let's add some gadgets to the list and see them 34 Time for action – let's strike off the gadgets with top-most priority after we have bought them 37 Time for action – let's create an appointment list 40 Live sorting and statistics for online bidding 41 Time for action – let's create a custom class for live sorting 42 Why did we have three LinkedList as part of the data structure? 47 An attempt to answer questions asked by your boss 47 Table of Contents [ ii ] Time for action – associating products with live sorted bid amounts 47 Time for action – finding common values across different bidding amount lists 50 You will win every scrabble game from now on 52 Time for action – creating the method to find the character histogram of a word 52 Time for action – checking whether a word can be formed 53 Time for action – let's see whether it works 54 Trying to fix an appointment with a doctor? 56 Time for action – creating a set of dates of the doctors' availability 57 Time for action – finding out when both doctors shall be present 58 Revisiting the anagram problem 60 Time for action – re-creating the anagram finder 60 Lists under the hood 64 Summary 65 Chapter 3: Dictionaries 67 Types of generic associative structures 68 Creating a tag cloud generator using dictionary 69 Time for action – creating the word histogram 69 Creating a bubble wrap popper game 73 Time for action – creating the game console 74 Look how easy it was! 77 How did we decide we need a dictionary and not a list? 78 Let's build a generic autocomplete service 79 Time for action – creating a custom dictionary for autocomplete 79 Time for action – creating a class for autocomplete 82 The most common pitfall. Don't fall there! 88 Let's play some piano 88 Time for action – creating the keys of the piano 89 How are we recording the key strokes? 94 Time for action – switching on recording and playing recorded keystrokes 95 How it works? 96 C# Dictionaries can help detect cancer. Let's see how! 97 Time for action – creating the KNN API 97 Time for action – getting the patient records 102 Time for action – creating the helper class to read a delimited file 103 Time for action – let's see how to use the predictor 104 Tuples are great for many occasions including games 105 Time for action – putting it all together 106 Why have we used Tuples? 113 How did we figure out whether the game is over or not? 115 Summary 116 Table of Contents [ iii ] Chapter 4: LINQ to Objects 117 What makes LINQ? 118 Extension methods 118 Time for action – creating an Extension method 119 Time for action – consuming our new Extension method 120 Check out these guidelines for when not to use Extension methods 122 Object initializers 122 Collection initializers 123 Implicitly typed local variables 124 Anonymous types 124 Lambda expressions 125 Functors 125 Predicates 127 Actions 127 Putting it all together, LINQ Standard Query Operators 128 Time for action – getting the LINQPad 129 Restriction operators 131 Where() 131 Time for action – finding all names with *am* 131 Time for action – finding all vowels 132 Time for action – finding all running processes matching a Regex 133 Time for action – playing with the indexed version of Where() 134 Time for action – learn how to go about creating a Where() clause 135 Projection operators 136 Select() 137 Time for action – let's say "Hello" to your buddies 137 Making use of the overloaded indexed version of Select() 138 Time for action – radio "Lucky Caller" announcement 138 SelectMany() 140 Time for action – flattening a dictionary 140 Partitioning operators 141 Take() 141 Time for action – leaving the first few elements 142 TakeWhile() 143 Time for action – picking conditionally 143 Skip() 145 Time for action – skipping save looping 145 SkipWhile() 146 Ordering operators 147 Reverse() 147 Time for action – reversing word-by-word 147 Time for action – checking whether a given string is a palindrome or not 148 OrderBy() 149 Table of Contents [ iv ] Time for action – sorting names alphabetically 149 Time for action – sorting 2D points by their co-ordinates 151 OrderByDescending() 152 ThenBy() 152 Time for action – sorting a list of fruits 152 What's the difference between a sequence of OrderBy().OrderBy() and OrderBy().ThenBy()? 154 ThenByDescending() 154 Grouping operator 154 GroupBy() 154 Time for action – indexing an array of strings 154 Time for action – grouping by length 156 Set operators 158 Intersect() 158 Time for action – finding common names from two names' lists 159 Union() 160 Time for action – finding all names from the list, removing duplicates 161 Concat() 162 Time for action – pulling it all together including duplicates 162 Except() 162 Time for action – finding all names that appear mutually exclusively 163 Distinct() 164 Time for action – removing duplicate song IDs from the list 165 Conversion operators 166 ToArray() 167 Time for action – making sure it works! 167 ToList() 168 Time for action – making a list out of IEnumerable 168 ToDictionary() 169 Time for action – tagging names 169 ToLookup() 171 Time for action – one-to-many mapping 171 Element operators 172 First() 172 Time for action – finding the first element that 172 satisfies a condition 172 How First() is different from Single()? 173 FirstOrDefault() 173 Time for action – getting acquainted with FirstOrDefault() 173 Last() 174 LastOrDefault() 174 SequenceEquals() 174 Time for action – checking whether a sequence is palindromic 174 ElementAt() 175 Time for action – understanding ElementAt() 176 ElementAtOrDefault() 177 Table of Contents [ v ] DefaultIfEmpty() 177 Time for action – check out DefaultIfEmpty() 178 Generation operators 178 Range() 178 Time for action – generating arithmetic progression ranges 179 Time for action – running a filter on a range 179 Repeat() 180 Time for action – let's go round and round with Repeat() 180 Quantifier operators 181 Single() 181 Time for action – checking whether there is only one item matching this pattern 182 SingleOrDefault() 183 Time for action – set to default if there is more than one matching elements 183 Any() 184 Time for action – checking Any() 185 All() 186 Time for action – how to check whether all items match 186 a condition 186 Merging operators 187 Zip() 187 Summary 188 Chapter 5: Observable Collections 189 Active change/Statistical change 190 Passive change/Non-statistical change 191 Data sensitive change 191 Time for action – creating a simple math question monitor 193 Time for action – creating the collections to hold questions 194 Time for action – attaching the event to monitor the collections 195 Time for action – dealing with the change as it happens 197 Time for action – dealing with the change as it happens 199 Time for action – putting it all together 200 Time for action – creating a Twitter browser 201 Time for action – creating the interface 202 Time for action – creating the TweetViewer user control design 203 Time for action – gluing the TweetViewer control 205 Time for action – putting everything together 208 Time for action – dealing with the change in the list of names in the first tab 209 Time for action – a few things to beware of at the form load 210 Time for action – things to do when names get added or deleted 211 Time for action – sharing the load and creating a task for each BackgroundWorker 213 Time for action – a sample run of the application 216 Summary 219 Table of Contents [ vi ] Chapter 6: Concurrent Collections 221 Creating and running asynchronous tasks 222 Pattern 1: Creating and starting a new asynchronous task 222 Pattern 2: Creating a task and starting it off a little later 222 Pattern 3: Waiting for all running tasks to complete 222 Pattern 4: Waiting for any particular task 222 Pattern 5: Starting a task with an initial parameter 222 Simulating a survey (which is, of course, simultaneous by nature) 223 Time for action – creating the blocks 223 Devising a data structure for finding the most in-demand item 227 Time for action – creating the concurrent move-to-front list 228 Time for action – simulating a bank queue with multiple tellers 234 Time for action – making our bank queue simulator more useful 239 Be a smart consumer, don't wait till you have it all 241 Exploring data structure mapping 242 Summary 243 Chapter 7: Power Collections 245 Setting up the environment 246 BinarySearch() 248 Time for action – finding a name from a list of names 248 CartesianProduct() 249 Time for action – generating names of all the 52 playing cards 249 RandomShuffle() 250 Time for action – randomly shuffling the deck 250 NCopiesOf() 252 Time for action – creating random numbers of any given length 252 Time for action – creating a custom random number generator 253 ForEach() 256 Time for action – creating a few random numbers of given any length 256 Rotate() and RotateInPlace() 257 Time for action – rotating a word 258 Time for action – creating a word guessing game 258 RandomSubset() 262 Time for action – picking a set of random elements 262 Reverse() 263 Time for action – reversing any collection 263 EqualCollections() 264 Time for action – revisiting the palindrome problem 264 DisjointSets() 265 Time for action – checking for common stuff 265 Table of Contents [ vii ] Time for action – finding anagrams the easiest way 266 Creating an efficient arbitrary floating point representation 267 Time for action – creating a huge number API 267 Creating an API for customizable default values 273 Time for action – creating a default value API 273 Mapping data structure 277 Algorithm conversion strategy 277 Summary 278 Chapter 8: C5 Collections 279 Setting up the environment 281 Time for action – cloning Gender Genie! 281 Time for action – revisiting the anagram problem 287 Time for action – Google Sets idea prototype 288 Time for action – finding the most sought-after item 294 Sorting algorithms 299 Pattern 1: Sorting an array of integers 300 Pattern 2: Partially sorting an array—say, sort first five numbers of a long array 300 Pattern 3: Sorting a list of string objects 301 Summary 302 Chapter 9: Patterns, Practices, and Performance 303 Generic container patterns 304 How these are organized 304 Pattern 1: One-to-one mapping 304 Pattern 2: One-to-many unique value mapping 305 Pattern 3: One-to-many value mapping 306 Pattern 4: Many-to-many mapping 307 A special Tuple<> pattern 308 Time for action – refactoring deeply nested if-else blocks 310 Best practices when using Generics 312 Selecting a generic collection 314 Best practices when creating custom generic collections 315 Performance analysis 317 Lists 317 Dictionaries/associative containers 318 Sets 318 How would we do this investigation? 318 Benchmarking experiment 1 319 Benchmarking experiment 2 324 Benchmarking experiment 3 328 Benchmarking experiment 4 330 Benchmarking experiment 5 334 Table of Contents [ viii ] Benchmarking experiment 6 336 Benchmarking experiment 7 340 Benchmarking experiment 8 344 Benchmarking experiment 9 345 Summary 348 Appendix A: Performance Cheat Sheet 349 Parameters to consider 353 Appendix B: Migration Cheat Sheet 357 Appendix C: Pop Quiz Answers 361 Chapter 2 361 Lists 361 Chapter 3 361 Dictionaries 361 Chapter 4 362 LINQ to Objects 362 Index 363 Preface Thanks for picking up this book. This is an example-driven book. You will learn about several generic containers and generic algorithms available in the .NET Framework and a couple of other majorly accepted APIs such as Power Collections and C5 by building several applications and programs. Towards the end, several benchmarkings have been carried out to identify the best container for the job at hand. What this book covers Chapter 1, Why Generics?, introduces .NET Generics. We will examine the need for the invention of Generics in the .NET Framework. If you start with a feel of "Why should I learn Generics?", you will end with a feeling of "Why didn't I till now?" Chapter 2, Lists, introduces you to several kinds of lists that .NET Generics has to offer. There are simple lists and associative lists. You shall see how simple lists can deliver amazing results avoiding any typecasting woes and boosting performance at the same time. Chapter 3, Dictionaries, explains the need for associative containers and introduces you to the associative containers that .NET has to offer. If you need to keep track of one or multiple dependent variables while one independent variable changes, you need a dictionary. For example, say you want to build a spell check or an autocomplete service, you need a dictionary. This chapter will walk you through this. Along the way, you will pick up some very important concepts. Chapter 4, LINQ to Objects, explains LINQ to objects using extension methods. LINQ or Language Integrated Query is a syntax that allows us to query collections unanimously. In this chapter, we will learn about some standard LINQ Standard Query Operators (LSQO) and then use them in unison to orchestrate an elegant query for any custom need. Preface [ 2 ] Chapter 5, Observable Collections, introduces observable collections. Observing events on collections has been inherently difficult. That's going to change forever, thanks to observable collections. You can now monitor your collections for any change; whether some elements are added to the collection, some of them are deleted, change locations, and so on. In this chapter, you will learn about these collections. Chapter 6, Concurrent Collections, covers concurrent collections that appeared in .NET 4.0. Multi-threaded applications are ubiquitous and that's the new expectation of our generation. We are always busy and impatient, trying to get a lot of things done at once. So concurrency is here and it is here to stay for a long time. Historically, there was no inbuilt support for concurrency in generic collections. Programmers had to ensure concurrency through primitive thread locking. You can still do so, but you now have an option to use the concurrent version of generic collections that support concurrency natively. This greatly simplifies the code. In this chapter, you will learn how to use them to build some useful applications such as simulating a survey engine. Chapter 7, Power Collections, introduces several generic algorithms in PowerCollections and some handy generic containers. This collection API came from Wintellect (www. wintellect.com) at the time when .NET Generics was not big and had some very useful collections. However, now .NET Generics has grown to support all those types and even more. So that makes most of the containers defined in PowerCollections outdated. However, there are a lot of good general purpose generic algorithms that you will need but which are missing from the .NET Generics API. That's the reason this chapter is included. In this chapter, you will see how these generic algorithms can be used with any generic container seamlessly. Chapter 8, C5 Collections, introduces the C5 API. If you come from a Java background and are wondering where your hash and tree-based data structures, are this is the chapter to turn to. However, from the usage perspective, all the containers available in C5 can be augmented with generic containers available in the .NET Framework. You are free to use them. This API is also home to several great generic algorithms that make life a lot easier. In this chapter, you will walk through the different collections and algorithms that C5 offers. Chapter 9, Patterns, Practices, and Performance, covers some best practices when dealing with Generics and introduces the benchmarking strategy. In this chapter, we will use benchmarking code to see how different generic containers perform and then declare a winner in that field. For example, benchmarking shows that if you need a set, then HashSet in the .NET Framework is the fastest you can get. Appendix A, Performance Cheat Sheet, is a cheat sheet with all the performance measures for all containers. Keeping this handy would be extremely useful when you want to decide which container to use for the job at hand. Preface [ 3 ] Appendix B, Migration Cheat Sheet, will show you how to migrate code from STL/JCF/ PowerCollections/.NET 1.0 to the latest .NET Framework-compliant code. Migration will never be easier. Using this cheat sheet, it will be a no brainer. This is great for seasoned C++, Java, or .NET developers who are looking for a quick reference to .NET Generics in the latest framework. What you need for this book You will need the following software to use this book: ‹‹ Visual Studio 2010 (any version will do, I have used the Ultimate Trial version) ‹‹ LINQPad Instructions to download this software are given in the respective chapters where they are introduced. Who this book is for This book is for you, if you want to know what .NET Generics is all about and how it can help solve real-world problems. It is assumed that readers are familiar with C# program constructs such as variable declaration, looping, branching, and so on. No prior knowledge in .NET Generics or generic programming is required. This book also offers handy migration tips from other generic APIs available in other languages, such as STL in C++ or JCF in Java. So if you are trying to migrate your code to the .NET platform from any of these, then this book will be helpful. Last but not the least, this book ends with generic patterns, best practices, and performance analysis for several generic containers. So, if you are an architect or senior software engineer and have to define coding standards, this will be very handy as a showcase of proofs to your design decisions. Conventions In this book, you will find several headings appearing frequently. To give clear instructions of how to complete a procedure or task, we use: Preface [ 4 ] Time for action – heading 1. Ac tion 1 2. Ac tion 2 3. Ac tion 3 Instructions often need some extra explanation so that they make sense, so they are followed with: What just happened? This heading explains the working of tasks or instructions that you have just completed. You will also find some other learning aids in the book, including: Pop quiz – heading These are short, multiple choice questions intended to help you test your own understanding. Have a go hero – heading These set practical challenges and give you ideas for experimenting with what you have learned. You will also find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning. Code words in text are shown as follows: "Suppose, I want to maintain a list of my students, then we can do that by using ArrayList to store a list of such Student objects." A block of code is set as follows: private T[] Sort(T[] inputArray) { //Sort input array in-place //and return the sorted array return inputArray; } Preface [ 5 ] When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold: Enumerable.Range(1, 100).Reverse().ToList() .ForEach(n => nums.AddLast(n)); Any command-line input or output is written as follows: Argument 1: cannot convert from 'int[]' to 'float[]' New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: "Then go to the File menu to create a console project." Warnings or important notes appear in a box like this. Tips and tricks appear like this. Reader feedback Feedback from our readers is always welcome. Let us know what you think about this book— what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of. To send us general feedback, simply send an e-mail to feedback@packtpub.com, and mention the book title through the subject of your message. If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors. Customer support Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase. D o wnload from Wow! eBook D o wnload from Wow! eBook Preface [ 6 ] Downloading the example code You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you. Errata Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code— we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website, or added to any list of existing errata, under the Errata section of that title. Piracy Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy. Please contact us at copyright@packtpub.com with a link to the suspected pirated material. We appreciate your help in protecting our authors, and our ability to bring you valuable content. Questions You can contact us at questions@packtpub.com if you are having a problem with any aspect of the book, and we will do our best to address it. 1Why Generics? Thanks for picking up the book! This means you care for Generics. This is similar to dropping a plastic bag in favor of our lonely planet. We are living in an interesting era, where more and more applications are data driven. To store these different kinds of data, we need several data structures. Although the actual piece of data is different, that doesn't always necessarily mean that the type of data is different. For example, consider the following situations: Let's say, we have to write an application to pull in tweets and Facebook wall updates for given user IDs. Although these two result sets will have different features, they can be stored in a similar list of items. The list is a generic list that can be programmed to store items of a given type, at compile time, to ensure type safety. This is also known as parametric polymorphism. A cat and a dog shouldn't share a bed. Neither should integers and floats. Why Generics? [ 8 ] In this introductory chapter, I shall give you a few reasons why Generics is important. An analogy Here is an interesting analogy. Assume that there is a model hand pattern: If we fill the pattern with clay, we get a clay-modeled hand. If we fill it with bronze, we get a hand model replica made of bronze. Although the material in these two hand models are very different, they share the same pattern (or they were created using the same algorithm, if you would agree to that term, in a broader sense). Reason 1: Generics can save you a lot of typing Extrapolating the algorithm part, let's say we have to implement some sorting algorithm; however, data types can vary for the input. To solve this, you can use overloading, as follows: //Overloaded sort methods private int[] Sort(int[] inputArray) { //Sort input array in-place //and return the sorted array return inputArray; } private float[] Sort(float[] inputArray) { //Sort input array in-place //and return the sorted array return inputArray; } Chapter 1 [ 9 ] However, you have to write the same code for all numeric data types supported by .NET. That's bad. Wouldn't it be cool if the compiler could somehow be instructed at compile time to yield the right version for the given data type at runtime? That's what Generics is about. Instead of writing the same method for all data types, you can create one single method with a symbolic data type. This will instruct the compiler to yield a specific code for the specific data type at runtime, as follows: private T[] Sort(T[] inputArray) { //Sort input array in-place //and return the sorted array return inputArray; } T is short for Type. If you replace T with anything, it will still compile; because it's the symbolic name for the generic type that will get replaced with a real type in the .NET type system at runtime. So once we have this method, we can call it as follows: int[] inputArray = { 1, 2, 0, 3 }; inputArray = Sort(inputArray); However, if you hover your mouse pointer right after the first brace ((), you can see in the tooltip, the expected type is already int[], as shown in the following screenshot: That's the beauty of Generics. As we had mentioned int inside < and >, the compiler now knows for sure that it should expect only an int[] as the argument to the Sort () method. However, if you change int to float, you will see that the expectation of the compiler also changes. It then expects a float[] as the argument, as shown: Why Generics? [ 10 ] Now if you think you can fool the compiler by passing an integer array while it is asking for a float, you are wrong. That's blocked by compiler-time type checking. If you try something similar to the following: You will get the following compiler error: Argument 1: cannot convert from 'int[]' to 'float[]' This means that Generics ensures strong type safety and is an integral part of the .NET framework, which is type safe. Reason 2: Generics can save you type safety woes, big time The previous example was about a sorting algorithm that doesn't change with data type. There are other things that become easier while dealing with Generics. There are broadly two types of operations that can be performed on a list of elements: 1. Loc ation centric operations 2. Da ta centric operations Adding some elements at the front and deleting elements at an index are a couple of examples of location-centric operations on a list of data. In such operations, the user doesn't need to know about the data. It's just some memory manipulation at best. However, if the request is to delete every odd number from a list of integers, then that's a data-centric operation. To be able to successfully process this request, the method has to know how to determine whether an integer is odd or not. This might sound trivial for an integer; however, the point is the logic of determining whether an element is a candidate for deletion or not, is not readily known to the compiler. It has to be delegated. Before Generics appeared in .NET 2.0, people were using (and unfortunately these are still in heavy use) non-generic collections that are capable of storing a list of objects. As an object sits at the top of the hierarchy in the .NET object model, this opens floodgates. If such a list exists and is exposed, people can put in just about anything in that list and the compiler won't complain a bit, because to the compiler everything is fine as they are all objects. Chapter 1 [ 11 ] So, if a loosely typed collection such as ArrayList is used to store objects of type T, then for any data-centric operation, these must be down-casted to T again. Now, if somehow an entry that is not T, is put into the list, then this down-casting will result in an exception at runtime. Suppose, I want to maintain a list of my students, then we can do that by using ArrayList to store a list of such Student objects: class Student { public char Grade { get; set; } public int Roll { get; set; } public string Name { get; set; } } //List of students ArrayList studentList = new ArrayList(); Student newStudent = new Student(); newStudent.Name = "Dorothy"; newStudent.Roll = 1; newStudent.Grade = 'A'; studentList.Add(newStudent); newStudent = new Student(); newStudent.Name = "Sam"; newStudent.Roll = 2; newStudent.Grade ='B'; studentList.Add(newStudent); foreach (Object s in studentList) { //Type-casting. If s is anything other than a student Why Generics? [ 12 ] //or a derived class, this line will throw an exception. //This is a data centric operation. Student currentStudent = (Student)s; Console.WriteLine("Roll # " + currentStudent.Roll + " " + currentStudent.Name + " Scored a " + curr entStudent.Grade); } What's the problem with this approach? All this might look kind of okay, because we have been taking great care not to put anything else in the list other than Student objects. So, while we de-reference them after boxing, we don't see any problem. However, as the ArrayList can take any object as the argument, we could, by mistake, write something similar to the following: studentList.Add("Generics"); //Fooling the compiler As ArrayList is a loosely typed collection, it doesn't ensure compile-time type checking. So, this code won't generate any compile-time warning, and eventually it will throw the following exception at runtime when we try to de-reference this, to put in a Student object. Then, it will throw an InvalidCastException: What the exception in the preceding screenshot actually tells us is that Generics is a string and it can't cast that to Student, for the obvious reason that the compiler has no clue how to convert a string to a Student object. Unfortunately, this only gets noticed by the compiler during runtime. With Generics, we can catch this sort of error early on at compile time. Chapter 1 [ 13 ] Following is the generic code to maintain that list: //Creating a generic list of type "Student". //This is a strongly-typed-collection of type "Student". //So nothing, except Student or derived class objects from Student //can be put in this list myStudents List myStudents = new List(); //Adding a couple of students to the list Student newStudent = new Student(); newStudent.Name = "Dorothy"; newStudent.Roll = 1; newStudent.Grade = 'A'; myStudents.Add(newStudent); newStudent = new Student(); newStudent.Name = "Sam"; newStudent.Roll = 2; newStudent.Grade = 'B'; myStudents.Add(newStudent); //Looping through the list of students foreach (Student currentStudent in myStudents) { //There is no need to type cast. Because compiler //already knows that everything inside this list //is a Student. Console.WriteLine("Roll # " + currentStudent.Roll + " " + currentStudent.Name + " Scored a " + currentStudent.Grade); } The reasons mentioned earlier are the basic benefits of Generics. Also with Generics, language features such as LINQ and completely new languages such as F# came into existence. So, this is important. I hope you are convinced that Generics is a great programming tool and you are ready to learn it. Why Generics? [ 14 ] Reason 3: Generics leads to faster code In the .NET Framework, everything is an object so it's okay to throw in anything to the non- generic loosely typed collection such as ArrayList, as shown in the previous example. This means we have to box (up-cast to object for storing things in the Arraylist; this process is implicit) and unbox (down-cast the object to the desired object type). This leads to slower code. Here is the result of an experiment. I created two lists, one ArrayList and one List to store integers: And following is the data that drove the preceding graph: ArrayList List 1323 185 1303 169 1327 172 1340 169 1302 172 The previous table mentions the total time taken in milliseconds to add 10,000,000 elements to the list. Clearly, generic collection is about seven times faster. Chapter 1 [ 15 ] Reason 4: Generics is now ubiquitous in the .NET ecosystem Look around. If you care to develop any non-trivial application, you are better off using some of the APIs built for the specific job at hand. Most of the APIs available rely heavily on strong typing and they achieve this through Generics. We shall discuss some of these APIs (LINQ, PowerCollections, C5) that are being predominantly used by the .NET community in this book. So far, I have been giving you reasons to learn Generics. At this point, I am sure, you are ready to experiment with .NET Generics. Please check out the instructions in the next section to install the necessary software if you don't have it already. Setting up the environment If you are already running any 2010 version of Visual Studio that lets you create C# windows and console projects, you don't have to do anything and you can skip this section. You can download and install the Visual Studio Trial from http://www.microsoft.com/ download/en/details.aspx?displaylang=en&id=12752. Once you are done, you should see the following in your program menu: After this, start the program highlighted in the preceding screenshot Microsoft Visual Studio 2010. Why Generics? [ 16 ] Then go to the File menu to create a console project: Now, once that is created, make sure the following namespaces are available: If these are available, you have done the right setup. Congratulations! Chapter 1 [ 17 ] Summary My objective for this chapter was to make sure you get why Generics is important. Following are the points again in bullets: ‹‹ It ensures compile-time type checking, so type safety is ensured. ‹‹ It can yield the right code for the data type thrown at it at runtime, thus saving us a lot of typing. ‹‹ It is very fast (about seven times) compared to its non-generic cousins for value types. ‹‹ It is everywhere in the .NET ecosystem. API/framework developers trust the element of least surprise and they know people are familiar with Generics and their syntax. So they try to make sure their APIs also seem familiar to the users. In the end, we did an initial setup of the environment; so we are ready to build and run applications using .NET Generics. From the next chapter, we shall learn about .NET Generic containers and classes. In the next chapter, we shall discuss the Generic container List that will let you store any type of data in a type safe way. Now that you know that's important, let's go there. 2Lists Lists are everywhere, starting from the laundry list and grocery list to the checklist on your smartphone's task manager. There are two types of lists. Simple lists, which just store some items disregarding the order allowing duplicates; and other types which don't allow duplicates. These second types of lists which don't allow duplicates are called sets in mathematics. The other broad category of list is associative list, where each element in some list gets mapped to another element in another list. In this chapter, we shall learn about these generic containers and related methods. We shall learn when to use which list-based container depending on the situation. Reading this chapter and following the exercises you will learn the following: ‹‹ Why bother learning about generic lists? ‹‹ Types of generic lists and how to use them Don't forget the milk, honey! Lists [ 20 ] ‹‹ Give proper rationale for choosing a generic list such as one container over the other ‹‹ How to create custom generic list-based containers So let's get on with it... Why bother learning about generic lists? Let us see why generic lists are important: ‹‹ It ensures compile-time type checking. So type-unsafe code becomes a thing of the past. ‹‹ It saves us a lot of typing, because compiler can generate type-safe code for generic types at runtime. So the same code works for different data types. ‹‹ It is faster in many cases than its non-generic cousins, because it removes the need to box and unbox (also known as type-casting) from the code. ‹‹ It makes source code much cleaner than its older, non-generic, unsafe cousins. In this chapter, you shall find examples of how generic list-based containers make these possible. Happy listing! Types of generic lists There are mainly three types of lists you can imagine or would ever need: 1. Simple list: It allows duplicates by default. 2. Se ts: It doesn't allow duplicates by default. 3. Associa tive list: It helps to represent a relationship between two types of entities. A simple list is just a collection of items. Say a laundry list, a grocery list, a task list, a list of courses. The elements in a simple container don't associate them with any other element. There are several generic container classes to represent simple lists. They are Stack, Queue, List, LinkedList, and SortedSet. These are generic implementations of classic data structures. So Stack is a Last In First Out (LIFO) list, Queue is a First In First Out (FIFO) list, List is a simple unordered list, and so is LinkedList. But List supports random integer indexing natively, while the others don't. Sets are simple lists that don't allow duplicates. All sets implement the interface ISet. There are two set implementations HashSet and SortedSet. HashSet doesn't maintain the order of elements, while SortedSet keeps the elements sorted, if it knows how to sort the type. For custom types, you can pass a custom IComparer instance to delegate the task of comparing. For HashSet, you can pass IEqualityComparer for custom equality comparison. Chapter 2 [ 21 ] An associative list, as the name suggests, brings in an association between two different entities and represents them in a single list for example, marks obtained by a student in a different semester of a course, share values for a company over the years, and so on. To represent these type of lists, there is an interface IDictionary and SortedList is a derived class that implements this interface. So we can use SortedList to represent any associative list in .NET Generics. SortedList is also a list. This is implemented using an array. So unlike other IDictionary implementations, SortedList supports indexing over keys and values. So the order of insertion in a SortedList determines the order of elements; unlike other IDictionary implementations, as we shall see in the next chapter. If you were wondering what the heck are TKey and TValue, they are short and quite traditional ways of representing the type of key and type of value respectively. As generic lists can support any type, they are represented in this way. So at runtime the compiler generates appropriate code for the specified data type. This greatly helps to reduce code duplication and type mismatch issues. In a nutshell: Stack This is a generic Stack implementation. This is first in last out. Queue This is a generic Queue implementation. This is a first in first out list. List This is a generic implementation of a simple list. This offers native random zero based integer indexing like an array. LinkedList This is a generic implementation of a classic double-linked list. No native random indexing is offered. HashSet This is a generic Set implementation. The elements are not sorted. For custom equality comparison, you can pass the IEqualityComparer instance. SortedSet This is a generic Set implementation. The value type elements remain sorted, by default. For reference types, we have to pass custom-comparer objects for custom element sorting; you can use the IComparer instance. SortedList Sometimes you want an associative container, but you still want to be able to access each key-value pair, like you do in a plain old array using the index. Then all you need is a SortedList. Lists [ 22 ] Now that we have some basic idea about these generic list based containers, we shall see how these can be used to solve some real-world problems. Let's get started with a fun app! A generic collection can contain objects of value and reference types. This means you can create nested types. For example, if you want a list of stacks, you can get that by declaring List>. If you want a SortedList of integer keys and associate a list of string values with each such integer key, you can write SortedList>. Checking whether a sequence is a palindrome or not All the following sequences share a common attribute. They all read the same both ways, save for the punctuations "Madam I'm Adam" "A Man, A Plan, A Canal, Panama!!" 12321 Detroit-Chicago-Detroit (This reads same word by word backwards, unlike others that read the same character by character.) A Toyota These types of sequences are called palindromes. A sequence is palindromic too, if the contents read the same both ways, for example, A-B-A. In this example, we shall see how a generic stack (that is, a Stack instance) can be used to implement this, so that we can detect whether a given sequence is a palindrome or not for any given type that knows how to compare itself (that is, implements the IComparable interface). So the task at hand is to build a generic palindrome checker that can check whether any sequence is a palindrome or not. All we have to do is let it know the type of the sequence at runtime. The algorithm to find whether a sequence is a palindrome or not is really simple. If we arrange the tasks to be done in order to check whether the given sequence is a palindrome or not, it will be like: Chapter 2 [ 23 ] Pick items from Input Put them in reverse order Retrieve elements from Input and Reversed Sequence Compare these two Equals Sequence Ends ? The sequence is not palindrome The sequence is palindrome No No Yes Yes The sequence can be of any type; char, string, integer, float, and so on to name a few. Also, note that the algorithm is not different for all these different types. However, we shall have to write several methods for supporting several data types. However, if we use Generics we can avoid this code duplication for each type. We can write a single method (also known as Generic Method) that can write the data type-specific code at runtime depending on what type of data type we want. The tradition is to use the letter T (which is short for Type) to create the generic method. When we consume this method, we pass concrete Types that conform to the standard expectations from the Type; like in this case, we expect the Type to implement IComparable. The heart of the solution is to reverse the elements of the input sequence. This can be done by a stack. Since we want to make the method generic, we need a generic stack. Lists [ 24 ] Time for action – creating the generic stack as the buffer Follow the given steps: 1. Create a console application project. Call it Palindrome Checker. 2. Cr eate a method called IsPalindromic() with the following signature: public static bool IsPalindromic(IEnumerable inputSequence) where T:IComparable { Stack buffer = new Stack(); return true; } 3. Add the f ollowing code in the Main() method: static void Main(string[] args) { bool status = IsPalindromic(new int[] { 1, 2, 2, 1 }); } 4. T ry to compile the program. What just happened? You will see a compiler error as shown in the following screenshot: Let's see why that compile-time error occurred. Before that, let's learn a little more about generic stacks. Creating a generic stack is very easy. Here is how: Stack myGenericStack = new Stack(); So if you want a generic stack to be used to store a stack of string, then you change T with string as follows: Stack myCDs = new Stack(); Chapter 2 [ 25 ] If you want a stack of integers, you just say: Stack opCodes = new Stack(); If you are dealing with reference types, then you can declare a stack of base types and put derived types in that. Say if D is a class that implements an ID interface, then if you create a stack of ID objects, you can put ID objects in the Stack as follows: Stack somethings = new Stack(); Somethings.Push(new ID()); However, the method needs to know how to compare the elements in the buffer. So now the obvious question you might have is, if we write the code using a generic place holder (T in this case) which gets replaced by the original data type at runtime, how will the method know at compile time how to compare variables of such data type? Well, that's a very good question and the answer lies in the very first line of the method. Let's take a close look at it: public static bool IsPalindromic(IEnumerable inputSequence) where T:IComparable The keyword where is used to constrain the types that can be used as a valid argument in the generic method IsPalindromic(), we just created. where T:IComparable means T has to be a type that implements the IComparable interface. This syntax is known as Constraints in Generics. Now, let's get back to the compilation error part. Generics ensure compile-time type checking as follows: bool status = IsPalindromic(new int[] { 1, 2, 2, 1 }); In this code, IsPalindromic promises that everything that follows in the argument to this method will be of type character. However, we have an integer array as the input sequence. This is causing a type mismatch and thus is caught at compile time. We are passing an array as the parameter, where the method expects something of type IEnumerable. This is possible because IEnumerable inherits from IEnumerable and good old arrays implement IEnumerable too. Thus, IEnumerable serves as a bridge between pre-generic collections and generic collections. Lists [ 26 ] Time for action – completing the rest of the method Follow the given steps: 1. Add the f ollowing code after the initialization of the buffer in the source, as carried out previously in step 2: foreach (T element in inputSequence) { buffer.Push(element); } Count() and ElementAt() are extension methods and are available at System. Linq namespace in .NET Framework 3.5 and above. You can get it with Visual Studio 2008 and above. So please include that namespace: for (int i = 0; i < inputSequence.Count(); i++) { if(buffer.Pop().CompareTo(inputSequence.ElementAt(i))==0) continue; else return false; } Make sure you retain the last return true; statement. 2. R eplace the content of the Main() method with the following: static void Main(string[] args) { //Checking if a string (Which is basically a sequence of //characters) bool status1 = IsPalindromic("LIRIL"); //Checking if an array of string is palindromic bool status2 = IsPalindromic(new string[] {"mango","apple","mango"}); //Checking if an array of int is palindromic bool status3 = IsPalindromic(new int[] { 1,2,3,4,5,6,0,6,5,4,3,2,1 }); //Checking if an array of float is palindromic bool status4 = IsPalindromic(new float[] { 1.34F,2.34F,43.1F}); Console.WriteLine(status1 + " " + status2 + " " + status3 + " " + status4); } 3. Run the program. Chapter 2 [ 27 ] What just happened? The program shall print the following to the console: The following looping construct: foreach (T element in inputSequence) { buffer.Push(element); } pushes all the elements in the input sequence to a stack. The stack is a last in first out list. It keeps the elements in reverse order. So if you want to add an element to the top of the stack, you should use the push method. So when the method is called by bool status1 = IsPalindromic("GENERI CS"); it stores the characters of the input sequence in the internal generic stack buffer as follows: G E N E R I C S Input Sequence G E N E R I C Stack buffer push() buffer.Pop() returns the top-most element from the stack and removes that from the stack. The ElementAt() method returns the ith element from the input sequence. As long as these two are equal, the sequence might be a palindrome. So, we must continue going through the input sequence. However, if they mismatch at any point, it becomes evident immediately that the input sequence is not a palindrome. Thus, we can return False. However, if we can get to the end of the input sequence without encountering a mismatch, the input sequence is a palindrome and thus we have to return True. D o wnload from Wow! eBook D o wnload from Wow! eBook Lists [ 28 ] Let's now try to decode the output. Among four input sequences in the Main() method, the first three are palindromes. However, the last one doesn't read the same both ways and surely isn't a palindrome. Output matches that. We get True for the first three input sequences and False for the last one. Any type that implements IComparable would work. Other types will fail. Designing a generic anagram finder Generics are not limited to lists and you can even mix parameterized types with specialized types as well. For instance, what if you want to store the frequency of occurrence of a string, number, or a symbol. While the type of what you keep the frequency of is unknown or undecided, however you see that the frequency itself is always a number. Let's take a look at how we can implement that using anagrams as an example. Anagrams are as fascinating as palindromes. Two sequences are said to be anagrams of each another, when they share the same number of elements in the same frequency. For example, two strings "The eyes" and "They see" both have the same number of characters, and the frequency of each character in these two strings is the same. Both of them have a single "T", a couple of "e"s, and so on. Thus, they are anagrams of each other. http://www.anagrammy.com/ organizes a competition for the best anagrams each month. Assume your friend wants to participate and is now seeking your help to develop a program that can find anagrams of a given word. Let's get into action and help your buddy. A strategy to check whether two sequences are anagrams of each other or not is really simple. We need a container that would let us store each unique element of the sequence and their frequency of appearance in the sequence. For example, the string "The eyes" will be stored in such a container as follows: Element Frequency T H E Y S 1 1 3 1 1 Chapter 2 [ 29 ] If another string stored in the same way shares the same frequency of characters, then that will be an anagram of "The eyes". So, if we deduct 1 from the frequency as soon as we get the same character in the second string; we shall have all zeros in the frequency column of the preceding table at the end of scanning the second string; in this case, we are dealing with two anagrams. This algorithm is applicable to any type of anagram sequence, not only for strings. So, we can write a generic method to check whether two sequences are anagrams of each other or not, for any type. Time for action – creating the method Follow the given steps: 1. Cr eate a console application called Ganagram (short form for generic anagram). This is a showcase of SortedList, solving a real problem. 2. Add the f ollowing method in the Program.cs file: public static bool IsAnagram(IEnumerable first, IEnumerable second) { SortedList val = new SortedList<(); foreach (T element in first) { if (!val.ContainsKey(element)) val.Add(element, 1); else val[element]++; } foreach (T element in second) { if (val.ContainsKey(element)) val[element]--; else return false; } foreach (T element in val.Keys) { if (val[element] != 0) return false; Lists [ 30 ] } return true; } 3. Call this method from the Main() method of the console application as follows: static void Main(string[] args) { bool status = IsAnagram("dream", "armed"); bool status2 = IsAnagram("yesterday today tomorrow". Split(' '), "tomorrow today yesterday".Split (' ')); bool status3 = IsAnagram(new int[]{1,2,3}, new int[]{ 3,4,1}); Console.WriteLine(status + " " + status2 + " " + status3); } 4. Compile and execute the program. What just happened? As you execute the program, you will see the following output: "dream" and "armed" are anagrams of each other so the first one returned True. So are the sentences where the words "Yesterday", "Today", and "Tomorrow" appear in different locations. However, the last sequence of integers are not anagrams of each other because the second one has a four and the first one doesn't. That's why the generic method returns False for that one. Ok. So the method works well. Let's see how it works. SortedList means a sorted list where the data type of the key is T and that of the value is an integer, there is no duplicate entry, and the entries in the SortedList are sorted. SortedList can be used to store key-value pairs as long as the keys are unique. This is the condition imposed by IDictionary and thus has to be followed by all IDictionary implementations. A sequence can be expressed as a list of key-value pairs as shown earlier. Keys are the individual elements of the sequence and the value for each key is the frequency of that particular element in the sequence. Chapter 2 [ 31 ] At runtime, when we pass two strings, the compiler generates code for SortedList. In the next loop, we are adding all the elements from the first sequence to this SortedList. The ContainsKey() method returns true if the key is present in the sorted list. Otherwise it returns false. Warning! As the SortedList can only store key-value pairs with unique keys, it is good practice to check for the existence of the key before trying to add it. If the key is already present, it will throw the following exception: The following screenshot shows the exception that occurs when we want to add a duplicate key: val.Add(element, 1); adds the element for the first time, when it is not present in the SortedList. However, from the second time onwards the frequency of the element is increased by the code snippet val[element]++;. The SortedList class implements the IDictionary interface. So there is a concrete implementation for SortedList of the interface indexer: TValue this[TKey key] { get; set; } Lists [ 32 ] This is why the corresponding value can be obtained by passing the key in square brackets. So in the statement val[element]++; val[element] refers to the corresponding frequency of the key, whose value is stored in the element. Thus, val[element]++ means increasing the corresponding element by unity. In the third loop, we are iterating over the second sequence. If the sequence contains an element not present in the SortedList already, we don't need to look any further. They can't be anagrams. However, if the current element of the second sequence is present in the SortedList as a key, then we must decrease the frequency by unity; and continue to do so as long as we don't hit the end of the second sequence or encounter an element that is not present as a key in the SortedList already. The last loop checks whether all the values of the SortedList are zero or not. If they are, both the sequences are anagrams to each other, otherwise they aren't. We return the status accordingly. Have a go hero – use this method to find anagrams from a dictionary Now that we have the method that can find whether two given sequences are palindromes to each other or not, can you write a program that uses this method and finds all the anagrams of the given word? You can use any dictionary you want. You can verify your program by several word anagram duos listed at http://sudipta.posterous.com/ found-these-word-beauties-on-the-taxi-ride-ho. Life is full of priorities, let's bring some order there As gadget addicts, there are so many gadgets we desire to buy, but often our budget is limiting. So, let's keep a prioritized list of gadgets we hope to buy. Priority queues come to our help here. Let's see how we can create our own implementation of the generic priority queue. .NET doesn't offer a built-in generic priority queue. As there can be more than one gadget that you want with the same zest, so the shopping list can have more than one element with the same priority. We shall represent the priority queue as a SortedList, where the priority will be represented by any type that can be compared for equality and the values will be represented by a list of values of type TValue. Chapter 2 [ 33 ] Time for action – creating the data structure for the prioritized shopping list Follow the given steps: 1. Cr eate a console application. 2. Add a class called PriorityQueue to the project. Change the signature of the class header as follows: public class PriorityQueue where Tkey : IComparable 3. Th e heart of this PriorityQueue is a SortedList. I want to call it innerDS, add this private variable as follows: private SortedList> innerDS; public PriorityQueue() { innerDS = new SortedList>(); } 4. Go t o the Main() method of the console app and add the following code: PriorityQueue gadgets = new PriorityQueue(); 5. T ry to compile. You will see that it compiles. It will run too. But don't expect any visible output yet. What just happened? When you run the console application, you will see nothing happens. But in the background the compiler created a type-safe copy of the generic priority queue, where the priorities are of type integer and the values are of type string. Now, go and change the priority queue declaration as follows: PriorityQueue set = new PriorityQueue(); and try to compile. You will get the following compile-time error: There is no implicit reference conversion from 'System.TimeZone' to 'System.IComparable'. Lists [ 34 ] This will happen because the TimeZone class doesn't implement the Compare() method of the IComparable interface. Time for action – let's add some gadgets to the list and see them Follow the given steps: 1. Add the f ollowing code, in the same PriorityQueue class: /// /// Adds a new entry to the priority queue /// /// priority of the item. /// the item itself. public void Enque(Tkey key, TValue value) { if (!innerDS.ContainsKey(key))//O(log n) { List vals = new List(); vals.Add(value); innerDS.Add(key, vals); } else if (!innerDS[key].Contains(value)) innerDS[key].Add(value); } 2. Add this method in the PriorityQueue class as follows: /// /// Gets elements of given priority /// /// Given priority /// A list of elements that has this priority public IEnumerable< TValue > GetElemenets(Tkey priority) { try { return innerDS[priority]; } catch (KeyNotFoundException ex) Chapter 2 [ 35 ] { //return empty list. So the caller wouldn't crash. return new List(); } } 3. Go t o the Main() method and replace the existing content with the following: PriorityQueue gadgetShoppingList = new PriorityQueue(); gadgetShoppingList.Enque(1, "iPhone"); gadgetShoppingList.Enque(2, "iPad"); gadgetShoppingList.Enque(1, "MacBook"); gadgetShoppingList.Enque(3, "Amazon Kindle"); var gadgetsIWantMost = gadgetShoppingList.GetElemenets(1); foreach (string gadgetName in gadgetsIWantMost) Console.WriteLine(gadgetName); Console.ReadLine(); What just happened? This will print the following output. Can you explain why? Now, let's see what caused this. The first line in the Main() method: PriorityQueue gadgetShoppingList = new PriorityQueue(); creates a specific copy of a generic priority queue, where the priorities are of type integer, and data types of objects to be stored as values are set to string. I wanted to assign integer priorities to the gadgets I want to buy. This call internally replaces the TKey with int and TValue with string at runtime, in the generic implementation of the PriorityQueue defined previously. Lists [ 36 ] Note that the internal data structure is a sorted list, where each key of the list maps to a list of values depicted by List. So, with the four calls to the Enque() method, the internal SortedList will look as follows: 1 2 3 iPhone iPad Amazon Kindle MacBook In the Enque() method, the elements are entered, if the priority attached to the element is not already present in the innerDS.Keys. However, users might get over-enthusiastic and add the same element twice or more. That will result in duplicate entries in the prioritized list. In order to avoid that innerDS[key].Contains(value) is used to check whether this gadget is already there or not. Now that we know how the elements made their way to the inner list, let's see what happens when we try to retrieve the elements for a given priority. As described earlier, the value for a corresponding key from a SortedList is found by indexing over the key. So in the method GetElements() innerDS[priority]; shall return all the gadgets that were added with the given priority. However, if no element exists with that priority, it will throw a KeyNotFoundException. In such a case, we shall return a blank list of TValue to match the method signature. In order to validate this point, do a little experiment. Add the following code in the Main() method: Console.WriteLine(gadgetShoppingList.GetElemenets(4).Count()); This will print "0" to the console as there is no element in the ShoppingList with priority 4. Chapter 2 [ 37 ] Tips! If we get an exception of type KeyNotFoundException and the caller method is expecting a list of items, it is best practice to return an empty list so that the caller method doesn't have to do anything special with the return value. Now suppose you have enough money to buy a few of the top gadgets you wanted, it's time to strike them off the ShoppingList. Let's see how. Time for action – let's strike off the gadgets with top-most priority after we have bought them Follow the given steps: 1. Add the f ollowing method in the PriorityQueue class: /// /// Removes the first element with top priority /// public void Deque()() { if (innerDS.Count > 0) { int i; for (i = 0; i < innerDS.Count; i++) if (innerDS[innerDS.Keys[i]].Count > 0) break; try { innerDS[innerDS.Keys[i]].RemoveAt(0); } catch (KeyNotFoundException ex) { return; } } } Lists [ 38 ] 2. Add the following property in the PriorityQueue class: /// /// This is a port between other functionalities that are omitted. /// This returns a Queue implementation of the Prioritized Queue. /// public Queue OriginalQueue { get { Queue actual = new Queue(); foreach (Tkey key in innerDS.Keys) { foreach (TValue value in innerDS[key]) { actual.Enqueue(value); } } return actual; } } This OriginalQueue, read-only property, flattens the SortedList instance and arranges all the elements in the PriorityQueue maintaining the order using a Queue instance. 3. Add the f ollowing property in the PriorityQueue class: /// /// The first element with top priority /// public TValue First { get { try { int i; for (i = 0; i < innerDS.Count; i++) if (innerDS[innerDS.Keys[i]].Count > 0) break; return innerDS[innerDS.Keys[i]][0]; } Chapter 2 [ 39 ] catch (Exception ex) { return default(TValue); } } } 4. Add the f ollowing code in the Main() method: static void Main(string[] args) { PriorityQueue gadgetShoppingList = new PriorityQueue(); gadgetShoppingList.Enque(1, "iPhone"); gadgetShoppingList.Enque(2, "iPad"); gadgetShoppingList.Enque(1, "MacBook"); gadgetShoppingList.Enque(3, "Amazon Kindle"); //Returning the first element from the prioritized Shopping List string gadgetIWantTheMost = gadgetShoppingList.First; Console.WriteLine("Buying {0}",gadgetIWantTheMost ); Console.WriteLine("Bought " + gadgetIWantTheMost); //Bought the first gadget. Let's strike it off the list. gadgetShoppingList.Deque(); //Let's see what else I can buy if I get some more dollars ;) Console.WriteLine("Need more cash to buy "); Queue gadgets = gadgetShoppingList. OriginalQueue; foreach (string gadgetName in gadgets) Console.WriteLine(gadgetName); } 5. Compile and run the program. Lists [ 40 ] What just happened? This will print the following output to the console: Let's see how we got there. The property First returns the value of the first element of the PriorityQueue or in this case the gadget I want most. innerDS[innerDS.Keys[i]].count returns the list of gadgets, where the priority is the ith key of the inner SortedList. So first in the list of gadgets that has at least one element are the gadgets of top-most priority. The first item (which is located at index zero of such a list) is the item with top-most priority and entered in the ShoppingList first. So, technically that's the first element with top-most priority. This is referenced by innerDS[innerDS. Keys[i]][0]. The method Deque() just removes this first element from the priority queue. The property OriginalQueue returns a queue of gadgets in order of their priority. Using this generic PriorityQueue, we can deal with other priorities of our life too. Say we want to schedule our meetings and appointments. This is how we can do it. Time for action – let's create an appointment list Follow the given steps: 1. Add the f ollowing code to the Main() method: //Creating a priority queue to store the appointments. PriorityQueue myAppointments = new PriorityQueue(); //Adding appointments Chapter 2 [ 41 ] myAppointments.Enque(DateTime.Today.AddDays(1), "Dental checkup @ Doctor Smiley :)"); myAppointments.Enque(DateTime.Today.AddDays(3), "Weekly grocery shopping"); myAppointments.Enque(DateTime.Today.AddDays(1), "Trip to Art Gallery."); myAppointments.Enque(DateTime.Today.AddDays(2), "Son's football tournament at school"); //Listing all appointments as per they are scheduled foreach (string task in myAppointments.OriginalQueue) Console.WriteLine(task); 2. Compile and run the program. What just happened? This will print the following output: Note that the tasks are scheduled as per the date. Live sorting and statistics for online bidding Imagine you work for an online bidding company such as eBay. Your company deals in several product lines. On the company website, live bidding prices for all the products have to be displayed. In this example, we shall see how LinkedList can be used to offer such functionalities. Lists [ 42 ] There are a lot of hits on the server per second, so sorting the data at regular intervals is not an option. You have to come up with a structure such that the bidding prices get inserted in a list in the sorted order as the user submits them from the site console. This is what I like to call live sorting. Moreover, as two or more people can quote the same bid amount, so the list where you store the bidding values must allow duplicates. Your boss is interested to know the following: ‹‹ What are the actual bid amounts submitted for any product? ‹‹ What is the range of bid amounts (minimum and maximum) for any product? ‹‹ What is/are the common bidding amounts across different product lines? Besides this, your boss would be happy to see whether you can design a system to capture the demographics of the bidders. This will help your company distribute targeted advertisements. For example, which newspaper is mostly read by people who bid for smartphones and MP3 players? Thankfully, .NET can help and we can build a couple of generic structures to solve this problem. First, we shall build a structure that will help us solve the live sorting challenge. Later, we shall use this and some other generic .NET classes to solve the remaining demographics issues. Time for action – let's create a custom class for live sorting Follow the given steps: 1. Cr eate a class library project. Call it LiveSortedListAPI. 2. Ch ange the name of the class from Class1.cs to LiveSortedList.cs. 3. Ch ange the header of the class as follows: public class LiveSortedList where T:IComparable 4. Add the f ollowing private variables and properties: LinkedList innerDS; LinkedList innerDSDesc; LinkedList actualValues; public List LiveSortedValues { get { return new List(innerDS); } Chapter 2 [ 43 ] } public List LiveSortedValuesDesc { get { return new List(innerDSDesc); } } public List ActualValues { get { return new List(actualValues); } } 5. Add the c onstructor as follows: public LiveSortedList() { innerDS = new LinkedList(); innerDSDesc = new LinkedList(); actualValues = new LinkedList(); } /// Add the following method to add elements in the Live Sorted /// List /// /// Adds a new element in the LiveSortedList /// /// The value to be added. public void Add(T val) { //Adding to the actual list actualValues.AddLast(val); //If there is no item in the list, //This is the first item in the list if (innerDS.Count == 0) { //Adding the first item to the ascending values list innerDS.AddFirst(val); //Adding the first item to the descending values list Lists [ 44 ] innerDSDesc.AddFirst(val); } else { //If the current value is less than first value //in the ascending list if (innerDS.First.Value.CompareTo(val) >= 0) { //Add this as the first item in the ascending values list innerDS.AddFirst(val); //Add this as the last item in the descending values list innerDSDesc.AddLast(val); //return the control return; } //If the current value is more than the last value //in the ascending list if (innerDS.Last.Value.CompareTo(val) <= 0) { //Add this as the last item in the ascending values list innerDS.AddLast(val); //Add this as the first item in the descending values list innerDSDesc.AddFirst(val); //return the control return; } //For all other range of values of the current value else { T temp = default(T); //Iterate over the current ascending collection to //find the proper place of insertion. foreach (T t in innerDS) { if (t.CompareTo(val) > 0) { //Temp is the value Just greater than the current value temp = t; break; } } //Add this value before temp in the ascending list innerDS.AddBefore(innerDS.Find(temp), val); Chapter 2 [ 45 ] //Add this value after temp in the descendng list innerDSDesc.AddAfter(innerDSDesc.Find(temp), val); } } } 6. Add a console application project to this solution, where you created the class library. 7. In the Main() method of the console application, add the following code: LiveSortedList bids = new LiveSortedList(); //Say, Adding bidding amount for a iPod bids.Add(12.50); bids.Add(11.50); bids.Add(11.32); bids.Add(13.35); bids.Add(11.50); bids.Add(12.60); bids.Add(18.40); bids.Add(19.50); bids.Add(11.65); foreach (double bidAmount in bids.LiveSortedValues) Console.WriteLine(bidAmount.ToString("$00.00")); 8. Compile and run the program. Don't forget to set the console application as the start-up project. Lists [ 46 ] What just happened? As you execute the program, you will see the following output: LinkedList is a generic linked list that stores elements of type T in LinkedListNode. A linked list offers several methods to insert nodes in several parts of the list. The names of these methods are very straightforward. AddFirst() lets you add an element at the start of the linked list. AddLast() lets you add an element at the end of the linked list. AddBefore() lets you add an element before a particular node in the linked list, whereas AddAfter() lets you add an element after a particular node in the linked list. The properties First and Last represent the first and the last node of a linked list. First. Value returns the value of the first node. Last.Value returns the value of the last node. These nodes are of type LinkedListNode. The Add() method of LiveSortedList class works very simply. If the element to be added is less than the first value, then we have to add it as the first element. If, however, the element to be added is more than the last value, then we have to add it as the last element. However, if the element's value is somewhere in between, we need to find the LinkedListNode whose value is just greater than the element's value. The Find() method of the LinkedList class returns the LinkedListNode that has the value passed as an argument of Find(). So in the call innerDS. AddBefore(innerDS.Find(temp), val); innerDS.Find(temp) returns the LinkedListNode whose value is temp. So this call will insert val before temp in the LinkedList innerDS. D o wnload from Wow! eBook D o wnload from Wow! eBook Chapter 2 [ 47 ] Why did we have three LinkedList as part of the data structure? 1. One f or keeping the bid values/elements to be added as they appear (in order). 2. One f or keeping the bid values/elements in ascending order. 3. One f or keeping the bid values/elements in descending order. Pop quiz 1. Fill in the blank to insert 3 after 1. LinkedList numbers = new LinkedList(); numbers.AddLast(1); numbers.AddLast(11); numbers.AddAfter(___________); An attempt to answer questions asked by your boss For now, assume that your company deals with bidding on iPad, iPhone, MacBook, and iPod. So you shall have to keep an eye on the bidding prices of these products. This means we would need a way of associating product name/ID to the live sorted bid amount values. SortedList as discussed earlier, fits the bill perfectly. Let's see how. Time for action – associating products with live sorted bid amounts Follow the given steps: 1. St ay in the class library project, where we defined LiveSortedList. Add a different class. Call it MultiLiveSortedList.cs. 2. Ch ange the class header as follows: public class MultiLiveSortedList where TKey:IComparable where TValue:IComparable Lists [ 48 ] 3. Add the f ollowing variable and the constructor: SortedList> innerDS; public MultiLiveSortedList() { innerDS = new SortedList>(); } 4. Add the f ollowing method to the class to add elements in the list: public void Add(TKey key, IEnumerable values) { if (!innerDS.ContainsKey(key)) { innerDS.Add(key, new LiveSortedList()); foreach (TValue val in values) innerDS[key].Add(val); } else { foreach (TValue val in values) innerDS[key].Add(val); } } 5. Add the f ollowing method to return bid values for a particular product: public List BidAmountsFor(TKey key) { try { return innerDS[key].LiveSortedValues; } catch { throw new KeyNotFoundException("No such product exist"); } } 6. Create a console app and attach the reference of this class library to that. Add the following code in the Main() method: MultiLiveSortedList categorizedBidValues = new MultiLiveSortedList(); //Adding live sorted bid amounts for each product Chapter 2 [ 49 ] categorizedBidValues.Add("iPod", new List(){12.50,11.50,11.32,13.35,11.50}); categorizedBidValues.Add("iPad", new List() { 212.50, 211.50, 211.32, 213.35, 213.50 }); categorizedBidValues.Add("MacBook", new List() { 212.50, 211.50, 300, 223, 320 }); categorizedBidValues.Add("iPhone", new List() { 333, 242, 302, 301.40, 310 }); //Finding Bid Amounts for the product "iPad" List BidsForIPad = categorizedBidValues. BidAmountsFor("iPad"); //Maximum Bid Amount for "iPad" double MaxBidForIPad = BidsForIPad[BidsForIPad.Count - 1]; //Minimum Bid Amount for "iPad" double MinBidForIPad = BidsForIPad[0]; Console.WriteLine("iPad bid amounts are between " + MinBidForIPad. ToString ("$.00") + " and " + MaxBidForIPad.ToString ("$.00")); 7. Compile and run the program. What just happened? As you execute the program, the following output will be printed to the console: The new class created is basically a SortedList in disguise where the TValue will be a LiveSortedList. So, you saw how two different generic collections are used together to solve a problem. LiveSortedList ensures that the bid amount values being added are always in a sorted order and the SortedList class ensures that these LiveSortedLists are kept indexed by the product name. innerDS[key], in this case, returns the LiveSortedList for the given key. innerDS[key].LiveSortedValues, as shown earlier, returns a list of sorted TValue values. Lists [ 50 ] Now is the time to take a look at the other questions your boss had. How to find common bidding amounts. That's conceptually different. That's actually a set theory problem. We have a list of values representing the bidding amount for several products. Now we need to find common values among these lists. This can be solved using an intersection of these two lists of values. .NET makes breathing easy with the introduction of the SortedSet class in the System.Collections.Generics namespace. Let's see how we can use this. Time for action – finding common values across different bidding amount lists Follow the given steps: 1. St ay in the MultiLiveSortedList class. Add the following methods: public SortedSet CommonAcross { get { SortedSet com; com = new SortedSet(innerDS.Values[0]. LiveSortedValues); for (int i = 1; i < innerDS.Count; i++) com.IntersectWith (new SortedSet(innerDS.Values[i].LiveSortedValues)); return com; } } public SortedSet CommonAmong(TKey key1, TKey key2) { SortedSet com; com = new SortedSet(innerDS[key1].LiveSortedValues); com.IntersectWith(new SortedSet(innerDS[key2]. LiveSortedValues)); return com; } Chapter 2 [ 51 ] 2. Go to the console app previously created. Add the following snippet after the previous code in the Main() method. Then compile and run the program: SortedSet mostCommonBiddingAmounts = categorizedBidValues.CommonAcross; SortedSet commonBidAmountForIPadAndMacBook = categorizedBidValues.CommonAmong("iPad","MacBook"); Console.WriteLine("Common Bid amounts for iPad and MacBook are "); foreach (double commonBidAmount in commonBidAmountForIPadAndMacBook) Console.WriteLine(commonBidAmount.ToString("$.00")); What just happened? As you execute the program, you will see the following output: SortedSet has several methods. The IntersectWith() method accepts another SortedSet as an argument and modifies the caller SortedSet object content with the intersection result. So this is an in-place operation. SortedSet has an overloaded constructor that can take an IEnumerable instance and create a sorted set out of it. It removes the duplicates as a set can't have duplicates. In the method CommonAmong(),innerDS[key1] returns a LiveSortedList instance. Thus, innerDS[key1].LiveSortedValues is a List instance that implements IEnumerable. So, this is a valid argument to the SortedSet constructor. Have a go hero – finding common demographic statistics for the bidders Using these new generic container classes, design a system to capture vital statistics about the demographics of the bidders. For example, you can store the product name/ID and the newspaper bidders as the values in an MultiLiveSortedList instance, as shown in the preceding example. Lists [ 52 ] You will win every scrabble game from now on I was playing the following word-making game with one of my friends. I lost a couple of times: http://www.eastoftheweb.com/games/index.php?p=games/multieight/1 So, I decided to write a program that will help me find all the words from the characters of a given word. This task is not very easy. But with Generics, this becomes lightweight. In this example, we shall create a console app, where the computer finds all the possible words from the T9 dictionary that uses a subset of the alphabet used in the given word. The following is a sample run: So, you see how it works. Let's see how we can build it. Time for action – creating the method to find the character histogram of a word Follow the given step: 1. Cr eate a console application and add the following method in the program: /// /// Finds the histogram of the characters of the word. /// /// The word for which the histogram of /// characters has to be found. Chapter 2 [ 53 ] /// Histogram of charactres private static SortedList GetWordHistogram(string word) { SortedList wordHistogram = new SortedList(); foreach (char c in word.ToArray()) { if (!wordHistogram.ContainsKey(c)) wordHistogram.Add(c, 1); else wordHistogram[c]++; } return wordHistogram; } Time for action – checking whether a word can be formed Follow the given step: 1. Add the f ollowing method in the Program.cs file: /// /// A word can be formed using a set of characters, if one of /// their histogram is a /// subset of the other in terms that they share some of the /// characters and /// occurrence of characters is not more than in the other one. /// /// The first character histogram /// The second histogram /// private static bool CanIFormAWord(SortedList firstOne, SortedList secondOne) { int count = 0; bool status = false; foreach (char c in firstOne.Keys) { //In order to be a subset all the characters //in this first SortedList has to be there //in the second one. if (secondOne.ContainsKey(c)) { Lists [ 54 ] //The frequency of these characters should not be exceeding //that of the other one. if (secondOne[c] >= firstOne[c]) { status = true; continue; } else { status = false; break; } } else { status = false; break; } } return status; } Time for action – let's see whether it works Follow the given step: 1. Add the f ollowing code snippet in the Main() method: static void Main(string[] args) { //Save the T9.txt file in the applications bin/debug directory. StreamReader sr = new StreamReader("T9.txt"); Console.WriteLine("Enter the word or phrase"); string word = Console.ReadLine(); //Stores the histogram of the current given word SortedList soughtWordHistogram = GetWordHistogram(word); //Stores histogram of all words in the SortedList SortedList> allWordHistogram = new SortedList>(); Chapter 2 [ 55 ] string line = string.Empty; while ((line = sr.ReadLine()) != null) { foreach (string w in line.Split(' ')) { string x = w.Trim(); if(x.Length > 0) if(!allWordHistogram.ContainsKey(x)) allWordHistogram.Add(x, GetWordHistogram(w)); } } sr.Close(); foreach (string w in allWordHistogram.Keys) { //Checks If the current word can be formed using the letters //of the given word if (CanIFormAWord(allWordHistogram[w], soughtWordHistogram)) { //If yes, print that Console.WriteLine(count.ToString() + ". " + w); count++; //print 20 words or less per page if (count % 20 == 0) { Console.WriteLine(); Console.WriteLine("---More--"); Console.WriteLine("---Hit to proceed --"); Console.ReadLine(); } } } //Wait for a final keystroke Console.ReadLine(); } Lists [ 56 ] Pop quiz 1. What will be the value of x in the following code snippet: SortedSet set = new SortedSet(new List() { 1, 2, 3, 2 }); int x = set.Count; a. 1 b. 2 c. 3 d. 4 2. Wha t will be the content of set1 after these following calls: SortedSet set1 = new SortedSet(new List() { 1, 2, 3, 2 }); SortedSet set2 = new SortedSet(new List() { 2,4,6 }); set1.IntersectWith(set2); Have a go hero – explain the code! Can you explain how the program works? Everything used in this example is discussed earlier in another example app in the chapter. You can download the T9 dictionary from the book's website. Also, as an extension, can you use whatever you learned in this chapter to create a system that offers predictive text suggestions as the user types. Trying to fix an appointment with a doctor? The last time I went to see a doctor, I wished I could have seen a specialist the same day. We can build a system that can answer very important questions (such as the ones shown in the following list) that patients might have: ‹‹ When is doctor A available? ‹‹ When are both doctor A and doctor B there? ‹‹ Does doctor A work on all the days that doctor B does and vice versa? ‹‹ Which are the days when either of the doctors is available? ‹‹ When either of these two doctors available, but not both? ‹‹ Which doctor is available over the weekends? Chapter 2 [ 57 ] ‹‹ Whether doctor A is available on the given date? ‹‹ Who is more available, doctor A or doctor B? ‹‹ When is doctor B available save the weekends? Time for action – creating a set of dates of the doctors' availability Follow the given steps: 1. Cr eate a console application. Add the following variables: HashSet AvailabilityOfDoctorA = new HashSet() { DateTime.Today, DateTime.Today.AddDays(-1), DateTime.Today.AddDays(1) }; HashSet AvailabilityOfDoctorB = new HashSet() { DateTime.Today.AddDays(-3), DateTime.Today.AddDays(1), DateTime.Today.AddDays(1) }; 2. Add the following code in the Main() method: Console.WriteLine("Doctor A is avilable on these days "); Console.WriteLine(); foreach (DateTime d in AvailabilityOfDoctorA) Console.WriteLine(d.ToLongDateString()); 3. Ex ecute the program. Lists [ 58 ] What just happened? You will see a similar output: We are looping through all the dates in the HashTable that holds the dates Doctor A is available. That's not very interesting. But let's see how we can do a lot more with sets. You see here in this list Sunday appeared before Saturday. If you want to keep your entries sorted, then please resort to a sorted set implementation such as SortedSet. Time for action – finding out when both doctors shall be present Follow the given steps: 1. Go back t o the Main() method and change its content as follows: Console.WriteLine("Doctor A and Doctor B both shall be available on "); Console.WriteLine(); HashSet commonDates = AvailabilityOfDoctorA; commonDates.IntersectWith(AvailabilityOfDoctorB); foreach (DateTime d in commonDates) Console.WriteLine(d.ToLongDateString()); 2. Ex ecute this new program. What just happened? You will see the following output: Chapter 2 [ 59 ] Before we dig deep to see what caused the output, please note that every time you run this snippet, you shall see different results because the example uses DateTime.Today that will change every second if you are right on the 11th hour of a day. The IntersectWith() method returns the set representing the intersection of two sets. Or in other words, sets with common elements in two sets. IntersectWith() is a method that modifies the caller instance of the ISet implementation; in place. So, after the call to IntersectWith(); commonDates will contain only this: Tuesday, May 10, 2011 entry. So, if you don't want to modify the HashSet in place, try copying that to a temporary variable as I did in this case, in the commonDates HashSet instance. All the methods of ISet that end with a With are in place. So be careful while calling them. If you want to use a method that returns a new set without modifying the caller set instance; you have to use the extension method Intersect available on the System. Linq namespace. We shall learn more about extension methods in Chapter 4, LINQ to Objects. For now it would be enough to know that native ISet implementations are in place, while the extension methods are not. Pop quiz 1. Wha t will be the content of Set A after the following calls: HashSet A = new HashSet() { "a", "b", "c", "d" }; HashSet B = new HashSet() { "d","e" }; A.IntersectWith(B); 2. Wha t will be the content of Set B after the following calls: HashSet A = new HashSet() { "a", "b", "c", "d" }; HashSet B = new HashSet() { "d","e" }; B.UnionWith(A); Lists [ 60 ] Revisiting the anagram problem We solved the anagram problem earlier, so let's take a second look. An anagram duo results in the same sorted collection. For example, consider the following two strings: "The Eyes" "They See" When sorted and any special character (including whitespace) is removed; these two strings result in the following string: "eeehsty" or in other words that can be written as follows: "e3h1s1t1y1" where the digits next to each character represents the frequency of that character in the phrase. Time for action – re-creating the anagram finder Follow the given steps: 1. Cr eate a class library project. Call it GenEx. 2. Add the f ollowing class there: /// /// A Set that allows multiple copies of the same element. /// It keeps a count of how many elements are present. /// /// public class MultiSet { private SortedList innerDS; /// /// Creating a multiset instance from an IEnumerable of T /// /// The input from which we want to /// create the multiset instance. public MultiSet(IEnumerable input) { Chapter 2 [ 61 ] innerDS = new SortedList(); foreach (T item in input) { if (!innerDS.ContainsKey(item)) { innerDS.Add(item, 1); } else innerDS[item]++; } } /// /// The flattend value combining all the elements. /// public IEnumerable Value { get { List vals = new List(); foreach (T item in innerDS.Keys) { for (int i = 0; i < innerDS[item]; i++) vals.Add(item); } return vals; } } } 3. Add a c onsole application to this project. Name it GenExTest. Add a reference of the class library project to this one. 4. Add the f ollowing to the Main() method: static void Main(string[] args) { MultiSet firstPhrase = new MultiSet ("theeyes".ToCharArray()); MultiSet secondPhrase = new MultiSet ("theysee".ToCharArray()); bool isAnagram = firstPhrase.Value.SequenceEquals (secondPhrase.Value); Lists [ 62 ] Display("The Eyes", "They See", isAnagram); MultiSet firstSentence = new MultiSet ("nay nada nay".Split(' ')); MultiSet secondSentence = new MultiSet ("nada nay nay".Split(' ')); isAnagram = false; isAnagram = firstSentence.Value.SequenceEquals (secondSentence.Value); Display("nay nada nay", "nada nay nay", isAnagram); Console.ReadKey(); } 5. Add the f ollowing Display() method: private static void Display(string a, string b, bool isAnagram) { Console.WriteLine("\"{0}\" and \"{1}\" are {2}", a, b, isAnagram == true ? "Anagrams" : "Not Anagrams"); } 6. Compile and run the program. What just happened? As you execute the program, you will see the following output: Now, let's see how we got here. The constructor of the MultiSet class creates an element histogram. The public read-only property Value flattens the histogram by returning an enumerable of type T. Chapter 2 [ 63 ] For example, the entry "the eyes" histogram will look like this: Element Frequency e 3 h 1 s 1 t 1 y 1 And the Value property will be eeehsty. So, we get these sorted versions of the input from two sources in the Value property and then compare those using the SequenceEquals() extension method. Warning! The extension method SequenceEquals() and the native method SetEquals() expects inputs to be in order, otherwise they return false. It is kind of misleading for SetEquals(), because ideally SetEquals() should check whether two sets contain the same elements or not, disregarding the order of elements. If you are familiar with multiset in C++, then you shall be able to co-relate with this structure we have just built. Lists [ 64 ] Lists under the hood So far in this chapter, we have learnt about many generic containers. This is the perfect time to know how they fit in to the entire ecosystem of .NET Generics. Different list-based implementations implement different interfaces. These relationships are shown as follows: Generic collection Interfaces it implements What's the significance? Stack IEnumerable ICollection IEnumerable Stack is a generic collection that has to be enumerable. It is also a collection, so any collection can be converted to Stack. For that ICollection is implemented. Queue IEnumerable ICollection IEnumerable From compiler's standpoint Stack and Queue are identical. Only their concrete implementation for these interface methods make them distinct from one another. List IList ICollection IEnumerable IList ICollection IEnumerable It is a list. So IList is implemented. It is also a generic list so IList is implemented. Everything else is common, because it is also a collection. Because of IList and IList it can provide random indexing over the contents. LinkedList ICollection IEnumerable ICollection IEnumerable ISerializable IDeserializationCallback This is a generic implementation of a classic LinkedList data structure. HashSet ISerializable IDeserializationCallback ISet ICollection IEnumerable IEnumerable This is a Hash-based set implementation. The fastest of its kind. As with mathematical sets, no duplicates are allowed. Elements are not sorted. Chapter 2 [ 65 ] Generic collection Interfaces it implements What's the significance? SortedSet ISet ICollection IEnumerable ICollection IEnumerable ISerializable IDeserializationCallback This is a tree-based set implementation. Entries remain sorted for value types, for reference the default comparer implementation is used. SortedList IDictionary ICollection> IEnumerable> IDictionary ICollection IEnumerable This is a list as well as a dictionary that offers indexing over its contents and values can be obtained by key indexing. Summary We learned a lot in this chapter about .NET Generics' list-based containers. We have learnt which generic container to use depending on the task at hand. The main idea of this chapter was not to be yet another API reference; but to learn how to use these generic list-based containers to solve real-world problems. Although some of the applications built in the chapter might look trivial, they share the same pattern of several other problems in other domains. Some concepts such as constraints and extension methods are used, but not explained here. Constraints shall be discussed in the appendix and extension methods in Chapter 4, LINQ to Objects. In this chapter, we have learnt about a list-based associative container SortedList. However, that's not much use for performance issues. In the next chapter, we shall discuss tree-based IDictionary implementations that are much faster than SortedList. Download from Wow! eBook Download from Wow! eBook 3Dictionaries In the last chapter, we have learnt about different generic list-based containers. Those are great for maintaining simple lists. However, they are not the best for storing associative relationships among different entities. For example: ‹‹ You might want to know how many electronic gadgets are available from amazon.com that are below $20 ‹‹ A football fanatic friend of mine wants to keep a count of how many times Real Madrid won over other teams in recent times ‹‹ Your friend at the local electronics store dreams about a system that can auto-complete electronic part names These are some of the examples you can think of, where you would need a mechanism to associate one variable with another. The fun factor is that all these situations can be represented conceptually using the same structure—the same structure where some fields will be accessible very quickly using some other unique variable. This structure is known as Dictionary in C#. Using Generics, you can build a common structure that can be used in different applications. "To one" or "To many" that's the question. Dictionaries [ 68 ] Reading this chapter and following the exercises, you shall learn the following: ‹‹ Types of Generic Associative Structures and how to use them ‹‹ How to create your own custom generic associative structure Types of generic associative structures There are two types of associative containers/structures available in .NET Generics. They are as follows: 1. K ey-value pairs 2. Tuples Key-value pairs can be of two types. One type allows duplicate keys. These are represented by a collection of KeyValuePair objects. The other type doesn't allow duplicate keys. These are represented by Dictionary and SortedDictionary. SortedDictionary keeps the keys in sorted order unlike Dictionary, where the keys are not stored in any particular order. Tuples are a new inclusion to the .NET 4.0 Framework and like KeyValuePair they also allow duplicate keys. However, KeyValuePair can be viewed as a special case of a Tuple. A Tuple allows you to store the relationship between n different variables (they may be of different types). However a Key-value pair is what the name suggests. It's a pair. Only one variable is associated with the other. If you need to store a relationship between three or more variables, you need a Tuple. Chapter 3 [ 69 ] Creating a tag cloud generator using dictionary Tag cloud is a fascinating way to visualize content. Tag clouds, generated from text sources, give an impression about the text. It speaks for the text. Here is a tag cloud that I generated pointing my program to Apple's iPad page. It's almost evident from the tag cloud that the Apple iPad will arrive at the store from April and you can pre-order. All this without even rolling our eyeballs over the page: Time for action – creating the word histogram Follow the given steps: 1. Cr eate a console application. Call it TagCloud. 2. Add the f ollowing in the Main() method of Program.cs: Dictionary wordHistogram = new Dictionary(); wordHistogram.Add("Apple",33); wordHistogram.Add("Orange",85); wordHistogram.Add("Strawberry",20); wordHistogram.Add("Watermelon",150); wordHistogram.Add("Guava", 52); wordHistogram.Add("Grape", 80); ShowCloud(wordHistogram); 3. Cop y the Sample Code from the following location and save it as TagCloudData. txt in the C:\ drive: http://visapi-gadgets.googlecode.com/svn/trunk/termcloud/doc. html Dictionaries [ 70 ] 4. R eplace the highlighted part of this file (as shown in the following screenshot) with a place holder !DATA!: Delete the highlighted lines and replace these with !DATA!. So, after replacement it will look something similar to the following screenshot: Chapter 3 [ 71 ] 5. Add the f ollowing method in Program.cs: private static void ShowCloud(Dictionary wordHistogram) { //Creating the data to replace the placeholder StringBuilder cloudBuilder = new StringBuilder(); cloudBuilder.AppendLine("data.addRows(" + wordHistogram.Keys. Count.ToString() + ")"); int i = 0; foreach (string word in wordHistogram.Keys) { cloudBuilder.AppendLine("data.setValue(" + i.ToString() + ",0,'" + word + "');"); cloudBuilder.AppendLine("data.setValue(" + i.ToString() + ",1," + wordHistogram[word] + ");"); i++; } //Replacing the placeholder //If you can't put the file in C:\ drive. Put the file //anywhere else where you have access StreamReader sr = new StreamReader("C:\\TagCloudData.txt"); string total = sr.ReadToEnd().Replace("!DATA!", cloudBuilder.ToString()); sr.Close(); //Writing the content to a temporary local file StreamWriter sw = new StreamWriter("C:\\Cloud.html"); sw.Write(total); sw.Close(); //Showing the generated cloud. System.Diagnostics.Process.Start("C:\\Cloud.html"); } 6. Compile and run the application. You will need to be connected to the internet while using this application as it uses the Google Visualization API. Dictionaries [ 72 ] What just happened? As you execute the program, it will launch your default browser and will show the following tag cloud: Watermelon is the biggest tag, because it has the highest frequency assigned (150). A word histogram is nothing but a tabular view of a tag cloud. Every word has a weight. In the real world, the weight can be the frequency of occurrence in a text. The declaration: Dictionary wordHistogram = new Dictionary(); creates a dictionary, where the keys are of type string and the value is of type integer. This dictionary is used to store the histogram of the words. As you see, Strawberry has the lowest weight and Watermelon has the largest. So, they appear smallest and largest in the tag cloud respectively. We are using the Google Visualization API for generating the tag cloud. So, we somehow need to replace the hardcoded values from the sample HTML file. That's being done in the method ShowCloud(). All the keys can be obtained by the Keys property of the Dictionary class. It returns a KeyCollection object. Dictionary—similar to SortedList in the previous chapter— implements the IDictionary interface and thus, we can find the value associated with a key by indexing, using the key. Thus, wordHistogram[word] will return the weight associated with the word in the dictionary wordHistogram. Have a go hero We have now created a console application to generate tag clouds from a single source. Can you extend the program, such that it can generate tag clouds from several text sources? Hint: You can use a dictionary of dictionaries as follows: Dictionary> The first string in the preceding dictionary is the type of key of the outer dictionary which represents the source file paths, and the inner dictionary stores the frequency of each word appearing in those files; or in other words histogram for those files. Chapter 3 [ 73 ] Pop quiz 1. Which s tructure would you use when mapping country versus capital? a. Dictionary b. Dictionary> 2. Which s tructure would you use when mapping country versus state-wise per-capita income? Say, we represent per-capita income in decimal. a. Dictionary b. Dictionary> c. Dictionary> 3. How would you refer to the stock price of MSFT in the month of June 2011, if they are stored like this in a dictionary: Dictionary> stockPrices = new Dictionary>(); stockPrices.Add("MSFT",new Dictionary()); stockPrices["MSFT"].Add("May2011",24.22);//Hypothetical stock price. stockPrices["MSFT"].Add("June2011",34.22);//Hypothetical stock price. Creating a bubble wrap popper game Sometimes we all get bored and need an easy break from boredom. Bubble wrap popping is one of the most popular games. It doesn't require any brain power, but it is still a lot of fun. I have seen people popping virtual bubbles for several reasons. It acts very well as a stress buster too. My wife loves this game. Dictionaries [ 74 ] This is a paradise for any bubble popper. It launches with a screen full of bubbles. As you click on a bubble it emits a realistic bubble pop sound and changes the look of the pattern such that it looks as if the bubble you just clicked is punctured. So it looks somewhat similar to the following screenshot. I have minimized it: This is a simple game and we can make it using C# Dictionary. Let's get into action. Time for action – creating the game console Follow the given steps: 1. Cr eate a Windows application. 2. Add the f ollowing code in Form1.cs: Dictionary bubbleStatus = new Dictionary(); private void Form1_Load(object sender, EventArgs e) { Chapter 3 [ 75 ] //Lets find the width of the form int totalWidth = this.Width; //Lets find the height of the form int totalHeight = this.Height; //Lets see how many bubbles we can accommodate per row int perRow = totalWidth / 45; //Lets see how many bubbles we can accommodate per column int perCol = totalHeight / 45; //Bubbles will be images on picturebox controls //This is the first bubble. PictureBox pic = new PictureBox(); pic.Name = "pic0"; pic.Cursor = System.Windows.Forms.Cursors.Hand; pic.Width = 45; pic.Height = 45; //Bubbles have to match the background image //of the form for a realistic look pic.BackgroundImage = Image.FromFile("C:\\bubbleBackGround.jpg"); //Loading the normal bubble image pic.ImageLocation = "C:\\bubble.jpg"; //Attaching a click event to handle the click. //When user clicks a bubble it should play the pop sound //and change the image to a popped bubble image to //create a more realistic special effect pic.Click += new EventHandler(pic_Click); //adding the first bubble to the form's control this.Controls.Add(pic); //Remembering where we painted the first bubble on the form Point lastLocation = pic.Location; //Making a copy, we are going to need this Point firstLocation = pic.Location ; //Adding this bubble to the dictionary. //Right now nobody popped it //So the status is false. bubbleStatus.Add(pic.Name, false); for (int r = 1; r <= perCol; r++) { firstLocation = pic.Location; for (int c = 1; c <= perRow; c++) { Dictionaries [ 76 ] //Creating bubbles on the fly pic = new PictureBox(); pic.BackgroundImage = Image.FromFile("C:\\bubbleBackGround.jpg"); pic.Name = "pic" + (r*10+c).ToString (); pic.Cursor = System.Windows.Forms.Cursors.Hand; pic.Width = 45; pic.Height = 45; pic.ImageLocation = "C:\\bubble.jpg"; pic.Click += new EventHandler(pic_Click); //Checking if There is already a bubble //in the dictionary with this ID if (!bubbleStatus.ContainsKey(pic.Name)) bubbleStatus.Add(pic.Name, false); else { //Change the ID arbitrarily. It doesn't //have to be sequential //We are just going to need a way to //access the status //of a bubble given its ID pic.Name = pic.Name + c.ToString(); //add this new bubble to the dictionary bubbleStatus.Add(pic.Name, false); } //Are we at the edge? //If not we can still go on adding on the same column if (c % perRow != 0) pic.Location = new Point(lastLocation.X + pic.Width,lastLocation.Y); else //OOPs! we fell off the edge, //we ran out of space to render any more //bubbles in the current row. So lets go back to the second //row where we started. pic.Location = new Point(firstLocation.X , firstLocation.Y + pic.Height); //add the current bubble to the controls of the form this.Controls.Add(pic); Chapter 3 [ 77 ] //Remember where you added this last bubble. lastLocation = pic.Location; } } } void pic_Click(object sender, EventArgs e) { PictureBox pic = (PictureBox)sender; if (bubbleStatus[pic.Name] == false) { //This bubble is Popped!! bubbleStatus[pic.Name] = true; //Play bubble wrap pop sound System.Media.SoundPlayer p = new System.Media.SoundPlayer("C:\\BubblePop.wav"); p.Play(); //Change the image to give an impression that it //actually popped. pic.ImageLocation = "C:\\bubblePopped.jpg"; } } 3. No w compile and run the program. What just happened? You should get a screen full of bubbles ready to be popped, as shown in the preceding screenshot. Look how easy it was! See, every bubble can either be normal or already popped. So the status of a bubble in the game board is binary. Thus, it is best to describe their status using a Boolean field. Now, how do we identify one bubble from the other? Well, every bubble has to have an ID. As we don't know how many bubbles we shall have, depending on the screen size the number may vary, so generating the ID at runtime seems an obvious choice. Dictionaries [ 78 ] How did we decide we need a dictionary and not a list? We could also have done it using a list. However, if we did it using a list, identifying which bubble has just popped would have been time consuming and difficult as we would have needed a sequential search. Dictionary, on the other hand, offers very fast access as it keeps the entries indexed by the key. In this case, Key is the Name of the picture boxes that show the bubble images. The dictionary: Dictionary bubbleStatus = new Dictionary(); stores the status of each bubble on the board. When a new bubble is created and gets its Name, then that is naturally not popped. At this point, the bubble is added to the dictionary as follows: bubbleStatus.Add(pic.Name, false); A false status means that the bubble has still not been popped. When the bubble is clicked, a check is made to see whether that bubble has already popped or not—by de-referencing the bubble status by its name—using the following code: if (bubbleStatus[pic.Name] == false) However, if this condition evaluates to be true, then the program changes the status of the bubble from false to true or conceptually from normal to popped if you assign the value true to the status using the following code snippet: bubbleStatus[pic.Name] = true; I thought it would be a good idea to see how the dictionary stores the elements in runtime, here is a snapshot of the middle of a game: Check that the first, third, fifth, and sixth bubbles are popped. Check their status from the dictionary. The first entry is highlighted. Chapter 3 [ 79 ] You can see the game in action at http://sudipta.posterous.com/bubble-wrap- popper-in-c. Let's build a generic autocomplete service We live in an interesting time and remembering everything exactly could be very challenging. For example, imagine yourself in the following scenarios: ‹‹ Coding in C# without intelligence support. ‹‹ Trying to find an Integrated Circuit (IC) from the local electronics store for your school project, but you can't remember the exact number of the IC. ‹‹ You work in a drugstore and a customer only knows first few letters of a drug. There are hundreds that start with those and you can't guess. ‹‹ You work for a travel website and customers are expecting intelligent drop-downs, where city names get completed as they type. All these situations are example of autocomplete feature that completes the entry for the user. It is easy to create a generic structure using the generic C# dictionary that can support autocomplete. The objective of the design is that it has to be user driven. Once written, we should be able to support the autocomplete facility for any custom list of values. Time for action – creating a custom dictionary for autocomplete Follow the given steps: 1. Cr eate a class library project. Call it MyDictionary. 2. Rename Class1.cs to MultiDictionary.cs. 3. Modif y the MultiDictionary class header as follows: public class MultiDictionary where Tkey : IComparable 4. Add the f ollowing variable and the constructor: Dictionary> innerDS; public MultiDictionary() { innerDS = new Dictionary>(); } Dictionaries [ 80 ] 5. Add the f ollowing method to add a single entry to the dictionary: public void Add(Tkey key, TValue val) { if (!innerDS.ContainsKey(key)) { List values = new List(); values.Add(val); innerDS.Add(key, values); } else { innerDS[key].Add(val); } } 6. Add the f ollowing method: /// /// Find all the values for a given key /// /// The given key /// Values associated with this key public IEnumerable Values(Tkey key) { List values = new List(); innerDS.TryGetValue(key, out values); return values; } 7. Add the reference of MyDictionary to this console application project and add the following lines to the Program.cs: using MyDictionary; And in the Main() method, add the following code: static void Main(string[] args) { MultiDictionary authorBookMap = new MultiDictionary(); authorBookMap.Add("Sudipta", "Data Structure using C"); authorBookMap.Add("Sudipta", ".NET Generics Beginners Guide"); var booksBySudipta = authorBookMap.Values("Sudipta"); Chapter 3 [ 81 ] foreach (string bookTitle in booksBySudipta) Console.WriteLine(bookTitle); Console.ReadLine(); } 8. Compile and run the console application project. What just happened? As you execute the program, you will see the following output: Data Structure using C .NET Generics Beginners Guide You might be thinking what's the point in discussing this in the autocomplete app? The reason is, for autocomplete we need to be able to map multiple entries to be associated with a single tag. In other words, we should be able to have duplicate keys in the structure. Till now, we have used dictionaries to map one variable with another single variable. That's one-to-one mapping to be precise. However, for autocomplete, we need a one-to-many mapping capability. So, we should be able to map each key of the dictionary with a list of values. That's exactly what I did in this class MultiDictionary. Pictorially MultiDictionary will look similar to the following diagram: Key1 Key2 Value1 Value2 Value3 Value1 Value2 In the Add() method of the MultiDictionary class, it checks whether the Key is already present. If it is already present, we just add the value to the associated list of that Key. However, if it is not present, we create a blank list, add the new value to that list and then associate that list with the given key. However, the consumers of MultiDictionary get an impression that they can have duplicate keys. Let's create an application showing how MultiDictionary can be used to offer an autocomplete feature. Dictionaries [ 82 ] Time for action – creating a class for autocomplete Follow the given steps: 1. St ay in the class library project where we created the MultiDictionary project. Add a class called AutoComplete.cs. 2. Mark the class as public and add using System.IO; in the using directive list: public class AutoComplete 3. Add the f ollowing variables and the constructor: List values = new List(); MultiDictionary autoCompleteMap; string sourceFile; int minimumLength; public AutoComplete(string file, int min) { autoCompleteMap = new MultiDictionary(); sourceFile = file; minimumLength = min; StreamReader reader = new StreamReader(sourceFile); string line = string.Empty; while ((line = reader.ReadLine()) != null) { if (line.Length > minimumLength) autoCompleteMap.Add(line.Substring(0, minimumLength), line); } reader.Close(); } 4. Add the f ollowing method: public List GetSuggestions(string initial) { if (initial.Length == minimumLength) { values.Clear(); List vals = autoCompleteMap.Values(initial).ToList(); values.AddRange(vals); } Chapter 3 [ 83 ] if (initial.Length > minimumLength) { List currentItems = new List(); foreach (object s in values) currentItems.Add(s.ToString()); values.Clear(); foreach (string k in currentItems) if (k.StartsWith(initial)) values.Add(k); } return values; } 5. No w, create a Windows application to test this. Add a textbox (textBox1) as follows: 6. No w, add these two variables and add a reference to MyDictionary, and add the following using directive also: using MyDictionary; ListBox suggestionBox = new ListBox(); AutoComplete suggester; 7. Add c ode for the form_Load event as follows: private void Form1_Load(object sender, EventArgs e) { suggestionBox.Font = textBox1.Font; suggestionBox.DoubleClick+=new EventHandler (suggestionBox_DoubleClick); Dictionaries [ 84 ] suggestionBox.KeyDown += new KeyEventHandler(suggestionBox_KeyDown); suggester = new AutoComplete("UK_Cities.txt", 2); } 8. Add the f ollowing event: private void textBox1_TextChanged(object sender, EventArgs e) { if (textBox1.Text.Length < 2) { suggestionBox.Visible = false; suggestionBox.Items.Clear(); } else { try { List values = suggester. GetSuggestions(textBox1.Text); suggestionBox.Items.Clear(); foreach (string value in values) suggestionBox.Items.Add(value); ShowListBox(); } catch (Exception ex) { return;//Don't do anything. } } } 9. Add the f ollowing method: private void ShowListBox() { suggestionBox.Location = new Point(textBox1.Location.X, textBox1.Location.Y + textBox1.Height); suggestionBox.Width = textBox1.Width; suggestionBox.Height = 10 * textBox1.Height; suggestionBox.Visible = true; Chapter 3 [ 85 ] this.Controls.Add(suggestionBox); suggestionBox.BringToFront(); } 10. Add the f ollowing events: void suggestionBox_KeyDown(object sender, KeyEventArgs e) { if (e.KeyData == Keys.Enter) { textBox1.Text = suggestionBox.Text; suggestionBox.Visible = false; } } private void suggestionBox_DoubleClick(object sender, EventArgs e) { textBox1.Text = suggestionBox.Text; suggestionBox.Visible = false; } private void textBox1_KeyDown(object sender, KeyEventArgs e) { if (e.KeyData == Keys.Down) { suggestionBox.Focus(); if (suggestionBox.Visible) suggestionBox.SelectedIndex = 0; } } 11. Co mpile and run the program. Dictionaries [ 86 ] What just happened? As you execute the program, you will see the following output: As you start typing, after two characters you shall received autocomplete suggestions, as shown in the following screenshot: You can find a video of this output in my blog at: http://sudipta.posterous.com/ auto-complete-feature-demo-video-written-in-c#. Chapter 3 [ 87 ] The code described in steps 9 and 10 is for making the autocomplete feature more useful. As you press the down arrow key, it will set the focus on the listbox so that you can scroll for the right one in the suggestions. If you press Enter while one entry in the listbox is highlighted, that entry will be copied to the textbox and the same will happen if you double-click any item in the listbox. These make the program more real-world ready. The heart and soul of this program is the following code snippet: suggester = new AutoComplete("UK_Cities.txt", 2); This creates an AutoComplete object. This tells the program that it must populate the entries from the file Uk_Cities.txt and it should offer suggestions as soon as the first two characters are typed in the textbox. In the constructor of the AutoComplete class, the following code snippet: autoCompleteMap.Add(line.Substring(0, minimumLength), line); stores the words in a MultiDictionary where Keys of the MultiDictionary are prefixes of length minimumLength. One entry in this MultiDictionary is as follows: The most interesting method is the GetSuggestion() method as it calculates the list of suggestions with every changing few initial letters. If the length of the initial text is more than the minimum length for offering a suggestion, then the already suggested strings are filtered depending on whether it starts with the same pattern as the input string. For example, as soon as you type Gl in the box, you will see Glasgow and Gloucester in the list as they both start with Gl. However, if you keep typing, then the input string of Gla will only match Glasgow and show that in the suggestion box. While we deal with dictionaries, we should try to avoid the biggest possible error that might crash a program at runtime. Let's see how. D o wnload from Wow! eBook D o wnload from Wow! eBook Dictionaries [ 88 ] The most common pitfall. Don't fall there! The most common pitfall is what is known as a hole problem. It means you are trying to access the value for a key from a dictionary which doesn't exist. This throws a KeyNotFoundException. In order to avoid this, you can use the TryGetValue() method that first checks whether the key is present and then tries to de-reference that. Here is a typical use: Dictionary histogram = new Dictionary(); histogram.Add("a", 1); histogram.Add("b", 2); //The following call will throw KeyNotFoundException int x = histogram["c"];int y; //The following call will set y to default value for int, which is 0 histogram.TryGetValue("c", out y); Let's play some piano The piano is one of the closest analogies of a C# Dictionary (at least conceptually) in the real world. Every key of the piano is associated with a particular note. In this example, we shall create a table-top piano with 12 keys (A, B, C, D, E, F, G, B#, C#, E#, F#, G#) as shown in the following screenshot: Chapter 3 [ 89 ] This would have the following functionalities: ‹‹ Users should be able to play it by clicking on the keys ‹‹ Users should be able to play it by pressing keys on the keyboard ‹‹ Users should be able to customize the key settings to their preference ‹‹ It should be able to record a note and play it back ‹‹ It should show labels for each key on demand (as shown in the preceding screenshot) The preceding screenshot might look very fancy, but it is composed with only winform buttons and labels. I have skipped the details for laying out the controls and their event handlers. However, you can find all that from the website including the background image and the .wav files used. Time for action – creating the keys of the piano Follow the given steps: 1. Cr eate a Windows form. Place the controls as shown in this video: http:// sudipta.posterous.com/private/eweuvoFJwb. Eventually, your layout should look like the one shown in the preceding screenshot. As it is a complex layout, I have created the video to help you. 2. Add the f ollowing variables to Form1.cs: bool startRecording = false; This will be set to true when the app starts recording keystrokes. 3. Add the f ollowing field to the Form1.cs: string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; Dictionary pianoNotes = new Dictionary(); Dictionary keySettings = new Dictionary(); List> keyPressRecord = new List>(); Dictionaries [ 90 ] 4. Add the f ollowing methods: /// /// Plays the .WAV file associated with the pressed key /// /// The piano key that is pressed private void PlayPiano(string pianoKey) { try { System.Media.SoundPlayer p = new System.Media.SoundPlayer (pianoNotes[pianoKey]); p.Play(); } catch (KeyNotFoundException ex) { return; } } /// /// This method fills the key settings /// with recommended values by yours truly! /// Every time the program starts these values will be loaded. /// So even if the users make any customization, for now, that /// will not be sticky. /// private void FactoryReset() { keySettings.Add("A", 'A'); keySettings.Add("B", 'B'); keySettings.Add("C", 'C'); keySettings.Add("D", 'D'); keySettings.Add("E", 'E'); keySettings.Add("F", 'F'); keySettings.Add("G", 'G'); keySettings.Add("B#", 'Q'); keySettings.Add("C#", 'W'); keySettings.Add("E#", 'R'); keySettings.Add("F#", 'T'); keySettings.Add("G#", 'H'); Chapter 3 [ 91 ] } /// /// Populates the drop down boxes with the recommended keys /// private void PopulateKeys() { cmbA.Text = alphabet[alphabet.IndexOf(keySettings["A"])].ToString(); cmbB.Text = alphabet[alphabet.IndexOf(keySettings["B"])].ToString(); cmbC.Text = alphabet[alphabet.IndexOf(keySettings["C"])].ToString(); cmbD.Text = alphabet[alphabet.IndexOf(keySettings["D"])].ToString(); cmbE.Text = alphabet[alphabet.IndexOf(keySettings["E"])].ToString(); cmbF.Text = alphabet[alphabet.IndexOf(keySettings["F"])].ToString(); cmbG.Text = alphabet[alphabet.IndexOf(keySettings["G"])].ToString(); cmbBSharp.Text = alphabet[alphabet.IndexOf(keySettings["B#"])].ToString(); cmbCSharp.Text = alphabet[alphabet.IndexOf(keySettings["C#"])].ToString(); cmbESharp.Text = alphabet[alphabet.IndexOf(keySettings["E#"])].ToString(); cmbFSharp.Text = alphabet[alphabet.IndexOf(keySettings["F#"])].ToString(); cmbGSharp.Text = alphabet[alphabet.IndexOf(keySettings["G#"])].ToString(); } 5. Add the f ollowing code in Form_Load: private void Form1_Load(object sender, EventArgs e) { //Associating the .WAV files with the keystrokes pianoNotes.Add("A", "A.wav"); pianoNotes.Add("B", "B.wav"); pianoNotes.Add("B#", "B#.wav"); pianoNotes.Add("C", "C.wav"); pianoNotes.Add("C#", "C#.wav"); pianoNotes.Add("D", "D.wav"); pianoNotes.Add("E", "E.wav"); Dictionaries [ 92 ] pianoNotes.Add("E#", "E#.wav"); pianoNotes.Add("F", "F.wav"); pianoNotes.Add("F#", "F#.wav"); pianoNotes.Add("G", "G.wav"); pianoNotes.Add("G#", "G#.wav"); FactoryReset(); PopulateKeys(); } 6. Add event handlers for all the buttons (representing piano keys) as follows: private void btnA_Click(object sender, EventArgs e) { //Showing what key is pressed. lblKeyStroke.Text = "A"; PlayPiano("A"); //If user intended to start recording, //let's remember this key stroke. if (startRecording) keyPressRecord.Add (new KeyValuePair("A", DateTime.Now)); } 7. Th e helper button's name is btnShowLables and add the following event handler for that: private void btnShowLables_Click(object sender, EventArgs e) { if (btnShowLables.Text == "Help") { keyLabelA.Visible = true; keyLabelB.Visible = true; keyLabelC.Visible = true; keyLabelD.Visible = true; keyLabelE.Visible = true; keyLabelF.Visible = true; keyLabelG.Visible = true; keyLabelBSharp.Visible = true; keyLabelCSharp.Visible = true; keyLabelESharp.Visible = true; keyLableFSharp.Visible = true; keyLabelGSharp.Visible = true; grpKeySettings.Visible = true; btnShowLables.Text = "Helping"; } Chapter 3 [ 93 ] else { keyLabelA.Visible = false; keyLabelB.Visible = false; keyLabelC.Visible = false; keyLabelD.Visible = false; keyLabelE.Visible = false; keyLabelF.Visible = false; keyLabelG.Visible = false; keyLabelBSharp.Visible = false; keyLabelCSharp.Visible = false; keyLabelESharp.Visible = false; keyLableFSharp.Visible = false; keyLabelGSharp.Visible = false; grpKeySettings.Visible = false ; btnShowLables.Text = "Help"; } } What just happened? Once you compile and run the application, it will show the piano interface. You can click on each key to play. Now, let's see what our code can do so far. These two dictionaries: Dictionary pianoNotes = new Dictionary(); and Dictionary keySettings = new Dictionary(); are the heart and soul of this application. The first one, pianoNotes, associates a piano key to a given .wav file. So, as you click on that button, that associated media file (.wav) could be played. The second one, maps the keyboard keys to the piano keys. For example, there is no key on the keyboard to directly show "C#". So the key W is mapped to "C#". All these are being done in the FactoryReset() method. As soon as you click on any button, the PlayPiano() method plays the associated media file. This is done using key-based indexing of Dictionaries. For example, if you click on btnA then pianoNotes[pianoKey] will return the value of the key A in the dictionary pianoNotes, which is A.wav. Now, suppose you want to play the "C#" note. Which keyboard key should you click on? The answer is you should click btnW or press W on the keyboard; which is the associated keyboard key for piano key "C#". Dictionaries [ 94 ] This type of single unique key and value association is known as one-to-one mapping. And the type of dictionary we created for autocomplete is called one-to-many mapping. How are we recording the key strokes? Well, we are using List> keyPressRecord = new List>(); for that. As the same piano note can be played many times during a session, we can't use a dictionary to record keystrokes, because the dictionary can't have duplicate keys. So, the solution is a list of KeyValuePair class objects that allows us to store KeyValuePair with the same key. The value type is DateTime, because we need to remember exactly when which key was pressed, if we need to play it back. The duration between each such key press will be determined by finding the difference between the values of consecutive DateTime items. When I played and recorded my keystrokes, for a while it was stored in the list of key-value pairs as follows: So you see, I waited two seconds after I hit the first key A. So while playing it back, we must play the second note (which is B in this case) after two seconds of playing the first note. Let's see how we can do that! Chapter 3 [ 95 ] Time for action – switching on recording and playing recorded keystrokes Follow the given steps: 1. Add the f ollowing method in Form1.cs: /// /// Switch on recording. /// private void btnRecord_Click(object sender, EventArgs e) { if (btnRecord.Text.Equals("Record")) { //let's remember this key stroke. keyPressRecord.Clear(); //We need to start recording. startRecording = true; btnRecord.Text = "Stop"; } else { startRecording = false ; btnRecord.Text = "Record"; } } 2. Add the f ollowing method and event in Form1.cs: private void SleepInBetween(TimeSpan span) { System.Threading.Thread.Sleep(span.Minutes); System.Threading.Thread.Sleep(span.Seconds); System.Threading.Thread.Sleep(span.Milliseconds); } /// /// Plays the recorded keystrokes /// private void btnPlay_Click(object sender, EventArgs e) { try { Dictionaries [ 96 ] int i; TimeSpan span; for (i = 0; i < keyPressRecord.Count - 1; i++) { PlayPiano(keyPressRecord[i].Key); span = keyPressRecord[i + 1].Value .Subtract(keyPressRecord[i].Value); //Lets wait till the timespan between //these two key-strokes are spent. SleepInBetween(span); } span = keyPressRecord[i].Value.Subtract (keyPressRecord[i - 1].Value); SleepInBetween(span); PlayPiano(keyPressRecord[i].Key); } catch (Exception ex) { //Let's go back return; } } How it works? We are iterating over the recorded keystrokes using this loop: for (i = 0; i < keyPressRecord.Count - 1; i++). keyPressRecord[i].Key gives the ith key pressed by the user from the start. keyPressRecord[i + 1].Value.Subtract(keyPressRecord[i].Value); returns the time difference between the two consecutive recorded keystrokes. The program needs to sleep during this time. Once we reach out of this loop, we need to play the last recorded keystroke. You can access the key and value of a KeyValuePair by public property Key and Value. You can see the piano being played at http://sudipta.posterous.com/my-table- top-piano-12-keys. Chapter 3 [ 97 ] C# Dictionaries can help detect cancer. Let's see how! Now that you are aware of the KeyValuePair, let's see another interesting situation where this structure can be used. Pattern recognition is an emerging science. K Nearest Neighbor (popularly known as KNN) is a supervised learning algorithm that can be used in binary classification problems very efficiently. A binary classification problem is just what it states. It is a classification problem. Typically, classification of several entries are previously known and depending on that a new entry has to be classified in either of the two categories. Thus it is called binary. Suppose we have records of several patients who are diagnosed with either malignant (class 'M' cancerous) or benign (class 'B' not cancerous) cases. Now, we gather details from the tests for a new patient. Depending on what we know previously from past patient records, can we classify the new patient's case as either a malignant case or a benign case? That's where binary classification helps in real life. KNN uses a voting mechanism to solve this problem. It considers test results as a vector. If two vectors, representing two patient's test records, are near then they are probably suffering from the same type of case (either benign or malignant). Now this mechanism can lead to confusing results sometimes, if only the nearest vector is considered. So, K nearest vectors are considered. Thus the name KNN. In this example, we shall see how C# Dictionary-based data structures can be used to solve this problem. Time for action – creating the KNN API Follow the given steps: 1. Cr eate a class library project. 2. Add a class to the project. Call it Entry. Add the following code: public class Entry { public string Id { get; set; } public string Tag { get; set; Dictionaries [ 98 ] } public List ParamValues { get; set; } } 3. Add another class to this project. Call it KNN. 4. Add the f ollowing variable and the constructor: private List>> map ; public KNN() { map = new List>>(); } 5. Add the f ollowing methods: /// /// Adds a patient record to the list /// /// public void Add(Entry entry) { map.Add(new KeyValuePair>(entry.Tag, entry.ParamValues)); } /// /// Adds an entry to the list /// /// Class of the new entry /// Patient Records public void Add(string tag, List entries) { map.Add(new KeyValuePair>(tag, entries)); } /// /// Calculates the Euclidean Distance between two entries /// /// Index of the first entry in the /// list Chapter 3 [ 99 ] /// Index of the second entry in the /// list/// /// To know more about Euclidean Distance please refer /// http://en.wikipedia.org/wiki/Euclidean_distance /// private double distance(int index1,int index2) { double sum = 0; for (int i = 0; i < map[index1].Value.Count; i++) { sum += Math.Pow(Math.Round(map[index1].Value[i] – map[index2].Value[i],2),2); } return Math.Round(Math.Sqrt(sum),2); } 6. Add the following method: /// /// Predicts the suspected class of the input data /// /// Values. In this case value of several /// parameters /// The first of the binary classes this data /// set can belong to. In this example, either "B" or "M" /// /// The second of the binary classes this /// data set can belong to. In this example either "B" or "M" /// /// Number of nearest neighbor for we need to /// consider to conclude a class. /// For more information visit http://en.wikipedia.org/ /// wiki/K-nearest_neighbor_algorithm public string Predict(List entries, string class1, string class2, int k) { //Dictionary to keep the entries indexed and sorted by their //distance from the new values SortedDictionary distanceMap = new SortedDictionary(); int count = 0; this.Add("X", entries);//Right now it is unknown Dictionaries [ 100 ] int lastIndex = map.Count - 1; int firstIndex = 0; //Lets start counting at 0 double minDistance = distance(firstIndex, lastIndex); for (int i = firstIndex + 1; i < lastIndex - 1; i++) { double currentDistance = distance(i, lastIndex); if(!distanceMap.ContainsKey(currentDistance)) distanceMap.Add( currentDistance,map[i].Key); } map.RemoveAt(map.Count - 1);//lets remove the last entry Dictionary kVals = new Dictionary(); foreach (double key in distanceMap.Keys) { if (count == k) break; kVals.Add(key, distanceMap[key]); } //Finding Class of the new entry "class1" int class1Count = 0; int class2Count = 0; foreach (double key in kVals.Keys) { if (kVals[key] == class1) class1Count++; else class2Count++; } return class1Count > class2Count ? class1 : class2; } What just happened? You can compile the project but can't run it yet. It doesn't have any data. We shall get to that in a while. Let's see what we created. The class Entry represents a particular patient's record. Records from different tests are obtained and we are using numeric record values. The class KNN implements the KNN algorithm. The inner data structure of the KNN class is: List>>. Chapter 3 [ 101 ] Let's see why. We have a list of patient records, where the patients might be suffering from either of the two cases and there could be multiple patients suffering from the same class of disease. So a plain dictionary is ruled out as it can't support duplicate keys. We can't use a MultiDictionary—described earlier in the chapter—because we would need integer indexing. So, we need a key-value pair list where the key is the class of disease and the value is the list of values of different test results. So pictorially, we are using the structure shown in the following screenshot: 5 4421 5 4 4 11 15 41 42111M B B The Predict() method is doing all the magic. It predicts the possible class of the new patient entry. It uses a sorted dictionary to keep a list of entries in the ascending order of their distance from the new entry: SortedDictionary distanceMap = new SortedDictionary(); This dictionary keeps the class tags and the distance of the patient's record data entries from the newly-added unidentified one, ordered by the distance because tags will be duplicate, however, distance can't be in most instances. And if they are, we don't need to add that entry to the dictionary as it actually is a duplicate entry. Dictionaries [ 102 ] So the entries in this dictionary will look similar to the following: The index on the left is the value for k. So the first entry k=1 is the closest to an entry which is of type M. Once we have this dictionary in the Predict() method, we need to consider only the first k entries of the SortedDictionary distanceMap. If the number of entries among these first k elements having tag value class1 is more than that of entries having tag value class2, then the new unidentified entry will be marked as class1 else it will be marked as class2. That's how the voting works. Time for action – getting the patient records Follow the given steps: 1. Do wnload patient records from the following website: http://archive.ics.uci.edu/ml/machine-learning-databases/ breast-cancer-wisconsin/wdbc.data. If you are interested to know what are these values check out: http://archive. ics.uci.edu/ml/machine-learning-databases/breast-cancer- wisconsin/wdbc.names. Chapter 3 [ 103 ] 2. Cop y about one-third of the entries to another file and delete them from this one. Save the new file as wdbctest.txt and the left over entries in the original file as wdbc.txt. You can get these files from the book website. Time for action – creating the helper class to read a delimited file Follow the given step: 1. St ay in the class library project and add a class called TextReader.cs: public class TextReadHelper { private string fileName;//Delimited source file that has // patient records private int tagIndex; //Index of the class of disease this // patient is suffering private int idIndex;//Index of the ID of the patient in // the delimited file string delim; //delimiter public TextReadHelper(string fileName,int tagIndex, int idIndex,string delim) { //id,tag,values //tag,values this.fileName = fileName; this.tagIndex = tagIndex; this.idIndex = idIndex; this.delim = delim; } /// /// Returns the list of patient records /// public List GetEntries() { //Include System.IO for this line of code StreamReader sr = new StreamReader(fileName); string line = string.Empty; List entries = new List(); while ((line = sr.ReadLine()) != null) { string[] tokens = line.Split(new string[] {delim}, StringSplitOptions.RemoveEmptyEntries); Entry current = new Entry(); Dictionaries [ 104 ] if(idIndex!=-1) current.Id = tokens[idIndex]; current.Tag = tokens[tagIndex]; current.ParamValues = new List(); for (int i = 2; i < tokens.Length; i++) current.ParamValues.Add(Convert.ToDouble(tokens[i])); entries.Add(current); } sr.Close(); return entries; } } What just happened? We wanted to read records of patient data from delimited notepad files. So, we needed a mechanism to be able to read these files and convert them to a list of Entry objects, which represents patient data. With this, we are now ready to put it all together. Time for action – let's see how to use the predictor 1. Cr eate a console application and add a reference of the class library we built earlier. 2. Add the f ollowing using directive at the beginning: using ClassLibrary1; 3. Add the f ollowing code in the Main() method of the console application: static void Main(string[] args) { TextReadHelper helper = new TextReadHelper("C:\\wdata.txt",1,0,","); KNN knnHelper = new KNN(); List entries = helper.GetEntries(); foreach (Entry current in entries) knnHelper.Add(current); int k = 3; Chapter 3 [ 105 ] string tag = knnHelper.Predict(new List() {21.56, 22.39, 142, 1479, 0.111, 0.1159, 0.2439, 0.1389, 0.1726, 0.05623, 1.176, 1.256, 7.673, 158.7, 0.0103, 0.02891, 0.05198, 0.02454,0.01114, 0.004239, 25.45, 26.4, 166.1, 2027, 0.141, 0.2113, 0.4107, 0.2216, 0.206,0.07115 },"M", "B", k); Console.WriteLine("Suspected diagnoses " + tag); Console.ReadLine(); } 4. Compile and run this console application. What just happened? As you execute the program, it will print the following output: Suspected diagnoses M The call to the Predict() method takes only a new patient's test data record, saves the patient ID and tag. k is set to 3. So in this case, we are considering three nearest neighbors for reaching a conclusion about the possible disease class the new patient is suffering from. I thought it would be a good idea to see the final dictionary of distance mapped entries. So, here it is for the call we made from Main(): Tuples are great for many occasions including games Tuples are a new inclusion in the .NET 4.0 framework and they are a perfect fit for many situations, some of them are as follows: ‹‹ To represent a database table row in the code. ‹‹ To replace all those dummy classes that have no methods and are just acting as placeholders for several types of objects. Dictionaries [ 106 ] ‹‹ To bring association between more than a couple of types. Have you ever found yourself using Dictionary of Dictionaries? Then you shall find Tuples really useful. Tuples can be used to refactor a nested branching statement. In this example, we shall create a number rearranging game that I used to play in my childhood. Time for action – putting it all together Follow the given steps: 1. Cr eate a Windows application. Call it TilO. 2. Add nine button controls. Also add one label control and one pictureBox control, as shown in the following screenshot. Add large borders to buttons ("10"), make them flat, and name them as follows: 3. Add the f ollowing variables in the Form1.cs: bool mute = false; int totalMoves = 0; Chapter 3 [ 107 ] List> board = new List>(); Dictionary buttons = new Dictionary(); 4. Add the f ollowing methods to draw the board properly. You can customize the colors: private void DrawBoard() { int emptyIsAt = board.First(t => t.Item3 == 0).Item1; DrawEmpty(emptyIsAt); //Draw Rest for (int i = 0; i < board.Count; i++) { if (board[i].Item3 != 0) { DrawOthers(board[i].Item1, board[i].Item3); } } } private void DrawOthers(int index,int number) { buttons[index].Text = number.ToString(); buttons[index].BackColor = Color.White; } private void DrawEmpty(int index) { buttons[index].BackColor = Color.Black; buttons[index].Text = string.Empty; } 5. Add the f ollowing methods to check whether the tiles the players want to swap can actually move or not. If so, the second method DoMove() handles the movement: private bool CanMove(int first, int second) { if (first != 0 && second != 0) return false; int firstIndex = board.First(tuple => tuple.Item3 == first).Item1; int secondIndex = board.First(tuple => tuple.Item3 == second).Item1; Dictionaries [ 108 ] int absDiff = Math.Abs(firstIndex - secondIndex); //Handle Edge cases if (absDiff == 1 && Math.Min(firstIndex, secondIndex) % 3 == 0) return false; if (absDiff == 1 || absDiff == 3) return true; else return false; } private void DoMove(string moveCommand, bool silent) { string[] moveThese = moveCommand.Split(new char[] { '=' }, StringSplitOptions.RemoveEmptyEntries); //If some player just clicks the Empty box then it will //generate //an invalid command "0". And in this time moveThese will be //of length 1. if (moveThese.Length == 2) { int first = Convert.ToInt16(moveThese[0]); int second = Convert.ToInt16(moveThese[1]); bool CanTheseBeMoved = CanMove(first, second); if (CanTheseBeMoved) { //Move int firstIndex = board.First(t => t.Item3 == first).Item1; int secondIndex = board.First(t => t.Item3 == second).Item1; int expectedFirst = board.First(t => t.Item3 == first).Item2; int expectedSecond = board.First(t => t.Item3 == second).Item2; Tuple newFirstTuple = new Tuple(firstIndex, expectedFirst, second); Tuple newSecondTuple = new Tuple(secondIndex, expectedSecond, first); board.RemoveAt(firstIndex - 1); Chapter 3 [ 109 ] board.Insert(firstIndex - 1, newFirstTuple); board.RemoveAt(secondIndex - 1); board.Insert(secondIndex - 1, newSecondTuple); if (!mute) { System.Media.SoundPlayer player = new System.Media.SoundPlayer("Blip.wav"); player.Play(); } totalMoves++; lblMoves.Text = String.Format("You have made {0} moves so far.", totalMoves); } else { if (!silent && !mute) { System.Media.SoundPlayer player = new System.Media.SoundPlayer("Error.wav"); player.Play(); } } } } 6. Add the f ollowing methods to initialize board and buttons: private void InitializeStartUpBoard() { board.Add(new Tuple(1, 1, 8)); board.Add(new Tuple(2, 2, 7)); board.Add(new Tuple(3, 3, 6)); board.Add(new Tuple(4, 4, 5)); board.Add(new Tuple(5, 5, 4)); board.Add(new Tuple(6, 6, 3)); board.Add(new Tuple(7, 7, 2)); board.Add(new Tuple(8, 8, 1)); board.Add(new Tuple(9, 0, 0)); } private void InitializeNumberButtonMap() { Dictionaries [ 110 ] buttons.Add(1, btn1); buttons.Add(2, btn2); buttons.Add(3, btn3); buttons.Add(4, btn4); buttons.Add(5, btn5); buttons.Add(6, btn6); buttons.Add(7, btn7); buttons.Add(8, btn8); buttons.Add(9, btn9); } 7. Add these event handlers to handle the events of button Click and volume control pictureBox Click: void but_Click(object sender, EventArgs e) { Button button = (Button)sender; DoMove(String.Format("{0}=0", button.Text), false); DrawBoard(); GameOverYet(); } //This is for volume toggle. You can skip it. private void pictureBox1_Click(object sender, EventArgs e) { if (mute) pictureBox1.ImageLocation = @"Volume_img.jpg"; else pictureBox1.ImageLocation = @"Mute_img.JPG"; //Switch the state mute = !mute; } 8. Add the f ollowing code to randomize the board. Getting the same initial board every time is not what we want: private void RandomizeBoard() { //generate random move commands and move them whenever possible //finally return the modified board string randomMoveCommand = string.Empty; int total = new Random().Next(100); for (int i = 0; i < total; i++) { try Chapter 3 [ 111 ] { randomMoveCommand = new Random().Next(9).ToString() + "=0"; System.Threading.Thread.Sleep(10); DoMove(randomMoveCommand, true); } catch { //This randomly generated move is illegal. Don't worry, just go. continue; } } } 9. Add the f ollowing code for the Form1_Load event: private void Form1_Load(object sender, EventArgs e) { //Attach the event handler for al the buttons this.Controls.OfType