• 1. C++ Network Programming Mastering Complexity with ACE & PatternsDr. Douglas C. Schmidt schmidt@isis-server.isis.vanderbilt.edu www.cs.wustl.edu/~schmidt/tutorials-ace.html Professor of EECS Vanderbilt University Nashville, Tennessee
  • 2. Motivation: Challenges of Networked ApplicationsObservation Building robust, efficient, & extensible concurrent & networked applications is hard e.g., we must address many complex topics that are less problematic for non-concurrent, stand-alone applications Accidental Complexities Low-level APIs Poor debugging tools Algorithmic decomposition Continuous re-invention/discovery of core concepts & components Inherent Complexities Latency Reliability Load balancing Causal ordering Scheduling & synchronization DeadlockComplexities in networked applications2
  • 3. Presentation OutlinePatterns (25+), which embody reusable software architectures & designs ACE wrapper facades, which encapsulate OS concurrency & network programming APIsCover OO techniques & language features that enhance software qualityPresentation Organization Background Concurrent & network challenges & solution approaches Patterns & wrapper facades in ACE + applicationsOO language features, e.g., classes, dynamic binding & inheritance, parameterized types3
  • 4. CPUs and networks have increased by 3-7 orders of magnitude in the past decade2,400 bits/sec to 1 Gigabits/sec10 Megahertz to 1 GigahertzThese advances stem largely from standardizing hardware & software APIs and protocols, e.g.:The Evolution of Information TechnologiesTCP/IP, ATMPOSIX & JVMsMiddleware & componentsIntel x86 & Power PC chipsetsQuality of service aspectsExtrapolating this trend to 2010 yields ~100 Gigahertz desktops ~100 Gigabits/sec LANs ~100 Megabits/sec wireless ~10 Terabits/sec Internet backboneIn general, software has not improved as rapidly or as effectively as hardwareIncreasing software productivity and QoS depends heavily on COTS4
  • 5. There are multiple COTS layers & research/ business opportunitiesHistorically, mission-critical apps were built directly atop hardware& OSTedious, error-prone, & costly over lifecyclesStandards-based COTS middleware helps: Control end-to-end resources & QoS Leverage hardware & software technology advances Evolve to new environments & requirements Provide a wide array of reuseable, off-the-shelf developer-oriented services There are layers of middleware, just like there are layers of networking protocolsObject-Oriented Middleware Layers5
  • 6. Operating System & ProtocolsOperating systems & protocols provide mechanisms to manage endsystem resources, e.g., CPU scheduling & dispatching Virtual memory management Secondary storage, persistence, & file systems Local & remove interprocess communication (IPC) OS examples UNIX/Linux, Windows, VxWorks, QNX, etc. Protocol examples TCP, UDP, IP, SCTP, RTP, etc.RTPDNSHTTPUDPTCPIPTELNETEthernetATMFDDIFibre ChannelFTPINTERNETWORKING ARCHTFTP20th CenturyWin2KLinuxLynxOSSolarisVxWorksMiddlewareMiddleware ServicesMiddleware ApplicationsMIDDLEWARE ARCH21st Century6
  • 7. www.cs.wustl.edu/~schmidt/ACE.htmlHost Infrastructure MiddlewareHost infrastructure middleware encapsulates & enhances native OS communication & concurrency mechanisms to create reusable network programming components These components abstract away many tedious & error-prone aspects of programming to low-level operating system APIs Examples Java Virtual Machine (JVM), Microsoft Common Language Runtime (CLR), ADAPTIVE Communication Environment (ACE)SynchronizationMemory ManagementPhysical Memory AccessAsynchronous Event HandlingSchedulingAsynchronous Transfer of Control www.rtj.org7
  • 8. Distribution MiddlewareDistribution middleware defines higher-level distributed programming models whose reusable APIs & components automate & extend the native OS network programming capabilities Examples OMG CORBA, Sun’s Remote Method Invocation (RMI), Microsoft’s Distributed Component Object Model (DCOM)Distribution middleware enables clients to program distributed applications much like stand-alone applications i.e., by invoking operations on target objects without hard-coding dependencies on their location, language, OS, protocols, & hardware 8
  • 9. Common Middleware ServicesCommon middleware services augment distribution middleware by defining higher-level domain-independent services that allow application developers to concentrate on programming business logic Examples CORBA Component Model & Object Services, Sun’s J2EE, Microsoft’s .NETCommon middleware services alleviate need to write the “plumbing” code required to develop distributed applications by using lower-level middleware directly e.g., application developers no longer need to write code that handles transactional behavior, security, database connection pooling or threading9
  • 10. Domain-Specific MiddlewareModalities e.g., MRI, CT, CR, Ultrasound, etc.Siemens MED Common software platform for distributed electronic medical information & imaging systems Used by all ~13 Siemens MED business units worldwide www.syngo.comBoeing Bold Stroke Common software platform for Boeing avionics mission computing systems www.boeing.comDomain-specific middleware services are tailored to the requirements of particular domains, such as telecom, e-commerce, health care, process automation, or aerospace Examples The domain-specific services layer is where system integrators can provide the most value & derive the most benefits10
  • 11. Present solutions to common software problems arising within a certain contextOverview of PatternsHelp resolve key design forcesFlexibility Extensibility Dependability Predictability Scalability EfficiencyCapture recurring structures & dynamics among software participants to facilitate reuse of successful designsThe Proxy Pattern11ProxyserviceServiceserviceAbstractServiceserviceClientGenerally codify expert knowledge of design constraints & “best practices”www.posa.uci.edu11
  • 12. Overview of Pattern LanguagesBenefits of Pattern Languages Define a vocabulary for talking about software development problems Provide a process for the orderly resolution of these problems Help to generate & reuse software architecturesMotivation Individual patterns & pattern catalogs are insufficient Software modeling methods & tools that just illustrate how, not why, systems are designed12
  • 13. Taxonomy of Patterns & Idioms TypeDescriptionExamplesIdiomsRestricted to a particular language, system, or toolScoped lockingDesign patternsCapture the static & dynamic roles & relationships in solutions that occur repeatedly Active Object, Bridge, Proxy, Wrapper Façade, & VisitorArchitectural patternsExpress a fundamental structural organization for software systems that provide a set of predefined subsystems, specify their relationships, & include the rules and guidelines for organizing the relationships between themHalf-Sync/Half-Async, Layers, Proactor, Publisher-Subscriber, & ReactorOptimization principle patternsDocument rules for avoiding common design & implementation mistakes that degrade performanceOptimize for common case, pass information between layers13
  • 14. The Layered Architecture of ACEFeatures Open-source 200,000+ lines of C++ 40+ person-years of effort Ported to many OS platformsLarge open-source user community www.cs.wustl.edu/~schmidt/ACE-users.htmlCommercial support by Riverace www.riverace.com/www.cs.wustl.edu/~schmidt/ACE.html14
  • 15. Sidebar: Platforms Supported by ACEACE runs on a wide range of operating systems, including: PCs, e.g., Windows (all 32/64-bit versions), WinCE; Redhat, Debian, and SuSE Linux; & Macintosh OS X; Most versions of UNIX, e.g., SunOS 4.x and Solaris, SGI IRIX, HP-UX, Digital UNIX (Compaq Tru64), AIX, DG/UX, SCO OpenServer, UnixWare, NetBSD, & FreeBSD; Real-time operating systems, e.g., VxWorks, OS/9, Chorus, LynxOS, Pharlap TNT, QNX Neutrino and RTP, RTEMS, & pSoS; Large enterprise systems, e.g., OpenVMS, MVS OpenEdition, Tandem NonStop-UX, & Cray UNICOS ACE can be used with all of the major C++ compilers on these platforms The ACE Web site at http://ace.ece.uci.edu contains a complete, up-to-date list of platforms, along with instructions for downloading & building ACE15
  • 16. Event Handling & IPCService Access & ControlConcurrencySynchronizationKey Capabilities Provided by ACE16
  • 17. The Pattern Language for ACEPattern Benefits Preserve crucial design information used by applications & middleware frameworks & components Facilitate reuse of proven software designs & architectures Guide design choices for application developers17
  • 18. POSA2 Pattern AbstractsService Access & Configuration Patterns The Wrapper Facade design pattern encapsulates the functions and data provided by existing non-object-oriented APIs within more concise, robust, portable, maintainable, and cohesive object-oriented class interfaces. The Component Configurator design pattern allows an application to link and unlink its component implementations at run-time without having to modify, recompile, or statically relink the application. Component Configurator further supports the reconfiguration of components into different application processes without having to shut down and re-start running processes. The Interceptor architectural pattern allows services to be added transparently to a framework and triggered automatically when certain events occur. The Extension Interface design pattern allows multiple interfaces to be exported by a component, to prevent bloating of interfaces and breaking of client code when developers extend or modify the functionality of the component.Event Handling Patterns The Reactor architectural pattern allows event-driven applications to demultiplex and dispatch service requests that are delivered to an application from one or more clients. The Proactor architectural pattern allows event-driven applications to efficiently demultiplex and dispatch service requests triggered by the completion of asynchronous operations, to achieve the performance benefits of concurrency without incurring certain of its liabilities. The Asynchronous Completion Token design pattern allows an application to demultiplex and process efficiently the responses of asynchronous operations it invokes on services. The Acceptor-Connector design pattern decouples the connection and initialization of cooperating peer services in a networked system from the processing performed by the peer services after they are connected and initialized.18
  • 19. POSA2 Pattern Abstracts (cont’d)Synchronization Patterns The Scoped Locking C++ idiom ensures that a lock is acquired when control enters a scope and released automatically when control leaves the scope, regardless of the return path from the scope. The Strategized Locking design pattern parameterizes synchronization mechanisms that protect a component’s critical sections from concurrent access. The Thread-Safe Interface design pattern minimizes locking overhead and ensures that intra-component method calls do not incur ‘self-deadlock’ by trying to reacquire a lock that is held by the component already. The Double-Checked Locking Optimization design pattern reduces contention and synchronization overhead whenever critical sections of code must acquire locks in a thread-safe manner just once during program execution.Concurrency Patterns The Active Object design pattern decouples method execution from method invocation to enhance concurrency and simplify synchronized access to objects that reside in their own threads of control. The Monitor Object design pattern synchronizes concurrent method execution to ensure that only one method at a time runs within an object. It also allows an object’s methods to cooperatively schedule their execution sequences. The Half-Sync/Half-Async architectural pattern decouples asynchronous and synchronous service processing in concurrent systems, to simplify programming without unduly reducing performance. The pattern introduces two intercommunicating layers, one for asynchronous and one for synchronous service processing. The Leader/Followers architectural pattern provides an efficient concurrency model where multiple threads take turns sharing a set of event sources in order to detect, demultiplex, dispatch, and process service requests that occur on the event sources. The Thread-Specific Storage design pattern allows multiple threads to use one ‘logically global’ access point to retrieve an object that is local to a thread, without incurring locking overhead on each object access.19
  • 20. The Frameworks in ACEACE FrameworkInversion of ControlReactor & ProactorCalls back to application-supplied event handlers to perform processing when events occur synchronously & asynchronouslyService ConfiguratorCalls back to application-supplied service objects to initialize, suspend, resume, & finalize themTaskCalls back to an application-supplied hook method to perform processing in one or more threads of controlAcceptor-ConnectorCalls back to service handlers to initialize them after they are connectedStreamsCalls back to initialize & finalize tasks when they are pushed & popped from a stream20
  • 21. Example: Applying ACE in Real-time AvionicsKey System Characteristics Deterministic & statistical deadlines ~20 Hz Low latency & jitter ~250 usecs Periodic & aperiodic processing Complex dependencies Continuous platform upgradesTest flown at China Lake NAWS by Boeing OSAT II ‘98, funded by OS-JTF www.cs.wustl.edu/~schmidt/TAO-boeing.html Also used on SOFIA project by Raytheon sofia.arc.nasa.gov First use of RT CORBA in mission computing Drove Real-time CORBA standardizationKey ResultsGoals Apply COTS & open systems to mission-critical real-time avionics21
  • 22. Time-critical targets require immediate response because: They pose a clear and present danger to friendly forces & Are highly lucrative, fleeting targets of opportunityExample: Applying ACE to Time-Critical TargetsChallenges are also relevant to TBMD & NMDGoals Detect, identify, track, & destroy time-critical targetsReal-time mission-critical sensor-to-shooter needs Highly dynamic QoS requirements & environmental conditions Multi-service & asset coordinationKey System CharacteristicsKey Solution CharacteristicsEfficient & scalable Affordable & flexible COTS-basedAdaptive & reflective High confidence Safety critical22
  • 23. Example: Applying ACE to Large-scale RoutersIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMIOMBSEBSEBSEBSEBSEBSEBSEBSEBSEGoal Switch ATM cells + IP packets at terabit ratesKey Software Solution CharacteristicsHigh confidence & scalable computing architecture Networked embedded processors Distribution middleware FT & load sharing Distributed & layered resource management Affordable, flexible, & COTSKey System Characteristics Very high-speed WDM links 102/103 line cards Stringent requirements for availability Multi-layer load balancing, e.g.: Layer 3+4 Layer 5 www.arl.wustl.edu
  • 24. Example: Applying ACE to Hot Rolling MillsGoals Control the processing of molten steel moving through a hot rolling mill in real-timeSystem Characteristics Hard real-time process automation requirements i.e., 250 ms real-time cycles System acquires values representing plant’s current state, tracks material flow, calculates new settings for the rolls & devices, & submits new settings back to plantKey Software Solution CharacteristicsAffordable, flexible, & COTS Product-line architecture Design guided by patterns & frameworksWindows NT/2000 Real-time CORBA (ACE+TAO)www.siroll.de24
  • 25. Example: Applying ACE to Real-time Image ProcessingGoals Examine glass bottles for defects in real-timeSystem Characteristics Process 20 bottles per sec i.e., ~50 msec per bottle Networked configuration ~10 cameras Key Software Solution CharacteristicsAffordable, flexible, & COTS Embedded Linux (Lem) Compact PCI bus + Celeron processorsRemote booted by DHCP/TFTP Real-time CORBA (ACE+TAO)www.krones.com25
  • 26. Networked Logging Service ExampleKey Participants Client application processes Generate log records Server logging daemon Receive, process, & store log records The logging server example in C++NPv2 is more sophisticated than the one in C++NPv1C++ code for all logging service examples are in ACE_ROOT/examples/ C++NPv1/ ACE_ROOT/examples/ C++NPv2/There’s an extra daemon involved26
  • 27. Patterns in the Networked Logging Service ReactorAcceptor- ConnectorComponent ConfiguratorMonitor ObjectActive ObjectProactorPipes & FiltersWrapper FacadeStrategized LockingScoped LockingThread-safe InterfaceHalf-Sync/ Half-AsyncLeader/ Followers27
  • 28. Network Daemon Design DimensionsCommunication dimensions address the rules, form, & level of abstraction that networked applications use to interact Concurrency dimensions address the policies & mechanisms governing the proper use of processes & threads to represent multiple service instances, as well as how each service instance may use multiple threads internally Service dimensions address key properties of a networked application service, such as the duration & structure of each service instance Configuration dimensions address how networked services are identified & the time at which they are bound together to form complete applications28
  • 29. Communication Design DimensionsCommunication is fundamental to networked application design The next three slides present a domain analysis of communication design dimensions, which address the rules, form, and levels of abstraction that networked applications use to interact with each other We cover the following communication design dimensions: Connectionless versus connection-oriented protocols Synchronous versus asynchronous message exchange Message-passing versus shared memory29
  • 30. Connectionless vs. Connection-oriented ProtocolsConnection-oriented applications must address two additional design issues: Data framing strategies, e.g., bytestream vs. message-oriented Connection multiplexing (muxing) strategies, e.g., multiplexed vs. nonmultiplexedA protocol is a set of rules that specify how control & data information is exchanged between communicating entities 3-way handshake in TCP/IPConnectorAcceptorSYNSYN/ACKACKConnection-oriented protocols provide a reliable, sequences, non-duplicated delivery service, which is useful for applications that can’t tolerate data loss Examples include TCP & ATMConnectionless protocols provide a message-oriented service in which each message can be routed and delivered independently Examples include UDP & IP30
  • 31. Alternative Connection Muxing StrategiesIn multiplexed connections all client requests emanating from threads in a single process pass through one TCP connection to a server process Pros: Conserves OS communication resources, such as socket handles and connection control blocks Cons: harder to program, less efficient, & less deterministicIn nonmultiplexed connections each client uses a different connection to communicate with a peer service Pros: Finer control of communication priorities & low synchronization overhead since additional locks aren't needed Cons: use more OS resources, & therefore may not scale well in certain environments31
  • 32. Sync vs. Async Message ExchangeSynchronous request/response protocols are the simplest form to implement Requests & responses are exchanged in a lock-step sequence. Each request must receive a response synchronously before the next is sentAsynchronous request/response protocols stream requests from client to server without waiting for responses synchronously Multiple client requests can be transmitted before any responses arrive from a server These protocols therefore often require a strategy for detecting lost or failed requests & resending them later32
  • 33. Message Passing vs. Shared MemoryMessage passing exchanges data explicitly via the IPC mechanisms Application developers generally define the protocol for exchanging the data, e.g.: Format & content of the data Number of possible participants in each exchange (e.g., point-to-point unicast), multicast, or broadcast) How participants begin, conduct, & end a message-passing sessionShared memory allows multiple processes on the same or different hosts to access & exchange data as though it were local to the address space of each process Applications using native OS shared memory mechanisms must define how to locate & map the shared memory region(s) & the data structures that are placed in shared memory33
  • 34. Sidebar: C++ Objects & Shared MemoryGeneral responsibilities using placement new operator The pointer passed to the placement new operator must point to a region of memory that is big enough & is aligned properly for the object type being created The placed object must be destroyed by explicitly calling the destructor Pitfalls initializing C++ objects with virtual functions in shared memory The shared memory region may reside at a different virtual memory location in each process that maps the shared memory The C++ compiler/linker need not locate the vtable (table of function pointers that provides a level of indirection to invoke virtual methods) at the same address in different processes that use the shared memory ACE wrapper façade classes that can be initialized in shared memory must therefore be concrete data types i.e., classes with only non-virtual methods Allocating a C++ Object in shared Memory void *obj_buf = … // Get a pointer to location in shared memory ABC *abc = new (obj_buf) ABC; // Use C++ placement new operator34
  • 35. Overview of the Socket API (1/2)Sockets are the most common network programming API available on OS platformsLocal context management Connection establishment & termination Originally developed in BSD Unix as a C language API to TCP/IP protocol suite The Socket API has approximately two dozen functions classified in five categoriesSocket is a handle created by the OS that associates it with an end point of a communication channel A socket can be bound to a local or remote address In Unix, socket handles & I/O handles can be used interchangeably in most cases, but this is not the case for Windows35
  • 36. Overview of the Socket API (2/2)Options management Network addressing Data transfer mechanisms 36
  • 37. Taxonomy of Socket DimensionsThe Socket API can be decomposed into the following dimensions:Type of communication service e.g., streams versus datagrams versus connected datagrams Communication & connection role e.g., clients often initiate connections actively, whereas servers often accept them passively Communication domain e.g., local host only versus local or remote host 37
  • 38. Limitations with the Socket APIs (1/2)Poorly structured, non-uniform, & non-portable API is linear rather than hierarchical i.e., the API is not structured according to the different phases of connection lifecycle management and the roles played by the participants No consistency among the names Non-portable & error-prone Function names: read() & write() used for any I/O handle on Unix but Windows needs ReadFile() & WriteFile() Function semantics: different behavior of same function on different OS e.g., accept () can take NULL client address parameter on Unix/Windows, but will crash on some operating systems, such as VxWorks Socket handle representations: different platforms represent sockets differently e.g., Unix uses unsigned integers whereas Windows uses pointers Header files: Different platforms use different names for header files for the socket API 38
  • 39. Limitations with the Socket APIs (2/2)Lack of type safety I/O handles are not amenable to strong type checking at compile time e.g., no type distinction between a socket used for passive listening & a socket used for data transfer Steep learning curve due to complex semantics Multiple protocol families and address families Options for infrequently used features such as broadcasting, async I/O, non blocking I/O, urgent data delivery Communication optimizations such as scatter-read & gather-write Different communication and connection roles, such as active & passive connection establishment, & data transferToo many low-level details Forgetting to use the network byte order before data transfer Possibility of missing a function, such as listen() Possibility of mismatch between protocol & address families Forgetting to initialize underlying C structures e.g., sockaddr Using a wrong socket for a given role 39
  • 40. 1 #include 2 #include 3 4 const int PORT_NUM = 10000; 5 6 int echo_server () 7 { 8 struct sockaddr_in addr; 9 int addr_len; 10 char buf[BUFSIZ]; 11 int n_handle; 12 // Create the local endpoint.Example of Socket API Limitations (1/3)Forgot to initialize to sizeof (sockaddr_in) Use of non-portable handle type Possible differences in header file names40
  • 41. 13 int s_handle = socket (PF_UNIX, SOCK_DGRAM, 0); 14 if (s_handle == -1) return -1; 15 16 // Set up address information where server listens. 17 addr.sin_family = AF_INET; 18 addr.sin_port = PORT_NUM; 19 addr.sin_addr.addr = INADDR_ANY; 20 21 if (bind (s_handle, (struct sockaddr *) &addr, 22 sizeof addr) == -1) 23 return -1; 24Use of non-portable return value Unused structure members not zeroed out Protocol and address family mismatchWrong byte orderMissed call to listen()Example of Socket API Limitations (2/3)41
  • 42. 25 // Create a new communication endpoint. 26 if (n_handle = accept (s_handle, (struct sockaddr *) &addr, 27 &addr_len) != -1) { 28 int n; 29 while ((n = read (s_handle, buf, sizeof buf)) > 0) 30 write (n_handle, buf, n); 31 32 close (n_handle); 33 } 34 return 0; 35 } SOCK_DGRAM handle illegal hereReading from wrong handle No guarantee that “n” bytes will be written Example of Socket API Limitations (3/3)42
  • 43. ACE Socket Wrapper Façade ClassesThese classes are designed in accordance with the Wrapper Facade design patternACE defines a set of C++ classes that address the limitations with the Socket API Enhance type-safety Ensure portability Simplify common use cases Building blocks for higher-level abstractions43
  • 44. The Wrapper Façade PatternMotivation The diversity of hardware & operating systems makes it hard to build portable & robust networked application software Programming directly to low-level OS APIs is tedious, error-prone, & non-portable This pattern encapsulates data & functions provided by existing non-OO APIs within more concise, robust, portable, maintainable, & cohesive OO class interfaces: Applicationmethod(): WrapperFacade: APIFunctionAfunctionA(): APIFunctionBfunctionB()Applicationcalls methodscallsAPI FunctionA()callsAPI FunctionB()calls API FunctionC()void methodN(){functionA();}void method1(){functionA();}functionB();Wrapper Facadedatamethod1() … methodN()Solution Apply the Wrapper Facade pattern to avoid accessing low-level operating system APIs directly 44
  • 45. Taxonomy of ACE Socket Wrapper FacadesThe structure of the ACE Socket wrapper facades corresponds to the taxonomy of communication services, connection/communication roles, & communication domains The ACE Socket wrapper façade classes provide the following capabilities: The ACE_SOCK_* classes encapsulate the Internet-domain Socket API functionality The ACE_LSOCK_* classes encapsulate the UNIX-domain Socket API functionalityACE also has wrapper facades for datagrams e.g., unicast, multicast, broadcast45
  • 46. Roles in the ACE Socket Wrapper FacadeThe active connection role (ACE_SOCK_Connector) is played by a peer application that initiates a connection to a remote peer The passive connection role (ACE_SOCK_Acceptor) is played by a peer application that accepts a connection from a remote peer & The communication role (ACE_SOCK_Stream) is played by both peer applications to exchange data after they are connected46
  • 47. ACE Socket Addressing ClassesMotivation Network addressing is a trouble spot in the Socket API To minimize the complexity of these low-level details, ACE defines a hierarchy of classes that provide a uniform interface for all ACE network addressing objectsClass Capabilities The ACE_Addr class is the root of the ACE network addressing hierarchy The ACE_INET_Addr class represents TCP/IP & UDP/IP addressing information This class eliminates many subtle sources of accidental complexity47
  • 48. ACE I/O Handle ClassesMotivation The low-level C I/O handle data types are tedious & error-prone Even the ACE_HANDLE typedef is still not properly object-oriented & typesafeClass Capabilities ACE_IPC_SAP is the root of the ACE hierarchy of IPC wrapper facades It provides basic I/O handle manipulation capabilities to other ACE IPC wrapper facades ACE_SOCK is the root of the ACE Socket wrapper facades & it provides methods to Create & destroy socket handles Obtain the network addresses of local & remote peers Set/get socket options, such as socket queue sizes, Enable broadcast/multicast communication Disable Nagle‘s algorithm48
  • 49. The ACE_SOCK_Connector ClassMotivation There is a confusing asymmetry in the Socket API between (1) connection roles & (2) socket modes e.g., an application may accidentally call recv() or send() on a data-mode socket handle before it's connected This problem can't be detected until run time since C socket handles are weakly-typedClass Capabilities ACE_SOCK_Connector is factory that establishes a new endpoint of communication actively & provides capabilities to Initiate a connection with a peer acceptor & then to initialize an ACE_SOCK_Stream object after the connection is established Initiate connections in either a blocking, nonblocking, or timed manner Use C++ traits to support generic programming techniques that enable wholesale replacement of IPC functionality 49
  • 50. Sidebar: Using Traits for ACE Wrapper FacadesACE uses the C++ generic programming idiom to define & combine a set of characteristics to alter the behavior of a template class In C++, the typedef & typename language feature is used to define a trait A trait provides a convenient way to associate related types, values, & functions with template parameter type without requiring that they be defined as members of the type Traits are used extensively in the C++ Standard Template Library (STL)ACE Socket Wrapper Facades use traits to define the following associations PEER_ADDR – this trait defines the ACE_INET_Addr class associated with the ACE Socket Wrapper Façade PEER_STREAM – this trait defines the ACE_SOCK_Stream data transfer class associated with the ACE_SOCK_Acceptor & ACE_SOCK_Connector factoriesclass ACE_SOCK_Connector { public: typedef ACE_INET_Addr PEER_ADDR; typedef ACE_SOCK_Stream PEER_STREAM; // ... class ACE_TLI_Connector { public: typedef ACE_INET_Addr PEER_ADDR; typedef ACE_TLI_Stream PEER_STREAM; // ...50
  • 51. Using the ACE_SOCK_Connector (1/2)int main (int argc, char *argv[]) { const char *pathname = argc > 1 ? argv[1] : "index.html"; const char *server_hostname = argc > 2 ? argv[2] : "ace.ece.uci.edu"; ACE_SOCK_Connector connector; ACE_SOCK_Stream peer; ACE_INET_Addr peer_addr; if (peer_addr.set (80, server_hostname) == -1) return 1; else if (connector.connect (peer, peer_addr) == -1) return 1;Block until connection established or connection request failureInstantiate the connector, data transfer, & address objectsThis example shows how the ACE_SOCK_Connector can be used to connect a client application to a Web server51
  • 52. Using the ACE_SOCK_Connector (2/2) // Designate a nonblocking connect. if (connector.connect (peer, peer_addr, &ACE_Time_Value::zero) == -1) { if (errno == EWOULDBLOCK) { // Do some other work ... // Now, try to complete connection establishment, // but don't block if it isn't complete yet. if (connector.complete (peer, 0, &ACE_Time_Value::zero) == -1) // Designate a timed connect. ACE_Time_Value timeout (10); // 10 second timeout. if (connector.connect (peer, peer_addr, &timeout) == -1) { if (errno == ETIME) { // Timeout, do something elsePerform a non-blocking connectIf connection not established, do other work & try again without blockingPerform a timed connect e.g., 10 seconds in this case52
  • 53. The ACE_SOCK_Stream ClassMotivation Developers can misuse sockets in ways that can't be detected during compilation An ACE_SOCK_Stream object can't be used in any role other than data transfer without intentionally violating its interfaceClass Capabilities Encapsulates data transfer mechanisms supported by data-mode sockets to provide the following capabilities:Support for sending & receiving up to n bytes or exactly n bytes Support for ``scatter-read'' operations, which populate multiple caller-supplied buffers instead of a single contiguous buffer Support for ``gather-write'' operations, which transmit the contents of multiple noncontiguous data buffers in a single operation Support for blocking, nonblocking, & timed I/O operations Support for generic programming techniques that enable the wholesale replacement of functionality via C++ parameterized types53
  • 54. Using the ACE_SOCK_Stream (1/2) // ...Connection code from example in Section 3.5 omitted... char buf[BUFSIZ]; iovec iov[3]; iov[0].iov_base = (char *) "GET "; iov[0].iov_len = 4; // Length of "GET ". iov[1].iov_base = (char *) pathname; iov[1].iov_len = strlen (pathname); iov[2].iov_base = (char *) " HTTP/1.0\r\n\r\n"; iov[2].iov_len = 13; // Length of " HTTP/1.0\r\n\r\n"; if (peer.sendv_n (iov, 3) == -1) return 1; for (ssize_t n; (n = peer.recv (buf, sizeof buf)) > 0; ) ACE::write_n (ACE_STDOUT, buf, n); return peer.close () == -1 ? 1 : 0; }Initialize the iovec vector for scatter-read and gather-write I/OPerform blocking gather-write on ACE_SOCK_Stream Perform blocking read on ACE_SOCK_Stream This example shows how an ACE_SOCK_Stream can be used to send & receive data to & from a Web server54
  • 55. Using the ACE_SOCK_Stream (2/2)Blocking & non-blocking I/O semantics can be controlled via the ACE_SOCK_STREAM enable() & disable() methods, e.g., peer.enable (ACE_NONBLOCK); // enables non blocking peer.disable (ACE_NONBLOCK); // disable non blocking If the I/O operation blocks, it returns a -1 & errno is set to EWOULDBLOCK I/O operations can involve timeouts, e.g., ACE_Time_Value timeout (10); // 10 second timeout If (peer.sendv_n (iov, 3, &timeout) == -1) { // check if errno is set to ETIME, // which indicates a timeout } // similarly use timeout for receiving data55
  • 56. Sidebar: Working with (& Around) Nagle’s AlgorithmNagle’s Algorithm Problem: Need to tackle the small packet problem (also called the send-side silly window syndrome), where application-generated small data payloads, such as a keystroke, result in transmissions of large packets thereby causing unnecessary wastage of network resources & possibly resulting in congestion Solution: The OS kernel buffers a number of small sized application level messages and concatenates them into a larger size packet that can then be transmitted Consequences: Although network congestion is minimized, it can lead to higher & unpredictable latencies, as well as lower throughput Controlling Nagle’s Algorithm via ACE Use the enable ()/disable () methods of the ACE_IPC_SAP class e.g., peer.enable (TCP_NODELAY); // disable Nagle’s algorithm peer.disable (TCP_NODELAY); // enable Nagle’s algorithm56
  • 57. The ACE_SOCK_Acceptor ClassMotivation The C functions in the Socket API are weakly typed, which makes it easy to apply them incorrectly in ways that can’t be detected until run-time The ACE_SOCK_Acceptor class ensures type errors are detected at compile-timeClass Capabilities This class is a factory that establishes a new endpoint of communication passively & provides the following capabilities: It accepts a connection from a peer connector & then initializes an ACE_SOCK_Stream object after the connection is established Connections can be accepted in either a blocking, nonblocking, or timed manner C++ traits are used to support generic programming techniques that enable the wholesale replacement of functionality via C++ parameterized types57
  • 58. Using the ACE_SOCK_Acceptorextern char *get_url_pathname (ACE_SOCK_Stream *); int main () { ACE_INET_Addr server_addr; ACE_SOCK_Acceptor acceptor; ACE_SOCK_Stream peer; if (server_addr.set (80) == -1) return 1; if (acceptor.open (server_addr) == -1) return 1; for (;;) { if (acceptor.accept (peer) == -1) return 1; peer.disable (ACE_NONBLOCK); // Ensure blocking . ACE_Auto_Array_Ptr pathname (get_url_pathname (peer)); ACE_Mem_Map mapped_file (pathname.get ()); if (peer.send_n (mapped_file.addr (), mapped_file.size ()) == -1) return 1; peer.close (); } return acceptor.close () == -1 ? 1 : 0; }Instantiate the acceptor, data transfer, & address objectsInitialize a passive mode endpoint to listen for connections on port 80Accept a new connectionSend the requested dataClose the connection to the senderStop receiving any connectionsThis example shows how an ACE_SOCK_Acceptor & ACE_SOCK_Stream can be used to accept connections & send/receive data to/from a web client58
  • 59. Sidebar: The ACE_Mem_Map ClassMemory Mapped Files Many modern operating systems provide a mechanism for mapping a file’s contents directly into a process’s virtual address space This memory-mapped file mechanism can be read from or written to directly by referencing the virtual memory e.g., via pointers instead of using less efficient I/O functions The file manager defers all read/write operations to the virtual memory manager Contents of memory mapped files can be shared by multiple processes on the same machine It can also be used to provide a persistent backing storeACE_Mem_Map Class A wrapper façade that encapsulates the memory mapped file system mechanisms on different operating systems Relieves application developers from having to manually perform bookkeeping tasks e.g., explicitly opening files or determining their lengths The ACE_Mem_Map class offers multiple constructors with several signature variants 59
  • 60. The ACE_Message_Block Class (1/2)Storing messages in buffers as they are received from the network or from other processes Adding/removing headers/trailers from messages as they pass through a user-level protocol stack Fragmenting/reassembling messages to fit into network MTUs Storing messages in buffers for transmission or retransmission Reordering messages that were received out-of-sequenceMotivation Many networked applications require a means to manipulate messages efficiently e.g.MESSAGES IN TRANSIT MESSAGES BUFFERED FOR TRANSMISSION MESSAGES BUFFERED AWAITING PROCESSING 60
  • 61. The ACE_Message_Block Class (2/2)Class Capabilities This class enables efficient manipulation of messages via the following operations Each ACE_Message_Block contains a pointer to a reference-counted ACE_Data_Block which in turn points to the actual data associated with a messageIt allows multiple messages to be chained together into a composite message It allows multiple messages to be joined together to form an ACE_Message_Queue It treats synchronization & memory management properties as aspects61
  • 62. Two Kinds of Message BlocksComposite messages contain multiple ACE_Message_Blocks These blocks are linked together in accordance with the Composite pattern Composite messages often consist of a control message that contains bookkeeping information e.g., destination addresses, followed by one or more data messages that contain the actual contents of the message ACE_Data_Blocks can be referenced countedSimple messages contain a one ACE_Message_Block An ACE_Message_Block points to an ACE_Data_Block An ACE_Data_Block points to the actual data payload62
  • 63. Using the ACE_Message_Block (1/2)int main (int argc, char *argv[]) { ACE_Message_Block *head = new ACE_Message_Block (BUFSIZ); ACE_Message_Block *mblk = head; for (;;) { ssize_t nbytes = ACE::read_n (ACE_STDIN, mblk->wr_ptr (), mblk->size ()); if (nbytes <= 0) break; // Break out at EOF or error. mblk->wr_ptr (nbytes); Allocate an ACE_Message_Block whose payload is of size BUFSIZRead data from standard input into the message block starting at write pointer (wr_ptr ())Advance write pointer by the number of bytes read to end of bufferThe following program reads all data from standard input into a singly linked list of dynamically allocated ACE_Message_Blocks These ACE_Message_Blocks are chained together by their continuation pointers63
  • 64. Using the ACE_Message_Block (2/2) mblk->cont (new ACE_Message_Block (BUFSIZ)); mblk = mblk->cont (); } // Print the contents of the list to the standard output. for (mblk = head; mblk != 0; mblk = mblk->cont ()) ACE::write_n (ACE_STDOUT, mblk->rd_ptr (), mblk->length ()); head->release (); // Release all the memory in the chain. return 0; }Allocate a new ACE_Message_Block of size BUFSIZ & chain it to the previous one at the end of the listFor every message block, print mblk->length() amount of contents starting at the read pointer (rd_ptr ())Advance mblk to point to the newly allocated ACE_Message_Block64
  • 65. ACE CDR StreamsMotivation Networked applications that send & receive messages often require support forThe ACE_OutputCDR & ACE_InputCDR classes provide a highly optimized, portable, & convenient means to marshal & demarshal data using the standard CORBA Common Data Representation (CDR) ACE_OutputCDR creates a CDR buffer from a data structure (marshaling) ACE_InputCDR extracts data from a CDR buffer (demarshaling)Linearization To handle the conversion of richly typed data (such as arrays or linked lists) to/from raw memory buffers (De)marshaling To interoperate in environments with heterogeneous compiler alignment constraints & hardware instructions with different byte-ordering rules65
  • 66. The ACE_OutputCDR & ACE_InputCDR Classes Class Capabilities These classes support the following features: They provide operations to (de)marshal the following types: Primitive types, e.g., booleans; 16-, 32-, & 64-bit integers; 8-bit octets; single & double precision floating point numbers; characters; & strings Arrays of primitive types The insertion (<<) and extraction (>>) operators can be used to marshal & demarshal primitive types, using the same syntax as the C++ iostream components They use ACE_Message_Block chains internally to minimize memory copies They take advantage of CORBA CDR alignment & byte-ordering rules to avoid memory copying & byte-swapping operations, respectively They provide optimized byte swapping code that uses inline assembly language instructions for common hardware platforms (such as Intel x86) & the standard htons(), htonl(), ntohs(), and ntohl() macros/functions on other platforms They support zero copy marshaling & demarshaling of octet buffers Users can define custom character set translators for platforms that do not use ASCII or Unicode as their native character sets66
  • 67. Sidebar: Log Record Message Structureclass ACE_Log_Record { private: ACE_UINT type_; ACE_UINT pid_; ACE_Time_Value timestamp_; char msg_data_[ACE_MAXLOGMSGLEN]; public: ACE_UINT type () const; ACE_UINT pid () const; const ACE_Time_Value timestamp () const; const char *msg_data () const; };ACE_Log_Record is a type that ACE uses internally to keep track of the fields in a log record This example uses a 8-byte, CDR encoded header followed by the payload Header includes byte order, payload length, & other fields67
  • 68. Using ACE_OutputCDRint operator<< (ACE_OutputCDR &cdr, const ACE_Log_Record &log_record) { size_t msglen = log_record.msg_data_len (); // Insert each field into the output CDR stream. cdr << ACE_CDR::Long (log_record.type ()); cdr << ACE_CDR::Long (log_record.pid ()); cdr << ACE_CDR::Long (log_record.time_stamp ().sec ()); cdr << ACE_CDR::Long (log_record.time_stamp ().usec ()); cdr << ACE_CDR::ULong (msglen); cdr.write_char_array (log_record.msg_data (), msglen); return cdr.good_bit (); } After marshaling all the fields of the log record into the CDR stream, return the success/failure status We show the ACE CDR insertion & extraction operators for the ACE_Log_Record class that's used by the client application & logging server68
  • 69. Using ACE_InputCDRint operator>> (ACE_InputCDR &cdr, ACE_Log_Record &log_record) { ACE_CDR::Long type; ACE_CDR::Long pid; ACE_CDR::Long sec, usec; ACE_CDR::ULong buffer_len; // Extract each field from input CDR stream into . if ((cdr >> type) && (cdr >> pid) && (cdr >> sec) && (cdr >> usec) && (cdr >> buffer_len)) { ACE_TCHAR log_msg[ACE_Log_Record::MAXLOGMSGLEN + 1]; log_record.type (type); log_record.pid (pid); log_record.time_stamp (ACE_Time_Value (sec, usec)); cdr.read_char_array (log_msg, buffer_len); log_msg[buffer_len] = '\0'; log_record.msg_data (log_msg); } return cdr.good_bit (); }After demarshaling all the fields of the log record from the CDR stream, return the success/failure status Temporaries use during demarshaling (not always necessary)69
  • 70. Implementing the Client Application (1/6)class Logging_Client { public: // Send to the server. int send (const ACE_Log_Record &log_record); // Accessor method. ACE_SOCK_Stream &peer () { return logging_peer_; } // Close the connection to the server. ~Logging_Client () { logging_peer_.close (); } private: ACE_SOCK_Stream logging_peer_; // Connected to server. };Header file: “Logging_Client.h”The following client application illustrates how to use the ACE Socket wrapper facades & CDR streams to establish connections, marshal log records, & send the data to our logging serverThis example behaves as follows: Reads lines from standard input Sends each line to the logging server in a separate log record, & Stops when it reads EOF from standard input70
  • 71. Implementing the Client Application (2/6) 1 int Logging_Client::send (const ACE_Log_Record &log_record) { 2 const size_t max_payload_size = 3 4 // type() 4 + 8 // timestamp 5 + 4 // process id 6 + 4 // data length 7 + ACE_Log_Record::ACE_MAXLOGMSGLEN // data 8 + ACE_CDR::MAX_ALIGNMENT; // padding; 9 10 ACE_OutputCDR payload (max_payload_size); 11 payload << log_record; 12 ACE_CDR::ULong length = payload.total_length (); 13 The Logging_Client::send() method behaves as follows: Computes the size of the payload (lines 2 – 8) Marshals the header & data into an output CDR (lines 10 – 16) & Sends it to the logging server (lines 18 – 24)First marshal the payload to contain the linearized ACE_Log_Record71
  • 72. Implementing the Client Application (3/6)14 ACE_OutputCDR header (ACE_CDR::MAX_ALIGNMENT + 8); 15 header << ACE_OutputCDR::from_boolean (ACE_CDR_BYTE_ORDER); 16 header << ACE_CDR::ULong (length); 17 18 iovec iov[2]; 19 iov[0].iov_base = header.begin ()->rd_ptr (); 20 iov[0].iov_len = 8; 21 iov[1].iov_base = payload.begin ()->rd_ptr (); 22 iov[1].iov_len = length; 23 24 return logging_peer_.sendv_n (iov, 2); 25 } Then marshal the header info that includes byte order & payload lengthConstruct an iovec of size 2 with header & payload infoSend message to logging serverSince TCP/IP is a byte stream protocol without any message boundaries the logging service uses CDR as a message framing protocol to delimit log records 72
  • 73. Implementing the Client Application (4/6) 1 int main (int argc, char *argv[]) 2 { 3 u_short logger_port = 4 argc > 1 ? atoi (argv[1]) : 0; 5 const char *logger_host = 6 argc > 2 ? argv[2] : ACE_DEFAULT_SERVER_HOST; 7 int result; 8 9 ACE_INET_Addr server_addr; 10 11 if (logger_port != 0) 12 result = server_addr.set (logger_port, logger_host); 13 else 14 result = server_addr.set ("ace_logger", logger_host); 15 if (result == -1) 16 ACE_ERROR_RETURN((LM_ERROR, 17 "lookup %s, %p\n", 18 logger_port == 0 ? "ace_logger" : argv[1], 19 logger_host), 1); 20The Logging_Client main program73
  • 74. Sidebar: ACE Debugging & Error MacrosConsolidates printing of debug and error messages via a printf ()-like format e.g., ACE_DEBUG, ACE_ERROR (and their *_RETURN counterparts) that encapsulate the ACE_Log_Msg::log() method Arguments are enclosed in a double set of parentheses to make it appear as one argument to the C++ preprocessor First argument is the severity code; second one is a format string supporting a superset of printf() conversion specifiersFormatAction%lDisplays the line number where the error occurred%NDisplays the file name where the error occurred%nDisplays the name of the program%PDisplays the current process ID%pTakes a const char * argument and displays it and the error string corresponding to errno (similar to perror())%TDisplays the current time%tDisplays the calling thread’s ID74
  • 75. Implementing the Client Application (5/6)21 ACE_SOCK_Connector connector; 22 Logging_Client logging_client; 23 24 if (connector.connect (logging_client.peer (), 25 server_addr) < 0) 26 ACE_ERROR_RETURN ((LM_ERROR, 27 "%p\n", 28 "connect()"), 29 1); 30 31 // Limit the number of characters read on each record. 32 cin.width (ACE_Log_Record::MAXLOGMSGLEN);Use the ACE_SOCK_Connector wrapper façade to connect to the logging serverContents of the message to be sent to logging server are obtained from standard input75
  • 76. Implementing the Client Application (6/6)33 for (;;) { 34 std::string user_input; 35 getline (cin, user_input, '\n'); 36 37 if (!cin || cin.eof ()) break; 38 39 ACE_Time_Value now (ACE_OS::gettimeofday ()); 40 ACE_Log_Record log_record (LM_INFO, now, 41 ACE_OS::getpid ()); 42 log_record.msg_data (user_input.c_str ()); 43 44 if (logging_client.send (log_record) == -1) 45 ACE_ERROR_RETURN ((LM_ERROR, 46 "%p\n", "logging_client.send()"), 1); 47 } 48 49 return 0; // Logging_Client destructor 50 // closes TCP connection. 51 } 76
  • 77. The Logging_Server ClassesThe figure below illustrates our Logging_Server abstract base class, the Logging_Handler class we'll describe shortly, & the concrete logging server classes that we'll develop in subsequent sections of the tutorial77
  • 78. Implementing the Logging_Server (1/5)// Forward declaration. class ACE_SOCK_Stream; class Logging_Server { public: // Template Method that runs logging server's event loop. virtual int run (int argc, char *argv[]); protected: // The following four methods are ``hooks'' that can be // overridden by subclasses. virtual int open (u_short logger_port = 0); virtual int wait_for_multiple_events () { return 0; } virtual int handle_connections () = 0; virtual int handle_data (ACE_SOCK_Stream * = 0) = 0; Header file “Logging_Server.h”This example uses the ACE_Message_Block & ACE CDR classes in a new base class that will simplify logging server implementations in the book78
  • 79. Implementing the Logging_Server (2/5) // This helper method can be used by the hook methods. int make_log_file (ACE_FILE_IO &, ACE_SOCK_Stream * = 0); // Close the socket endpoint and shutdown ACE. virtual ~Logging_Server () { acceptor_.close (); } // Accessor. ACE_SOCK_Acceptor &acceptor () { return acceptor_; } private: // Socket acceptor endpoint. ACE_SOCK_Acceptor acceptor_; }; Header file “Logging_Server.h” cont’d79
  • 80. Implementing the Logging_Server (3/5)#include "ace/FILE_Addr.h" #include "ace/FILE_Connector.h" #include "ace/FILE_IO.h" #include "ace/INET_Addr.h" #include "ace/SOCK_Stream.h" #include "Logging_Server.h" int Logging_Server::run (int argc, char *argv[]) { if (open (argc > 1 ? atoi (argv[1]) : 0) == -1) return -1; for (;;) { if (wait_for_multiple_events () == -1) return -1; if (handle_connections () == -1) return -1; if (handle_data () == -1) return -1; } return 0; }Implementation file “Logging_Server.cpp”Template method providing the skeleton of the algorithm to use Hook methods will be overridden by subclasses unless default is to be usedThree hook methods that can be overridden in subclasses80
  • 81. Implementing the Logging_Server (4/5)int Logging_Server::open (u_short logger_port) { // Raises the number of available socket handles to // the maximum supported by the OS platform. ACE::set_handle_limit (); ACE_INET_Addr server_addr; int result; if (logger_port != 0) result = server_addr.set (logger_port, INADDR_ANY); else result = server_addr.set ("ace_logger", INADDR_ANY); if (result == -1) return -1; // Start listening and enable reuse of listen address // for quick restarts. return acceptor_.open (server_addr, 1); }Initialize the acceptor allowing it to accept connections from any address81
  • 82. Implementing the Logging_Server (5/5)int Logging_Server::make_log_file (ACE_FILE_IO &logging_file, ACE_SOCK_Stream *logging_peer) { std::string filename (MAXHOSTNAMELEN, ’\0’); if (logging_peer != 0) { // Use client host name as file name. ACE_INET_Addr logging_peer_addr; logging_peer->get_remote_addr (logging_peer_addr); logging_peer_addr.get_host_name (filename.c_str (), filename.size ()); filename += ".log"; } else filename = "logging_server.log"; ACE_FILE_Connector connector; return connector.connect (logging_file, ACE_FILE_Addr (filename.c_str ()), 0, // No time-out. ACE_Addr::sap_any, // Ignored. 0, // Don't try to reuse the addr. O_RDWR|O_CREAT|O_APPEND, ACE_DEFAULT_FILE_PERMS); }Create the log file using the ACE_FILE_Connector factory82
  • 83. Sidebar: The ACE File Wrapper FacadesACE file wrapper facades encapsulate platform mechanisms for unbuffered file operations The design of thes wrapper facades is very similar to ACE IPC wrapper facades The ACE File classes decouple: Initialization factories: e.g., ACE_FILE_Connector, which opens and/or creates files Data transfer classes: e.g., ACE_FILE_IO, which applications use to read/write data from/to files opened using ACE_FILE_Connector This generality in ACE’s design of wrapper facades helps strategize higher-level ACE framework components e.g., ACE_Acceptor, ACE_Connector, & ACE_Svc_Handler83
  • 84. Implementing the Logging_Handler (1/7)#include "ace/FILE_IO.h" #include "ace/SOCK_Stream.h" class ACE_Message_Block; // Forward declaration. class Logging_Handler { protected: // Reference to a log file. ACE_FILE_IO &log_file_; // Connected to the client. ACE_SOCK_Stream logging_peer_; Header file “Logging_Handler.h”This class is used by the logging server to encapsulate the I/O & processing of log records 84
  • 85. Implementing the Logging_Server (2/7)public: // Initialization and termination methods. Logging_Handler (ACE_FILE_IO &log_file) : log_file_ (log_file) {} Logging_Handler (ACE_FILE_IO &log_file, ACE_HANDLE handle) : log_file_ (log_file) { logging_peer_.set_handle (handle); } Logging_Handler (ACE_FILE_IO &log_file, const ACE_SOCK_Stream &logging_peer) : log_file_ (log_file), logging_peer_ (logging_peer) {} int close () { return logging_peer_.close (); } ACE_SOCK_Stream &peer () { return logging_peer_; } // Accessor. Header file “Logging_Handler.h” cont’d85
  • 86. Implementing the Logging_Server (3/7) // Receive one log record from a connected client. // contains the hostname, cont()> contains the log // record header (the byte order and the length) and the data. int recv_log_record (ACE_Message_Block *&mblk); // Write one record to the log file. The contains the // hostname and the cont> contains the log record. int write_log_record (ACE_Message_Block *mblk); // Log one record by calling and // . int log_record (); }; Header file “Logging_Handler.h” cont’d86
  • 87. Implementing the Logging_Server (4/7) 1 int Logging_Handler::recv_log_record (ACE_Message_Block *&mblk) 2 { 3 ACE_INET_Addr peer_addr; 4 logging_peer_.get_remote_addr (peer_addr); 5 mblk = new ACE_Message_Block (MAXHOSTNAMELEN + 1); 6 peer_addr.get_host_name (mblk->wr_ptr (), MAXHOSTNAMELEN); 7 mblk->wr_ptr (strlen (mblk->wr_ptr ()) + 1); // Go past name. 8 9 ACE_Message_Block *payload = 10 new ACE_Message_Block (ACE_DEFAULT_CDR_BUFSIZE); 11 // Align Message Block for a CDR stream. 12 ACE_CDR::mb_align (payload); 13 14 if (logging_peer_.recv_n (payload->wr_ptr (), 8) == 8) { 15 payload->wr_ptr (8); // Reflect addition of 8 bytes. 16 17 ACE_InputCDR cdr (payload); 18Receive incoming data & use the input CDR class to parse header Then payload based on the framing protocol & Finally save it in a ACE_Message_Block chainForce proper alignmentFirst save the peer hostnameReceive the header info (byte order & length)87
  • 88. Implementing the Logging_Server (5/7)19 ACE_CDR::Boolean byte_order; 20 // Use helper method to disambiguate booleans from chars. 21 cdr >> ACE_InputCDR::to_boolean (byte_order); 22 cdr.reset_byte_order (byte_order); 23 24 ACE_CDR::ULong length; 25 cdr >> length; 26 27 payload->size (8 + ACE_CDR::MAX_ALIGNMENT + length); 28 29 if (logging_peer_.recv_n (payload->wr_ptr(), length) > 0) { 30 payload->wr_ptr (length); // Reflect additional bytes. 31 mblk->cont (payload); // chain the header and payload 32 return length; // Return length of the log record. 33 } 34 } 35 payload->release (); 36 mblk->release (); 37 payload = mblk = 0; 38 return -1; 39 }On error, free up allocated message blocksResize message block to be the right size for payload & that’s aligned properly Demarshal header info88
  • 89. Implementing the Logging_Server (6/7) 1 int Logging_Handler::write_log_record (ACE_Message_Block *mblk) 2 { 3 if (log_file_->send_n (mblk) == -1) return -1; 4 5 if (ACE::debug ()) { 6 ACE_InputCDR cdr (mblk->cont ()); 7 ACE_CDR::Boolean byte_order; 8 ACE_CDR::ULong length; 9 cdr >> ACE_InputCDR::to_boolean (byte_order); 10 cdr.reset_byte_order (byte_order); 11 cdr >> length; 12 ACE_Log_Record log_record; 13 cdr >> log_record; // Extract the . 14 log_record.print (mblk->rd_ptr (), 1, cerr); 15 } 16 17 return mblk->total_length (); 18 }Send the message block chain to the log file, which is stored in binary format If debug flag is set, print contents of the message89
  • 90. Implementing the Logging_Server (7/7)int Logging_Handler::log_record () { ACE_Message_Block *mblk = 0; if (recv_log_record (mblk) == -1) return -1; else { int result = write_log_record (mblk); mblk->release (); // Free up the entire contents. return result == -1 ? -1 : 0; } } Receives the message Demarshals it into a ACE_Message_Block & Writes it to the log file90
  • 91. Implementing the Iterative_Logging_Server (1/3)#include "ace/FILE_IO.h" #include "ace/INET_Addr.h" #include "ace/Log_Msg.h" #include "Logging_Server.h" #include "Logging_Handler.h" class Iterative_Logging_Server : public Logging_Server { protected: ACE_FILE_IO log_file_; Logging_Handler logging_handler_; public: Iterative_Logging_Server (): logging_handler_ (log_file_) {} Logging_Handler &logging_handler () { return logging_handler_; } protected: // Other methods shown below... };Header file: Iterative_Logging_Server.h91
  • 92. Implementing the Iterative_Logging_Server (2/3) virtual int open (u_short logger_port) { if (make_log_file (log_file_) == -1) ACE_ERROR_RETURN ((LM_ERROR, "%p\n", "make_log_file()"), -1); return Logging_Server::open (logger_port); } virtual ~Iterative_Logging_Server () { log_file_.close (); } virtual int handle_connections () { ACE_INET_Addr logging_peer_addr; if (acceptor ().accept (logging_handler_.peer (), &logging_peer_addr) == -1) ACE_ERROR_RETURN ((LM_ERROR, "%p\n", "accept()"), -1); ACE_DEBUG ((LM_DEBUG, "Accepted connection from %s\n", logging_peer_addr.get_host_name ())); return 0; }Override the handle_connections() hook method to handle one connection at a timeOverride & “decorate” the Logging_Server::open() method92
  • 93. Implementing the Iterative_Logging_Server (3/3) virtual int handle_data (ACE_SOCK_Stream *) { while (logging_handler_.log_record () != -1) continue; logging_handler_.close (); // Close the socket handle. return 0; } #include "ace/Log_Msg.h" #include "Iterative_Logging_Server.h" int main (int argc, char *argv[]) { Iterative_Logging_Server server; if (server.run (argc, argv) == -1) ACE_ERROR_RETURN ((LM_ERROR, "%p\n", "server.run()"), 1); return 0; }Main program of iterative logging serverDelegate I/O to Logging_Handler93
  • 94. Concurrency Design DimensionsConcurrency is essential to develop scalable & robust networked applications, particularly servers The next group of slides present a domain analysis of concurrency design dimensions that address the policies & mechanisms governing the proper use of processes, threads, & synchronizers We cover the following design dimensions in this chapter: Iterative versus concurrent versus reactive servers Processes versus threads Process/thread spawning strategies User versus kernel versus hybrid threading models Time-shared versus real-time scheduling classes Task- versus message-based architectures94
  • 95. Iterative vs. Concurrent ServersIterative/reactive servers handle each client request in its entirety before servicing subsequent requests Best suited for short-duration or infrequent services Concurrent servers handle multiple requests from clients simultaneously Best suited for I/O-bound services or long-duration services Also good for busy servers95
  • 96. Multiprocessing vs. MultithreadingA process is the OS entity that provides the context for executing program instructions Each process manages certain resources (such as virtual memory, I/O handles, and signal handlers) & is protected from other OS processes via an MMU IPC between processes can be complicated & inefficientA thread is a single sequence of instructions executed in the context of a process’s protection domain Each thread manages certain resources (such as runtime stack, registers, signal masks, priorities, & thread-specific data) Threads are not protected from other threads IPC between threads can be more efficient than IPC between processes96
  • 97. Thread Pool Eager Spawning StrategiesThis strategy prespawns one or more OS processes or threads at server creation time These``warm-started'' execution resources form a pool that improves response time by incurring service startup overhead before requests are serviced Two general types of eager spawning strategies are shown below:These strategies are based on the Half-Sync/Half-Async and Leader/Followers patterns 97
  • 98. The Half-Sync/Half-Async PatternSyncService LayerAsyncService LayerQueueingLayer<><><><><>Sync Service 1Sync Service 2Sync Service 3ExternalEvent SourceQueueAsync ServiceThe Half-Sync/Half-Async pattern decouples async & sync service processing in concurrent systems This pattern simplifies programming without unduly reducing performanceThis pattern defines two service processing layers—one async & one sync—along with a queueing layer that allows services to exchange messages between the two layers The pattern allows sync services, such as HTTP protocol processing, to run concurrently, relative both to each other & to async services, such as event demultiplexing: External EventSource: Async Service: Queuenotificationread()enqueue()message: Sync Servicework()messageread()messagework()notification98
  • 99. Drawbacks with Half-Sync/ Half-Async ArchitectureSolution Apply the Leader/Followers architectural pattern to minimize server threading overhead Problem Although Half-Sync/Half-Async threading model is more scalable than the purely reactive model, it is not necessarily the most efficient designCPU cache updates<><><><>Worker Thread 1Worker Thread 3ACE_ReactorRequest QueueHTTP AcceptorHTTP Handlers, Worker Thread 2e.g., passing a request between the Reactor thread & a worker thread incurs:This overhead can make server latency unnecessarily highDynamic memory (de)allocation,A context switch, &Synchronization operations,99
  • 100. The Leader/Followers PatternThe Leader/Followers architectural pattern provides a more efficient concurrency model than Half-Sync/Half-Async In this pattern, multiple threads take turns sharing event sources to detect, demux, dispatch, & process service requests that occur on the event sources This pattern eliminates the need for—& the overhead of—a separate Reactor thread & synchronized request queue used in the Half-Sync/Half-Async patternHandles Handle SetsConcurrent HandlesIterative HandlesConcurrent Handle SetsUDP Sockets + WaitForMultipleObjects()TCP Sockets + WaitForMultpleObjects()Iterative Handle SetsUDP Sockets + select()/poll()TCP Sockets + select()/poll()Handleusesdemultiplexes**Handle Set handle_events() deactivate_handle() reactivate_handle() select() Event Handler handle_event () get_handle()Concrete Event Handler B handle_event () get_handle()Concrete Event Handler A handle_event () get_handle()Thread Pool join() promote_new_leader() synchronizer100
  • 101. Leader/Followers Pattern Dynamics: ConcreteEvent Handlerjoin()handle_event(): ThreadPool : HandleSetjoin()thread2 sleepsuntil it becomesthe leadereventthread1 sleepsuntil it becomesthe leaderdeactivate_handle()join()Thread1Thread2handle_events()reactivate_handle()handle_event()eventthread2waits for anew event,thread1processescurrenteventdeactivate_handle()handle_events()new_leader()Leader thread demuxing Follower thread promotion Event handler demuxing & event processing Rejoining the thread poolpromote_101
  • 102. Thread-per-Request On-demand Spawning StrategyOn-demand spawning creates a new process or thread in response to the arrival of client connection and/or data requests Typically used to implement the thread-per-request and thread-per-connection modelsThe primary benefit of on-demand spawning strategies is their reduced consumption of resources The drawbacks, however, are that these strategies can degrade performance in heavily loaded servers & determinism in real-time systems due to the costs incurred when spawning processes/threads and starting services.102
  • 103. The N:1 & 1:1 Threading ModelsOS scheduling ensures that applications use host CPU resources appropriately Modern OS platforms provide various models for scheduling threads A key difference between the models is the contention scope in which threads compete for system resources, particularly CPU time The two different contention scopes are shown below:Process contention scope (aka “user threading”) where threads in the same process compete with each other (but not directly with threads in other processes) for scheduled CPU timeSystem contention scope (aka “kernel threading”) where threads compete directly with other system-scope threads, regardless of what process they’re in103
  • 104. The N:M Threading ModelSome operating systems (such as Solaris) offer a combination of the N:1 and 1:1 models, referred to as the ``N:M'‘ hybrid-threading model that supports a mix of user threads & kernel threads When an application spawns a thread, it can indicate in which contention scope the thread should operate The OS threading library creates a user-space thread, but only creates a kernel thread if needed or if the application explicitly requests the system contention scopeWhen the OS kernel blocks an LWP, all user threads scheduled onto it by the threads library also block However, threads scheduled onto other LWPs in the process can continue to make progress104
  • 105. Task- vs. Message-based Concurrency ArchitecturesA concurrency architecture is a binding between: CPUs, which provide the execution context for application code Data & control messages, which are sent & received from one or more applications & network devices Service processing tasks, which perform services upon messages as they arrive & departTask-based concurrency architectures structure multiple CPUs according to units of service functionality in an application Message-based concurrency architectures structure the CPUs according to the messages received from applications & network devices105
  • 106. Overview of OS Concurrency Mechanisms Networked applications, particularly servers, must often process requests concurrently to meet their quality of service requirements This section presents an overview of the Synchronous event demultiplexing Multiprocessing Multithreading & Synchronization mechanisms available to implement those designs We also discuss common portability & programming problems that arise when networked applications are developed using native C-level concurrency APIs106
  • 107. Synchronous Event DemultiplexingSynchronous event demuxers wait for certain events to occur on a set of event sources, where the caller is returned the thread of control whenever one or more event sources become active e.g., poll() on System V UNIX, WaitForMultipleObjects() on Win32, & select() select() is the most common event demultiplexing API for I/O handlesint select (int width, // Maximum handle plus 1 fd_set *read_fds, // Set of "read" handles fd_set *write_fds, // Set of "write" handles fd_set *except_fds, // Set of "exception" handles struct timeval *timeout);// Time to wait for eventsfd_set is a structure representing the set of handles to check for I/O events, such as ready for reading, writing, or exception events select() modifies the fd_set depending on the active/inactive handles as follows: If a handle is not active in the fd_set, it is ignored & select will keep it inactive in the fd_set If a handle is active, select() will determine if there are pending events. If there is one, the appropriate fd_set has the handle activated else its value is made inactive in the returned fd_set107
  • 108. Multiprocessing MechanismsMultiprocessing mechanisms include the features provided by the OS for creating & managing the execution of multiple processes, e.g., Process lifetime operations – such as fork() & exec*() on POSIX & CreateProcess() on Win32 create a new process address space for programs to run The initiating process can set command line arguments, environment variables and working directories for the new process The new process can terminate voluntarily by reaching the end of its execution or be involuntarily killed via signals (in POSIX) or TerminateProcess() (in Win32) Process synchronization options – provided by the OS to retain the identity and exit status of a process and report it to the parent, e.g., POSIX wait() & waitpid() Win32 WaitForSingleObject() & WaitForMultipleObjects() Process property operations – used to get/set process properties, such as default file access permissions, user identification, resource limits, scheduling priority and current working directory108
  • 109. Multithreading Mechanisms (1/2)Multithreading mechanisms are provided by the OS to handle thread lifetime management, synchronization, priorities, & thread specific storage Thread lifetime operations – include operations to create threads, e.g., pthread_create() (PThreads) & CreateThread() (Win32) Thread termination is achieved in the following manner: Voluntarily – by reaching the end point of the thread entry function or calling pthread_exit() (Pthreads) or ExitThread() (Win32) Involuntarily – by being killed via a signal or an aynchronous thread cancelation operations, such as pthread_cancel() (Pthreads) and TerminateThread() (Win32) Thread synchronization operations – that allow created threads to be Detached – where the OS reclaims storage used for the thread’s state & exit status after it has exit Joinable – where the OS retains identity & exit status of a terminating thread so other threads can synchronize with it Other operations that allow threads to suspend & resume each other, or send signals to other threads 109
  • 110. Multithreading Mechanisms (2/2)Thread property operations – includes operations to set and get thread properties, such as priority and scheduling class. Thread-specific storage – is similar to global data except that the data is global in the scope of the executing thread. Each thread has its own copy of a TSS data e.g., errno Each TSS item has a key that is global to all threads within a process A thread uses this key to access its copy of the TSS data Keys are created by factory functions, such as pthread_key_create() (Pthreads) or TlsAlloc() (Win32). Key/pointer relationships are managed by TSS set/get functions, such as pthread_getspecific() & pthread_setspecific() (Pthreads) or TlsGetValue() & TlsSetValue() (Win32) 110
  • 111. Synchronization Mechanisms (1/2)Synchronization mechanisms allow processes and threads to coordinate their execution order and the order in which they access shared resources, such as files, network devices, database records, and shared memory Mutexes – serialize execution of multiple threads by defining a critical section of code that can be executed by only one thread at a time. A thread owning a mutex must release it There are two kinds of mutexes: Nonrecursive mutex – that will deadlock or fail if the thread currently owning the mutex tries to reacquire it without first releasing it Recursive mutex – that will allow the thread owning the mutex to reacquire it without deadlocking The owner thread is responsible to release it the same number of times it has acquired it111
  • 112. Synchronization Mechanisms (2/2)Readers/writer locks – allows access to a shared resource by either multiple threads simultaneously having read-only access or only one thread at a time having a read-write access They help improve performance for applications where resources are read more often than modified They can be implemented to give more preference to either the readers or the writer Semaphores – is a non negative integer that can be incremented and decremented atomically A thread blocks when it tried to decrement a semaphore whose value is 0 A block thread makes progress when another thread increments the value of the semaphore Usually implemented using sleep locks, that trigger a context switch Condition variables – allows a thread to coordinate & schedule its own processing A thread can wait on complex expressions to attain a desired state Used to build higher level patterns such as active object & monitor objects112
  • 113. Sidebar: Evaluating Synchronization MechanismsPerformance of synchronization mechanisms depends on the OS implementation and hardware Some general issues to keep in mind: Condition variables & semaphores – generally have a higher overhead than mutexes Native OS implementations usually perform better than emulated behavior Mutexes versus Reader/Writer Locks – mutexes generally have the lower overhead than reader/writer locks. On a multiprocessor platform, reader/writer locks scale well since multiple readers can execute in parallel Nonrecursive mutexes – are more efficient than recursive mutexes Moreover, subtle errors can be caused using recursive mutexes due to mismatch in the number of lock & unlock operations113
  • 114. Sidebar: ACE API Error Propagation StrategiesError reporting strategies usually differ across different concurrency APIs & OS e.g., UI & Pthreads return 0 on success and a non-zero number on failure whereas Win32 returns 0 on failure and conveys the error value via thread specific storage This makes code non-portable and filled with accidental complexities ACE concurrency wrapper facades solve this problem by returning -1 on error and setting the errno variable in thread specific storage114
  • 115. The ACE Event Demuxing Wrapper FacadesThe reactive server model can be thought of as ``lightweight multitasking,'' where a single-threaded server communicates with multiple clients in a round-robin manner without introducing the overhead & complexity of threading & synchronization mechanisms This server concurrency strategy uses an event loop that examines & reacts to events from its clients continuously An event loop demultiplexes input from various event sources so they can be processed in an orderly way Event sources in networked applications are primarily socket handles The most popular event demultipelxing function is select(), which provides the basis for the ACE classes described below115
  • 116. The ACE_Handle_Set ClassMotivation The fd_set represents another source of accidental complexity in the following areas:The code to scan for active handles is often a hot spot since it executes continually in a tight loopClass Capabilities The ACE_Handle_Set class uses the Wrapper Façade pattern to encapsulate fd_sets & provide the following capabilities: It enhances the portability, ease of use, & type safety of event-driven applications by simplifying the use of fd_set & select() across OS platforms It tracks and adjusts the fd_set size-related values automatically as handles are added & removed ACE_Handle_Set_Iterator provides an optimized iterator for ACE_Handle_SetThe macros supplied to manipulate & scan an fd_set must be used carefully to avoid processing handles that aren't active & to avoid corrupting an fd_set The fd_set is defined in system-supplied header files whose representation is exposed to programmers There are subtle nonportable aspects of fd_set when used in conjunction with select() 116
  • 117. Using the ACE_Handle_Set (1/3)class Reactive_Logging_Server : public Iterative_Logging_Server { protected: // Keeps track of the acceptor socket handle and all the // connected stream socket handles. ACE_Handle_Set master_handle_set_; // Keep track of handles marked as active by . ACE_Handle_Set active_read_handles_; typedef Logging_Server PARENT; // Other methods shown below... };Association between each connected peer & its log file is maintained in a hash map125
  • 126. Implementing the Reactive Logging Server (3/6) virtual int open (u_short logger_port) { PARENT::open (logger_port); master_handle_set_.set_bit (acceptor ().get_handle ()); acceptor ().enable (ACE_NONBLOCK); return 0; } virtual int wait_for_multiple_events () { active_read_handles_ = master_handle_set_; int width = (int) active_read_handles_.max_set () + 1; return ACE::select (width, active_read_handles_); } Note use of ACE::select(), which calls active_read_handles_.sync() automaticallyopen() method similar to previous one126
  • 127. Implementing the Reactive Logging Server (4/6) virtual int handle_connections () { ACE_SOCK_Stream logging_peer; while (acceptor ().accept (logging_peer) != -1) { ACE_FILE_IO *log_file = new ACE_FILE_IO; // Use the client's hostname as the logfile name. make_log_file (*log_file, &logging_peer); // Add the new 's handle to the map and // to the set of handles we