WL#7755: mysqlpump: Extend mysqldump functionalities

Affects: Server-5.7   —   Status: Complete

"mysqlpump" is a new open source database utility providing built-from-scratch reimplementation of mysqldump functionality. This utilities main feature is speed and extendability. Parallelizing the operation of dumping multiple databases, objects inside databases, would speedup the dump process. This tool is not based on mysqldump, this tool is developed from scratch and is not compatible with mysqldump. It does not implement all the functionalities provided by mysqldump.

There is a list of new features and optimizations mysqlpump will have implemented:

  1. get the basic functionality of mysqldump
  2. allow the compression of output,
  3. make creating secondary indexes in InnoDB faster by adding them after rows are inserted,
  4. allow parallel processing of schemas to speed up dump process,
  5. allow parallel processing of objects in schemas to speed up dump process,
  6. allow more flexibility with which data to dump (views, procedures, table DDL, table rows, privileges),
  7. allow to watch progress.
  8. allow to use mysqldump in other tools as library, both MySQL client tools and tools created by external clients.

This task will solve 2 other WLs partially and 4 bugs related to mysqldump.

As DB object we mean any of database, table, view, trigger, privilege, Event Scheduler event, stored procedure or function, user and user privileges and row.

Functional Requirements:

  1. New tool should be backward-compatible with all existing command line options.
  2. User can change multi-threading behaviour by options:
    • --parallel-schemas=[N:]<list of: schema_name separated with ','> - Process tables in specified schemas using separate queue handled by --default-parallelism threads or N threads, if N is specified. Specified N must be positive number. Can be used multiple times to specify more queues.
    • --default-parallelism=<N> - Specifies number of threads to process each parallel queue for values N > 0. If N is 0 then no queue will be used. Default value is to be set after performance tests. If N > 1 then objects in dump file can have lines intersected. Usage of values greater than 1 is mutually exclusive with --single-transaction.
  3. Additionally to existing mysqldump version filtering possibilities, in new version user can additionally specify which DB objects types to dump and in each type which to include or exclude:
    • --include-<object type>=<list> - List of comma separated DB objects to include in dump. If no --exclude-<object type> was used then all other objects will not be dumped. If --exclude-<object type> is used, then all objects that are on exclude list and are not on include list will be ignored. Names can contain _%* pattern characters. Valid <object-type>s: tables, databases, views, triggers, events, procedures, users.
    • --exclude-<object type>=<list> - List of comma separated DB objects to exclude from dump. All other objects will be dumped. Names can contain _%* pattern characters. All objects in all excluded databases will be ignored. Valid <object-type>s: tables, databases, views, triggers, events, procedures, users.
    • --users - Dump all users in CREATE USER syntax and their privileges in GRANT format.
  4. Allow saving results in multiple files, dumping all DB objects to one main file, and DML to pool of other files, compress all output files.
    • --result-file=file - Direct all output generated for all objects to a given file.
    • --compress-output=algorithm - Compresses all output files with given compression algorithm. Available algorithms are "lz4" and "zlib".
  5. Allow watching dump progress:
    • --watch-progress - Shows periodically progress information on error output. Progress information include both completed and total number of tables, rows and other objects collected.
  6. New mysqlpump options:
    • --add-drop-user - Add a DROP USER before each CREATE USER.
    • --users - Dump users with their privileges in GRANT format.
    • --compress-output - Compresses output files.
    • --deffer-table-indexes - Create indexes for tables after all rows are dumped.
    • --exclude-databases - Specifies comma-separated list of databases to exclude.
    • --exclude-events - Specifies comma-separated list of events to exclude.
    • --exclude-routines - Specifies comma-separated list of routines to exclude.
    • --exclude-tables - Specifies comma-separated list of tables to exclude.
    • --exclude-triggers - Specifies comma-separated list of triggers to exclude.
    • --exclude-users - Specifies comma-separated list of users to exclude.
    • --include-databases - Specifies comma-separated list of databases to include.
    • --include-events - Specifies comma-separated list of events to include.
    • --include-routines - Specifies comma-separated list of routines to include.
    • --include-tables - Specifies comma-separated list of tables to include.
    • --include-triggers - Specifies comma-separated list of triggers to include.
    • --include-users - Specifies comma-separated list of users to include.
    • --default-parallelism - mysqlpump divides dump steps into several task and adds these tasks to a thread safe queue. This queue can be processed by several threads to improve performance. default-parallelism help specify numbers of threads to be created for this queue.
    • --parallel-schemas - This option allows mentioning of databases to be processed in a separate queue along with how many threads to be created for this queue.
    • --watch-progress - Displays progress information of the dump process.
    • --skip-definer - Skip DEFINER and SQL SECURITY clauses for Views and Stored Routines ( procedures and functions).
    • --skip-dump-rows - Skips rows for each table being dumped. This option is same as mysqldumps --no-data.

Non-functional Requirements:

  1. Single-threaded configuration can't process with less than 90% of speed of current implementation.
  2. When using --default-parallelism all DB objects are processed by specified number of threads. Each thread should have its own connection to MySQL, independent from which chain element is using it.
  3. With --single-transaction exactly one connection to MySQL server can be created in dump process. In other cases more than 1 connection can be created, in opposition to existing mysqldump.

Contents


Basic division

Mysqlpump will be divided into two parts:

  • libmysqlpump - the library with modules implementations,
  • mysqlpump - client application, will parse options, create modules chains using libmysqlpump, run dump operation and watch progress.

Dump stages

As DB object we mean any of database, table, view, trigger, privilege, Event Scheduler event, stored procedure or function, user and user privileges and row.

This worklog will divide dump process into several stages. Each one will have API interface and will allow to use exchangeable modules to achieve desired effects by users.

Dump process will be divided into stages:

  • Crawler - list all database objects (tables, views, procedures, and so on).
  • Chain Maker - building chain of modules to use for specified object found by Crawler, this includes object filtering and generating no chains for filtered out items.
  • Object Reader - getting abstract information about DB object to dump.
  • Formatter - getting string representation of object to dump basing on abstract information.
  • Output Writer - writing string representations to output destination.

We define module A is a wrapper on module B, if modules A main work is to process data and redirect results to module B. All modules in all stages but last (Output Writer), when not stated differently, are wrappers on module from next stage. Also, within stages it will be possible to make wrappers on module from the same stage, to allow for example: queueing, filtering. List of module instances wrapping next one on list results in chains of tools. Possibility to build customized chains make whole mysqldump process more flexible.

Modules

Modules are independent parts of code that expose the interface. Stages define module types naturally. Each module type has its own API. We define another class of modules which is optional in dump process:

  • Progress Watcher - gathers informations from Crawler and Object Readers to provide string representation of progress of dump process.

Modules: Implementations

We introduce following modules:

Crawlers:

  • MySQL Crawler - searches and enumerates DB objects using connection to MySQL.

Chain Makers:

  • Crawler Filter - wrapper on another Chain Maker, filters out unwanted objects by name and/or type, as configured.
  • Mysqldump Tool Chain Maker - chain maker implemented in Mysqldump application, constructs chain based on command line options that are compatible with these available in current implementation.

Object Readers:

  • MySQL Object Reader - retrieves and parses data of any DB object using connection to MySQL server.
  • Object Queue - wrapper to another Object Reader, adds all objects to read on queue. Allows specified number of threads to dequeue and process objects.

Formatters:

  • SQL Formatter - prints object data in SQL format.
  • XML Formatter - prints object data in XML format.
  • Object Formatting Queue - wrapper to another Formatter, adds all objects to format on queue. Allows specified number of threads to dequeue and process rows.

Output Writers:

  • File Writer - writes formatted data to specified file.
  • File Pool Writer - writes formatted data to specified group of files, randomly choosing one for each write.
  • Standard Writer - writes formatted data to standard output or error stream.
  • Compression Writer - wrapper to another Output Writer, compresses formatted data stream.

Progress Watchers:

  • Standard Progress Watcher - gathers information about progress of current dump progress and format messages on progress. Also it should expose API for receiving processed progress information: collected objects and rows information along with time elapsed, ETA.

Modules: File formats

We introduced 3 types of file formats:

  • SQL format - text format with SQL queries. This is used now as output from mysqldump tool. It allows to execute itself directly by MySQL client.
  • XML format - XML-compatible format for exporting data. This is used now as optional output format from mysqldump tool.

Option changes compared to existing tool

MyISAM-related optimization options will be turned on by default for all MyISAM and disabled for all other. Options include:

  • --disable-keys
  • --order-by-primary

It should be possible for user to create his own tool that use mysqldump library. It can then create additional modules implementations and use them in chains.

Example use case explained

Suppose user has a MySQL Server instance with 3 schemas, db1, db2, db3, each containing multiple objects. For example using --parallel-schemas=3:db1 --parallel-schemas=db2,db3 --default-parallelism=2 command line options would create 2 object threads for processing db2 and db3, and 3 threads for db1. Total 7 additional threads will be created, along with main thread. Unless it is modified with --result-file all output will go to console output, independently of number of threads.

Bugs and WLs solved by this task

Following bugs and WLs will be solved by this task:

  • BUG#11744834 (MYSQLDUMP SHOULD HAVE A EXCLUDE),
  • BUG#11746833 (ADD MYSQLDUMP OPTION TO DUMP PRIVILEGES IN GRANT FORMAT),
  • BUG#11757117 (MYSQLDUMP SHOULD HAVE FLAG TO DELAY CREATING INDEXES FOR INNODB PLUGIN RE),
  • BUG#13691768 (CREATE SECONDARY INDEXES AFTER INSERTING ROWS STATEMENTS IN MYSQLDUMP).

Following bugs and WLs will be partially solved by this task:

  • WL#390 (Options for better dump control for MySQLDUMP),
  • WL#1627 (Remove mysql_refresh from mysql clients),

Contents


Milestones Plans

Milestone 0:

  • Build prototype to test different threading approaches performance.
  • Create module interfaces.
  • Divide mysqldump to client and library.

Milestone 1 - working basic version:

  • Implement:
    • MySQL Crawler for tables,
    • simple static configuration Chain Maker for SQL Format,
    • MySQL Object Reader for tables,
    • Standard Writer,
    • SQL Formatter for tables and rows.
  • Test set of implemented modules to generate valid SQL tables dump.

Milestone 2 - full SQL Format version:

  • Implement:
    • full MySQL Crawler,
    • full MySQL Object Reader,
    • full SQL Formatter,
    • File Writer,
    • Simple Dependency Sorting Wrapper can be needed.
  • Test set of implemented modules to generate valid SQL for all objects in DB.

Milestones Plans: Backward-Compatible plan

Milestone a3.1 - compatibility:

  • Implement:
    • XML Formatter,
    • CSV Formatter,
    • Mysqldump CommandLine Based Tool Chain Maker,
    • basic Crawler Filter.
  • Make sure to implement as much of current mysqldump command line options - this will require to add parameters to existing modules.

Milestones Plans: Features

Milestone 4.1 - Multi-threading:

  • Implement:
    • Simple Thread-safe Queue,
    • Object Queue,
    • Object Formatting Queue.
  • Profile queue implementation for most obvious problems.
  • Test real life performance against old mysqldump.
    • Determine fastest default configuration.

Milestone 5 - Simple features:

  • Implement:
    • full Crawler Filter,
    • Compression Writer,
    • Standard Progress Watcher.
  • Make changes to make secondary indexes created after row insertion.
  • Use Standard Progress Watcher in mysqldump client application.

For now the sequence of milestones to proceed with is: 0, 1, 2, a3.1, 4.1, 5.

Milestone for RC2

Scope of mysqlpump is to achieve the basic functionality of mysqldump and improve the performance of taking dumps for a default configuration. This new tool will provide the following: 1. Object filtering options to include or exclude in the dump. Type of

  objects include databases, tables, views, triggers, events and procedures.
  The option to use is --exclude-<object_type>=name or
   --include-<object_type>=name

2. Concurrency at schema level where objects inside each schema will be added

  to a queue associated with each schema. This queue is then processed by
  multiple threads defined by the default configuration. This can be
  specified with --parallel-schemas.

3. --default-parallelism=N This option specifies number of threads to process

  each queue for values N > 0. If N is 0 then no queue will be used. Default
  value is 2.

4. Compression of output files with LZ4 or zlib compression algorithm. 5. Defer index creation after data insertion. This greatly improves

  performance when dump file is applied back. This can be specified with
  --deffer-table-indexes.

6. Progress watching. Progress information include both completed and total

  number of tables, rows and other objects collected.

7. Dump user privileges.

Note: Achieving basic functionality of mysqldump does not mean to mimic all the functionalities of mysqldump. Existing options which will not be implemented as part of mysqlpump include: comments, force, verbose, compact, fields-terminated-by, fields-enclosed-by, fields-optionally-enclosed-by, fields-escaped-by, lines-terminated-by, xml, disable-keys, opt, quick, flush-logs, flush-privileges, lock-tables, no-autocommit, order-by-primary, where, force, ignore-error, tab, apply-slave-statements, delete-master-logs, dump-slave, include-master-host-port, master-data, set-gtid-purged, all-tablespaces, no-create-db, no-create-info, no-tablespaces, add-drop-trigger

Modules

Modules: APIs

Modules: APIs: Interfaces of Objects

I_data_object: - interface for all DB objects.
  • uint64 get_id() - Returns an unique ID of this DB object. This helps progress watching with multiple parts of chain during object processing (including queueing).
  • string get_schema() - Returns schema in which object is contained.
  • string get_name() - Returns name of object in schema.
I_dump_task: - interface for all individual dump process tasks.
  • I_data_object* get_related_db_object() - Returns DB object for which this task was created for.
  • vector<I_dump_task*> get_dependencies() - Returns list of all task objects that must complete within dump progress before execution of current one can start.
Table: - class to represent single DB table.
  • Extends I_data_object.
  • uint64 get_row_count() - Returns number of rows contained in this table. This enables Progress Watchers to produce much more accurate predictions.

Single table object will not be dumped as whole, it will be divided into 3 dump tasks to ease configuration.

Abstract_table_task: - abstract class for defining single DB table dump task.
  • Extends I_dump_task.
  • Table* get_related_table() - Returns table the current task is created for. Returns get_related_db_object() casted to Table object type.
Table_definition_task: - class to represent single DB table creation task.
  • Extends Abstract_table_task.
Table_rows_task: - class to represent single DB table rows extraction task.
  • Extends Abstract_table_task.
Tables_row_completed_barrier_task: - class to represent barrier placed after all table rows were successfully dumped. Should have in dependencies all Table_row_tasks created in all schemas.
  • Extends I_dump_task.
Table_deferred_indexes_task: - class to represent creation task of deferred indexes of single DB table.
  • Extends Abstract_table_task.
Row: - class to represent single row of table and it's data.
  • Extends I_data_object.
  • Table* get_source_table() - Returns a table this row is contained in.
  • vector<string> get_row_data_formatted() - Returns all string representation of data of fields.
  • vector<shared_ptr<void>> - Returns all raw data of fields. This is not to be implemented at start as it requires effort creating converters for all data types.
Abstract_plain_sql_object: - abstract object carrying its definition in SQL formatted string only.
  • Extends I_data_object.
  • string get_sql_formatted_definition() - Returns SQL formatted object definition.
Abstract_plain_sql_object_task: - abstract task for dumping object carrying its definition in SQL formatted string only.
  • Extends Abstract_plain_sql_object.
  • Extends I_dump_task.
Database:
  • Extends Abstract_plain_sql_object
Abstract_database_task:
  • Extends I_dump_task
  • Database* get_related_database() - Returns database the current task is created for. Returns get_related_db_object() casted to Database object type.
Database_start_task:
  • Extends Abstract_database_task
Database_end_task:
  • Extends Abstract_database_task
View:
  • Extends Abstract_plain_sql_object_task.
Privilege:
  • Extends Abstract_plain_sql_object_task.
Event_scheduler_event:
  • Extends Abstract_plain_sql_object_task.
Stored_procedure:
  • Extends Abstract_plain_sql_object_task.
struct Chain_processing_item:
  • uint64 get_chain_id() - Returns ID of chain being processed.
  • I_dump_task* get_object() - Returns instance of DB object that is being processing.
  • I_chain_element* get_chain_element() - Returns instance of chain element that is processing specified element.
struct Chain_data:
  • uint64 get_chain_id() - Returns ID of chain being processed.
  • I_callable<void, Chain_processing_item> get_completion_callback() - Returns callback to call after element is fully processed to the output. Can be NULL. I_callable is defined in WL#7308.

Modules: APIs: Interfaces of Modules

I_chain_element:
  • void get_id() - Returns an application unique ID of this chain element object. This helps progress watching with multiple parts of chain during all objects processing.
  • void abort(uint64 chain_id) - Abort execution of specified chain at first opportunity. This should abort specified chain execution in all children chain elements.
  • void task_completion_in_child_callback(uint64 chain_id, I_dump_task* task_processed) - This callback can be requested to be called by child for any object processing. This will be called when the task object processing has completed. Note that this function may be called from multiple threads so all implementations must be thread-safe.
I_progress_reporter:
  • void register_progress_watcher(I_progress_watcher* new_progress_watcher) - Add new Progress Watcher to report to.
I_crawler:
  • Extends I_chain_element.
  • Extends I_progress_reporter.
  • void enumerate_objects(vector<I_callable<bool, Message_data>*> message_handlers) - Enumerates all objects it can access, gets chains from all registered chain_maker for each object and then execute each chain.
  • void register_chain_maker(I_chain_maker* new_chain_maker) - Adds new Chain Maker to ask for chains for found objects.
I_chain_maker:
  • Extends I_chain_element.
  • I_object_reader* create_chain(uint64 chain_id, I_dump_task* task_object) - Creates new chain for specified DB object dump task. May return null if do not want to process specified object.
I_object_reader:
  • Extends I_chain_element.
  • Extends I_progress_reporter.
  • void read_object(Chain_data chain_data, I_dump_task* object_to_read) - Reads all information on specified DB object.
  • void register_formatter(I_data_formatter* new_data_formatter) - Add new Data Formatter to supply acquired data of objects to.
I_data_formatter:
  • Extends I_chain_element.
  • Extends I_progress_reporter.
  • void format_object(Chain_data chain_data, I_dump_task* object_to_format, bool begin_declaration) - Creates string representation of specified DB object specified by dump task. It will be called twice for each object. First with begin_declaration=true, then all children items will be processed and then again with begin_declaration=false.
  • void register_output_writer(I_output_writer* new_output_writer) - Add new Output Writer to supply formatted strings to.
I_output_writer:
  • Extends I_chain_element.
  • Extends I_progress_reporter.
  • void append(string data_to_append) - Adds new block of data atomically to output. Atomicity assures that specified block of data will be added to output as one part, will not be divided or interleaved with another data.
I_progress_watcher:
  • Extends I_chain_element.
  • void new_object_created(Chain_processing_item new_object_creator) - Reports new non-empty chain being created by Chain Maker or new row fetched from table by Table Reader. Called from Crawler or Table Reader.
  • void object_processing_started(Chain_processing_item process_data) - Report new object (table, row or any other) was started processing by specified Object Reader, Table Reader, Formatter or Row Formatter. Reported by these types. Is not reported by queues on enqueue but on dequeue.
  • void object_processing_ended(Chain_processing_item finished_process_data) - Report object (table, row or any other) finished being processed. In case of table, this does not necessarily mean that all rows were processed. That does not necessarily mean that object was successfully written by Output Writers.
  • void crawler_completed(I_crawler* crawler) - Reports crawler ended enumerating objects and creating chains for them.

Modules: APIs: Error handling

Each module should be able to acquire list of error (message) handlers at constructor call.

struct Message_data:
  • uint64 code - Arbitrary message code, in case of MySQL error it can be error code from MySQL server.
  • string message - Formatted user-readable message.
  • Message_type message_type - Type of message.

Message_type is enum of: Info, Note, Warning, Error. Message handler is defined by type I_callable<bool, Message_data>. Message handler returns bool, a true for message being consumed, i.e. it won't be passed to next handlers. Handlers are called from most recent to the oldest, i.e. using reversed iterator of vector<>.

Modules: MySQL connection management

While creating chain it is not known which object processing thread will execute current chain, so all chain elements that use connection to MySQL server must not have it acquired while constructing chain. Also it is possible (and optimal) to reuse chain elements in different chains, so it will lead to parallel usage of given chain element in several threads. Providing new connection to each new element would be inefficient and will disallow of usage of --single-transaction option. To address this issue all modules that need connection to MySQL server will acquire instance of I_connection_provider, which will be similar to I_connection_factory, but will not have to create new connection each call. Connection acquired from I_connection_provider must not be reused while processing other element than for which it was acquired. Two implementations of I_connection_provider are foreseen:

  • Single_transaction_connection_provider - will create only one singleton connection to MySQL server and will check if all acquisitions are done from the same thread.
  • Thread_specific_connection_provider - will create one singleton connection to MySQL server per each thread that acquires connection.

Modules: Implementation details

Here are more development notes to individual modules implementations. They should contain all non-obvious functional and implementation details to keep in mind during implementation:

  • MySQL Crawler:
    • WL#1627: Do not use mysql_refresh(), issue "FLUSH ..." queries.
    • WL#2386, WL#1587: Use UTF-8 encoding when operating on object names.
  • Crawler Filter:
    • WL#390: It should allow configuring default behaviour for each type of DB objects (process or filter out all objects) and list of exceptions. List of exceptions can be set to be interpreted as set of regular expressions. Filtering out whole database will filter out all objects contained in it.
  • Mysqldump Tool Chain Maker
    • BUG#11744834: Allow usage of --exclude-databases when --all-databases is specified.
    • Remember first to compress then encrypt if both options are specified.
  • Dependency Sorting Wrapper:
    • It should make use of WL#6359 to implement correct ordering of objects to process.
  • Simple Dependency Sorting Wrapper:
    • It should parse SQL data or use parsed data in I_data_objects. Probably at start it will not support all possible dependencies as it will require significant effort to make all DB objects parsed enough.
    • It must wait with execution of all objects which have dependencies until all dependants are finished processing using completion_callback in Chain_data
  • MySQL Object Reader:
    • We will need some parser generator to handle parsing parts of SQL that are needed to build instances of I_data_object. Do not write parser by hand.
    • WL#2386, WL#1587: Use UTF-8 encoding when operating on object data.
    • Allow usage of option --set-storage-engine.
    • Handle auto-increment value correctly.
  • Object Queue, Object Formatting Queue:
    • These modules will use one of Thread-safe Queue implementations to process elements. It should be possible to instruct instances to use not-synchronized versions of enqueue() and dequeue() in cases that Chain Maker is sure there will be only 1 writer or reader to make overheads smaller.
  • SQL Formatter:
    • BUG#11757117, BUG#13691768: creation of indexes should be delayed to point where all rows where already added to table.
    • BUG#11746833: Dump privileges in "GRANT" format instead of dumping privileges tables.
    • WL#2386, WL#1587: Make sure different encoding of strings in fields of rows are exported properly.
  • XML Formatter
  • CSV Formatter
  • MySQL Object Reader:
  • File Writer:
    • This module must handle binary data (includes null characters) when operating with strings - value of string.size() is important.
  • Compression Writer:
    • This module must handle binary data (includes null characters) when operating with strings - value of string.size() is important.
    • This should use very fast compression algorithm, LZ4 have been chosen to perform compression.
  • Standard Progress Watcher:
    enum Data_object_type - enumeration of types of DB objects.
    struct Data_object_type_progress: - structure to represent progress of single DB object type processing.
    • Data_object_type object_type - Indicates type of objects this structure refer to.
    • uint64 objects_found - Indicates number of objects found so far.
    • bool all_objects_found - Indicates if all objects were already found, that is objects_found will not change in future.
    • uint64 objects_started_processing - Indicates number of objects that started being processed. This includes objects that finished processing already. We have: objects_found >= objects_started_processing => objects_finished_processing.
    • uint64 objects_finished_processing - Indicates number of objects that finished processing.
    Abstract_aggregating_progress_watcher: - Aggregates incoming progress data to more processed form.
    • It must be thread-safe. It shouldn't use any locks in methods that implement I_progress_watcher, as it would downgrade performance in great way.
    • Estimated time to complete and percent of progress values should base on number of elements to process and how fast each element type is processed. Evaluating how fast elements are processed must not be done by measuring each item processing time. It must be evaluated by number of items evaluated between consecutive calls to methods utilizing this data. Current speed of processing should be calculated as SMA (Simple Moving Average) with window size expressed in number of calls to these members (or preferably expressed in time in seconds, but this can be more challenging) configurable in constructor parameter.
    Standard_progress_watcher: - Formats aggregated data and passes them to registered Output Writers.
    • Extends Abstract_aggregating_progress_watcher.
    • void register_output_writer(I_output_writer* new_output_writer) - Add new Output Writer to supply formatted progress information strings to.

Thread-safe Queue implementation

This Thread-safe queue implementation will be used internally in Object Queue and Object Formatting Queue. It should be strict-typed, so this will introduce template variable T_type. It should expose simple interface I_queue:

  • void enqueue(T_type new_value) - adds new item to the end of queue.
  • T_type dequeue() - removes item on start of queue and returns it.

Blocking_queue

Main implementations will be based on circular buffer queue with fixed size. This is enough for our use cases, but future queue implementations are not restricted to use other algorithms. Basic blocking implementation will be Blocking_queue. It will have private members:

  • int m_queue_start; - index of first (start) element of queue. This is element that will be dequeued next.
  • int m_queue_end; - index of next after end (last) element of queue. This is element that will be enqueued next.
  • int m_queue_size; - Allocated size of queue. The maximum size of queue is m_queue_size - 1 to make m_queue_start == m_queue_end an empty condition valid.
  • vector<T_type> m_elements - Array that is buffer for circular queue elements.

Blocking_queue::enqueue(T_type new_value) should perform following steps:

  • Check if queue is not full (next wrapped m_queue_end is equal to m_queue_start). Check this in spin-loop returning control to OS scheduler each time this condition is not met.
  • Acquire lock by mutex.
  • Check if queue is not full (next wrapped m_queue_end is equal to m_queue_start). If this check fails then release lock on mutex and go to first step again.
  • Fill m_elements[m_queue_end] with new_value.
  • Increase m_queue_end.
  • Wrap m_queue_end if needed: if m_queue_end is equal to m_queue_size then set m_queue_end to 0.
  • Release lock on mutex.

Blocking_queue::dequeue() should perform following steps:

  • Check if queue is not empty. Check this in spin-loop returning control to OS scheduler each time this condition is not met.
  • Acquire lock by mutex.
  • Check if queue is not empty. If this check fails then release lock on mutex and go to first step again.
  • Get value of m_elements[m_queue_start].
  • Increase m_queue_start.
  • Wrap m_queue_start if needed: if m_queue_start is equal to m_queue_size then set m_queue_start to 0.
  • Release lock on mutex.
  • Return value of removed element.

The constructor for Blocking_queue will get maximum size of queue (not allocated one) and will initialize vector properly.

Non_locking_queue

CAS-based non-locking implementation will be Non_locking_queue. It is more complicated and may not be implemented in first milestones. It is very optimistic on acquiring resources, which is not a problem because acquisition will not be done every few CPU cycles but many more, so probability of collision is low.

It will have private members:

  • int m_queue_start; - index of first (start) element of queue. This is element that will be dequeued next.
  • int m_queue_end_readers; - index of next after end (last) element of queue from reader thread perspective. This is index where new elements are not available in current moment.
  • int m_queue_end_writers; - index of next after end (last) element of queue from writer thread perspective. This is element that will be enqueued next.
  • int m_queue_size; - Allocated size of queue. The maximum size of queue is m_queue_size - 1 to make m_queue_start == m_queue_end_writers and m_queue_start == m_queue_end_readers an empty condition valid.
  • vector<T_type> m_elements - Array that is buffer for circular queue elements.

Non_locking_queue::enqueue(T_type new_value) should perform following steps:

  • Get m_queue_end_writers (store as end).
  • Check if queue is not full (next wrapped end is equal to m_queue_start). If this check fails return control to OS scheduler and go back to first step.
  • Set m_queue_end_writers to next wrapped value next_end (m_queue_end_writers+1 != m_queue_size ? m_queue_end_writers+1 : 0) using CAS. This may fail if other thread already reserved this entry, then go back to first step.
  • Fill m_elements[end] with new_value.
  • Change m_queue_end_readers from end to next_end using CAS. This may fail, then repeat this step from begin. This thread will wait for others that increased m_queue_end_writers before this one did. To not make this thread make others starving it should call OS scheduler once per some time (like 100 iterations).

Non_locking_queue::dequeue() should perform following steps:

  • Get m_queue_start (store as start).
  • Check if queue is not empty. If this check fails return control to OS scheduler and go back to first step.
  • Get m_elements[start] (store as start_element).
  • Change m_queue_start from start to next wrapped start using CAS. This may fail, then it means someone already took this element - go back to first step.

The constructor for Non_locking_queue will get maximum size of queue (not allocated one) and will initialize vector properly.