• 1. Greenplum 数据库基础培训Wei.Li Senior Data Architecture Alpine Solution 2010/05
  • 2. Greenplum数据库海量并行处理 (Massively Parallel Processing) DBMS 基于 PostgreSQL 8.2—(GP3.3.6.1 based on 8.2.13) 相同的客户端功能 增加支持并行处理的技术 增加支持数据仓库和BI的特性 外部表(external tables)/并行加载(parallel loading) 资源管理 查询优化器增强(query optimizer enhancements) 表分区 压缩、列存储
  • 3. S1002 Network Configuration
  • 4. Greenplum 基本体系架构
  • 5. 客户端接口和程序 psql pgAdmin III ODBC/Datadirect JDBC Perl DBI Python libpq OLE DB
  • 6. Master Host 访问系统的入口 数据库侦听进程 (postgres) 处理所有用户连接 建立查询计划 协调工作处理过程 管理工具 系统目录表和元数据(数据字典) 不存放任何用户数据
  • 7. 每段(Segment)存放一部分用户数据 一个系统可以有多段 用户不能直接存取访问 所有对段的访问都经过Master 数据库监听进程(postgres)监听来自Master的连接Segment
  • 8. Greenplum数据库之间的连接层 进程间协调和管理 基于千兆以太网架构 属于系统内部私网配置 支持两种协议:TCP or UDPInterconnect
  • 9. Greenplum 高可用性体系架构
  • 10. Standby 节点用于当 Master 节点损坏时提供 Master 服务 Standby 实时与 Master 节点的 Catalog 和事务日志保持同步Master/Standby 镜像保护
  • 11. 每个Segment的数据冗余存放在另一个Segment上,数据实时同步 当Primary Segment失败时,Mirror Segment将自动提供服务 Primary Segment恢复正常后,使用gprecoverseg –F 同步数据。 4.0后能够自动恢复、且同步机制按照BLOCK进行,性能更高数据冗余-Segment 镜像保护
  • 12. Hash分布 CREATE TABLE … DISTRIBUTED BY (column [,…]) 同样数值的内容被分配到同一个Segment上 循环分布 CREATE TABLE … DISTRIBUTED RANDOMLY 具有同样数值的行内容并不一定在同一个Segment上表分布的策略-并行计算的基础表分布的策略-并行计算的基础
  • 13. 查询命令的执行举例说明:按卡号、客户号、机构的分布方式优劣点分布存储
  • 14. 查询命令的执行查询命令的执行
  • 15. SQL查询处理机制
  • 16. SELECT customer, amount FROM sales JOIN customer USING (cust_id) WHERE date=04302008; (4.0后支持one row Operate) 并行查询计划
  • 17. 表分区的概念将一张大表逻辑性地分成多个部分,如按照分区条件进行查询,将减少数据的扫描范围,提高系统性能。 提高对于特定类型数据的查询速度和性能 也可以更方便数据库的维护和更新 两种类型: Range分区 (日期范围或数字范围)/如日期、价格等 List 分区,例如地区、产品等 Greenplum中的表分区在使用中具有总表的继承性,并通过Check参数指定相应的子表 分区的子表依然根据分布策略分布在各segment上 分区是一种非常有用的优化措施,例如一年的交易按交易日期分区后,查询一天的交易性能将提高365倍!!!
  • 18. Segment 1A Segment 1B Segment 1C Segment 1D Segment 2A Segment 2B Segment 2C Segment 2D Segment 3A Segment 3B Segment 3C Segment 3D Jan 2005Feb 2005Mar 2005Apr 2005May 2005Jun 2005Jul 2005Aug 2005Sep 2005Oct 2005Nov 2005Dec 2005每个分区表的数据平均分布到各个节点表分区可减少数据的搜索范围, 提高查询性能Data Distribution & Partitioning
  • 19. Segment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DSegment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DSegment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DSELECT COUNT(*) FROM orders WHERE order_date>= ‘Oct 20 2005’ AND order_date< ‘Oct 27 2005’VSHash DistributionHash Distribution+ Table PartitioningSegment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DFull Table Scan VS. Partition Pruning
  • 20. 表分区示意图
  • 21. 压缩存储压缩存储 支持ZLIB和QUICKLZ方式的压缩,压缩比可到10:1 压缩表只能是Append Only方式 压缩数据不一定会带来性能的下降,压缩表将消耗CPU资源,而减少I/O资源占用 语法 CREATE TABLE foo (a int, b text) WITH (appendonly=true, compresstype=zlib, compresslevel=5);
  • 22. 行列存储Greenplum支持行或列存储模式 列模式目前只支持Append Only 如果常用的查询只取表中少量字段,则列模式效率更高,如查询需要取表中的大量字段,行模式效率更高 语法: CREATE TABLE sales2 (LIKE sales) WITH (appendonly=true, orientation=column); 效率比较测试: 测试1:需要去表中所有字段,此时行存储更快。 select * from dw_ods.s1_sema_scmcaccp_row where crdacct_nbr= '4033930000166380411'; 41秒 select * from dw_ods.s1_sema_scmcaccp_col where crdacct_nbr= '4033930000166380411'; 116秒 测试2:只取表中少量字段,列存储更快 select crdacct_status from dw_ods.s1_sema_scmcaccp_row where crdacct_nbr= '4033930000166380411'; 35秒 select crdacct_status from dw_ods.s1_sema_scmcaccp_col where crdacct_nbr= '4033930000166380411'; 3秒
  • 23. 外部表加载外部表的特征 Read-only(4.0后可写外部表) 数据存放在数据库外 可执行SELECT, JOIN, SORT等命令,类似正规表的操作 外部表的优点 并行方式加载 ETL的灵活性 格式错误行的容错处理 支持多种数据源 两种方式 External Tables: 基于文件 Web Tables: 基于URL或指令
  • 24. 基于外部表的高速数据加载利用并行数据流引擎,Greenplum可以直接用SQL操作外部表 数据加载完全并行Master主机Segment主机内部互联网—千兆以太网交换机gpfdistgpfdistSegment主机Segment主机Segment主机外部表文件外部表文件ETL服务器内部网络
  • 25. 外部表加载的特征并行数据加载提供最好的性能 能够处理远程存储的文件 采用HTTP协议 200 MB/s data distribution rate per gpfdist gpfdist文件分发守护进程启动: gpfdist -d /var/load_files/expenses -p 8080 -l /home/gpadmin/log & 外部表定义: CREATE EXTERNAL TABLE ext_expenses ( name text, date date,  amount float4, description text ) LOCATION ('gpfdist//etlhost:8081/*','gpfdist//etlhost:8082/*') FORMAT 'TEXT' (DELIMITER '|') ENCODING ’UTF-8’ LOG ERRORS INTO ext_expenses_loaderrors SEGMENT REJECT LIMIT 10000 ROWS ;
  • 26. Load good rows and catch poorly formatted rows, such as: rows with missing or extra attributes rows with attributes of the wrong data type rows with invalid client encoding sequences Does not apply to constraint errors: PRIMARY KEY, NOT NULL, CHECK or UNIQUE constraints Optional error handling clause for external tables: [LOG ERRORS INTO error_table] SEGMENT REJECT LIMIT count [ROWS | PERCENT] ( PERCENT based on gp_reject_percent_threshold parameter ) Example CREATE EXTERNAL TABLE ext_customer (id int, name text, sponsor text) LOCATION ( 'gpfdist://filehost:8081/*.txt' ) FORMAT 'TEXT' ( DELIMITER '|' NULL ' ') LOG ERRORS INTO err_customer SEGMENT REJECT LIMIT 5 ROWS; 外部表加载异常处理
  • 27. Data resides outside the database No database statistics for external table data Not meant for frequent or ad-hoc access Can manually set rough statistics in pg_class: UPDATE pg_class SET reltuples=400000, relpages=400 WHERE relname='myexttable'; 外部表静态统计优化
  • 28. PostgreSQL command Support loading and unloading Optimized for loading a large number of rows Loads all rows in one command (not parallel) Loads data from a file or from standard input Supports error handling as does external tables EXAMPLE COPY mytable FROM '/data/myfile.csv' WITH CSV HEADER; (文件生成在Master) \COPY mytable FROM ‘/data/myfile.csv’ WITH CSV HEADER;(文件生成在本地) COPY country FROM '/data/gpdb/country_data' WITH DELIMITER '|' LOG ERRORS INTO err_country SEGMENT REJECT LIMIT 10 ROWS;COPY SQL 命令
  • 29. Drop indexes and recreate after load Increase maintenance_work_mem parameter to speed up CREATE INDEX operations Run ANALYZE after load Run VACUUM after load errors、delete、upate。。。 Do not use ODBC INSERT to load large volumes of data 数据加载性能优化提示
  • 30. 限制查询队列的激活数 防止系统过载(CPU, disk I/O, memory) 4.0后功能增强资源负载管理(Workload Management)
  • 31. 资源队列的两种典型管理方式Resource Queue Limits ACTIVE THRESHOLD EXAMPLE: CREATE RESOURCE QUEUE adhoc ACTIVE THRESHOLD 10 IGNORE THRESHOLD 1000.0; COST THRESHOLD EXAMPLES: CREATE RESOURCE QUEUE batch1 COST THRESHOLD 1000000.0 NOOVERCOMMIT; CREATE RESOURCE QUEUE batch1 COST THRESHOLD 1e+6;
  • 32. Greenplum 性能监控器
  • 33. Highly interactive web-based performance monitoring Real-time and historic views of: Resource utilization Queries and query internals DashboardGreenplum Performance Monitor
  • 34. System Metrics Greenplum Performance Monitor
  • 35. Query Monitor Greenplum Performance Monitor
  • 36. Backups and Restores Parallel backups (gp_dump) Parallel restores (gp_restore) Automating dump/restores (gpcrondump, gpdbrestore) Non-parallel backups and restores(pg_dump/pg_restore/psql)备份与恢复
  • 37. 用于在同构环境间迁移数据结构、数据、function备份恢复并行备份和恢复(gp_dump/gp_restore)
  • 38. 用于在异构环境间迁移数据结构、数据、function串行备份和恢复(pg_dump/pg_restore)
  • 39. Each active segment is dumped in parallel Dump files created in segment data directory by default Supports compression (gzip) Ensure sufficient disk space for dump files A backup set is identified by a timestamp key Example: Back up a database: gp_dump gpdb Back up a database, and create dump files in a centralized location on all hosts: gp_dump --gp-d=/home/gpadmin/backups gpdb Back up a particular schema only: gp_dump -n myschema mydatabase Back up a single segment instance only (by noting the dbid of the segment instance): gp_dump --gp-s=i[5] gpdbRunning a Parallel Backup (gp_dump)
  • 40. On the master host gp_catalog_1__ gp_cdatabase_1__ gp_dump_1__ gp_dump_status_1__ On the segment hosts gp_dump_0__ gp_dump_status_0__Dump Files Created by gp_dump
  • 41. Use gp_restore command Need timestamp key from gp_dump Make sure dump files are placed on correct segment hosts Make sure database exists before restoring Database-level server configuration settings are not restored Examples Restore an Greenplum database using backup files created by gp_dump: gp_restore --gp-k=2005103112453 -d gpdb Restore a single segment instance only (by noting the dbid of the segment instance): gp_restore --gp-k=2005103112453 -d gpdb --gp-s=i[5]Running a Parallel Restore (gp_restore)
  • 42. gpcrondump Calls to gp_dump Can be called directly or can schedule using CRON Send email notifications Flexible dump options Can copy configuration files Can dump system catalogs Can dump global objects Can include a post-dump script gpdbrestore Restores from gpcrondump files Can restore from an archive host – no need to pre-place dump files on segmentsScheduling Routine Backups
  • 43. Greenplum also supports pg_dump and pg_restore Useful for migrating data to/from other DBMS pg_dump creates a single dump file Can be slow on very large databases Run at low-usage times Supports compression Can dump data as INSERT or COPY commands gp-syntax option includes DISTRIBUTED BY statements in DDLNon-Parallel Backups and Restores
  • 44. Dump a database called mydb into a SQL-script file: pg_dump mydb > db.sql To reload such a script into a (freshly created) database named newdb: psql -d newdb -f db.sql Dump a Greenplum database in tar file format and include distribution policy information: pg_dump -Ft --gp-syntax mydb > db.tar To dump a database into a custom-format archive file: pg_dump -Fc mydb > db.dump To reload an archive file into a (freshly created) database named newdb: pg_restore -d newdb db.dump To dump a single table named mytab: pg_dump -t mytab mydb > db.sql To specify an upper-case or mixed-case name in -t and related switches, you need to double-quote the name; else it will be folded to lower case. But double quotes are special to the shell, so in turn they must be quoted. Thus, to dump a single table with a mixed-case name, you need something like: pg_dump -t '"MixedCaseName"' mydb > mytab.sqlNon-Parallel Backups and Restores Example
  • 45. 客户端工具pgAdmin3 图形化管理和SQL执行/分析/监控工具 psql 行命令操作和管理工具
  • 46. pgAdmin3 for GPDBpgAdmin3 is the leading graphical Open Source management, development and administration tool for PostgreSQL Greenplum has contributed extensive GPDB-specific enhancements With GPDB 3.3, Greenplum ships and supports this tool
  • 47. pgAdmin3 for GPDB
  • 48. pgAdmin3 for GPDB监控活动session ,同SQL:select * from pg_stat_activity; 监控锁,从pg_lock中获取信息 可以停止正在运行的SQL
  • 49. PSQLConnect through the master Connection information database name (-d | PGDATABASE) master host name (-h | PGHOST) master port (-p | PGPORT) user name (-U | PGUSER) First time connections template1 database default superuser account (gpadmin)
  • 50. Issuing SQL StatementsInteractive mode psql mydatabase mydatabase=# SELECT * FROM foo; Non-interactive mode (single command) psql mydatabase –ac “SELECT * FROM foo;” Non-interactive mode (multiple commands) psql mydatabase –af /home/lab1/sql/createdb.sql (Use semi-colon (;) to denote end of a statement)
  • 51. Common PSQL Meta Commands\? (help on psql meta-commands) \h (help on SQL command syntax) \dt (show tables) \dtS (show system tables) \dg or \du (show roles) \l (show databases) \c db_name (connect to this database) \q (quit psql) \! (Enter into shell mode) \df (show function) \dn(show schema) Set search_path=… \timing
  • 52. postgresql.confLocalLocalLocalLocal参数参考Adminguide 重要参数:max_connection,share_buff,work_mem… Local 变量的修改,如max_stack_depth需要修改所有segment上的valueLocal, Global, and Master-Only
  • 53. postgresql.conf fileLocated in master or segment instance’s data directory Used to set configuration parameters on the system level Parameters that are using the default setting are commented out (#) Requires a restart (or reload using gpstop -u) for changes to take effect
  • 54. Viewing Parameter SettingsView a specific parameter setting Example: SHOW search_path; View all parameter settings Example: SHOW ALL; Set parameter Example: set search_path=public set client_encoding=gb18030
  • 55. Configuring Host-Based Authentication客户端授权 是否允许从某个客户端的连接 用户是否可以连接到所请求的数据库 pg_hba.conf file 基于host address, database, and/or DB user account控制权限 位于master和segment实例的数据目录中 系统初始化时进行default配置
  • 56. Default Master Host pg_hba.confLocal connections allowed for Greenplum superuser Remote connections not allowed EXAMPLE # TYPE DATABASE USER CIDR-ADDRESS METHOD local all gpadmin ident sameuser local all all ident sameuser host all gpadmin 127.0.0.0/0 trust
  • 57. 配置 pg_hba.confEXAMPLE # TYPE DATABASE USER CIDR-ADDRESS METHOD local all all trust host all all 127.0.0.1/32 trust host all all ::1/128 trust host all all 10.0.0.206/32 trust host carddw etl 21.104.138.12/32 md5 host gpadmin all 21.104.138.0/24 md5 gpstop -u 可与在不重启数据库方式下,让设置生效
  • 58. SQL语法具体参考《Gp sql language》: 注意事项: DELETE ,UPDATE在两表关联时,两个表的distribution必须一致。 如:delete from tablea using tableb where tablea.id=tableb.id UPDATE table a as a SET desc= b.desc FROM tableb as b WHERE a.id=b.id 以上操作 table a,table b 必须使用相同的分布,必要时可以使用 alter table set distribution 进行分布修改。
  • 59. 数据类型常用数据类型 CHAR,VARCHAR,TEXT Smallint ,integer,bigint Numeric, real,double precision Timestamp,date,time Boolean Array 类型。如 integer[] 其它数据类型请参考
  • 60. All system catalogs in pg_catalog schema Standard PostgreSQL system catalogs (pg_*) Greenplum-specific catalogs: gp_configuration gp_distribution_policy gp_id gp_version_at_initdb pg_resqueue pg_exttable pg_tables pg_class pg_stat_activity To list all system catalog tables in psql: \dtS To list all system views in psql: \dvS 其它 catalog 参考 System Catalog Tables and Views
  • 61. 函数日期函数 Extract(day|month|year。。。 From date); Select date + ‘1 day’::interval,date+ ‘1 month’::interval SELECT date_part('day', TIMESTAMP '2001-02-16 20:38:40'); Result: 16 SELECT date_trunc('hour', TIMESTAMP '2001-02-16 20:38:40'); Result: 2001-02-16 20:00:00 pg_sleep(seconds); 系统日期变量 Current_date Current_time Current_timestamp Now() Timeofday() 在 事务中发生变化,以上函数在事务中不变
  • 62. 函数字符串处理函数 Substr/length/lower/upper/trim/replace/position rPad/lpad To_char, || (字符串连接) substring(string from pattern ~/ ~* ,like,simillar to (模式匹配) 其它杂类 Case 。。。When/Coalesce nullif generate_series In/not in/exists/any/all各类函数参考:
  • 63. 存储过程Greenplum支持SQL/PYTHON/PERL/C语言构建函数,以下着重介绍SQL 存储过程。 一个存储过程就是一个事务,包括对子过程的调用都在一个事务内 存储过程结构: CREATE FUNCTION somefunc() RETURNS integer AS $$ DECLARE quantity integer := 30; BEGIN RETURN ....; END; $$ LANGUAGE plpgsql; 赋值 给一个变量或行/记录赋值用下面方法:identifier := expression 例子:user_id := 20; 执行一个没有结果的查询: PERFORM query; 一个例子: PERFORM create_mv('cs_session_page_requests_mv', my_query);存储过程请参考:
  • 64. 存储过程动态SQL EXECUTE command-string [INTO [STRICT] target]; SELECT INTO Example:SELECT ID INTO VAR_ID FROM TABLEA 获取结果状态 GET DIAGNOSTICS variable = item [, ...]; 一个例子: ·GET DIAGNOSTICS integer_var = ROW_COUNT; SQL返回变量 SQLERRM, SQLSTATE 控制结构 IF ... THEN ... ELSEIF ... THEN ... ELSE LOOP, EXIT, CONTINUE, WHILE, FOR 从函数返回 有两个命令可以用来从函数中返回数据:RETURN 和 RETURN NEXT 。 Syntax:RETURN expression; 设置回调 EXEC SQL WHENEVER condition action; condition 可以是下列之一: SQLERROR,SQLWARNING,NOT FOUND
  • 65. 存储过程异常处理 EXCEPTION WHEN unique_violation THEN -- do nothing END; 忽略错误: EXCEPTION WHEN OTHERS THEN RAISE NOTICE 'an EXCEPTION is about to be raised'; RAISE EXCEPTION 'NUM:%, DETAILS:%', SQLSTATE, SQLERRM; END; 错误和消息 RAISE level 'format' [, expression [, ...]]; Level: Info:信息输入 Notice:信息提示 Exception:产生一个例外,将退出存储过程 Example: RAISE NOTICE 'Calling cs_create_job(%)', v_job_id;
  • 66. Performance Tuning Introduction to performance tuning Common performance problems Tracking down a performance problem Query profiling (EXPLAIN, EXPLAIN ANALYZE) Query tuning 我的经验:性能调优
  • 67. Set performance expectations acceptable response times, queries per minute, etc. Benchmarks Know your baseline hardware performance throughput / capacity Know your workload heavy usage times resource contention data contention Focus your optimizationsApproaching a Performance Tuning Initiative
  • 68. Hardware Issues / Failed Segments Resource Allocation Contention Between Concurrent Workloads Inaccurate Database Statistics Uneven Data Distribution SQL Formulation Database Design Common Causes of Performance Issues
  • 69. Disk failures Host failures Network failures OS not tuned for Greenplum Disk Capacity 70% maximum recommended VACUUM after updates, deletes and loads VACUUM configuration parameters max_fsm_relations = tables + indexes + system tables max_fsm_pages = 16 * max_fsm_relations Hardware Issues
  • 70. Greenplum resource queues limit active queries in the system limit the size of a query a particular user can run Perform admin tasks at low usage times Data loading, ETL VACUUM, ANALYZE Backups Design applications to prevent lock conflicts Concurrent sessions not updating the same data at the same time Resource-related Configuration Parameters work_mem = 32MB maintenance_work_mem = 64MB shared_buffers = 125MBResource Allocation and Contention
  • 71. Database statistics used by the query planner Run ANALYZE after Data loads Restores from backups Changes to schema (adding indexes, etc.) Inserts, Updates, or Deletes Can configure statistics collection default_statistics_target = 25 gp_analyze_relative_error = .25 on specific table columns ALTER TABLE name ALTER column SET STATISTICS # Database Statistics (ANALYZE)
  • 72. Table Distribution Key Considerations Even data distribution Local vs. distributed operations Even query processing Checking for data skew gpskew –t schema.table Unix system utilities (gpssh):gpssh –f seg_host ->sar 1 100 Rebalancing a Table CREATE TABLE sales_temp (LIKE sales) DISTRIBUTED BY (date, total, customer); INSERT INTO sales_temp SELECT * FROM sales; DROP sales; ALTER TABLE sales_temp RENAME TO sales; Greenplum Data Distribution
  • 73. Data Type Selection smallest size possible to fit data INTEGER, not NUMERIC(11,2) for integer fields same data type across tables for join columns consider hash of wide join keys, using BYTEA instead of CHAR(100) varchar or text for character data Denormalization (star schema) Table Partitioning Database Design
  • 74. Use sparingly in Greenplum Database Try workload without indexes first Can improve performance of OLTP type workloads Other index considerations: Avoid on frequently updated columns Avoid overlapping indexes Use bitmap indexes where applicable instead of B-tree Drop indexes for loads Consider a clustered index Configuring Index Usage: enable_indexscan = on | offDatabase Design - Indexes
  • 75. General Considerations Know your data Minimize returned rows Avoid unnecessary columns/tables in result set Avoid large sorts if possible Match data types in joins Greenplum-specific Considerations Join on common distribution key columns when possible Consider data distribution policy and query predicatesSQL Formulation
  • 76. System Catalog Tables and Views pg_stat_activity pg_locks / pg_class Database Logs Located in master (and segment) data directories UNIX system utilities (gpssh) Tracking Down a Performance Problem
  • 77. Examine query plans to identify tuning opportunities What to look for? Plan operations that are taking exceptionally long Are the planner’s estimates close to reality? (EXPLAIN ANALYZE) Is the planner applying selective predicates early? Is the planner choosing the best join order? Is the planner selectively scanning partitioned tables? Is the planner choosing hash aggregate and hash join operations where applicable? Is there sufficient work memory?Query Profiling
  • 78. The Query Process
  • 79. To see the plan for a query EXPLAIN EXPLAIN ANALYZE Query plans are read from bottom to top Motions (Gather, Redistribute, Broadcast) Joins, sorts, aggregations Table scans The following metrics are given for each operation cost (units of disk page fetches) rows (rows output by this node) width (bytes of the rows produced by this node)Viewing the Query Plan
  • 80. EXPLAIN SELECT * FROM names WHERE name = 'Joelle';       --------------------------- QUERY PLAN ------------------------- Gather Motion 2:1 (slice1) (cost=0.00..20.88 rows=1 width=13) -> Seq Scan on 'names' (cost=0.00..20.88 rows=1 width=13) Filter: name::text ~~ 'Joelle'::text Reading EXPLAIN Output
  • 81. EXPLAIN ANALYZE SELECT * FROM names WHERE name = 'Joelle'; ---------------------- QUERY PLAN ------------------------ Gather Motion 2:1 (slice1) (cost=0.00..20.88 rows=1 width=13) recv: Total 1 rows with 0.305 ms to first row, 0.537 ms to end. -> Seq Scan on 'names' (cost=0.00..20.88 rows=1 width=13) Total 1 rows (seg0) with 0.255 ms to first row, 0.486 ms to end. Filter: name::text ~~ 'Joelle'::text 22.548 ms elapsed Reading EXPLAIN ANALYZE Output
  • 82. MVCC (multi-version concurrency control) Greenplum supports all transaction isolation levels defined in the SQL standard INSERT/COPY acquire locks at the row-level UPDATE/DELETE acquire locks at the table-level Can use LOCK command to acquire specific locks Transaction Concurrency Control
  • 83. ACCESS SHARE (SELECT, ANALYZE) ROW SHARE (SELECT FOR UPDATE, SELECT FOR SHARE) ROW EXCLUSIVE (INSERT, COPY) SHARE UPDATE EXCLUSIVE (VACUUM) SHARE (CREATE INDEX) SHARE ROW EXCLUSIVE EXCLUSIVE (UPDATE/DELETE) ACCESS EXCLUSIVE (ALTER TABLE, DROP TABLE, REINDEX, CLUSTER, and VACUUM FULL)Table-level Lock Modes
  • 84. Lock conflicts caused by: Concurrent transactions accessing the same object Resource queue locks Transaction deadlocks between segments (rare) Query pg_locks system table to see current locks EXAMPLE: SELECT locktype, database, c.relname, l.relation, l.transactionid, l.transaction, l.pid, l.mode, l.granted, a.current_query FROM pg_locks l, pg_class c, pg_stat_activity a WHERE l.relation=c.oid AND l.pid=a.procpid ORDER BY c.relname;Checking for Lock Conflicts
  • 85. Transactions bundle multiple statements into one ‘all-or-nothing’ operation Transaction commands BEGIN or START TRANSACTION END or COMMIT ROLLBACK SAVEPOINT and ROLLBACK TO SAVEPOINT Autocommit mode in psql \set autocommit on|off Two-phase commit transactions not supported PREPARE TRANSACTION COMMIT PREPARED ROLLBACK PREPAREDAbout Transactions in Greenplum Database
  • 86. Physical storage Server processesDatabase Internals
  • 87. Collects information about database activity Server configuration parameters start_stats_collector = on stats_block_level = off stats_row_level = off stats_queue_level = off stats_command_string = on To see statistics views and tables in catalog: \dtvS pg_stat* Statistics Collector
  • 88. Physical Storage - Data Directory File Structure
  • 89. Linux: ps ax | grep postgres Solaris: ps –ef | grep postgres pargs Greenplum Master Instance postgres database listener process postgres: postgres: seqserver process postgres: Greenplum Segment Instance postgres database listener process postgres: Server Processes
  • 90. 附 录
  • 91. About Window Functions Constructing a Window Specification OVER clause WINDOW clause Built-in Window FunctionsOLAP Windowing Extensions
  • 92. New class of function allowed only in the SELECT list Returns a value per row (unlike aggregate functions) Results interpreted in terms of the current row and its corresponding window partition or frame Characterized by the use of the OVER clause Defines the window partitions (groups of rows) to apply the function Defines ordering of data within a window Defines the positional or logical framing of a row in respect to its windowAbout Window Functions
  • 93. All window functions have an OVER() clause Specifies the ‘window’ of data to which the function applies Defines: Window partitions (PARTITION BY clause) Ordering within a window partition (ORDER BY clause) Framing within a window partition (ROWS/RANGE clauses)Defining Window Specifications (OVER Clause)
  • 94. SELECT * , row_number() OVER() FROM sale ORDER BY cn; SELECT * , row_number() OVER(PARTITION BY cn) FROM sale ORDER BY cn;row_number | cn | vn | pn | dt | qty | prc ------------+----+----+-----+------------+------+------ 1 | 1 | 10 | 200 | 1401-03-01 | 1 | 0 2 | 1 | 30 | 300 | 1401-05-02 | 1 | 0 3 | 1 | 50 | 400 | 1401-06-01 | 1 | 0 4 | 1 | 30 | 500 | 1401-06-01 | 12 | 5 5 | 1 | 20 | 100 | 1401-05-01 | 1 | 0 1 | 2 | 50 | 400 | 1401-06-01 | 1 | 0 2 | 2 | 40 | 100 | 1401-01-01 | 1100 | 2400 1 | 3 | 40 | 200 | 1401-04-01 | 1 | 0 (8 rows) row_number | cn | vn | pn | dt | qty | prc ------------+----+----+-----+------------+------+------ 1 | 1 | 10 | 200 | 1401-03-01 | 1 | 0 2 | 1 | 30 | 300 | 1401-05-02 | 1 | 0 3 | 1 | 50 | 400 | 1401-06-01 | 1 | 0 4 | 1 | 30 | 500 | 1401-06-01 | 12 | 5 5 | 1 | 20 | 100 | 1401-05-01 | 1 | 0 6 | 2 | 50 | 400 | 1401-06-01 | 1 | 0 7 | 2 | 40 | 100 | 1401-01-01 | 1100 | 2400 8 | 3 | 40 | 200 | 1401-04-01 | 1 | 0 (8 rows) OVER (PARTITION BY…) Example
  • 95. SELECT vn, sum(prc*qty) FROM sale GROUP BY vn ORDER BY 2 DESC; SELECT vn, sum(prc*qty), rank() OVER (ORDER BY sum(prc*qty) DESC) FROM sale GROUP BY vn ORDER BY 2 DESC; vn | sum | rank ----+---------+------ 40 | 2640002 | 1 30 | 180 | 2 50 | 0 | 3 20 | 0 | 3 10 | 0 | 3 (5 rows) vn | sum ----+--------- 40 | 2640002 30 | 180 50 | 0 20 | 0 10 | 0 (5 rows) OVER (ORDER BY…) Example
  • 96. 30 | 05022008 | 0 30 | 06012008 | 60 30 | 06012008 | 60 30 | 06012008 | 60Window Framing: “Box car” AverageSELECT vn, dt, AVG(prc*qty) OVER ( PARTITION BY vn ORDER BY dt ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING) FROM sale; vn | dt | avg ----+------------+--------- 10 | 03012008 | 30 20 | 05012008 | 20 30 | 05022008 | 0 30 | 06012008 | 60 30 | 06012008 | 60 30 | 06012008 | 60 40 | 06012008 | 140 40 | 06042008 | 90 40 | 06052008 | 120 40 | 06052008 | 100 50 | 06012008 | 30 50 | 06012008 | 10 (12 rows) 30 | 05022008 | 0 30 | 06012008 | 60 30 | 06012008 | 60 30 | 06012008 | 6030 | 05022008 | 0 30 | 06012008 | 60 30 | 06012008 | 60 30 | 06012008 | 6030 | 05022008 | 0 30 | 06012008 | 60 30 | 06012008 | 60 30 | 06012008 | 60OVER (…ROWS…) Example
  • 97. Useful for multiple window function queries Define and name a window specification Reuse window specification throughout the query EXAMPLE: SELECT RANK() OVER (ORDER BY pn), SUM(prc*qty) OVER (ORDER BY pn), AVG(prc*qty) OVER (ORDER BY pn) FROM sale; SELECT RANK() OVER (w1), SUM(prc*qty) OVER (w1), AVG(prc*qty) OVER (w1) FROM sale WINDOW w1 AS (ORDER BY pn);Global Window Specifications (WINDOW clause)
  • 98. cume_dist() dense_rank() first_value(expr) lag(expr [,offset] [,default]) last_value(expr) lead(expr [,offset] [,default]) ntile(expr) percent_rank() rank() row_number() * Any aggregate function (used with the OVER clause) can also be used as a window functionBuilt-In Window Functions
  • 99. Enhanced Workload Management with Dynamic Query Prioritization Self Healing Fault Tolerance Model with Differential Online Recovery Direct Dispatch Performance Optimization of Single Row Operation MPP Tablespace Support for Non-Uniform and SSD Segment Storage B-Tree and Bitmap Indexes on Column-Oriented and Append-Only Tables Health Monitoring Infrastructure with Email and SNMP Alerting Writable External Tables for Parallel Data Output Object-level 'Metadata Management' Tracking and Querying Enhanced Global Statistics Collection MapReduce Support for C Language Functions 4.0版本功能增强
  • 100. Q&A 问题与解答