Apache Hive v2.0.1发布

jopen 8年前
   <p style="text-align: center;"><img alt="" src="https://simg.open-open.com/show/988125c04f1b57cf3a5f7ea76d4aa4b2.png" /></p>    <p>Hive是一个基于Hadoop的开源数据仓库工具,用于存储和处理海量结构化数据。它是非死book 2008年8月开源的一个数据仓库框架,提供了类似于SQL语法的HQL语句作为数据访问接口,Hive有如下优缺点:<br /> 优点:</p>    <ul>     <li>Hive 使用类SQL 查询语法, 最大限度的实现了和SQL标准的兼容,大大降低了传统数据分析人员学习的曲线;</li>     <li>使用JDBC 接口/ODBC接口,开发人员更易开发应用;</li>     <li>以MR 作为计算引擎、HDFS 作为存储系统,为超大数据集设计的计算/ 扩展能力;</li>     <li>统一的元数据管理(Derby、MySql等),并可与Pig 、Presto 等共享;</li>    </ul>    <p>缺点:</p>    <ul>     <li>Hive 的HQL 表达的能力有限,有些复杂运算用HQL 不易表达;</li>     <li>由于Hive自动生成MapReduce 作业, HQL 调优困难;</li>     <li>粒度较粗,可控性差</li>    </ul>    <p style="text-align: center;"><br /> Hive运行架构<br /> <img src="https://simg.open-open.com/show/ef4c85fc0e2ee7dd847b329ae738878e.jpg" /></p>    <h2>更新日志</h2>    <h3>Sub-task</h3>    <ul>     <li>[<a href="/misc/goto?guid=4958990847841605192">HIVE-13362</a>] - Commit binary file required for HIVE-13361</li>    </ul>    <h3>Bug修复</h3>    <ul>     <li>[<a href="/misc/goto?guid=4958990847963109386">HIVE-9499</a>] - hive.limit.query.max.table.partition makes queries fail on non-partitioned tables</li>     <li>[<a href="/misc/goto?guid=4958990848091035255">HIVE-9862</a>] - Vectorized execution corrupts timestamp values</li>     <li>[<a href="/misc/goto?guid=4958990848206079659">HIVE-10729</a>] - Query failed when select complex columns from joinned table (tez map join only)</li>     <li>[<a href="/misc/goto?guid=4958990848331010936">HIVE-12064</a>] - prevent transactional=false</li>     <li>[<a href="/misc/goto?guid=4958990848446625617">HIVE-12165</a>] - wrong result when hive.optimize.sampling.orderby=true with some aggregate functions</li>     <li>[<a href="/misc/goto?guid=4958990848564544069">HIVE-12552</a>] - Wrong number of reducer estimation causing job to fail</li>     <li>[<a href="/misc/goto?guid=4958990848688626194">HIVE-12749</a>] - Constant propagate returns string values in incorrect format</li>     <li>[<a href="/misc/goto?guid=4958990848798684660">HIVE-12799</a>] - Always use Schema Evolution for ACID</li>     <li>[<a href="/misc/goto?guid=4958990848933677694">HIVE-12887</a>] - Handle ORC schema on read with fewer columns than file schema (after Schema Evolution changes)</li>     <li>[<a href="/misc/goto?guid=4958990849064428722">HIVE-12894</a>] - Detect whether ORC is reading from ACID table correctly for Schema Evolution</li>     <li>[<a href="/misc/goto?guid=4958990849198662551">HIVE-12937</a>] - DbNotificationListener unable to clean up old notification events</li>     <li>[<a href="/misc/goto?guid=4958990849328120010">HIVE-12990</a>] - LLAP: ORC cache NPE without FileID support</li>     <li>[<a href="/misc/goto?guid=4958990849454130254">HIVE-12992</a>] - Hive on tez: Bucket map join plan is incorrect</li>     <li>[<a href="/misc/goto?guid=4958990849588667361">HIVE-13036</a>] - Split hive.root.logger separately to make it compatible with log4j1.x (for remaining services)</li>     <li>[<a href="/misc/goto?guid=4958990849715477700">HIVE-13051</a>] - Deadline class has numerous issues</li>     <li>[<a href="/misc/goto?guid=4958990849831590735">HIVE-13056</a>] - delegation tokens do not work with HS2 when used with http transport and kerberos</li>     <li>[<a href="/misc/goto?guid=4958990849941373852">HIVE-13079</a>] - LLAP: Allow reading log4j properties from default JAR resources</li>     <li>[<a href="/misc/goto?guid=4958990850042125182">HIVE-13083</a>] - Writing HiveDecimal to ORC can wrongly suppress present stream</li>     <li>[<a href="/misc/goto?guid=4958990850141396113">HIVE-13086</a>] - LLAP: Programmatically initialize log4j2 to print out the properties location</li>     <li>[<a href="/misc/goto?guid=4958990850248913015">HIVE-13090</a>] - Hive metastore crashes on NPE with ZooKeeperTokenStore</li>     <li>[<a href="/misc/goto?guid=4958990850341464895">HIVE-13093</a>] - hive metastore does not exit on start failure</li>     <li>[<a href="/misc/goto?guid=4958990850448191641">HIVE-13105</a>] - LLAP token hashCode and equals methods are incorrect</li>     <li>[<a href="/misc/goto?guid=4958990850574256966">HIVE-13108</a>] - Operators: SORT BY randomness is not safe with network partitions</li>     <li>[<a href="/misc/goto?guid=4958990850698014767">HIVE-13110</a>] - LLAP: Package log4j2 jars into Slider pkg</li>     <li>[<a href="/misc/goto?guid=4958990850820980590">HIVE-13111</a>] - Fix timestamp / interval_day_time wrong results with HIVE-9862</li>     <li>[<a href="/misc/goto?guid=4958990850955172246">HIVE-13115</a>] - MetaStore Direct SQL getPartitions call fail when the columns schemas for a partition are null</li>     <li>[<a href="/misc/goto?guid=4958990851073271338">HIVE-13126</a>] - Clean up MapJoinOperator properly to avoid object cache reuse with unintentional states</li>     <li>[<a href="/misc/goto?guid=4958990851200411365">HIVE-13134</a>] - JDBC: JDBC Standalone should not be in the lib dir by default</li>     <li>[<a href="/misc/goto?guid=4958990851323834493">HIVE-13144</a>] - HS2 can leak ZK ACL objects when curator retries to create the persistent ephemeral node</li>     <li>[<a href="/misc/goto?guid=4958990851449062851">HIVE-13151</a>] - Clean up UGI objects in FileSystem cache for transactions</li>     <li>[<a href="/misc/goto?guid=4958990851580044245">HIVE-13153</a>] - SessionID is appended to thread name twice</li>     <li>[<a href="/misc/goto?guid=4958990851706137600">HIVE-13199</a>] - NDC stopped working in LLAP logging</li>     <li>[<a href="/misc/goto?guid=4958990851819928533">HIVE-13200</a>] - Aggregation functions returning empty rows on partitioned columns</li>     <li>[<a href="/misc/goto?guid=4958990851956485630">HIVE-13232</a>] - Aggressively drop compression buffers in ORC OutStreams</li>     <li>[<a href="/misc/goto?guid=4958990852071867879">HIVE-13236</a>] - LLAP: token renewal interval needs to be set</li>     <li>[<a href="/misc/goto?guid=4958990852201696437">HIVE-13240</a>] - GroupByOperator: Drop the hash aggregates when closing operator</li>     <li>[<a href="/misc/goto?guid=4958990852328068891">HIVE-13242</a>] - DISTINCT keyword is dropped by the parser for windowing</li>     <li>[<a href="/misc/goto?guid=4958990852444599341">HIVE-13243</a>] - Hive drop table on encyption zone fails for external tables</li>     <li>[<a href="/misc/goto?guid=4958990852570840588">HIVE-13255</a>] - FloatTreeReader.nextVector is expensive</li>     <li>[<a href="/misc/goto?guid=4958990852703515389">HIVE-13263</a>] - Vectorization: Unable to vectorize regexp_extract/regexp_replace " Udf: GenericUDFBridge, is not supported"</li>     <li>[<a href="/misc/goto?guid=4958990852824260868">HIVE-13285</a>] - Orc concatenation may drop old files from moving to final path</li>     <li>[<a href="/misc/goto?guid=4958990852942259355">HIVE-13286</a>] - Query ID is being reused across queries</li>     <li>[<a href="/misc/goto?guid=4958990853065619384">HIVE-13294</a>] - AvroSerde leaks the connection in a case when reading schema from a url</li>     <li>[<a href="/misc/goto?guid=4958990853181980149">HIVE-13296</a>] - Add vectorized Q test with complex types showing count(*) etc work correctly</li>     <li>[<a href="/misc/goto?guid=4958990853298797367">HIVE-13299</a>] - Column Names trimmed of leading and trailing spaces</li>     <li>[<a href="/misc/goto?guid=4958990853411561590">HIVE-13310</a>] - Vectorized Projection Comparison Number Column to Scalar broken for !noNulls and selectedInUse</li>     <li>[<a href="/misc/goto?guid=4958990853540436230">HIVE-13313</a>] - TABLESAMPLE ROWS feature broken for Vectorization</li>     <li>[<a href="/misc/goto?guid=4958990853661454939">HIVE-13324</a>] - LLAP: history log for FRAGMENT_START doesn't log DagId correctly</li>     <li>[<a href="/misc/goto?guid=4958990853779331818">HIVE-13327</a>] - SessionID added to HS2 threadname does not trim spaces</li>     <li>[<a href="/misc/goto?guid=4958990853891947399">HIVE-13330</a>] - ORC vectorized string dictionary reader does not differentiate null vs empty string dictionary</li>     <li>[<a href="/misc/goto?guid=4958990854012042932">HIVE-13346</a>] - LLAP doesn't update metadata priority when reusing from cache; some tweaks in LRFU policy</li>     <li>[<a href="/misc/goto?guid=4958990854129649528">HIVE-13361</a>] - Orc concatenation should enforce the compression buffer size</li>     <li>[<a href="/misc/goto?guid=4958990854254179050">HIVE-13379</a>] - HIVE-12851 args do not work (slider-keytab-dir, etc.)</li>     <li>[<a href="/misc/goto?guid=4958990854368043464">HIVE-13390</a>] - HiveServer2: Add more test to ZK service discovery using MiniHS2</li>     <li>[<a href="/misc/goto?guid=4958990854489659625">HIVE-13394</a>] - Analyze table fails in tez on empty partitions/files/tables</li>     <li>[<a href="/misc/goto?guid=4958990854608575929">HIVE-13396</a>] - LLAP: Include hadoop-metrics2.properties file LlapServiceDriver</li>     <li>[<a href="/misc/goto?guid=4958990854721274481">HIVE-13405</a>] - Fix Connection Leak in OrcRawRecordMerger</li>     <li>[<a href="/misc/goto?guid=4958990854838623102">HIVE-13428</a>] - ZK SM in LLAP should have unique paths per cluster</li>     <li>[<a href="/misc/goto?guid=4958990854962993819">HIVE-13463</a>] - Fix ImportSemanticAnalyzer to allow for different src/dst filesystems</li>     <li>[<a href="/misc/goto?guid=4958990855086580365">HIVE-13468</a>] - branch-2 build is broken</li>     <li>[<a href="/misc/goto?guid=4958990855201602689">HIVE-13523</a>] - Fix connection leak in ORC RecordReader and refactor for unit testing</li>     <li>[<a href="/misc/goto?guid=4958990855322573252">HIVE-13630</a>] - missing license headers</li>     <li>[<a href="/misc/goto?guid=4958990855438907634">HIVE-13645</a>] - Beeline needs null-guard around hiveVars and hiveConfVars read</li>    </ul>    <h3>改进</h3>    <ul>     <li>[<a href="/misc/goto?guid=4958990855553025641">HIVE-10115</a>] - HS2 running on a Kerberized cluster should offer Kerberos(GSSAPI) and Delegation token(DIGEST) when alternate authentication is enabled</li>     <li>[<a href="/misc/goto?guid=4958990855671897852">HIVE-13120</a>] - propagate doAs when generating ORC splits</li>     <li>[<a href="/misc/goto?guid=4958990855795233084">HIVE-13782</a>] - Compile async query asynchronously</li>    </ul>    <h2>下载</h2>    <ul>     <li><a href="/misc/goto?guid=4958990855918421936" rel="nofollow"><strong>Source code</strong> (zip)</a></li>     <li><a href="/misc/goto?guid=4958990856033655595" rel="nofollow"><strong>Source code</strong> (tar.gz)</a></li>     <li><a href="/misc/goto?guid=4958990856142858539">官网下载</a></li>    </ul>