Hadoop与关系数据库数据相互迁移工具 Apache Sqoop 1.4.5 发布

jopen 10年前

Hadoop与关系数据库数据相互迁移工具 Apache Sqoop 1.4.5 发布

Sqoop是一个用来将Hadoop和关系型数据库中的数据相互转移的工具,可以将一个关系型数据库(例如 : MySQL ,Oracle ,Postgres等)中的数据导入到Hadoop的HDFS中,也可以将HDFS的数据导入到关系型数据库中。

Apache Sqoop 1.4.5 发布,此版本是 Sqoop 作为 Apache TLP 项目以来的第四个版本。

子任务

  • [SQOOP-1194] - Make changes to Sqoop build file to enable Netezza third party tests

  • [SQOOP-1323] - Update HCatalog version to 0.13 in Sqoop builds

  • [SQOOP-1324] - Support new hive datatypes in Sqoop hcatalog integration

  • [SQOOP-1325] - Make hcatalog object names escaped during creation so that reserved words are properly processed

  • [SQOOP-1326] - Support multiple static partition keys for better integration support

  • [SQOOP-1357] - QA testing of Data Connector for Oracle and Hadoop

  • [SQOOP-1363] - Document Hcatalog integration enhancements introduced in SQOOP-1322

Bug 修复

  • [SQOOP-585] - Bug when sqoop a join of two tables with the same column name with mysql backend

  • [SQOOP-832] - Document --columns argument usage in export tool

  • [SQOOP-1032] - Add the --bulk-load-dir option to support the HBase doBulkLoad function

  • [SQOOP-1107] - Further improve error reporting when exporting malformed data

  • [SQOOP-1117] - when failed to import a non-existing table, the failure information includes NullPointerException

  • [SQOOP-1138] - incremental lastmodified should re-use output directory

  • [SQOOP-1167] - Enhance HCatalog support to allow direct mode connection manager implementations

  • [SQOOP-1170] - Can't import columns with name "public"

  • [SQOOP-1179] - Incorrect warning saying --hive-import was not specified when it was specified

  • [SQOOP-1185] - LobAvroImportTestCase is sensitive to test method order execution

  • [SQOOP-1190] - Class HCatHadoopShims will be removed in HCatalog 0.12

  • [SQOOP-1192] - Add option "--skip-dist-cache" to allow Sqoop not copying jars in %SQOOP_HOME%\lib folder when launched by Oozie and use Oozie share lib

  • [SQOOP-1209] - DirectNetezzaManager fails to find tables from older Netezza system catalogs

  • [SQOOP-1216] - Improve error message on corrupted input while doing export

  • [SQOOP-1224] - Enable use of Oracle Wallets with Oracle Manager

  • [SQOOP-1226] - --password-file option triggers FileSystemClosed exception at end of Oozie action

  • [SQOOP-1227] - Sqoop fails to compile against commons-io higher then 1.4

  • [SQOOP-1228] - Method Configuration#unset is not available on Hadoop < 1.2.0

  • [SQOOP-1239] - Sqoop import code too large error

  • [SQOOP-1246] - HBaseImportJob should add job authtoken only if HBase is secured

  • [SQOOP-1249] - Sqoop HCatalog Import fails with -queries because of validation issues

  • [SQOOP-1250] - Oracle connector is not disabling autoCommit on created connections

  • [SQOOP-1259] - Sqoop on Windows can't run HCatalog/HBase multinode jobs

  • [SQOOP-1260] - HADOOP_MAPRED_HOME should be defaulted correctly

  • [SQOOP-1261] - CompilationManager should add Hadoop 2.x libraries to the classpath under Hadoop 2.x

  • [SQOOP-1268] - Sqoop tarballs do not contain .gitignore and .gitattribute files

  • [SQOOP-1271] - Sqoop hcatalog location should support older bigtop default location also

  • [SQOOP-1273] - Multiple append jobs can easily end up sharing directories

  • [SQOOP-1278] - Allow use of uncommitted isolation for databases that support it as an import option

  • [SQOOP-1279] - Sqoop connection resiliency option breaks older Mysql versions that don't have JDBC 4 methods

  • [SQOOP-1302] - Doesn't run the mapper for remaining splits, when split-by ROWNUM

  • [SQOOP-1303] - Can only write to default file system on incremental import

  • [SQOOP-1316] - Example for use of password file in docs is incorrect

  • [SQOOP-1322] - Enhance Sqoop HCatalog Integration to cover features introduced in newer Hive versions

  • [SQOOP-1329] - JDBC connection to Oracle timeout after data import but before hive metadata import

  • [SQOOP-1339] - Synchronize .gitignore files

  • [SQOOP-1353] - Sqoop 1.4.5 release preparation

  • [SQOOP-1358] - Add wallet support for Oracle High performance connector

  • [SQOOP-1359] - Fix avro versions in Sqoop to stop shipping hadoop1  jars with hadoop2

  • [SQOOP-1362] - TestImportJob getContent method doesn't work

  • [SQOOP-1365] - Do not print stack trace when we can't move generated .java file to CWD

  • [SQOOP-1370] - AccumuloUtils can throw NPE when zookeeper or accumulo home is null

  • [SQOOP-1372] - configure-sqoop does not export ZOOKEEPER_HOME

  • [SQOOP-1398] - Upgrade ivy version used to the latest release version

  • [SQOOP-1399] - Fix TestOraOopJdbcUrl test case

  • [SQOOP-1406] - Add license headers

  • [SQOOP-1410] - Update change log for 1.4.5

改进

  • [SQOOP-435] - Avro import should write the Schema to a file

  • [SQOOP-1056] - Implement connection resiliency in Sqoop using pluggable failure handlers

  • [SQOOP-1132] - Print out Sqoop version into log during execution

  • [SQOOP-1137] - Put a stress in the user guide that eval tool is meant for evaluation purpose only

  • [SQOOP-1161] - Generated Delimiter Set Field Should be Static

  • [SQOOP-1172] - Make Sqoop compatible with HBase 0.95+

  • [SQOOP-1203] - Add another default case for finding *_HOME when not explicitly defined

  • [SQOOP-1212] - Do not print usage on wrong command line

  • [SQOOP-1213] - Support reading password files from Amazon S3

  • [SQOOP-1223] - Enhance the password file capability to enable plugging-in custom loaders

  • [SQOOP-1282] - Consider avro files even if they carry no extension

  • [SQOOP-1321] - Add ability to serialize SqoopOption into JobConf

  • [SQOOP-1337] - Doc refactoring - Consolidate documentation of --direct

  • [SQOOP-1341] - Sqoop Export Upsert for MySQL lacks batch support

  • [SQOOP-1373] - Sqoop import schema is locked shows NullPointerException

新特性

  • [SQOOP-767] - Add support for Accumulo

  • [SQOOP-1051] - Support direct mode connection managers in a generalized fashion

  • [SQOOP-1197] - Enable Sqoop to build against Hadoop-2.1.0-beta jar files

  • [SQOOP-1287] - Add high performance Oracle connector into Sqoop

任务

  • [SQOOP-1207] - Allow user to override java source version

  • [SQOOP-1344] - Add documentation for Oracle connector

  • [SQOOP-1408] - Document SQL Server's --non-resilient arg

测试

  • [SQOOP-1057] - Introduce fault injection framework to test connection resiliency

来自:http://www.oschina.net/news/54434/apache-sqoop-1-4-5-released