Weather

Importing a table from teradata into hadoop hdfs /hive using TDCH command line interface .(In two different environments)

 Explanation of importing table from Teradata into Hive, Hdfs

-----------------------------------------
a)Teradata database in 192.168.229.130  and Cloudera /hortonworks is in different operating system.
b) Teradata connector for hadoop is installed in Cloudera where library files are stored in lib folder and configuration files "*.xml" stored in hadoop home($HADOOP_HOME) "CONF" directory.

c)teradata-export-properties.xml.template
teradata-import-properties.xml.template

d)tdgssconfig.jar , teradata-connector-1.4.1.jar,  terajdbc4.jar in lib folder of tdch .

e) ConnectorImportTool for import jobs
f)ConnectorExportTool  for export jobs
g)ConnectorPluginTool is also another method to do the desired import/export where the job is identified by
command line parameters sourceplugin/targetplugin and plugin-in specific parameters identified by -D< ..> option
 
Step 1:
cli>export TDCH_JAR=/usr/lib/tdch/1.4/lib/teradata-connector-1.4.1.jar

 cli>
hadoop jar $TDCH_JAR com.teradata.connector.common.tool.ConnectorImportTool 
-classname com.teradata.jdbc.TeraDriver 
-url jdbc:teradata://192.168.229.130/DATABASE=dbc
-sourcetable tables
-username dbc -password dbc
-jobtype hdfs -fileformat textfile
-method split.by.hash
-separator ","
 -splitbycolumn DatabaseName  
-targetpaths /user/hadoop/sales_transaction

NOTES: 
Database in teradata : DBC
(sourcetable)Table name in teradata : tables
Username for teradata : dbc
Password for teradata : dbc
jobtype : HDFS (where the file has to be imported to hdfs that is the reason the job type is hdfs)
Target paths : hdfs path where the file has to be stored.
seperator: the field names data is seperated by ',' in hdfs
fileformat : the import file format has to be stored as textfile
method : split.by.hash
splitbycloumn: in table which column you want to split data by for the mapper/reducer(to identify as key)






No comments:

Post a Comment