Integration of Hive and Elasticsearch on cloudera Hadoop hive version 1.1.0


#1

What I am trying to Integrate Hive and Elasticsearch ,I am getting the below issue ,Please allow me to explain as below :-

Hive version is my centos
Hive 1.1.0-cdh5.8.0

Elasticsearch version elasticsearch-2.3.5

In hive shell I am firing the below command

hive> ADD JAR /home/cloudera/Downloads/elasticsearch-hadoop-2.1.1.jar;
Added [/home/cloudera/Downloads/elasticsearch-hadoop-2.1.1.jar] to class path
Added resources: [/home/cloudera/Downloads/elasticsearch-hadoop-2.1.1.jar]

hive> CREATE EXTERNAL TABLE mcollect(Product string)

ROW FORMAT SERDE 'org.elasticsearch.hadoop.hive.EsSerDe’
STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler’
TBLPROPERTIES(‘es.nodes’ = ‘http://localhost:9200/’, ‘es.index.auto.create’ = ‘true’, ‘es.resource’ = ‘ecommerce/Product’, ‘es.mapping.Product’ = ‘Product:esproduct’);
OK
Time taken: 0.638 seconds
when I am trying to do the below I am getting the below error

hive> INSERT OVERWRITE TABLE mcollect select Product from sales2;

Query ID = cloudera_20160831013434_9d2882a6-67fa-40e4-9902-8b59d01428ab
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there’s no reduce operator
Starting Job = job_1472630763866_0002, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1472630763866_0002/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1472630763866_0002
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2016-08-31 01:34:14,895 Stage-0 map = 0%, reduce = 0%
2016-08-31 01:34:49,589 Stage-0 map = 100%, reduce = 0%
Ended Job = job_1472630763866_0002 with errors
Error during job, obtaining debugging information…
Job Tracking URL: http://quickstart.cloudera:8088/proxy/application_1472630763866_0002/
Examining task ID: task_1472630763866_0002_m_000000 (and more) from job job_1472630763866_0002

Task with the most failures(4):
Task ID:
task_1472630763866_0002_m_000000

URL:
http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1472630763866_0002&tipid=task_1472630763866_0002_m_000000

Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {“product”:“prodq”}
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:179)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row {“product”:“prodq”}
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:507)
at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:170)
… 8 more
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: -11
at java.lang.String.substring(String.java:1911)
at org.elasticsearch.hadoop.rest.RestClient.discoverNodes(RestClient.java:110)
at org.elasticsearch.hadoop.rest.InitializationUtils.discoverNodesIfNeeded(InitializationUtils.java:58)
at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:374)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:173)
at org.elasticsearch.hadoop.hive.EsHiveOutputFormat$EsHiveRecordWriter.write(EsHiveOutputFormat.java:58)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:697)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:497)
… 9 more

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
MapReduce Jobs Launched:
Stage-Stage-0: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
Total MapReduce CPU Time Spent: 0 msec.

The structure of Sales2 table is as below :-
create external table sales2(Product string) row format delimited fields terminated by ‘,’ location ‘/user/cloudera/dir1’;

Kindly help me for the above issue .

Thanks
Prosenjit