Merge branch '3.3.0-post' into 3.3.1-post (40d6f17f) · Commits · REDMIC Project / Confluent / Kafka Connect Elasticsearch

.gitignore

+1 −0

Original line number	Diff line number	Diff line
		@@ -2,3 +2,4 @@ target
		docs/_build
		.idea
		*.iml
		*.DS_Store

docs/elasticsearch_connector.rst

+126 −120

Original line number	Diff line number	Diff line
		.. _elasticsearch-overview:

		Elasticsearch Connector
		========================
		=======================
		The Elasticsearch connector allows moving data from Kafka to Elasticsearch. It writes data from
		a topic in Kafka to an `index <https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html#_index>`_
		in Elasticsearch and all data for a topic have the same
		@@ -26,44 +26,30 @@ connector provides a feature to infer mapping from the schemas of Kafka messages

		.. _elasticsearch-quickstart:

		Quick start
		Quick Start
		-----------
		This quick start uses the Elasticsearch connector to export data produced by the Avro console
		producer to Elasticsearch.

		First, start all the necessary services using Confluent CLI:
		Prerequisites:

		.. tip::
		- :ref:`Confluent Platform <installation>` is installed and services are running by using the Confluent CLI. This quick start assumes that you are using the Confluent CLI, but standalone installations are also supported. By default ZooKeeper, Kafka, Schema Registry, Kafka Connect REST API, and Kafka Connect are started with the ``confluent start`` command. For more information, see :ref:`installation_archive`.
		- Elasticsearch 5.x is installed and running.

		If not already in your PATH, add Confluent's ``bin`` directory by running: ``export PATH=<path-to-confluent>/bin:$PATH``
		.. important:: Elasticsearch 6.x is not supported at this time due to a known issue.

		.. sourcecode:: bash

		$ confluent start
		----------------------------
		Add a Record to the Consumer
		----------------------------

		Every service will start in order, printing a message with its status:
		Start the Avro console producer to import a few records to Kafka:

		.. sourcecode:: bash

		Starting zookeeper
		zookeeper is [UP]
		Starting kafka
		kafka is [UP]
		Starting schema-registry
		schema-registry is [UP]
		Starting kafka-rest
		kafka-rest is [UP]
		Starting connect
		connect is [UP]

		Next, start the Avro console producer to import a few records to Kafka:

		.. sourcecode:: bash

		$ ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test-elasticsearch-sink \
		<path-to-confluent>/bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test-elasticsearch-sink \
		--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'

		Then in the console producer, type in:
		Then in the console producer, enter:

		.. sourcecode:: bash

		@@ -73,19 +59,43 @@ Then in the console producer, type in:

		The three records entered are published to the Kafka topic ``test-elasticsearch`` in Avro format.

		Before starting the connector, please make sure that the configurations in
		``etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`` are properly set to your
		configurations of Elasticsearch, e.g. ``connection.url`` points to the correct http address.
		Then start the Elasticsearch connector by loading its configuration with the following command:
		--------------------------------
		Load the Elasticsearch Connector
		--------------------------------

		Load the predefined Elasticsearch connector.

		.. tip:: Before starting the connector, you can verify that the configurations in ``etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`` are properly set (e.g. ``connection.url`` points to the correct HTTP address).

		#. Optional: View the available predefined connectors with this command:

		.. sourcecode:: bash

		$ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
		etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties
		confluent list connectors

		Your output should resemble:

		.. sourcecode:: bash

		Bundled Predefined Connectors (edit configuration under etc/):
		elasticsearch-sink
		file-source
		file-sink
		jdbc-source
		jdbc-sink
		hdfs-sink
		s3-sink

		#. Load the the ``elasticsearch-sink`` connector:

		.. sourcecode:: bash

		confluent load elasticsearch-sink

		Your output should resemble:

		.. sourcecode:: bash

		$ confluent load elasticsearch-sink
		{
		"name": "elasticsearch-sink",
		"config": {
		@@ -97,34 +107,43 @@ Then start the Elasticsearch connector by loading its configuration with the fol
		"type.name": "kafka-connect",
		"name": "elasticsearch-sink"
		},
		"tasks": []
		"tasks": [],
		"type": null
		}

		To check that the connector started successfully view the Connect worker's log by running:
		.. tip:: For non-CLI users, you can load the Elasticsearch connector by running Connect in standalone mode with this command:

		.. sourcecode:: bash

		$ confluent log connect
		$ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
		etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties


		Towards the end of the log you should see that the connector starts, logs a few messages, and then exports
		data from Kafka to Elasticsearch.
		Once the connector finishes ingesting data to Elasticsearch, check that the data is available in Elasticsearch:
		#. After the connector finishes ingesting data to Elasticsearch, check that the data is available in Elasticsearch:

		.. sourcecode:: bash

		$ curl -XGET 'http://localhost:9200/test-elasticsearch-sink/_search?pretty'


		Your output should resemble:

		.. sourcecode:: bash

		{
		"took" : 2,
		"took" : 39,
		"timed_out" : false,
		"_shards" : {
		"total" : 5,
		"successful" : 5,
		"skipped" : 0,
		"failed" : 0
		},
		"hits" : {
		"total" : 1,
		"total" : 3,
		"max_score" : 1.0,
		"hits" : [ {
		"hits" : [
		{
		"_index" : "test-elasticsearch-sink",
		"_type" : "kafka-connect",
		"_id" : "test-elasticsearch-sink+0+0",
		@@ -132,42 +151,29 @@ Once the connector finishes ingesting data to Elasticsearch, check that the data
		"_source" : {
		"f1" : "value1"
		}
		}]
		},
		{
		"_index" : "test-elasticsearch-sink",
		"_type" : "kafka-connect",
		"_id" : "test-elasticsearch-sink+0+2",
		"_score" : 1.0,
		"_source" : {
		"f1" : "value3"
		}
		},
		{
		"_index" : "test-elasticsearch-sink",
		"_type" : "kafka-connect",
		"_id" : "test-elasticsearch-sink+0+1",
		"_score" : 1.0,
		"_source" : {
		"f1" : "value2"
		}
		}
		]
		}
		}

		Finally, stop the Connect worker as well as all the rest of the Confluent services by running:

		.. sourcecode:: bash

		$ confluent stop
		Stopping connect
		connect is [DOWN]
		Stopping kafka-rest
		kafka-rest is [DOWN]
		Stopping schema-registry
		schema-registry is [DOWN]
		Stopping kafka
		kafka is [DOWN]
		Stopping zookeeper
		zookeeper is [DOWN]

		or stop all the services and additionally wipe out any data generated during this quick start by running:

		.. sourcecode:: bash

		$ confluent destroy
		Stopping connect
		connect is [DOWN]
		Stopping kafka-rest
		kafka-rest is [DOWN]
		Stopping schema-registry
		schema-registry is [DOWN]
		Stopping kafka
		kafka is [DOWN]
		Stopping zookeeper
		zookeeper is [DOWN]
		Deleting: /tmp/confluent.w1CpYsaI

		Features
		--------