Commit 40d6f17f authored by Joel Hamill's avatar Joel Hamill
Browse files

Merge branch '3.3.0-post' into 3.3.1-post

parents 28a30e0f f02ec4db
Loading
Loading
Loading
Loading
+1 −0
Original line number Diff line number Diff line
@@ -2,3 +2,4 @@ target
docs/_build
.idea
*.iml
*.DS_Store
+126 −120
Original line number Diff line number Diff line
.. _elasticsearch-overview:

Elasticsearch Connector
========================
=======================
The Elasticsearch connector allows moving data from Kafka to Elasticsearch. It writes data from
a topic in Kafka to an `index <https://www.elastic.co/guide/en/elasticsearch/reference/current/_basic_concepts.html#_index>`_
in Elasticsearch and all data for a topic have the same
@@ -26,44 +26,30 @@ connector provides a feature to infer mapping from the schemas of Kafka messages

.. _elasticsearch-quickstart:

Quick start
Quick Start
-----------
This quick start uses the Elasticsearch connector to export data produced by the Avro console
producer to Elasticsearch.

First, start all the necessary services using Confluent CLI:
**Prerequisites:**

.. tip::
- :ref:`Confluent Platform <installation>` is installed and services are running by using the Confluent CLI. This quick start assumes that you are using the Confluent CLI, but standalone installations are also supported. By default ZooKeeper, Kafka, Schema Registry, Kafka Connect REST API, and Kafka Connect are started with the ``confluent start`` command. For more information, see :ref:`installation_archive`.
- Elasticsearch 5.x is installed and running.

   If not already in your PATH, add Confluent's ``bin`` directory by running: ``export PATH=<path-to-confluent>/bin:$PATH``
  .. important:: Elasticsearch 6.x is not supported at this time due to a known issue.

.. sourcecode:: bash

   $ confluent start
----------------------------
Add a Record to the Consumer
----------------------------

Every service will start in order, printing a message with its status:
Start the Avro console producer to import a few records to Kafka:

.. sourcecode:: bash

    Starting zookeeper
    zookeeper is [UP]
    Starting kafka
    kafka is [UP]
    Starting schema-registry
    schema-registry is [UP]
    Starting kafka-rest
    kafka-rest is [UP]
    Starting connect
    connect is [UP]

Next, start the Avro console producer to import a few records to Kafka:

.. sourcecode:: bash

  $ ./bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test-elasticsearch-sink \
    <path-to-confluent>/bin/kafka-avro-console-producer --broker-list localhost:9092 --topic test-elasticsearch-sink \
    --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'

Then in the console producer, type in:
Then in the console producer, enter:

.. sourcecode:: bash

@@ -73,19 +59,43 @@ Then in the console producer, type in:

The three records entered are published to the Kafka topic ``test-elasticsearch`` in Avro format.

Before starting the connector, please make sure that the configurations in
``etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`` are properly set to your
configurations of Elasticsearch, e.g. ``connection.url`` points to the correct http address.
Then start the Elasticsearch connector by loading its configuration with the following command:
--------------------------------
Load the Elasticsearch Connector
--------------------------------

Load the predefined Elasticsearch connector.

.. tip:: Before starting the connector, you can verify that the configurations in ``etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties`` are properly set (e.g. ``connection.url`` points to the correct HTTP address).

#.  Optional: View the available predefined connectors with this command:

    .. sourcecode:: bash

  $ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
  etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties
        confluent list connectors

    Your output should resemble:

    .. sourcecode:: bash

        Bundled Predefined Connectors (edit configuration under etc/):
          elasticsearch-sink
          file-source
          file-sink
          jdbc-source
          jdbc-sink
          hdfs-sink
          s3-sink

#.  Load the the ``elasticsearch-sink`` connector:

    .. sourcecode:: bash

        confluent load elasticsearch-sink

    Your output should resemble:

    .. sourcecode:: bash

   $ confluent load elasticsearch-sink
        {
          "name": "elasticsearch-sink",
          "config": {
@@ -97,34 +107,43 @@ Then start the Elasticsearch connector by loading its configuration with the fol
            "type.name": "kafka-connect",
            "name": "elasticsearch-sink"
          },
     "tasks": []
          "tasks": [],
          "type": null
        }

To check that the connector started successfully view the Connect worker's log by running:
    .. tip:: For non-CLI users, you can load the Elasticsearch connector by running Connect in standalone mode with this command:

        .. sourcecode:: bash

  $ confluent log connect
            $ ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties \
            etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties


Towards the end of the log you should see that the connector starts, logs a few messages, and then exports
data from Kafka to Elasticsearch.
Once the connector finishes ingesting data to Elasticsearch, check that the data is available in Elasticsearch:
#.  After the connector finishes ingesting data to Elasticsearch, check that the data is available in Elasticsearch:

    .. sourcecode:: bash

      $ curl -XGET 'http://localhost:9200/test-elasticsearch-sink/_search?pretty'


    Your output should resemble:

    .. sourcecode:: bash

        {
   "took" : 2,
          "took" : 39,
          "timed_out" : false,
          "_shards" : {
            "total" : 5,
            "successful" : 5,
            "skipped" : 0,
            "failed" : 0
          },
          "hits" : {
     "total" : 1,
            "total" : 3,
            "max_score" : 1.0,
     "hits" : [ {
            "hits" : [
              {
                "_index" : "test-elasticsearch-sink",
                "_type" : "kafka-connect",
                "_id" : "test-elasticsearch-sink+0+0",
@@ -132,42 +151,29 @@ Once the connector finishes ingesting data to Elasticsearch, check that the data
                "_source" : {
                  "f1" : "value1"
                }
     }]
              },
              {
                "_index" : "test-elasticsearch-sink",
                "_type" : "kafka-connect",
                "_id" : "test-elasticsearch-sink+0+2",
                "_score" : 1.0,
                "_source" : {
                  "f1" : "value3"
                }
              },
              {
                "_index" : "test-elasticsearch-sink",
                "_type" : "kafka-connect",
                "_id" : "test-elasticsearch-sink+0+1",
                "_score" : 1.0,
                "_source" : {
                  "f1" : "value2"
                }
              }
            ]
          }
        }

Finally, stop the Connect worker as well as all the rest of the Confluent services by running:

.. sourcecode:: bash

      $ confluent stop
      Stopping connect
      connect is [DOWN]
      Stopping kafka-rest
      kafka-rest is [DOWN]
      Stopping schema-registry
      schema-registry is [DOWN]
      Stopping kafka
      kafka is [DOWN]
      Stopping zookeeper
      zookeeper is [DOWN]

or stop all the services and additionally wipe out any data generated during this quick start by running:

.. sourcecode:: bash

      $ confluent destroy
      Stopping connect
      connect is [DOWN]
      Stopping kafka-rest
      kafka-rest is [DOWN]
      Stopping schema-registry
      schema-registry is [DOWN]
      Stopping kafka
      kafka is [DOWN]
      Stopping zookeeper
      zookeeper is [DOWN]
      Deleting: /tmp/confluent.w1CpYsaI

Features
--------