Connect是Kafka 0.9版本新增的功能,可以方便的从其它源导入数据到Kafka数据流(指定Topic中),也可以方便的从Kafka数据流(指定Topic中)导出数据到其它源。
下面结合官方教程详述如何使用File Connector导入数据到Kafka Topic,和导出数据到File:
(1)创建文本文件test.txt,作为其它数据源。
[root@localhost home]# echo -e "connector\ntest" > test.txt
(2)启动Connect实验脚本,此脚本为官方提供的实验脚本,默认Connector是 File Connector。
[root@localhost kafka_2.12-0.10.2.0]# ./bin/connect-standalone.sh config/connect-standalone.properties config/connect-file-source.properties config/connect-file-sink.properties
出现下方错误,是因为文件位置不对,默认应将test.txt文件建立在Kafka目录下,和bin目录同级。
[2017-03-20 13:36:14,879] WARN Couldn't find file test.txt for FileStreamSourceTask, sleeping to wait for it to be created (org.apache.kafka.connect.file.FileStreamSourceTask:106)
出现下方错误,是因为Standalone模式Zookeeper会自动停止工作,重启Zookeeper服务器即可,如错误继续出现,重启Kafka服务器即可。
[2017-03-20 13:38:07,832] ERROR Failed to commit offsets for WorkerSourceTask{id=local-file-source-0} (org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter:112) [2017-03-20 13:38:22,833] ERROR Failed to flush WorkerSourceTask{id=local-file-source-0}, timed out while waiting for producer to flush outstanding 1 messages (org.apache.kafka.connect.runtime.WorkerSourceTask:304)
(3)查看导出文件,test.sink.txt,可以看到消费到的消息。
[root@localhost kafka_2.12-0.10.2.0]# cat test.sink.txt connector test
(4)消息已被存储到Topic:connect-test ,也可以启动一个消费者消费消息。
[root@localhost kafka_2.12-0.10.2.0]# ./bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic connect-test --from-beginning &
消费者消费的消息 :
[root@localhost kafka_2.12-0.10.2.0]# {"schema":{"type":"string","optional":false},"payload":"connector"} {"schema":{"type":"string","optional":false},"payload":"test"}
(5)编辑文件test.txt,新增一条消息,由于Connector此时已经启动,可以实时的看到消费者消费到的新消息。
[root@localhost kafka_2.12-0.10.2.0]# echo "Another line" >> test.txt
新的消息,已被实时消费:
[root@localhost kafka_2.12-0.10.2.0]# {"schema":{"type":"string","optional":false},"payload":"connector"} {"schema":{"type":"string","optional":false},"payload":"test"} {"schema":{"type":"string","optional":false},"payload":"Another line"}
本文属作者原创,转贴请声明!