spring xd 参照指南
spring xd 参考指南
参考指南
引言
概观
Spring XD is a unified, distributed, and extensible service for data ingestion, real time analytics, batch processing, and data export.
Spring XD是一个统一的,分布式,可扩展的系统用于 data ingestion,实时分析,批量处理和数据导出。
The Spring XD project is an open source Apache 2 License licenced project whose goal is to tackle big data complexity.
该项目的目标是简化大数据应用的开发。
Much of the complexity in building real-world big data applications is related to integrating many disparate systems into one cohesive solution across a range of use-cases.
建立真实世界的大数据的应用程序的大部分复杂性是在于将许多不同的系统为一个完整的解决方案,在一个范围内的使用情况。
创建一个综合的大数据解决方案中常见的用例是
高吞吐量的分布式数据的从各种输入源为大数据存储诸如HDFS或splunk收集
在收集时进行实时分析,例如采集数据和计算值
通过批处理进行工作流程管理。 这些工作将通过标准企业系统(RDBMS)和Hadoop操作(MapReduce,HDFS,Pig,Hive or Cascading(流注)整合在一起。
High throughput data export, e.g. from HDFS to a RDBMS or NoSQL database.
The Spring XD project aims to provide a one stop shop solution for these use-cases.
Getting Started
Requirements
To get started, make sure your system has as a minimum Java JDK 6 or newer installed. Java JDK 7 is recommended.
Download Spring XD
M4/spring-xd-1.0.0.M4-dist.zip
解压,这将产生的安装目录spring-xd-1.0.0.m2。
All the commands below are executed from this directory, so change into it before proceeding(进行,进程;行动)。
cp spring-xd-1.0.0.M4-dist.zip /opt/
cd /opt/
unzip spring-xd-1.0.0.M4-dist.zip
drwxr-xr-x 7 root root 4096 Nov 12 13:39 spring-xd-1.0.0.M4/
$ cd spring-xd-1.0.0.M2
设置环境变量
Set the environment variable XDHOME to the installation directory <root-install-dir>\spring-xd\xd
vi /etc/profile
export XDHOME=/opt/spring-xd-1.0.0.M4/xd
source /etc/profile
root@Master:/etc# echo $XDHOME
/opt/spring-xd-1.0.0.M4/xd
安装 Spring XD
Spring XD can be run in two different modes.There’s a single-node runtime option for testing and development, and there’s a distributed runtime which supports distribution of processing tasks across multiple nodes.
This document will get you up and running quickly with a single-node runtime.
See Running Distributed Mode for details on setting up a distributed runtime.
Start the Runtime and the XD Shell
The single node option is the easiest to get started with.
It runs everything you need in a single process. To start it, you just need to cd to the xd directory and run the following command
启动命令
chmod -R 777 spring-xd-1.0.0.M4
xd/bin$amp;> ./xd-singlenode
启动后会看的
INFO: Starting Servlet Engine: Apache Tomcat/7.0.35
XD Configuration:
XDHOME=/opt/spring-xd-1.0.0.M4/xd
XDTRANSPORT=local
XDSTORE=memory
XDANALYTICS=memory
XDHADOOP_DISTRO=hadoop12
在一个单独的终端 cd into the shell directory and start the XD shell, which you can use to issue commands.
cd /opt/spring-xd-1.0.0.M4/shell/bin
shell/bin$amp;> ./xd-shell
The shell is a more user-friendly front end to the REST API which Spring XD exposes to clients. The URL of the currently targeted Spring XD server is shown at startup.
You should now be able to start using Spring XD.
Create a Stream
在spring XD中,基本流定义了事件驱动的源数据到一个接收器的摄取过程通过任意数量的处理器
You can create a new stream by issuing(发布) a stream create command from the XD shell。Stream defintions are built from a simple DSL. For example, execute:
xd:>stream create --definition "time | log" --name ticktock
Created new stream 'ticktock
This defines a stream named ticktock based off the DSL expression time | log. The DSL uses the "pipe" symbol |, to connect a source to a sink.
在xd窗口返回
01:47:30,823 WARN task-scheduler-6 logger.ticktock:145 - 2013-12-27 01:47:30
01:47:31,825 WARN task-scheduler-9 logger.ticktock:145 - 2013-12-27 01:47:31
01:47:32,827 WARN task-scheduler-6 logger.ticktock:145 - 2013-12-27 01:47:32
01:47:33,830 WARN task-scheduler-9 logger.ticktock:145 - 2013-12-27 01:47:33
01:47:34,845 WARN task-scheduler-6 logger.ticktock:145 - 2013-12-27 01:47:34
01:47:35,849 WARN task-scheduler-4 logger.ticktock:145 - 2013-12-27 01:47:35
01:47:36,852 WARN task-scheduler-7 logger.ticktock:145 - 2013-12-27 01:47:36
01:47:37,854 WARN task-scheduler-4 logger.ticktock:145 - 2013-12-27 01:47:37
01:47:38,856 WARN task-scheduler-7 logger.ticktock:145 - 2013-12-27 01:47:38
01:47:39,858 WARN task-scheduler-4 logger.ticktock:145 - 2013-12-27 01:47:39
01:47:40,881 WARN task-scheduler-7 logger.ticktock:145 - 2013-12-27 01:47:40
time | log
In this simple example, the time source simply sends the current time as a message each second, and the log sink outputs it using the logging framework at the WARN logging level.
To stop the stream, and remove the definition completely, you can use the stream destroy command:
xd:>stream destroy --name ticktock
Destroyed stream 'ticktock'
It is also possibly to stop and restart the stream instead, using the undeploy and deploy commands. The shell supports command completion so you can hit the TAB key to see which commands and options are available.
Command 'tab' not found (for assistance press TAB)
xd:>
! // admin
aggregatecounter cls counter
date exit fieldvaluecounter
gauge hadoop help
http job module
richgauge runtime script
stream system version
xd:>
探索spring xd
Learn about the modules available in Spring XD in the Sources(源), Processors(处理器), and Sinks(接收器) sections of the documentation.
Running in Distributed Mode
Introduction
The Spring XD distributed runtime (DIRT) supports distribution of processing tasks across multiple nodes.
Spring XD can use several middlewares(中间软件) when running in distributed mode.
At the time of writing, Redis and RabbitMQ are available options.
在写的时候,Redis and RabbitMQ 是可用选项。
curl -d "multihttp --port=9001 --rulepath=passport | file --dir=/home/focusstat/log/passport --name=passport.log"
root@Master:/opt/spring-xd-1.0.0.M4/shell/bin# netstat -antup
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0: LISTEN 673/sshd
tcp 0 52 10.1.78.49:22 10.1.77.40:57969 ESTABLISHED 1586/1
tcp 0 0 10.1.78.49:22 10.1.77.40:56054 ESTABLISHED 1025/0
tcp6 0 0 :::22 ::: LISTEN 673/sshd
tcp6 0 0 :::9101 ::: LISTEN 1677/java
tcp6 0 0 :::9393 ::: LISTEN 1677/java
tcp6 0 0 127.0.0.1:9101 127.0.0.1:45686 ESTABLISHED 1677/java
tcp6 0 0 127.0.0.1:9101 127.0.0.1:45687 ESTABLISHED 1677/java
tcp6 0 0 127.0.0.1:45686 127.0.0.1:9101 ESTABLISHED 1677/java
tcp6 0 0 127.0.0.1:45687 127.0.0.1:9101 ESTABLISHED 1677/java
udp 0 0 0.0.0.0:68 0.0.0.0:* 640/dhclient3
xd:>stream create --name httptest --definition "http | file"
Created new stream 'httptest'
xd:> http post --target http://localhost:9000 --data "hello world"
> POST (text/plain;Charset=UTF-8) http://localhost:9000 hello world
> 200 OK
root@Master:/tmp/xd/output# tail -f
hello world
root@Master:/tmp/xd/output# curl -d "test" http://localhost:9000
root@Master:/tmp/xd/output# tail -f