在Ubuntu环境部署Apache Spark集群

jopen 8年前

1、软件环境

本文主要讲述怎样在Ubuntu系统上部署Apache Spark独立集群(Standalone Cluster)的过程。所需的软件如下:

  • Ubuntu 15.10 x64
  • Apache Spark 1.5.1

2、安装所需的一切

# sudo apt-get install git -y  # sudo apt-add-repository ppa:webupd8team/java -y  # sudo apt-get update -y  # sudo apt-get install oracle-java8-installer -y  # sudo apt-get install oracle-java8-set-default   # sudo apt-get install maven gradle -y  # sudo apt-get install sbt -y  # sudo wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz  # sudo tar -xvf spark*.tgz  # sudo chmod 755 spark*  # sudo apt-get update  # sudo apt-get install -y openjdk-7-jdk  # sudo apt-get install -y autoconf libtool  # sudo apt-get -y install build-essential python-dev python-boto libcurl4-nss-dev libsasl2-dev maven libapr1-dev libsvn-dev  # sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF  DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')  CODENAME=$(lsb_release -cs)

添加到软件仓库:

# echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \   sudo tee /etc/apt/sources.list.d/mesosphere.list  # sudo apt-get -y update  # sudo apt-get -y install mesos

为了便于Spark集群未来从独立集群模式升级,还安装了Apache Mesos。

针对Spark独立集群,使用了spark-1.5.1-bin-hadoop2.6

conf/spark-env.sh  #!/usr/bin/env bash  export SPARK_LOCAL_IP=MYIP

3、启动一个节点

# sbin/start-slave.sh masterIP:7077

可参考文档:

4、安装其它的工具和服务器

1)安装MongoDB 3.0.4版

# sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10  # echo "deb http://repo.mongodb.org/apt/ubuntu "$(lsb_release -sc)"/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list  # sudo apt-get update  # sudo apt-get install -y mongodb-org  # sudo apt-get install -y mongodb-org=3.0.4 mongodb-org-server=3.0.4 mongodb-org-shell=3.0.4 mongodb-org-mongos=3.0.4 mongodb-org-tools=3.0.4  # sudo service mongod start  # sudo tail -5000 /var/log/mongodb/mongod.log

2)安装PostgreSQL

可参考文档:
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-postgresql-on-ubuntu-14-04

# sudo apt-get update  # sudo apt-get install postgresql postgresql-contrib

3)安装Redis

可参考文档:
https://www.digitalocean.com/community/tutorials/how-to-install-and-use-redis

# sudo apt-get install build-essential  # sudo apt-get install tcl8.5  # sudo wget http://download.redis.io/releases/redis-stable.tar.gz  # sudo tar xzf redis-stable.tar.gz  # cd redis-stable  # make  # make test  # sudo make install  # cd utils  # sudo ./install_server.sh  # sudo service redis_6379 start  # redis-cli

4)安装Scala 2.11.7版

可参考文档:

执行命令:

# sudo wget http://downloads.typesafe.com/scala/2.11.7/scala-2.11.7.deb  # sudo dpkg -i scala-2.11.7.deb

可参考文档:
http://www.scala-sbt.org/0.13/tutorial/Installing-sbt-on-Linux.html

# echo "deb http://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list  # sudo apt-get update  # sudo apt-get install sbt  # sudo apt-get install unzip  # curl -s get.gvmtool.net | bash  # source "/root/.gvm/bin/gvm-init.sh"  # gvm install gradle

来自: http://blog.csdn.net/chszs/article/details/50166991