lichengcom2009 发表于 2019-2-2 09:15:24

ceph详细安装部署教程(单监控节点)

  一、前期准备安装ceph-deploy工具
  所有的服务器都是用root用户登录的
  1、安装环境
  系统centos-6.5
  设备:1台admin-node (ceph-ploy)1台 monistor 2台 osd
  2、关闭所有节点的防火墙及关闭selinux,重启机器。
  service iptables stop
  sed -i '/SELINUX/s/enforcing/disabled/' /etc/selinux/config
  chkconfig iptables off
  3、编辑admin-node节点的ceph yum仓库
  vi /etc/yum.repos.d/ceph.repo
  
  name=Ceph noarch packages
  baseurl=http://ceph.com/rpm/el6/noarch/
  enabled=1
  gpgcheck=1
  type=rpm-md
  gpgkey=https://ceph.com/git/?p=ceph.git;a=blob_plain;f=keys/release.asc
  4、安装搜狐的epel仓库
  rpm -ivh http://mirrors.sohu.com/fedora-epel/6/x86_64/epel-release-6-8.noarch.rpm
  5、更新admin-node节点的yum源
  yum clean all
  yum update -y
  6、在admin-node节点上建立一个ceph集群目录
  mkdir /ceph
  cd/ceph
  7、在admin-node节点上安装ceph部署工具
  yum install ceph-deploy -y
  8、配置admin-node节点的hosts文件
  vi /etc/hosts
  10.240.240.210 admin-node
  10.240.240.211 node1
  10.240.240.212 node2
  10.240.240.213 node3
  

  二、配置ceph-deploy部署的无密码登录每个ceph节点
  1、在每个Ceph节点上安装一个SSH服务器
  $ yum install openssh-server -y
  2、配置您的admin-node管理节点与每个Ceph节点无密码的SSH访问。
  # ssh-keygen
  Generating public/private rsa key pair.
  Enter file in which to save the key (/root/.ssh/id_rsa):
  Enter passphrase (empty for no passphrase):
  Enter same passphrase again:
  Your identification has been saved in /root/.ssh/id_rsa.
  Your public key has been saved in /root/.ssh/id_rsa.pub.
  

  3、复制admin-node节点的秘钥到每个ceph节点
  ssh-copy-id root@admin-node
  ssh-copy-id root@node1
  ssh-copy-id root@node2
  ssh-copy-id root@node3
  4、测试每台ceph节点不用密码是否可以登录
  ssh root@node1
  ssh root@node2
  ssh root@node3
  5、修改admin-node管理节点的~/.ssh / config文件,这样它登录到Ceph节点创建的用户
  Host admin-node
  Hostname admin-node
  User root
  Host node1
  Hostname node1
  User root
  Host node2
  Hostname node2
  User root
  Host node3
  Hostname node3
  User root
  三、用ceph-deploy工具部署ceph集群
  1、在admin-node节点上新建一个ceph集群
  # ceph-deploy new node1
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy new node1
   Creating new cluster named ceph
   Resolving host node1
   Monitor node1 at 10.240.240.211
   making sure passwordless SSH succeeds
   connected to host: admin-node
   Running command: ssh -CT -o BatchMode=yes node1
   Monitor initial members are ['node1']
   Monitor addrs are ['10.240.240.211']
   Creating a random mon key...
   Writing initial config to ceph.conf...
   Writing monitor keyring to ceph.mon.keyring...
  

  查看生成的文件
  # ls
  ceph.confceph.logceph.mon.keyring
  

  2、部署之前确保ceph每个节点没有ceph数据包(先清空之前所有的ceph数据,如果是新装不用执行此步骤,如果是重新部署的话也执行下面的命令)
  # ceph-deploy purgedata admin-node node1 node2 node3
  # ceph-deploy forgetkeys
  # ceph-deploy purge admin-node node1 node2 node3
  

  如果是新装的话是没有任何数据的
  

  3、编辑admin-node节点的ceph配置文件,把下面的配置放入ceph.conf中
  osd pool default size = 2
  

  4、在admin-node节点用ceph-deploy工具向各个节点安装ceph
  # ceph-deploy install admin-node node1 node2 node3
  若果装到某个节点时提示 ImportError: No module named argparse这个报错,手动在报错的节点上执行下面的命令后再重新安装报错节点。
  # yum install *argparse* -y
  

  5、添加初始监控节点并收集密钥(新的ceph-deploy v1.1.3以后的版本)。
  # ceph-deploy mon create-initial
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy mon create-initial
   Deploying mon, cluster ceph hosts node1
   detecting platform for host node1 ...
   connected to host: node1
   detect platform information from remote host
   detect machine type
   distro info: CentOS 6.4 Final
   determining if provided host has same hostname in remote
   get remote short hostname
   deploying mon to node1
   get remote short hostname
   remote hostname: node1
   write cluster configuration to /etc/ceph/{cluster}.conf
   create the mon path if it does not exist
   checking for done path: /var/lib/ceph/mon/ceph-node1/done
   done path does not exist: /var/lib/ceph/mon/ceph-node1/done
   creating keyring file: /var/lib/ceph/tmp/ceph-node1.mon.keyring
   create the monitor keyring file
   Running command: ceph-mon --cluster ceph --mkfs -i node1 --keyring /var/lib/ceph/tmp/ceph-node1.mon.keyring
   ceph-mon: mon.noname-a 10.240.240.211:6789/0 is local, renaming to mon.node1
   ceph-mon: set fsid to 369daf5a-e844-4e09-a9b1-46bb985aec79
   ceph-mon: created monfs at /var/lib/ceph/mon/ceph-node1 for mon.node1
   unlinking keyring file /var/lib/ceph/tmp/ceph-node1.mon.keyring
   create a done file to avoid re-doing the mon deployment
   create the init path if it does not exist
   locating the `service` executable...
   Running command: /sbin/service ceph -c /etc/ceph/ceph.conf start mon.node1
   /etc/init.d/ceph: line 15: /lib/lsb/init-functions: No such file or directory
   RuntimeError: command returned non-zero exit status: 1
   Failed to execute command: /sbin/service ceph -c /etc/ceph/ceph.conf start mon.node1
   GenericError: Failed to create 1 monitors
  

  解决上面报错信息的方法:
  手动在mon(node1)节点上执行下面的命令
  # yum install redhat-lsb-y
  

  查看ceph集群目录多了下面几个文件
  ceph.bootstrap-mds.keyring
  ceph.bootstrap-osd.keyring
  ceph.client.admin.keyring
  

  6、添加osd节点
  先添加node2节点,进入node2节点查看未分配的分区
  # ssh node2
  # fdisk -l
  

  Disk /dev/sda: 53.7 GB, 53687091200 bytes
  255 heads, 63 sectors/track, 6527 cylinders
  Units = cylinders of 16065 * 512 = 8225280 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x000d6653
  

  Device Boot      Start         End      Blocks   IdSystem
  /dev/sda1   *         1          39      307200   83Linux
  Partition 1 does not end on cylinder boundary.
  /dev/sda2            39      6401    51104768   83Linux
  /dev/sda3            6401      6528   1015808   82Linux swap / Solaris
  

  Disk /dev/sdb: 21.5 GB, 21474836480 bytes
  255 heads, 63 sectors/track, 2610 cylinders
  Units = cylinders of 16065 * 512 = 8225280 bytes
  Sector size (logical/physical): 512 bytes / 512 bytes
  I/O size (minimum/optimal): 512 bytes / 512 bytes
  Disk identifier: 0x843e46d0
  

  Device Boot      Start         End      Blocks   IdSystem
  /dev/sdb1               1      2610    20964793+   5Extended
  /dev/sdb5               1      2610    20964762   83Linux
  

  # df -h
  Filesystem            SizeUsed Avail Use% Mounted on
  /dev/sda2            48G2.5G   44G   6% /
  tmpfs               242M   68K242M   1% /dev/shm
  /dev/sda1             291M   33M243M12% /boot
  

  查看可以看出第二块硬盘为使用,使用第二块硬盘的sdb5分区作为osd硬盘
  在admin-node节点上添加osd设备
  #ceph-deploy osd prepare node2:/dev/sdb5
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy osd prepare node2:/var/local/osd0 node3:/var/local/osd1
   Preparing cluster ceph disks node2:/var/local/osd0: node3:/var/local/osd1:
   connected to host: node2
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.5 Final
   Deploying osd to node2
   write cluster configuration to /etc/ceph/{cluster}.conf
   osd keyring does not exist yet, creating one
   create a keyring file
   IOError: No such file or directory: '/var/lib/ceph/bootstrap-osd/ceph.keyring'
   connected to host: node3
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.5 Final
   Deploying osd to node3
   write cluster configuration to /etc/ceph/{cluster}.conf
   osd keyring does not exist yet, creating one
   create a keyring file
   IOError: No such file or directory: '/var/lib/ceph/bootstrap-osd/ceph.keyring'
   GenericError: Failed to create 2 OSDs
  

  解决上面报错信息的方法如下:
  上面错误信息的意思是:在创建osd节点的时候在osd节点上缺少/var/lib/ceph/bootstrap-osd/ceph.keyring文件,查看Node1的监控节点发现有这个文件,把Node1节点上的文件拷贝到node2节点上去即可。
  在node2节点上建立一个目录:mkdir /var/lib/ceph/bootstrap-osd/。
  登录node1:
  # ssh node1
  #scp /var/lib/ceph/bootstrap-osd/ceph.keyring root@node2:/var/lib/ceph/bootstrap-osd/
  

  再次执行osd初始化命令
  # ceph-deploy osd prepare node2:/dev/sdb5
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy osd prepare node2:/dev/sdb5
   Preparing cluster ceph disks node2:/dev/sdb5:
   connected to host: node2
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.4 Final
   Deploying osd to node2
   write cluster configuration to /etc/ceph/{cluster}.conf
   Running command: udevadm trigger --subsystem-match=block --action=add
   Preparing host node2 disk /dev/sdb5 journal None activate False
   Running command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /dev/sdb5
   mkfs.xfs: No such file or directory
   ceph-disk: Error: Command '['/sbin/mkfs', '-t', 'xfs', '-f', '-i', 'size=2048', '--', '/dev/sdb5']' returned non-zero exit status 1
   RuntimeError: command returned non-zero exit status: 1
   Failed to execute command: ceph-disk-prepare --fs-type xfs --cluster ceph -- /dev/sdb5
   GenericError: Failed to create 1 OSDs
  

  上面的报错信息说明在node2上没有mkfs.xfs文件,需要在node2上安装mkfs.xfs文件。
  # ssh node2
  # yum install xfs* -y
  再次执行osd初始化命令可以成功初始化新加入的osd节点
  

  在admin节点上激活osd设备
  # ceph-deploy osd activate node2:/dev/sdb5
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy osd activate node2:/dev/sdb5
   Activating cluster ceph disks node2:/dev/sdb5:
   connected to host: node2
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.4 Final
   activating host node2 disk /dev/sdb5
   will use init type: sysvinit
   Running command: ceph-disk-activate --mark-init sysvinit --mount /dev/sdb5
   got monmap epoch 1
   2014-06-06 07:34:59.766494 7f4e0c2717a0 -1 journal FileJournal::_open: disabling aio for non-block journal.Use journal_force_aio to force use of aio anyway
   2014-06-06 07:34:59.931782 7f4e0c2717a0 -1 journal FileJournal::_open: disabling aio for non-block journal.Use journal_force_aio to force use of aio anyway
   2014-06-06 07:34:59.949677 7f4e0c2717a0 -1 filestore(/var/lib/ceph/tmp/mnt.5jdyKF) could not find 23c2fcde/osd_superblock/0//-1 in index: (2) No such file or directory
   2014-06-06 07:35:00.004350 7f4e0c2717a0 -1 created object store /var/lib/ceph/tmp/mnt.5jdyKF journal /var/lib/ceph/tmp/mnt.5jdyKF/journal for osd.0 fsid 591ef1f4-69f7-442f-ba7b-49cdf6695656
   2014-06-06 07:35:00.004630 7f4e0c2717a0 -1 auth: error reading file: /var/lib/ceph/tmp/mnt.5jdyKF/keyring: can't open /var/lib/ceph/tmp/mnt.5jdyKF/keyring: (2) No such file or directory
   2014-06-06 07:35:00.004951 7f4e0c2717a0 -1 created new key in keyring /var/lib/ceph/tmp/mnt.5jdyKF/keyring
   added key for osd.0
   ERROR:ceph-disk:Failed to activate
   Traceback (most recent call last):
     File "/usr/sbin/ceph-disk", line 2579, in
     main()
     File "/usr/sbin/ceph-disk", line 2557, in main
     args.func(args)
     File "/usr/sbin/ceph-disk", line 1910, in main_activate
     init=args.mark_init,
     File "/usr/sbin/ceph-disk", line 1724, in mount_activate
     mount_options=mount_options,
     File "/usr/sbin/ceph-disk", line 1544, in move_mount
     maybe_mkdir(osd_data)
     File "/usr/sbin/ceph-disk", line 220, in maybe_mkdir
     os.mkdir(*a, **kw)
   OSError: No such file or directory: '/var/lib/ceph/osd/ceph-0'
   RuntimeError: command returned non-zero exit status: 1
   RuntimeError: Failed to execute command: ceph-disk-activate --mark-init sysvinit --mount /dev/sdb5
  

  上面报错信息的意思是:在node2节点上没有/var/lib/ceph/osd/ceph-0这个目录,需要在node2节点上创建这个目录。
  # ssh node2
  # mkdir /var/lib/ceph/osd/
  # mkdir /var/lib/ceph/osd/ceph-0
  

  再次执行激活osd命令
  # ceph-deploy osd activate node2:/dev/sdb5
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy osd activate node2:/dev/sdb5
   Activating cluster ceph disks node2:/dev/sdb5:
   connected to host: node2
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.4 Final
   activating host node2 disk /dev/sdb5
   will use init type: sysvinit
   Running command: ceph-disk-activate --mark-init sysvinit --mount /dev/sdb5
   /etc/init.d/ceph: line 15: /lib/lsb/init-functions: No such file or directory
   ceph-disk: Error: ceph osd start failed: Command '['/sbin/service', 'ceph', 'start', 'osd.0']' returned non-zero exit status 1
   RuntimeError: command returned non-zero exit status: 1
   RuntimeError: Failed to execute command: ceph-disk-activate --mark-init sysvinit --mount /dev/sdb5
  

  上面报错信息的解决方法:
  # ssh node2
  # yum install redhat-lsb-y
  

  再次执行激活osd命令osd节点可以正常运行
  # ceph-deploy osd activate node2:/dev/sdb5
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy osd activate node2:/dev/sdb5
   Activating cluster ceph disks node2:/dev/sdb5:
   connected to host: node2
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.4 Final
   activating host node2 disk /dev/sdb5
   will use init type: sysvinit
   Running command: ceph-disk-activate --mark-init sysvinit --mount /dev/sdb5
   === osd.0 ===
   Starting Ceph osd.0 on node2...
   starting osd.0 at :/0 osd_data /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
   INFO:ceph-disk:ceph osd.0 already mounted in position; unmounting ours.
   create-or-move updating item name 'osd.0' weight 0.02 at location {host=node2,root=default} to crush map
   checking OSD status...
   Running command: ceph --cluster=ceph osd stat --format=json
  Unhandled exception in thread started by
  Error in sys.excepthook:
  

  Original exception was:
  #
  

  

  7、复制ceph配置文件及密钥到mon、osd节点
  # ceph-deploy admin admin-node node1 node2 node3
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy admin admin-node node1 node2 node3
   Pushing admin keys and conf to admin-node
   connected to host: admin-node
   detect platform information from remote host
   detect machine type
   get remote short hostname
   write cluster configuration to /etc/ceph/{cluster}.conf
   Pushing admin keys and conf to node1
   connected to host: node1
   detect platform information from remote host
   detect machine type
   get remote short hostname
   write cluster configuration to /etc/ceph/{cluster}.conf
   Pushing admin keys and conf to node2
   connected to host: node2
   detect platform information from remote host
   detect machine type
   get remote short hostname
   write cluster configuration to /etc/ceph/{cluster}.conf
   Pushing admin keys and conf to node3
   connected to host: node3
   detect platform information from remote host
   detect machine type
   get remote short hostname
   write cluster configuration to /etc/ceph/{cluster}.conf
  Unhandled exception in thread started by
  Error in sys.excepthook:
  

  Original exception was:
  

  8、确保你有正确的ceph.client.admin.keyring权限
  # chmod +r /etc/ceph/ceph.client.admin.keyring
  

  9、查看集群运行状态
  # ceph health
  HEALTH_OK
  

  11、添加一个元数据服务器
  # ceph-deploy mds create node1
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy mds create node1
   Deploying mds, cluster ceph hosts node1:node1
   connected to host: node1
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.4 Final
   remote host will use sysvinit
   deploying mds bootstrap to node1
   write cluster configuration to /etc/ceph/{cluster}.conf
   create path if it doesn't exist
   OSError: No such file or directory: '/var/lib/ceph/mds/ceph-node1'
   GenericError: Failed to create 1 MDSs
  解决上面报错的方法:
  # ssh node1
  Last login: Fri Jun6 06:41:25 2014 from 10.241.10.2
  # mkdir /var/lib/ceph/mds/
  # mkdir /var/lib/ceph/mds/ceph-node1
  

  再次执行元数据服务器创建完成
  # ceph-deploy mds create node1
   found configuration file at: /root/.cephdeploy.conf
   Invoked (1.5.3): /usr/bin/ceph-deploy mds create node1
   Deploying mds, cluster ceph hosts node1:node1
   connected to host: node1
   detect platform information from remote host
   detect machine type
   Distro info: CentOS 6.4 Final
   remote host will use sysvinit
   deploying mds bootstrap to node1
   write cluster configuration to /etc/ceph/{cluster}.conf
   create path if it doesn't exist
   Running command: ceph --cluster ceph --name client.bootstrap-mds --keyring /var/lib/ceph/bootstrap-mds/ceph.keyring auth get-or-create mds.node1 osd allow rwx mds allow mon allow profile mds -o /var/lib/ceph/mds/ceph-node1/keyring
   Running command: service ceph start mds.node1
   === mds.node1 ===
   Starting Ceph mds.node1 on node1...
   starting mds.node1 at :/0
  

  再次查看运行状态
  # ceph -w
  cluster 591ef1f4-69f7-442f-ba7b-49cdf6695656
  health HEALTH_OK
  monmap e1: 1 mons at {node1=10.240.240.211:6789/0}, election epoch 2, quorum 0 node1
  mdsmap e4: 1/1/1 up {0=node1=up:active}
  osdmap e9: 2 osds: 2 up, 2 in
  pgmap v22: 192 pgs, 3 pools, 1884 bytes data, 20 objects
  10310 MB used, 30616 MB / 40926 MB avail
  192 active+clean
  

  2014-06-06 08:12:49.021472 mon.0 pgmap v22: 192 pgs: 192 active+clean; 1884 bytes data, 10310 MB used, 30616 MB / 40926 MB avail; 10 B/s wr, 0 op/s
  2014-06-06 08:14:47.932311 mon.0 pgmap v23: 192 pgs: 192 active+clean; 1884 bytes data, 10310 MB used, 30615 MB / 40926 MB avail
  




页: [1]
查看完整版本: ceph详细安装部署教程(单监控节点)