现在的位置: 首页 > 综合 > 正文

openstack排错经历 写的不错 分享

2017年11月14日 ⁄ 综合 ⁄ 共 31922字 ⁄ 字号 评论关闭

     整个安装过程,报错是最经常的事,毕竟openstack的很多东西都还不成熟,冷静的排错才是王道,尽管openstack错误不断,但是最终他还是能跑起来的。

     出错时候,我本着内事问百度,外事问谷歌的原则寻求帮助,另外也从网上的同行朋友处汲取帮助,可以说错误是多种多样的,这里也只能列举几种常见的错误,未尽之处,各位按照查自己的步骤,查log,查bug列表,launchpad,求助中国同行朋友,求助邮件列表里的外国朋友的顺序排错吧。

1 各组件同步数据库时候报错:

这个时候报错大体上都是数据库拒绝访问或者找不到数据库,排错方法如下:

1. 检查数据库里是否建立了相应数据库

2. 检查数据库的权限是否存在问题,ubuntu 12.04 mysql需要给本机一个权限,

添加grant all on keystone.* to ‘keystone’@'controller’ identified by ‘ruijie’;给本机一个权限

3. 检查配置文件中mysql的地址、用户名、密码是否填写正确

2 Details: [Errno 111] Connection refused

这类错误很明显是keystone和相应模块的连接问题,检查keystone的日志,如果没有相应服务的报错记录,猜测应该是其他部件的授权地址出错,有相应记录,一般是授权出错,检查各部件配置文件中的授权token是否正确,检查keystone的数据是否正确,用户、服务、endpoint是否存在。如果之前的办法还是不行,重启服务,或者直接aptget remove —purge相应组件,重新安装。

3 依赖包错误

例子:ImportError: No module named keystone.middleware.auth_token

由于swift原来是独立的项目,所以Swift安装后需要keystone的相应模块,如果没有就会报这个错误,解决方法也很简单:apt-get install python-keystone python-keystoneclient

手动安装,常常会出现这种错误,但是往往以google就能解决

4本身bug错误

ValueError: invalid literal for int() with base 10: 'true'

在启动swift代理节点的时候,很郁闷的是,我明明没改几个地方,为什么值会错误,这个错误只要将配置文件中改为delay_auth_decision = 1就可以了,大众化的bug你一问度娘,答案就出来了,如果自己没有做致命修改,基本可以怀疑是openstack的问题

5实例错误

最为普遍,也最头大,原因五花八门,表现为服务在nova-manage service list中都正常,但是启动实例时出错,这时需特别注意dashboard启动实例时出错的位置,networking?还是spaning?还是scheduler?不同阶段查看不同地方的日志,最常见的有以下几个:

1. 网络配置错误导致分配不了IP地址,如果IP地址未分配那么可能就是这个错误

解决;核对标准配置文件,检查网络配置错误

2. rabbitmq或者其他什么组件没有运行

查看日志即可发现,这个的原因又很多样,这些组件正常情况下,是不会停掉的,一般是错误修改配置后才会,回忆自己出错之前做的修改,还原回来即可修复

6 僵尸实例的产生

僵尸实例一般是非法的关闭nova或者底层虚拟机,又或者在实例错误时删除不了的错误,注意用virsh list检查底层虚拟机是否还在运行,有的话停掉,然后直接进入数据库删除。

# mysql -u root -p

Enter password:

mysql> use nova;

mysql> SET FOREIGN_KEY_CHECKS=0;

Query OK, 0 rows affected (0.00 sec)

mysql> delete from instances where id = '29';

Query OK, 1 row affected (0.04 sec)

mysql> delete from instances where id = '30';

Query OK, 1 row affected (0.04 sec)

mysql> SET FOREIGN_KEY_CHECKS=1;

Query OK, 0 rows affected (0.00 sec)

 

 

 

8月3日

root@openstack-controller:~# keystone user-list
Expecting authentication method via
  either a service token, --token or env[SERVICE_TOKEN],
  or credentials, --os_username or env[OS_USERNAME].
export OS_TENANT_NAME=admin
export OS_USERNAME=admin
export OS_PASSWORD=admin
export OS_AUTH_URL=http://localhost:5000/v2.0/

root@openstack-controller:~# glance index
Failed to show index. Got error:
There was an error connecting to a server
Details: [Errno 111] Connection refused

初步判定为keystone的问题检查keystone的日志为空。。。
删除service和endpoint重试

就是glance与keystone的连接问题
应该先装好keystone后装glance
这里删除glance重新配aptget remove --purge
同时注意删除数据库

Starting proxy-server...(/etc/swift/proxy-server.conf)
Traceback (most recent call last):
  File "/usr/bin/swift-proxy-server", line 22, in <module>
    run_wsgi(conf_file, 'proxy-server', default_port=8080, **options)
  File "/usr/lib/python2.7/dist-packages/swift/common/wsgi.py", line 122, in run_wsgi
    loadapp('config:%s' % conf_file, global_conf={'log_name': log_name})
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 247, in loadapp
    return loadobj(APP, uri, name=name, **kw)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 271, in loadobj
    global_conf=global_conf)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 296, in loadcontext
    global_conf=global_conf)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 320, in _loadconfig
    return loader.get_context(object_type, name, global_conf)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 450, in get_context
    global_additions=global_additions)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 562, in _pipeline_app_context
    for name in pipeline[:-1]]
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 458, in get_context
    section)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 517, in _context_from_explicit
    value = import_string(found_expr)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 22, in import_string
    return pkg_resources.EntryPoint.parse("x=" + s).load(False)
  File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1989, in load
    entry = __import__(self.module_name, globals(),globals(), ['__name__'])
ImportError: No module named keystone.middleware.auth_token

swift-proxy节点也需要安装keystone,会使用到keystone里提供的swift\_auth和auth\_token的功能, 在swift的代码中并没有提供,所以, apt-get install python-keystone python-keystoneclient

root@swift-proxy:/etc/swift# swift-init proxy start
Starting proxy-server...(/etc/swift/proxy-server.conf)
Traceback (most recent call last):
  File "/usr/bin/swift-proxy-server", line 22, in <module>
    run_wsgi(conf_file, 'proxy-server', default_port=8080, **options)
  File "/usr/lib/python2.7/dist-packages/swift/common/wsgi.py", line 122, in run_wsgi
    loadapp('config:%s' % conf_file, global_conf={'log_name': log_name})
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 247, in loadapp
    return loadobj(APP, uri, name=name, **kw)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 272, in loadobj
    return context.create()
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 710, in create
    return self.object_type.invoke(self)
  File "/usr/lib/python2.7/dist-packages/paste/deploy/loadwsgi.py", line 207, in invoke
    app = filter(app)
  File "/usr/lib/python2.7/dist-packages/keystone/middleware/auth_token.py", line 524, in auth_filter
    return AuthProtocol(app, conf)
  File "/usr/lib/python2.7/dist-packages/keystone/middleware/auth_token.py", line 123, in __init__
    self.delay_auth_decision = int(conf.get('delay_auth_decision', 0))
ValueError: invalid literal for int() with base 10: 'true'
此为bug
改:
delay_auth_decision = 1

root@swift-proxy:/var/log# swift -V 2 -Ahttp://172.18.32.7:5000/v2.0 -U admin:admin -K admin stat
[Errno 111] ECONNREFUSED

研究-U admin:admin选项的含义:实际上是keystone的tenant:user 这里写service:swift
另外注意proxy的配置文件
以及keystone是否有这个用户和tenant

重启swift时:
Unable to locate config for object-expirer
有待于查找原因

root@openstack-controller:~# nova list
+--------------------------------------+-------+--------+----------+
|                  ID                  |  Name | Status | Networks |
+--------------------------------------+-------+--------+----------+
| e399f8f0-5d3e-4248-bcab-60ec52a3415c | test1 | ERROR  |          |
+--------------------------------------+-------+--------+----------+
nova-network   log
2012-08-06 14:03:25 INFO nova.rpc.common [-] Connected to AMQP server on 172.18.32.7:5672
2012-08-07 18:43:05 INFO nova.rpc.common [req-73ad3278-3416-4ed6-a22b-6ef6051a4a26 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Connected to AMQP server on 172.18.32.7:5672
2012-08-07 18:44:02 INFO nova.rpc.common [req-01699302-b01e-4b01-81b7-8414400ef471 None None] Connected to AMQP server on 172.18.32.7:5672
2012-08-07 18:44:25 ERROR nova.rpc.common [-] Timed out waiting for RPC response: timed out
2012-08-07 18:44:25 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 490, in ensure
2012-08-07 18:44:25 TRACE nova.rpc.common     return method(*args, **kwargs)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 567, in _consume
2012-08-07 18:44:25 TRACE nova.rpc.common     return self.connection.drain_events(timeout=timeout)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 175, in drain_events
2012-08-07 18:44:25 TRACE nova.rpc.common     return self.transport.drain_events(self.connection, **kwargs)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 238, in drain_events
2012-08-07 18:44:25 TRACE nova.rpc.common     return connection.drain_events(**kwargs)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 57, in drain_events
2012-08-07 18:44:25 TRACE nova.rpc.common     return self.wait_multi(self.channels.values(), timeout=timeout)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 63, in wait_multi
2012-08-07 18:44:25 TRACE nova.rpc.common     chanmap.keys(), allowed_methods, timeout=timeout)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 120, in _wait_multiple
2012-08-07 18:44:25 TRACE nova.rpc.common     channel, method_sig, args, content = read_timeout(timeout)
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 94, in read_timeout
2012-08-07 18:44:25 TRACE nova.rpc.common     return self.method_reader.read_method()
2012-08-07 18:44:25 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/method_framing.py", line 221, in read_method
2012-08-07 18:44:25 TRACE nova.rpc.common     raise m
2012-08-07 18:44:25 TRACE nova.rpc.common timeout: timed out
2012-08-07 18:44:25 TRACE nova.rpc.common

nova-sch
2012-08-06 14:03:21 TRACE nova.rpc.common IOError: Socket closed
2012-08-06 14:03:21 TRACE nova.rpc.common
2012-08-06 14:03:21 INFO nova.rpc.common [-] Reconnecting to AMQP server on 172.18.32.7:5672
2012-08-06 14:03:21 ERROR nova.rpc.common [-] AMQP server on 172.18.32.7:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds.
2012-08-06 14:03:21 TRACE nova.rpc.common Traceback (most recent call last):
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 446, in reconnect
2012-08-06 14:03:21 TRACE nova.rpc.common     self._connect()
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 423, in _connect
2012-08-06 14:03:21 TRACE nova.rpc.common     self.connection.connect()
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 154, in connect
2012-08-06 14:03:21 TRACE nova.rpc.common     return self.connection
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 560, in connection
2012-08-06 14:03:21 TRACE nova.rpc.common     self._connection = self._establish_connection()
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/connection.py", line 521, in _establish_connection
2012-08-06 14:03:21 TRACE nova.rpc.common     conn = self.transport.establish_connection()
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 255, in establish_connection
2012-08-06 14:03:21 TRACE nova.rpc.common     connect_timeout=conninfo.connect_timeout)
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/kombu/transport/pyamqplib.py", line 52, in __init__
2012-08-06 14:03:21 TRACE nova.rpc.common     super(Connection, self).__init__(*args, **kwargs)
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/connection.py", line 129, in __init__
2012-08-06 14:03:21 TRACE nova.rpc.common     self.transport = create_transport(host, connect_timeout, ssl)
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 281, in create_transport
2012-08-06 14:03:21 TRACE nova.rpc.common     return TCPTransport(host, connect_timeout)
2012-08-06 14:03:21 TRACE nova.rpc.common   File "/usr/lib/python2.7/dist-packages/amqplib/client_0_8/transport.py", line 85, in __init__
2012-08-06 14:03:21 TRACE nova.rpc.common     raise socket.error, msg
2012-08-06 14:03:21 TRACE nova.rpc.common error: [Errno 111] ECONNREFUSED
2012-08-06 14:03:21 TRACE nova.rpc.common
2012-08-06 14:03:24 AUDIT nova.service [-] Starting scheduler node (version 2012.1-LOCALBRANCH:LOCALREVISION)
2012-08-06 14:03:25 INFO nova.rpc.common [req-883fdae1-bf4e-4707-b60e-d768f4ddf0f2 None None] Connected to AMQP server on 172.18.32.7:5672
2012-08-07 18:43:23 INFO nova.rpc.common [req-72804b2a-3d1d-4d38-977d-6e70e491559f 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Connected to AMQP server on 172.18.32.7:5672
~
rabbitmq崩溃了重启所有服务
恢复

2012-08-08 10:43:37 ERROR nova.rpc.amqp [req-b4675e0f-c682-4621-8530-fdc4ea316295 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Exception during message handling
2012-08-08 10:43:37 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 253, in _process_data
2012-08-08 10:43:37 TRACE nova.rpc.amqp     rval = node_func(context=ctxt, **node_args)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped
2012-08-08 10:43:37 TRACE nova.rpc.amqp     return f(*args, **kw)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 183, in decorated_function
2012-08-08 10:43:37 TRACE nova.rpc.amqp     sys.exc_info())
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self.gen.next()
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 177, in decorated_function
2012-08-08 10:43:37 TRACE nova.rpc.amqp     return function(self, context, instance_uuid, *args, **kwargs)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 657, in run_instance
2012-08-08 10:43:37 TRACE nova.rpc.amqp     do_run_instance()
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 945, in inner
2012-08-08 10:43:37 TRACE nova.rpc.amqp     retval = f(*args, **kwargs)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 656, in do_run_instance
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self._run_instance(context, instance_uuid, **kwargs)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 457, in _run_instance
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self._set_instance_error_state(context, instance_uuid)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self.gen.next()
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 430, in _run_instance
2012-08-08 10:43:37 TRACE nova.rpc.amqp     requested_networks)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 565, in _allocate_network
2012-08-08 10:43:37 TRACE nova.rpc.amqp     requested_networks=requested_networks)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/network/api.py", line 170, in allocate_for_instance
2012-08-08 10:43:37 TRACE nova.rpc.amqp     'args': args})
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/__init__.py", line 68, in call
2012-08-08 10:43:37 TRACE nova.rpc.amqp     return _get_impl().call(context, topic, msg, timeout)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 674, in call
2012-08-08 10:43:37 TRACE nova.rpc.amqp     return rpc_amqp.call(context, topic, msg, timeout, Connection.pool)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 343, in call
2012-08-08 10:43:37 TRACE nova.rpc.amqp     rv = list(rv)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 304, in __iter__
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self.done()
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self.gen.next()
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 301, in __iter__
2012-08-08 10:43:37 TRACE nova.rpc.amqp     self._iterator.next()
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 572, in iterconsume
2012-08-08 10:43:37 TRACE nova.rpc.amqp     yield self.ensure(_error_callback, _consume)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 503, in ensure
2012-08-08 10:43:37 TRACE nova.rpc.amqp     error_callback(e)
2012-08-08 10:43:37 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/impl_kombu.py", line 553, in _error_callback
2012-08-08 10:43:37 TRACE nova.rpc.amqp     raise rpc_common.Timeout()
2012-08-08 10:43:37 TRACE nova.rpc.amqp Timeout: Timeout while waiting on RPC response.
2012-08-08 10:43:37 TRACE nova.rpc.amqp

网络问题
双网卡配置
2012-08-08 15:17:13 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 253, in _process_data
2012-08-08 15:17:13 TRACE nova.rpc.amqp     rval = node_func(context=ctxt, **node_args)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped
2012-08-08 15:17:13 TRACE nova.rpc.amqp     return f(*args, **kw)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 183, in decorated_function
2012-08-08 15:17:13 TRACE nova.rpc.amqp     sys.exc_info())
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self.gen.next()
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 177, in decorated_function
2012-08-08 15:17:13 TRACE nova.rpc.amqp     return function(self, context, instance_uuid, *args, **kwargs)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 657, in run_instance
2012-08-08 15:17:13 TRACE nova.rpc.amqp     do_run_instance()
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/utils.py", line 945, in inner
2012-08-08 15:17:13 TRACE nova.rpc.amqp     retval = f(*args, **kwargs)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 656, in do_run_instance
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self._run_instance(context, instance_uuid, **kwargs)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 457, in _run_instance
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self._set_instance_error_state(context, instance_uuid)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self.gen.next()
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 438, in _run_instance
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self._deallocate_network(context, instance)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self.gen.next()
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 435, in _run_instance
2012-08-08 15:17:13 TRACE nova.rpc.amqp     injected_files, admin_password)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 598, in _spawn
2012-08-08 15:17:13 TRACE nova.rpc.amqp     self._legacy_nw_info(network_info), block_device_info)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped
2012-08-08 15:17:13 TRACE nova.rpc.amqp     return f(*args, **kw)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 919, in spawn
2012-08-08 15:17:13 TRACE nova.rpc.amqp     block_device_info=block_device_info)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1539, in to_xml
2012-08-08 15:17:13 TRACE nova.rpc.amqp     rescue, block_device_info)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1422, in _prepare_xml_info
2012-08-08 15:17:13 TRACE nova.rpc.amqp     nics.append(self.vif_driver.plug(instance, network, mapping))
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/vif.py", line 99, in plug
2012-08-08 15:17:13 TRACE nova.rpc.amqp     return self._get_configurations(network, mapping)
2012-08-08 15:17:13 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/vif.py", line 69, in _get_configurations
2012-08-08 15:17:13 TRACE nova.rpc.amqp     'ip_address': mapping['ips'][0]['ip'],
2012-08-08 15:17:13 TRACE nova.rpc.amqp IndexError: list index out of range

依赖包未安装
依照沙克网站重新安装
可以
但是为多network的节点结构

迁移错误:
2012-08-09 09:13:28 ERROR nova.scheduler.driver [req-0ed136e0-7694-4320-9f95-7a129bfa459a 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Cannot confirm tmpfile at /var/lib/nova/instances is on same shared storage between worker1 and worker2.
2012-08-09 09:13:28 WARNING nova.scheduler.manager [req-0ed136e0-7694-4320-9f95-7a129bfa459a 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Failed to schedule_live_migration: File tmpqUEm34 could not be found.
2012-08-09 09:13:28 ERROR nova.rpc.amqp [req-0ed136e0-7694-4320-9f95-7a129bfa459a 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Exception during message handling
2012-08-09 09:13:28 TRACE nova.rpc.amqp Traceback (most recent call last):
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 253, in _process_data
2012-08-09 09:13:28 TRACE nova.rpc.amqp     rval = node_func(context=ctxt, **node_args)
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 97, in _schedule
2012-08-09 09:13:28 TRACE nova.rpc.amqp     context, ex, *args, **kwargs)
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2012-08-09 09:13:28 TRACE nova.rpc.amqp     self.gen.next()
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 92, in _schedule
2012-08-09 09:13:28 TRACE nova.rpc.amqp     return driver_method(*args, **kwargs)
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 222, in schedule_live_migration
2012-08-09 09:13:28 TRACE nova.rpc.amqp     disk_over_commit)
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 322, in _live_migration_common_check
2012-08-09 09:13:28 TRACE nova.rpc.amqp     self.mounted_on_same_shared_storage(context, instance_ref, dest)
2012-08-09 09:13:28 TRACE nova.rpc.amqp   File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 521, in mounted_on_same_shared_storage
2012-08-09 09:13:28 TRACE nova.rpc.amqp     raise exception.FileNotFound(file_path=filename)
2012-08-09 09:13:28 TRACE nova.rpc.amqp FileNotFound: File tmpqUEm34 could not be found.
2012-08-09 09:13:28 TRACE nova.rpc.amqp
2012-08-09 09:13:28 ERROR nova.rpc.amqp [req-0ed136e0-7694-4320-9f95-7a129bfa459a 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] Returning exception File tmpqUEm34 could not be found. to caller
2012-08-09 09:13:28 ERROR nova.rpc.amqp [req-0ed136e0-7694-4320-9f95-7a129bfa459a 6ed9496e07724c969e1e470e9ea4621e f4c42d279f124477a487c57f2a96d2df] ['Traceback (most recent call last):\n', '  File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 253,
in _process_data\n    rval = node_func(context=ctxt, **node_args)\n', '  File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 97, in _schedule\n    context, ex, *args, **kwargs)\n', '  File "/usr/lib/python2.7/contextlib.py", line 24, in
__exit__\n    self.gen.next()\n', '  File "/usr/lib/python2.7/dist-packages/nova/scheduler/manager.py", line 92, in _schedule\n    return driver_method(*args, **kwargs)\n', '  File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 222, in schedule_live_migration\n   
disk_over_commit)\n', '  File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py", line 322, in _live_migration_common_check\n    self.mounted_on_same_shared_storage(context, instance_ref, dest)\n', '  File "/usr/lib/python2.7/dist-packages/nova/scheduler/driver.py",
line 521, in mounted_on_same_shared_storage\n    raise exception.FileNotFound(file_path=filename)\n', 'FileNotFound: File tmpqUEm34 could not be found.\n']

libvirtError: internal error Process exited while reading console log output: chardev: opening backend "file" failed: Permission denied
nfs共享的文件夹权限不够
明明已经777了

nfs4配置与其他机器不同
计算节点的/etc/fstab <nfs-hostname>:/ /var/lib/nova/instances nfs4 defaults 0 0
nfs服务器的修改/etc/exports 内容如下:/var/lib/nova/instances *(rw,sync,no_root_squash,no_subtree_check,fsid=0)

3,迁移前配置修改每台节点libvirt修改/etc/libvirt/libvirtd.conf 文件如下:改前 : #listen_tls = 0改后: listen_tls = 0改前 : #listen_tcp = 1改后: listen_tcp = 1添加: auth_tcp = “none” 修改 /etc/init/libvirt-bin.conf改前 : exec /usr/sbin/libvirtd $libvirtd_opts改后 : exec /usr/sbin/libvirtd -d -l修改
/etc/default/libvirt-bin改前 :libvirtd_opts=” -d”改后 :libvirtd_opts=” -d -l”修改/etc/libvirtd/qemu.conf去掉下面三行注释#vnc_listen = “0.0.0.0″ 此行可可不修改user = “root”group = “root"sudo /etc/init.d/libvirt-bin restart

nova-compute.log:
2012-08-09 13:13:03 ERROR nova.manager [-] Error during ComputeManager.update_available_resource: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
2012-08-09 13:13:03 TRACE nova.manager Traceback (most recent call last):
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/nova/manager.py", line 155, in periodic_tasks
2012-08-09 13:13:03 TRACE nova.manager     task(self, context)
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 2409, in update_available_resource
2012-08-09 13:13:03 TRACE nova.manager     self.driver.update_available_resource(context, self.host)
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1936, in update_available_resource
2012-08-09 13:13:03 TRACE nova.manager     'vcpus_used': self.get_vcpu_used(),
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1742, in get_vcpu_used
2012-08-09 13:13:03 TRACE nova.manager     for dom_id in self._conn.listDomainsID():
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 298, in _get_connection
2012-08-09 13:13:03 TRACE nova.manager     self.read_only)
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 341, in _connect
2012-08-09 13:13:03 TRACE nova.manager     return libvirt.openAuth(uri, auth, 0)
2012-08-09 13:13:03 TRACE nova.manager   File "/usr/lib/python2.7/dist-packages/libvirt.py", line 102, in openAuth
2012-08-09 13:13:03 TRACE nova.manager     if ret is None:raise libvirtError('virConnectOpenAuth() failed')
2012-08-09 13:13:03 TRACE nova.manager libvirtError: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory
2012-08-09 13:13:03 TRACE nova.manager

libvirt.log:
2012-08-09 08:29:38.318+0000: 346: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:38.318+0000: 346: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:39.271+0000: 469: info : libvirt version: 0.9.8
2012-08-09 08:29:39.271+0000: 469: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:39.271+0000: 469: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:40.064+0000: 591: info : libvirt version: 0.9.8
2012-08-09 08:29:40.064+0000: 591: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:40.064+0000: 591: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:40.745+0000: 706: info : libvirt version: 0.9.8
2012-08-09 08:29:40.745+0000: 706: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:40.745+0000: 706: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:41.395+0000: 820: info : libvirt version: 0.9.8
2012-08-09 08:29:41.395+0000: 820: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:41.395+0000: 820: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:42.092+0000: 942: info : libvirt version: 0.9.8
2012-08-09 08:29:42.092+0000: 942: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:42.092+0000: 942: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:42.745+0000: 1058: info : libvirt version: 0.9.8
2012-08-09 08:29:42.745+0000: 1058: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:42.745+0000: 1058: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:43.968+0000: 1058: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:43.969+0000: 1058: warning : lxcCapsInit:77 : Failed to get host power management capabilities
2012-08-09 08:29:43.969+0000: 1058: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:43.969+0000: 1058: warning : umlCapsInit:87 : Failed to get host power management capabilities
2012-08-09 08:29:44.068+0000: 1048: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:44.068+0000: 1048: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:45.018+0000: 1045: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:45.018+0000: 1045: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:50.838+0000: 1211: info : libvirt version: 0.9.8
2012-08-09 08:29:50.838+0000: 1211: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:50.838+0000: 1211: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:52.072+0000: 1211: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:52.073+0000: 1211: warning : lxcCapsInit:77 : Failed to get host power management capabilities
2012-08-09 08:29:52.073+0000: 1211: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:52.073+0000: 1211: warning : umlCapsInit:87 : Failed to get host power management capabilities
2012-08-09 08:29:52.150+0000: 1201: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:52.150+0000: 1201: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:53.193+0000: 1197: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:29:53.193+0000: 1197: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:29:54.057+0000: 1196: error : virNetSocketReadWire:996 : End of file while reading data: Input/output error
2012-08-09 08:40:46.124+0000: 1581: info : libvirt version: 0.9.8
2012-08-09 08:40:46.124+0000: 1581: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:40:46.124+0000: 1581: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:40:47.364+0000: 1581: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:40:47.365+0000: 1581: warning : lxcCapsInit:77 : Failed to get host power management capabilities
2012-08-09 08:40:47.365+0000: 1581: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:40:47.365+0000: 1581: warning : umlCapsInit:87 : Failed to get host power management capabilities
2012-08-09 08:40:47.460+0000: 1571: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:40:47.460+0000: 1571: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:40:48.312+0000: 1570: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-09 08:40:48.312+0000: 1570: warning : qemuCapsInit:856 : Failed to get host power management capabilities
2012-08-09 08:40:49.094+0000: 1567: error : virNetSocketReadWire:996 : End of file while reading data: Input/output error

重装,跳过此错误
worker1 存储节点故障,搁置

2012-08-12 23:36:37.652+0000: 24256: error : virExecWithHook:328 : Cannot find 'pm-is-supported' in path: No such file or directory
2012-08-12 23:36:37.652+0000: 24256: warning : qemuCapsInit:856 : Failed to get host power management capabilities
此错误为缺少电源管理的包,可忽略
sudo apt-get -y install pm-utils

迁移worker1到worker2:没报错,但是虚拟机还在worker1
libvirt.log:
error : virNetClientProgramDispatchError:174 : Unable to read from monitor: Connection reset by peer
nova.conf中vnc监听改为全网0.0.0.0,不明白为什么和vnc有关

 

 

目前尚未解决的诡异问题:

1.重启All-In-One带实例的控制节点,控制节点上的nova-compute没启动起来,检查日志说是连接RabbitMQ超时

其他计算节点的功能完好,手动重启控制节点的nova-compute又重启成功

删除实例,重启系统成功

关闭计算节点,重启成功

 

2.分配FloatingIP后,关联虚拟机,开始时ping得通,ssh也可以用,部署HTTP后

过一阵子,ssh不通,ping不了,远程不了,只能通过dashboard连接但是,访问外部网络失败

可是!!!居然部署的HTTP服务依然正常提供服务。

 

 

抱歉!评论已关闭.