现在的位置: 首页 > 综合 > 正文

thrift+scribe安装+hadoop

2016年12月08日 ⁄ 综合 ⁄ 共 5155字 ⁄ 字号 评论关闭

需要

libevent,libevent-devel,boost,python,python-devel

boost一定要1.45的

下载  thrift0.7.0 scribe-2.2装不上,直接下载最新的scribe代码

 先安装thrift

./configure  --enable-gen-php --with-cpp  --with-php    --with-boost  --with-libevent --enable-gen-cpp
make && make install

zend_std_get_constructor错误如果报

是因为php版本问题,改成5.3.8就没事了

报php.h没找到的错

因为php安装目录不在/usr/include/php

ln -s /usr/web_soft/php/include/php /usr/include/php

就行了

然后安装fb303就在thrift的contrib下

如果./bootstrap.sh 没有执行权限,要加上 

chmod +x bootstrap.sh

./bootstrap.sh会生成configure文件

./configure --with-thriftpath=/usr/local/ 
make && make install

需要指定 thrift安装目录,即使是默认安装/usr/local也要指定

编译php 的c语言扩展,提高处理效率

在thrift-0.7.0/lib/php/src/ext/thrift_protocol/这里

 安装scribe

./bootstrap.sh出现boostlib的错误提示不用管,只需要生成configure

make时候提示

error:   overriding ‘virtual scribe::thrift::ResultCode::type scribe::thrift::scribeIf::Log(

boost版本不对,一定要1.45的,删掉原来的boost目录,重新安装

或者使用pcting版本

https://github.com/pcting/scribe

./configure --with-thriftpath=/usr/local/ --with-fb303path=/usr/local/
编译时候目录只能指定到这里,他自己会去匹配/usr/local/bin/thrift文件
如果报错找不到 lthriftnb 重新安装一次thrift就好了,我这是这样的
如果要使用hdfs,就加上这些
--enable-hdfs --with-hadooppath=/usr/web_soft/hadoop/ CPPFLAGS="-I/usr/web_soft/hadoop/src/c++/libhdfs -I/usr/java/jdk1.6.0_25/include/ -I/usr/java/jdk1.6.0_25/include/linux/"

 启动时候如果

error while loading shared libraries xxx.so

表示找不到相关so文件

找到这个so文件存放路径

在/etc/ld.so.conf.d/xxx.conf中加入xxx.so所在的目录

/sbin/ldconfig –v生效

 

php使用:

生成php文件

/usr/local/bin/thrift -o ./php_code -I /usr/local/share/ --gen php /usr/local/share/fb303/if/fb303.thrift 
/usr/local/bin/thrift -o  ./php_code -I /usr/local/share/ --gen php /source/to/scribe/if/scribe.thrift

cp thrift的php文件

cp /source/to/thrift/lib/php/src  ./php_code  -r


把目录设置成

 

Lib/Thrift

├── packages

│   ├── fb303

│   │   ├── FacebookService.php

│   │   └── fb303_types.php

│   └── scribe

│       └── scribe_types.php

├── protocol

│   ├── TBinaryProtocol.php

│   ├── TBinarySerializer.php

│   └── TProtocol.php

├── scribe.php

├── server

│   ├── TServer.php

│   └── TSimpleServer.php

├── Thrift.php

└── transport

    ├── TBufferedTransport.php

    ├── TFramedTransport.php

    ├── THttpClient.php

    ├── TMemoryBuffer.php

    ├── TNullTransport.php

    ├── TPhpStream.php

    ├── TServerSocket.php

    ├── TServerTransport.php

    ├── TSocket.php

    ├── TSocketPool.php

    ├── TTransportFactory.php

    └── TTransport.php

建packages目录,放fb303和scribe/scribe_types.php
然后:
<?php
$GLOBALS['THRIFT_ROOT'] = './includes';

include_once $GLOBALS['THRIFT_ROOT'] . '/scribe.php';
include_once $GLOBALS['THRIFT_ROOT'] . '/transport/TSocket.php';
include_once $GLOBALS['THRIFT_ROOT'] . '/transport/TFramedTransport.php';
include_once $GLOBALS['THRIFT_ROOT'] . '/protocol/TBinaryProtocol.php';

$msg1['category'] = 'keyword';
$msg1['message'] = "This is some message for the category/n";
$msg2['category'] = 'keyword';
$msg2['message'] = "Some other message for the category/n";
$entry1 = new LogEntry($msg1);
$entry2 = new LogEntry($msg2);
$messages = array($entry1, $entry2);

$socket = new TSocket('localhost', 1463, true);
$transport = new TFramedTransport($socket);
$protocol = new TBinaryProtocol($transport, false, false);
$scribe_client = new scribeClient($protocol, $protocol);

$transport->open();
$scribe_client->Log($messages);
$transport->close();

现在可以多个scribe client日志,存储到一个scribe center了
client配置
port=1464
max_msg_per_second=2000000
check_interval=3


# DEFAULT - forward all messages to Scribe on port 1463
<store>
category=default
type=buffer

target_write_size=20480
max_write_interval=1
buffer_send_rate=1
retry_interval=30
retry_interval_range=10

<primary>
type=network
remote_host=xxx.xxx.xxx.xxx
remote_port=1463
</primary>

<secondary>
type=file
fs_type=std
file_path=/path/scribe
base_filename=scribe_tmp_file
max_size=1073741824
</secondary>
</store>

center配置

port=1463
max_msg_per_second=2000000
check_interval=3

<store>
category=default
type=buffer

target_write_size=20480
max_write_interval=1
buffer_send_rate=2
retry_interval=30
retry_interval_range=10

<primary>
type=file
fs_type=std
file_path=/tmp/scribetest
base_filename=thisisoverwritten
max_size=100000000
</primary>

<secondary>
type=file
fs_type=std
file_path=/tmp
base_filename=thisisoverwritten
max_size=3000000
</secondary>
</store>

装hadoop
直接下载解压,配置文件参考
按网上教程配置hadoop
1.conf/core-site.xml
<configuration>
<property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:8000</value>
  </property>
</configuration>
2.conf/hdfs-site.xml
<configuration>
<property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>
3.conf/mapred-site.xml 
<configuration>
<property>
    <name>mapred.job.tracker</name>
    <value>localhost:8001</value>
  </property>
</configuration>
启动
bin/start-all.sh

如果报权限错误

 

error: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="":hadoop:supergroup:rwxr-xr-x

org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=WRITE, inode="":hadoop:supergroup:rwxr-xr-x

conf/hdfs-site.xml里加上

 

 

<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
操作命令:
转::
有的时候, datanode或者tasktracker crash,或者需要向集群中增加新的机器时又不能重启集群。下面方法也许对你有用。 
1.把新机器的增加到conf/slaves文件中(datanode或者tasktracker crash则可跳过) 
2.在新机器上进入hadoop安装目录 
  $bin/hadoop-daemon.sh start datanode 
  $bin/hadoop-daemon.sh start tasktracker 

3.在namenode上 
  $bin/hadoop balancer

 

抱歉!评论已关闭.