学步园

2010-11-20 星期六阴雾

## 从svnurl中获取保存本地的目录名
## 如：http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-biz/escrow/trunk/ ==> /home/$USER/work/intl-biz/escrow/
##    http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-biz/wsproductbase/client/branches/20101030_7849_1 ==>  /home/$USER/work/intl-biz/wsproductbase/client
get_path_from_svnurl()
{
	local svnurl=$1
        local basedir=$2
	echo $svnurl | sed 's#http://svn.alibaba-inc.com/repos/ali_intl/apps/#$basedir#' | sed 's#branches/.*##'
}

这里看到可以用#号代替/：

The slash as a delimiter

The character after the s is the delimiter. It is conventionally a slash, because this is what ed, more, and vi use. It can be anything you want, however. If you want to change a pathname that contains a slash - say /usr/local/bin to /common/bin - you could use the backslash to quote the slash:

sed 's///usr//local//bin///common//bin/' <old >new
Gulp. Some call this a 'Picket Fence' and it's ugly. It is easier to read if you use an underline instead of a slash as a delimiter:

sed 's_/usr/local/bin_/common/bin_' <old >new
Some people use colons:

sed 's:/usr/local/bin:/common/bin:' <old >new
Others use the "|" character.

sed 's|/usr/local/bin|/common/bin|' <old >new
Pick one you like. As long as it's not in the string you are looking for, anything goes. And remember that you need three delimiters. If you get a "Unterminated `s' command" it's because you are missing one of them.

但是发现在非替换命令下不能这么做：

forrest@ubuntu:~$ sed '/http:////svn.alibaba-inc.com//repos/!d' /home/forrest/Desktop/cnfm_branches_20101119.txt

而不能这么写：

forrest@ubuntu:~$ sed '#http://svn.alibaba-inc.com/repos#d#' /home/forrest/Desktop/cnfm_branches_20101119.txt

另外，A simple example is changing "day" in the "old" file to "night" in the "new" file:
sed s/day/night/ <old >new
Or another way (for Unix beginners),

sed s/day/night/ old >new
old和new不能是同一个文件，否则最终结果是空文件。
为了避免每次替换操作都保存在一个新的临时文件中，我们可以使用如下方式将替换操作串起来，就像pipe一样：

多次修改

如果需要对同一文件或行作多次修改，可以有三种方法来实现它。第一种是使用 "-e" 选项，它通知程序使用了多条编辑命令。例如：

$ echo The tiger cubs will meet on Tuesday after school | sed -e '
s/tiger/wolf/' -e 's/after/before/'
The wolf cubs will meet on Tuesday before school
$
这是实现它的非常复杂的方法，因此 "-e" 选项不常被大范围使用。更好的方法是用分号来分隔命令：

$ echo The tiger cubs will meet on Tuesday after school | sed '
s/tiger/wolf/; s/after/before/'
The wolf cubs will meet on Tuesday before school
$
注意分号必须是紧跟斜线之后的下一个字符。如果两者之间有一个空格，操作将不能成功完成，并返回一条错误消息。这两种方法都很好，但许多管理员更喜欢另一种方法。要注意的一个关键问题是，两个撇号 (' ') 之间的全部内容都被解释为 sed 命令。直到您输入了第二个撇号，读入这些命令的 shell 程序才会认为您完成了输入。这意味着可以在多行上输入命令—同时 Linux 将提示符从 PS1 变为一个延续提示符（通常为 ">"）—直到输入了第二个撇号。一旦输入了第二个撇号，并且按下了 Enter 键，则处理就进行并产生相同的结果，如下所示：

$ echo The tiger cubs will meet on Tuesday after school | sed '
> s/tiger/wolf/
> s/after/before/'
The wolf cubs will meet on Tuesday before school
$

笔者试验了一下，发现第二种方法是有效的。第一种方法ms有点问题，不过没有细看，具体原因不知。

实战

2010-11-23 星期二晴朗

今天早上过来合并代码，由于需要将今天要发布的分支先合在我们的代码中，所以先在Aone上找到今天的发布列表，但是从Aone上copy下来的信息格式如下：

1 intl-aisn tradeManager根据ip展示中文页面 麦俊生  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101119_26643_1 http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/tags/20101123_r_release1 395958 395958 
UK站首页help us挖成天窗 李栋  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_26769_1 
Sourcing Detail底部wholesale产品推荐优化 顾士元  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_25291_2 
招商频道日常发布11.23 黄健  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_26775_1 
深度认证-atm tab页url修改 刘亳  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101119_26397_1 
2 intl-atmgateway cookielog配置修改 麦俊生  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmgateway/branches/20101122_26641_1 http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmgateway/tags/20101123_r_release1 380286 380286 
3 intl-atmlogin cookielog配置修改 麦俊生  http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmlogin/branches/20101122_26641_1 http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmlogin/tags/20101123_r_release1 395965 395965 
。。。

一共有83个分支。人肉将预发布分支找出来是一件痛苦的事情，我们要得到是所有带tags标记的分支URL，也就是说对于第一个发布信息，我们要提取如下信息：

http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/tags/20101123_r_release1

用sed的替换功能正好可以做这样的事情。
我们要抽取的信息的格式特征如下：http://svn.alibaba-inc.com/repos/ali_intl/apps/应用名称/tags/分支信息
首先先把所有SVN URL提取出来，好进一步做处理。

forrest@ubuntu:~/Desktop$ sed 's#.*http://svn.alibaba-inc.com/repos/ali_intl/apps//(.*/)#http://svn.alibaba-inc.com/repos/ali_intl/apps//1#; /^http/ !d' < aone_release_20101123.txt > sed_study_1.txt

得到类似这样的数据：

http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/tags/20101123_r_release1 395958 395958 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_26769_1 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_25291_2 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_26775_1 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101119_26397_1 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmgateway/tags/20101123_r_release1 380286 380286 
。。。

这里面有两个需要我们处理：
1. 去除非/tags/的分支
2. 将tags分支后面的版本号去除

对于第一个是很容易做到的。

forrest@ubuntu:~/Desktop$ sed '/tags/ !d' < sed_study_1.txt > sed_study_2.txt
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/tags/20101123_r_release1 395958 395958 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmgateway/tags/20101123_r_release1 380286 380286 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-atmlogin/tags/20101123_r_release1 395965 395965 
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-billing/tags/20101123_r_release1 380290 380290
。。。

第二个如果不匹配前面的release1则比较麻烦。
但是对于这种表格型特征的记录取field的需求，使用awk是最方便的：

forrest@ubuntu:~$ awk '{print $1}'  ~/Desktop/sed_study_1.txt > /home/forrest/Desktop/sed_study_2.txt
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/tags/20101123_r_release1
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_26769_1
http://svn.alibaba-inc.com/repos/ali_intl/apps/intl-aisn/branches/20101122_25291_2
。。。

总结：
sed是一个非常强大而简单的面向行的文本流编辑工具，可以用它来做一些简单的文本处理，如：
The result is that nowadays, sed is most commonly used in just two kinds of applications: simple text substitutions (that don't involve fields!), and extractions of lines by number.
其他情况下，用AWK比sed要方便得多，这就是为什么要掌握多门语言，并且知道他们的各自的适用场景。

学步园

sed学习笔记

sed学习笔记

实战

作者: echoing

书签

最新文章New

本站推荐

返回首页