现在的位置: 首页 > 综合 > 正文

Working with large lists in MOSS2007(一)

2012年11月12日 ⁄ 综合 ⁄ 共 7025字 ⁄ 字号 评论关闭
   
对于在WSS/MOSS中,列表到底支持对大的数据存储量;有没有办法突破这些限制;在不同的场合中,使用什么样的数据访问方法最有效,相信是每个正在使用或者将要使用MOSS2007的人在思考的问题。对于这些问题,作者Steve
Peschka在技术白皮书《White Paper: Working with large lists in Office
SharePoint Server 2007
》中给了我们很好的答案。

   
我将其翻译出来,以便对和我一样英文不太好的同志看起来方便些。我刚也说了,我自己英文也不好,所以将原文也列出来,如果有纰漏的,请大家指正。译文大概会分四个部分:第一部分是测试的目标和测试环境的介绍;第二部分是使用各种数据访问方式进行测试;第三部分是测试用具和测试结果说明;第四部分是对测试结果进行分析,做出结论。另外,声明部分就不作翻译了。


Goals

The test
results in this white paper are intended to demonstrate the
difference in the performance characteristics of SharePoint lists
containing large numbers of items when different data access types
are used to present list contents. Test results in this white paper
show how to optimize list performance through limits on the number
of items that appear in a list, and by choosing the most
appropriate method of retrieving list contents.

本白皮书中的测试结果在于证明使用不同的数据访问方法展示包含大量项的列表时的执行性能差别。测试结果还演示了如何通过限制显示的列表项数量,以及挑选最适合的方法来检索列表内容,对列表的执行效率进行优化。

The tests upon
which the results in this white paper are based were conducted by
using artificially created test data and simulated users.
Real-world results may vary depending on hardware, number of
concurrent users, farm configuration, and user operations being
performed.

白皮书中的测试结果受人工创建的测试数据、模拟的用户数所影响。真实环境中的结果可能会受硬件、并发用户数、服务器场配置以及用户操作所影响。

Test results and findings

There is
documented guidance for Microsoft
Office SharePointServer 2007 regarding the
maximum size of lists and list containers. For typical customer
scenarios in which the standard Office SharePoint
Server 2007 browser-based user interface is used,
the recommendation is that a single list should not have more than
2,000 items per list container. A container in this case means the
root of the list, as well as any folders in the
list — a folder is a container because other list
items are stored within it. A folder can contain items from the
list as well as other folders, and each subfolder can contain more
of each, and so on. For example, that means that you could have a
list with 1,990 items in the root of the site, 10 folders that each
contain 2,000 items, and so on. The maximum number of items
supported in a list with recursive folders is 5 million
items.

关于MOSS2007中列表和列表容量的最大大小已经有了正式指导手册。在典型的用户场景中,标准的MOSS2007是基于浏览器的用户使用界面,建议在一个单一列表容器中不应包含超过2000条列表项。这里的容量是指列表的根下所包含的列表项数量。文件夹也是如此,因为文件夹同样是一个容器,其它列表项也可以存储在其下面。文件夹可以包含列表项,以可以包含其它文件夹,并且每个子文件夹又可以同样包含这些。举例来说,这意味着你可以在列表的根下包含1990条列表项,10个文件夹,每个文件夹中又包含1990条列表项和10个文件夹,以此类推。在一个列表中(包括子文件夹包含的)支持的最大列表项数量是500万条。

In Office
SharePoint Server 2007, virtually all end-user
data is stored in a list. A document library, for example, is just
a specialized list. The same is true for calendars, contacts, and
other interfaces; they are all just customized versions of the
basic SharePoint list, also referred to as an SPList. The
individual items in the list are referred to as list items
generally, or an SPListItem in an SPListItemCollection in the
Office SharePoint Server 2007 object model. The
findings in this article are equally important across all of the
ways in which you store and work with data in a Office SharePoint
Server 2007 site.

MOSS2007中,实际上,所有最终用户数据都是存放在列表中。比如文档库,就是一个专门的列表。同样的,日历、联系人以及其它接口,也都是基于基础的列表进行定制化的结果。在MOSS2007对象模型中,通常,列表中单独的列表项被关联到列表项集中,或者说列表项(SPListItem)在列表项集合(SPListItemCollection)中。本文认为在MOSS2007站点中,所有存储和操作数据的方法都是同等重要的。

There are some
scenarios in which you want to take advantage of the features of
Office SharePoint Server 2007, but need to exceed
the limit of 2,000 items per container. If you write your own
interface for managing and retrieving the data, it’s quite
possible that you can go past this limit without an adverse impact
on farm performance. You may be able to manage larger lists to some
extent by using views within Office SharePoint
Server 2007 that are filtered such that there are
never more than 2,000 items returned. Filtered views provide better
performance than just trying to view one large flat list, but are
not as efficient as breaking down the list into different
containers if you are using the predefined browser-based Office
SharePoint Server 2007 interface.

在有的场合中,也许你想利用MOSS2007的一些优势特征,但是也同样希望突破每个容器2000条的限制。加入你自己写接口来管理和检索数据,那么很可能你可以越过对服务器场性能的不利影响。通过使用MOSS2007内的视图功能,对数据进行筛选让返回的列表项不超过2000条,从而达到对大列表管理的某种程度扩展。相比尝试查看整个大列表,筛选视图提供了更好的执行效率。假如使用MOSS2007预定义的基于浏览器的接口,将一个大列表分割成几个容器存放比筛选视图还更有效。

If you develop
your own interface, there are several different ways to retrieve
list data, each with different performance characteristics. Some
data access methods perform very well, but are only useful in a
limited number of scenarios. Finally, there are also performance
tradeoffs that need to be made with other data maintenance tasks in
addition to data retrieval.

假设你开发自己的接口,有几种不同的检索列表数据的方式,这些不同的方式也具有不同的性能特点。一些数据访问方法执行得很好,但是这仅仅是在某些限制数量的场合中蔡有用。最后,除数据检索外,在数据维护时,也需要做一些性能折中。

Test characteristics

The tests in
this white paper were conducted on a relatively underpowered
Microsoft Virtual Server 2005 R2 image to show a
comparison of farm performance characteristics when different data
access types are used to manipulate list data. The goal of these
tests was not to establish a new arbitrary limit, or to deliver a
“requests per second” type number that is typically used in a
load style test to show raw throughput capacity. The virtual server
image was running Office SharePoint Server 2007
Enterprise Edition and had 1 gigabyte (GB) of allocated RAM.
Virtual Server was running on a host machine with a 2 gigahertz
(GHz) dual-core processor and 2 GB of RAM.

本白皮书中的测试是在一个性能相对较低的Microsoft Virtual Server 2005
R2
映象上进行的,用以显示使用不同的数据访问方式来操作列表数据时的服务器场执行效率特征的对比情况。测试的目的不是为了设置一个任意的限制,或者发送类似
“每秒请求数”这样的请求,而是典型地使用加载方式测试,显示原始的数据吞吐能力。虚拟服务器映象上安装了MOSS2007企业版,并且为虚拟机分配了1G内存。虚拟机本身运行在内存为2GCPU2 GHz的双核主机上。

Baseline tests
were done first with a list containing 1,500 items. The list schema
looked like this:

基线测试首先是在一个包含1500条项目的列表上进行。列表的结构如下:

Title: Single line of text(单行文本)

Expense Category: Choice (Meals, Travel, Hotel, Supplies)(选项)

Amount: Currency(货币)

Deductible:
Yes/No
(是/否)

Created By: Person or Group(用户或用户组)

Modified By: Person or Group(用户或用户组)

In the baseline
tests, no columns were indexed; measurements were taken just to
provide a relative value that could be used after the number of
items in the list exceeded recommended boundaries. In the tests
against a very large list, one set was done with no columns being
indexed and a second round was done after configuring the Expense
Category column to be indexed. The query that was executed in each
one of the tests used a WHERE clause against the Expense Category
field looking for the first 100 items that contained
“Supplies.”

在基线测试中,没有索引任何列。仅通过提供一个在列表中使用超过建议界限的数量的列表项时的相对值来衡量。测试中在一个大列表上进行,该列表被设置成不进行列的索引,在下一轮测试中,就是将费用类型列配置为被索引。每个查询都是使用“Where”条件语句进行筛选,以找到费用类型域中包含“Supplies“的前100条数据项。

To provide
another point of comparison, the data being selected was based on
ID value in the tests against the very large list. The ID is a
built-in numeric indexed field in all SharePoint lists that is well
suited to queries. The query in this case was constructed with a
WHERE clause that retrieved items where the ID ranged from 44,500
through 44,599.

为了提供另外一个对照点,测试中大列表中的数据是基于ID值进行选择。ID是所有SharePoint列表中内建的数字型索引域(字段/列),非常适合这些查询。本例子中的查询是通过“WHERE”条件语句对ID值为44500 44599范围内的数据进行检索的。

Some tests were
also run with the site under load. To create the load during the
testing process, a LoadTest was created in the Microsoft Visual
Studio
.NET
2005 development system to stress test the site. Instead of a
specific number of users in the test, it was configured as a
goal-based test, or a test in which a target value is defined for a
particular measurement, and the test determines the number of
requests required to achieve the target. In this case, the goal
that was configured for the test was to achieve a consistent target
CPU utilization on the Office SharePoint
Server 2007 computer of from 60 through 80
percent.

测试也是运行在站点加载方式下。为了在测试过程中创建加载,在VS2005开发系统中创建了一个加载测试工具以加强测试站点。它被配置为基于全局的测试,或者目标被定义为一个可以详细衡量的、可以确定对达到目的的请求数量的测试,而不是特殊数量的用户的测试。在本例中,为了测试目的,配置的目标是达到对MOSS2007计算机CPU60%80%一致的利用率。

【上篇】
【下篇】

抱歉!评论已关闭.