现在的位置: 首页 > 综合 > 正文

Types of Data

2013年12月07日 ⁄ 综合 ⁄ 共 3634字 ⁄ 字号 评论关闭
企业中的数据都如何分类?
粗略的分类
如果粗略点的分类话,可以分为两类数据:主数据和事务型数据。
主数据(Master Data)
“Master Data is your business critical data that is stored in disparate systems spread across your Enterprise.”
Master data describe the people, places, and things that are involved in an organization’s business.

Because these data tend to be used by multiple business processes and IT systems,standardizing master data formats and synchronizing values are critical for successful system integration.

通常主数据可以分为四类:
  • Parties(参与方): represents all parties the enterprise conducts business with such as customers, prospects, individuals, suppliers, partners, etc.
  • Places: represents the physical places and their segmentations such as geographies, locations, subsidiaries, sites, areas, zones, etc.
  • Things: usually represents what the enterprise actually sells such as products, services, packages, items, financial services, etc.
  • Financial and Organizational: represents all roll-up hierarchies used in many places for reporting and accounting purposes such as organization structures, sales territories, chart of accounts, cost centers, business units, profit centers, price lists, etc.
事务型数据(Transactional Data)
Such as purchase orders, invoices or financial statements, is not usually considered master data since it actually registers a “fact” that happened at a certain point in time. 
Transactional Data is really what drives the business indicators of the enterprise and it relies entirely on Master Data.

Examples: include sales orders, invoices, purchase orders, shipping documents, passport applications, credit card payments, and insurance claims.
These data are typically grouped into transactional records, which include associated master data.

两种类型数据的关系

--------------------------------------------------------------------------------------------------------------------------------------------
更详细的分类
也有人把数据的类型分的更细一些。如上图中的六类,数据模型中的蓝色越深代表语义相关性越强和数据质量越重要,黄色越深代表数据的数据数量越多、更新的频率越快、实时抓取的数据越快、数据的生命越短。
从中可以看到,元数据的数据语义性最强,几乎不更新,数据量最少,生命周期最长。
Metadata
This is data that describes the data held in the enterprise information architecture,
e.g. definitions of tables and columns in the system catalog of a database, or entities and attributes in a data model. 
Reference Data
Tables in databases that are also called "domains", or "lookup tables". These are used to hold information about entities the enterprise does manage in its business (e.g. countries and currencies), or hold information that categorizes the enterprise's information. We define reference data this way: Reference data is any kind of data that is used solely to categorize other data found in a database, or solely for relating data in a database to information beyond the boundaries of the enterprise.
Master Data
详细的解释见上面。
Enterprise Structure Data
Data that describes the structure of the enterprise, e.g. organizational structure or chart of accounts. This information is used to track business activities by responsibility. Formal definition: Data that permits business activity to be reported or analyzed by business responsibility.
Transaction Activity Data
This is the traditional focus of IT. It is the data that forms the transactions processed by the operational systems of the enterprise, e.g. sales, trades, etc.
Transaction Audit Data
An individual transaction may pass through several steps. in each step its state may change. Audit information tracks these state changes. Web logs and database logs also track this kind of data.

--------------------------------------------------------------------------------------------------------------------------------------------
按存储形式来划分
结构化数据:即存储在数据库中的数据。
非结构化数据:顾名思义,是存储在文件系统的信息,而不是数据库,如文件,邮件,社交媒体等。 据IDC的一项调查报告中指出:企业中80%的数据都是非结构化数据,这些数据每年都按指数增长60%。
结构化数据:先有结构后有数据。
非结构化数据:有数据,无结构。
大数据时代最大的挑战也是来自非结构化数据的处理。并且很多时候结构化数据并不是决策最关键点。传统的BI(商业智能)分析类软件还主要是基于结构化数据,只回答一些问题 Who,What,When,Where,但是没有回答Why,How。要回答Why和How,未来可能将依赖于针对非结构化数据的分析。
比如:从传统BI中,你能看到一个产品的销量比较差,但是你可能很难知道销量差的原因,针对非结构化数据的BI可以分析社交网络中的产品相关负面关键词,最终知道销量差的根结。

参考:
 

抱歉!评论已关闭.