Information about Data Modeling
In computer science, data modeling is the process of creating a data model by applying a data model theory to create a data model instance. A data model theory is a formal data model description. See database model for a list of current data model theories.
When data modelling, we are structuring and organizing data. These data structures are then typically implemented in a database management system. In addition to defining and organizing the data, data modeling will impose (implicitly or explicitly) constraints or limitations on the data placed within the structure.
Managing large quantities of structured and unstructured data is a primary function of information systems. Data models describe structured data for storage in data management systems such as relational databases. They typically do not describe unstructured data, such as word processing documents, email messages, pictures, digital audio, and video.
In an alternative framework, called the Zachman Framework, a data model instance may be one of six kinds (according to John Zachman, 1987):
A data model represents classes of entities (kinds of things) about which a company wishes to hold information, the attributes of that information, and relationships among those entities and (often implicit) relationships among those attributes. The model describes the organization of the data to some extent irrespective of how data might be represented in a computer system.
The entities represented by a data model can be the tangible entities, but models that include such concrete entity classes tend to change over time. Robust data models often identify abstractions of such entities. For example, a data model might include an entity class called "Person", representing all the people who interact with an organization. Such an abstract entity class is typically more appropriate than ones called "Vendor" or "Employee", which identify specific roles played by those people.
A proper conceptual data model describes the semantics of a subject area. It is a collection of assertions about the nature of the information that is used by one or more organizations. Proper entity classes are named with natural language words instead of technical jargon. Likewise, properly named relationships form concrete assertions about the subject area.
There are several versions of this. For example, a relationship called "is composed of" that is defined to operate on entity classes ORDER and LINE ITEM forms the following concrete assertion definition: Each ORDER "is composed of" one or more LINE ITEMS." A more rigorous approach is to force all relationship names to be prepositions, gerunds, or participles, with verbs being simply "must be" or "may be". This way, both cardinality and optionality can be handled semantically. This would mean that the relationship just cited would read in one direction, "Each ORDER may be composed of one or more LINE ITEMS" and in the other "Each LINE ITEM must be part of one and only one ORDER."
Note that this illustrates that often generic terms, such as 'is composed of', are defined to be limited in their use for a relationship between specific kinds of things, such as an order and an order line. This constraint is eliminated in the generic data modeling methodologies.
There are generic patterns that can be used to advantage for modeling business. These include the concepts PARTY (with included PERSON and ORGANIZATION), PRODUCT TYPE, PRODUCT INSTANCE, ACTIVITY TYPE, ACTIVITY INSTANCE, CONTRACT, GEOGRAPHIC AREA, and SITE. A model which explicitly includes versions of these entity classes will be both reasonably robust and reasonably easy to understand.
More abstract models are suitable for general purpose tools, and consist of variations on THING and THING TYPE, with all actual data being instances of these. Such abstract models are significantly more difficult to manage, since they are not very expressive of real world things. More concrete and specific data models will risk having to change as the environment changes.
One approach to generic data modeling has the following characteristics:
A generic data model obeys the following rules:
Examples of generic data models are ISO 10303-221, ISO 15926 and Gellish
While data analysis is a common term for data modeling, the activity actually has more in common with the ideas and methods of synthesis (inferring general concepts from particular instances) than it does with (identifying component concepts from more general ones). {Presumably we call ourselves systems analysts because no one can say systems synthesists.} Data modeling strives to bring the data structures of interest together into a cohesive, inseparable, whole by eliminating unnecessary data redundancies and by relating data structures with relationships.
A different approach is through the use of adaptive systems such as artificial neural networks that can autonomously create implicit models of data.
In the ANSI four-schema architecture [1], the internal schema
..... Click the link for more information.
In the ANSI four-schema architecture [1], the internal schema
..... Click the link for more information.
In the ANSI four-schema architecture [1], the internal schema
..... Click the link for more information.
When data modelling, we are structuring and organizing data. These data structures are then typically implemented in a database management system. In addition to defining and organizing the data, data modeling will impose (implicitly or explicitly) constraints or limitations on the data placed within the structure.
Managing large quantities of structured and unstructured data is a primary function of information systems. Data models describe structured data for storage in data management systems such as relational databases. They typically do not describe unstructured data, such as word processing documents, email messages, pictures, digital audio, and video.
Data model
A data model instance may be one of three kinds (according to ANSI in 1975[1]):- a conceptual schema (data model) describes the semantics of an organization. This consists of entity classes (representing things of significance to the organization) and relationships (assertions about associations between pairs of entity classes).
- a logical schema (data model) describes the semantics, as represented by a particular data manipulation technology. This consists of descriptions of tables and columns, object oriented classes, and XML tags, among other things.
- a physical schema (data model) describes the physical means by which data are stored. This is concerned with partitions, CPUs, tablespaces, and the like.
In an alternative framework, called the Zachman Framework, a data model instance may be one of six kinds (according to John Zachman, 1987):
- a conceptual data model (schema) consists of entity classes (representing things of significance to the organization).
- a contextual data model (schema) describes the semantics of an organization. This consists relationships (assertions about associations between pairs of entity classes).
- a logical data model (schema) describes the semantics, as represented by a particular data manipulation technology. This consists of descriptions of tables and columns, object oriented classes, and XML tags, among other things.
- a physical data model (schema) describes the physical means by which data are stored. This is concerned with partitions, CPUs, tablespaces, and the like.
- a data definition This is the actual coding of the database schema in the chosen development platform.
- a data manipulation describes the operations applied to the data in the schema.
Data structure
A data model describes the structure of the data within a given domain and, by implication, the underlying structure of that domain itself. This means that a data model in fact specifies a dedicated grammar for a dedicated artificial language for that domain.A data model represents classes of entities (kinds of things) about which a company wishes to hold information, the attributes of that information, and relationships among those entities and (often implicit) relationships among those attributes. The model describes the organization of the data to some extent irrespective of how data might be represented in a computer system.
The entities represented by a data model can be the tangible entities, but models that include such concrete entity classes tend to change over time. Robust data models often identify abstractions of such entities. For example, a data model might include an entity class called "Person", representing all the people who interact with an organization. Such an abstract entity class is typically more appropriate than ones called "Vendor" or "Employee", which identify specific roles played by those people.
A proper conceptual data model describes the semantics of a subject area. It is a collection of assertions about the nature of the information that is used by one or more organizations. Proper entity classes are named with natural language words instead of technical jargon. Likewise, properly named relationships form concrete assertions about the subject area.
There are several versions of this. For example, a relationship called "is composed of" that is defined to operate on entity classes ORDER and LINE ITEM forms the following concrete assertion definition: Each ORDER "is composed of" one or more LINE ITEMS." A more rigorous approach is to force all relationship names to be prepositions, gerunds, or participles, with verbs being simply "must be" or "may be". This way, both cardinality and optionality can be handled semantically. This would mean that the relationship just cited would read in one direction, "Each ORDER may be composed of one or more LINE ITEMS" and in the other "Each LINE ITEM must be part of one and only one ORDER."
Note that this illustrates that often generic terms, such as 'is composed of', are defined to be limited in their use for a relationship between specific kinds of things, such as an order and an order line. This constraint is eliminated in the generic data modeling methodologies.
Generic data modeling
Different modelers may well produce different models of the same domain. This can lead to difficulty in bringing the models of different people together. Invariably, however, this difference is attributable to different levels of abstraction in the models. If the modelers agree on certain elements which are to be rendered more concretely, then the differences become less significant.There are generic patterns that can be used to advantage for modeling business. These include the concepts PARTY (with included PERSON and ORGANIZATION), PRODUCT TYPE, PRODUCT INSTANCE, ACTIVITY TYPE, ACTIVITY INSTANCE, CONTRACT, GEOGRAPHIC AREA, and SITE. A model which explicitly includes versions of these entity classes will be both reasonably robust and reasonably easy to understand.
More abstract models are suitable for general purpose tools, and consist of variations on THING and THING TYPE, with all actual data being instances of these. Such abstract models are significantly more difficult to manage, since they are not very expressive of real world things. More concrete and specific data models will risk having to change as the environment changes.
One approach to generic data modeling has the following characteristics:
- A generic data model shall consist of generic entity types, such as 'individual thing', 'class', 'relationship', and possibly a number of their subtypes.
- Every individual thing is an instance of a generic entity called 'individual thing' or one of its subtypes.
- Every individual thing is explicitly classified by a kind of thing ('class') using an explicit classification relationship.
- The classes used for that classification are separately defined as standard instances of the entity 'class' or one of its subtypes, such as 'class of relationship'. These standard classes are usually called 'reference data'. This means that domain specific knowledge is captured in those standard instances and not as entity types. For example, concepts such as car, wheel, building, ship, and also temperature, length, etc. are standard instances. But also standard types of relationship, such as 'is composed of' and 'is involved in' can be defined as standard instances.
A generic data model obeys the following rules:
- Candidate attributes are treated as representing relationships to other entity types.
- Entity types are represented, and are named after, the underlying nature of a thing, not the role it plays in a particular context. Entity types are chosen.
- Entities have a local identifier within a database or exchange file. These should be artificial and managed to be unique. Relationships are not used as part of the local identifier.
- Activities, relationships and event-effects are represented by entity types (not attributes).
- Entity types are part of a sub-type/super-type hierarchy of entity types, in order to define a universal context for the model. As types of relationships are also entity types, they are also arranged in a sub-type/super-type hierarchy of types of relationship.
- Types of relationships are defined on a high (generic) level, being the highest level where the type of relationship is still valid. For example, a composition relationship (indicated by the phrase: 'is composed of') is defined as a relationship between an 'individual thing' and another 'individual thing' (and not just between e.g. an order and an order line). This generic level means that the type of relation may in principle be applied between any individual thing and any other individual thing. Additional constraints are defined in the 'reference data', being standard instances of relationships between kinds of things.
Examples of generic data models are ISO 10303-221, ISO 15926 and Gellish
Data organization
Another kind of data model describes how to organize data using a database management system or other data management technology. It describes, for example, relational tables and columns or object-oriented classes and attributes. Such a data model is sometimes referred to as the physical data model, but in the original ANSI three schema architecture, it is called "logical". In that architecture, the physical model describes the storage media (cylinders, tracks, and tablespaces). Ideally, this model is derived from the more conceptual data model described above. It may differ, however, to account for constraints like processing capacity and usage patterns.While data analysis is a common term for data modeling, the activity actually has more in common with the ideas and methods of synthesis (inferring general concepts from particular instances) than it does with (identifying component concepts from more general ones). {Presumably we call ourselves systems analysts because no one can say systems synthesists.} Data modeling strives to bring the data structures of interest together into a cohesive, inseparable, whole by eliminating unnecessary data redundancies and by relating data structures with relationships.
A different approach is through the use of adaptive systems such as artificial neural networks that can autonomously create implicit models of data.
Techniques
Several techniques have been developed for the design of data models. While these methodologies guide data modelers in their work, two different people using the same methodology will often come up with very different results. Most notable are:- Entity-relationship model
- IDEF
- Object Role Modeling (ORM) or Nijssen's Information Analysis Method (NIAM)
- Business rules or business rules approach
- RM/T
- Bachman diagrams
- Object-relational mapping
- Barker's Notation
- EBNF Grammars
See also
External links
- Article Database Modelling in UML from Methods & Tools
- Data Modelling Dictionary
- Data modeling articles
- Notes on System Development, Methodologies and Modelingby Tony Drewry
- Request For Proposal - Information Management Metamodel (IMM) of the Object Management Group which include e.g UML2 Profile for Relational Data Modeling
- Data Modeling 101
- Agile/Evolutionary Data Modeling
- David C. Hay. 1996. Data Model Patterns: Conventions of Thought. New York:Dorset House Publishers, Inc.
- Developing High Quality Data Models
- Generic Data Modeling
- and the Gellish Dictionary and documents about Gellish http://sourceforge.net/projects/gellish
- "A Practical Approach to Database Design" in Relational Datebase: Selected Writings by C. J. Date (1986)
- American National Standards Institute. 1975. “ANSI/X3/SPARC Study Group on Data Base Management Systems; Interim Report”. FDT(Bulletin of ACM SIGMOD) 7:2.
Computer science, or computing science, is the study of the theoretical foundations of information and computation and their implementation and application in computer systems.
..... Click the link for more information.
..... Click the link for more information.
A data model is an abstract model that describes how data is represented and used.
The term data model has two generally accepted meanings:
..... Click the link for more information.
The term data model has two generally accepted meanings:
- A data model theory i.e. a formal description of how data may be structured and used.
..... Click the link for more information.
A data model is an abstract model that describes how data is represented and used.
The term data model has two generally accepted meanings:
..... Click the link for more information.
The term data model has two generally accepted meanings:
- A data model theory i.e. a formal description of how data may be structured and used.
..... Click the link for more information.
A data model is an abstract model that describes how data is represented and used.
The term data model has two generally accepted meanings:
..... Click the link for more information.
The term data model has two generally accepted meanings:
- A data model theory i.e. a formal description of how data may be structured and used.
..... Click the link for more information.
A database model is a theory or specification describing how a database is structured and used. Several such models have been suggested.
Common models include:
..... Click the link for more information.
Common models include:
- Hierarchical model
- Network model
- Relational model
- Entity-relationship
..... Click the link for more information.
A database management system (DBMS) is computer software designed for the purpose of managing databases. Typical examples of DBMSs include Oracle, DB2, Microsoft Access, Microsoft SQL Server, PostgreSQL, MySQL, FileMaker and Sybase Adaptive Server Enterprise.
..... Click the link for more information.
..... Click the link for more information.
A word processor (more formally known as document preparation system) is a computer application used for the production (including composition, editing, formatting, and possibly printing) of any sort of printable material.
..... Click the link for more information.
..... Click the link for more information.
E-mail (short for electronic mail; often also abbreviated as e-mail, email or simply mail) is a store and forward method of composing, sending, storing, and receiving messages over electronic communication systems.
..... Click the link for more information.
..... Click the link for more information.
A conceptual schema, or conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature.
..... Click the link for more information.
..... Click the link for more information.
A Logical Schema is a data model of a specific problem domain that is in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either (for example, in 2007) relational tables and columns,
..... Click the link for more information.
..... Click the link for more information.
This article or section is written like a personal reflection or and may require .
Please [ improve this article] by rewriting this article or section in an . (, talk)
Physical SchemaPlease [ improve this article] by rewriting this article or section in an . (, talk)
In the ANSI four-schema architecture [1], the internal schema
..... Click the link for more information.
A conceptual schema, or conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature.
..... Click the link for more information.
..... Click the link for more information.
In computer science, a logical data model also referred to as LDM, is a representation of an organization's data, organized in terms of a particular data management technology.
..... Click the link for more information.
..... Click the link for more information.
A physical data model (a.k.a. database design) is a representation of a data design which takes into account the facilities and constraints of a given database management system.
..... Click the link for more information.
..... Click the link for more information.
The Zachman Framework is a framework for enterprise architecture which provides a formal and highly structured way of defining an enterprise. It uses a two dimensional classification model based around the 6 basic communication interrogatives (What, How, Where, Who, When, and Why)
..... Click the link for more information.
..... Click the link for more information.
John Zachman is an American computer scientist and is the originator of the Zachman framework, which is gaining acceptance as a framework for developing application architectures world-wide.
..... Click the link for more information.
..... Click the link for more information.
A conceptual schema, or conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature.
..... Click the link for more information.
..... Click the link for more information.
A Logical Schema is a data model of a specific problem domain that is in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either (for example, in 2007) relational tables and columns,
..... Click the link for more information.
..... Click the link for more information.
This article or section is written like a personal reflection or and may require .
Please [ improve this article] by rewriting this article or section in an . (, talk)
Physical SchemaPlease [ improve this article] by rewriting this article or section in an . (, talk)
In the ANSI four-schema architecture [1], the internal schema
..... Click the link for more information.
A misuse of statistics occurs when a statistical argument asserts a falsehood. In the period since statistics began to play a significant role in society, they have often been misused. In some cases, the misuse was accidental.
..... Click the link for more information.
..... Click the link for more information.
A conceptual schema, or conceptual data model is a map of concepts and their relationships. This describes the semantics of an organization and represents a series of assertions about its nature.
..... Click the link for more information.
..... Click the link for more information.
A Logical Schema is a data model of a specific problem domain that is in terms of a particular data management technology. Without being specific to a particular database management product, it is in terms of either (for example, in 2007) relational tables and columns,
..... Click the link for more information.
..... Click the link for more information.
This article or section is written like a personal reflection or and may require .
Please [ improve this article] by rewriting this article or section in an . (, talk)
Physical SchemaPlease [ improve this article] by rewriting this article or section in an . (, talk)
In the ANSI four-schema architecture [1], the internal schema
..... Click the link for more information.
A misuse of statistics occurs when a statistical argument asserts a falsehood. In the period since statistics began to play a significant role in society, they have often been misused. In some cases, the misuse was accidental.
..... Click the link for more information.
..... Click the link for more information.
A data model is an abstract model that describes how data is represented and used.
The term data model has two generally accepted meanings:
..... Click the link for more information.
The term data model has two generally accepted meanings:
- A data model theory i.e. a formal description of how data may be structured and used.
..... Click the link for more information.
Abstraction is the process of generalization by reducing the information content of a concept or an observable phenomenon, typically in order to retain only information which is relevant for a particular purpose.
..... Click the link for more information.
..... Click the link for more information.
The ISO 15926 is titled: "Industrial automation systems and integration—Integration of life-cycle data for process plants including oil and gas production facilities"
This title is regarded too narrow by the present ISO 15926 developers.
..... Click the link for more information.
This title is regarded too narrow by the present ISO 15926 developers.
..... Click the link for more information.
Gellish is a controlled natural language in which information and knowledge can be expressed such that it is computer interpretable, but still system independent. Gellish is a structured subset of natural language that is suitable for information and knowledge representation and as
..... Click the link for more information.
..... Click the link for more information.
A database management system (DBMS) is computer software designed for the purpose of managing databases. Typical examples of DBMSs include Oracle, DB2, Microsoft Access, Microsoft SQL Server, PostgreSQL, MySQL, FileMaker and Sybase Adaptive Server Enterprise.
..... Click the link for more information.
..... Click the link for more information.
A physical data model (a.k.a. database design) is a representation of a data design which takes into account the facilities and constraints of a given database management system.
..... Click the link for more information.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.
Herod_Archelaus
