Assignment: 1. Describe three traditional techniques for collecting information during analysis. When might one be better than another? 2. What are the general guidelines for collecting data through observing workers? 3. What is the degree of a relationship? Give an example of each of the relationship degrees illustrated in this chapter. Please make sure the assignment follows APA FORMAT. Also the citation and the references are two important factors of getting good grade for the assignment. Again the deadline for this assignment is by Thursday no later than 11:59 PM IST.
For any questions and concerns please do not hesitate to contact me. Chapter Objectives After studying this chapter, you should be able to: ¦Concisely define each of the following key data-modeling terms: conceptual data model, entity-relationship diagram, entity type, entity instance, attribute, candidate key, multivalued attribute, relationship, degree, cardinality, and associative entity. ¦Ask the right kinds of questions to determine data requirements for an information system. ¦Draw an entity-relationship (E-R) diagram to represent common business situations. Explain the role of conceptual data modeling in the overall analysis and design of an information system. ¦Distinguish between unary, binary and ternary relationships, and give an example of each. ¦Distinguish between a relationship and an associative entity, and use associative entities in a data model when appropriate. ¦Relate data modeling to process and logic modeling as different ways of describing an information system. ¦Generate at least three alternative design strategies for an information system. ¦Select the best design strategy using both qualitative and quantitative methods. Chapter Preview …
In Chapter 6 you learned how to model and analyze the flow of data (data in motion) between manual or automated steps and how to show data stores (data at rest) in a data-flow diagram. Data-flow diagrams show how, where, and when data are used or changed in an information system, but they do not show the definition, structure, and relationships within the data. Data modeling, the subject of this chapter, develops this missing, and crucial, piece of the description of an information system. Systems analysts perform data modeling during the systems analysis phase, as highlighted in Figure 7-1.
Data modeling is typically done at the same time as other requirements structuring steps. Many systems developers believe that a data model is the most important part of the information system requirements statement for four reasons. First, the characteristics of data captured during data modeling are crucial in the design of databases, programs, computer screens, and printed reports. For example, facts such as these—a data element is numeric, a product can be in only one product line at a time, a line item on a customer order can never be moved to another customer order—are all essential in ensuring an information system’s data integrity.
FIGURE 7-1 Systems analysts perform data modeling during the systems analysis phase. Data modeling typically occurs in parallel with other requirements structuring steps. Second, data rather than processes are the most complex aspects of many modern information systems. For example, transaction processing systems can have considerable complexity in validating data, reconciling errors, and coordinating the movement of data to various databases.
Management information systems (such as sales tracking), decision support systems (such as short-term cash investment), and executive support systems (such as product planning) are data intensive and require extracting data from various data sources. Third, the characteristics about data (such as format and relationships with other data) are rather permanent. In contrast, who receives which data, the format of reports, and what reports are used change constantly over time. A data model explains the inherent nature of the organization, not its transient form.
So, an information system design based on data, rather than processes or logic, should have a longer useful life. Finally, structural information about data is essential to generate programs automatically. For example, the fact that a customer order has many line items as opposed to just one affects the automatic design of a computer form in Microsoft Access for entry of customer orders. In this chapter, we discuss the key concepts of data modeling, including the most common format used for data modeling—entity-relationship (E-R) diagramming.
During the systems analysis phase of the SDLC, you use data-flow diagrams to show data in motion and E-R diagrams to show the relationships among data objects. We also illustrate E-R diagrams drawn using Microsoft’s Visio tool, highlighting this tool’s capabilities and limitations. You have now reached the point in the analysis phase where you are ready to transform all of the information you have gathered and structured into some concrete ideas about the design for the new or replacement information system. This aspect is called the design strategy. From requirements determination, you know what the current system does.
You also know what the users would like the replacement system to do. From requirements structuring, you know what forms the replacement system’s process flow and data should take, at a logical level independent of any physical implementation. To bring analysis to a conclusion, your job is to take these structured requirements and transform them into several alternative design strategies. One of these strategies will be pursued in the design phase of the life cycle. In this chapter, you learn why you need to come up with alternative design strategies and about guidelines for generating alternatives.
You then learn the different issues that must be addressed for each alternative. Once you have generated your alternatives, you will have to choose the best design strategy to pursue. We include a discussion of one technique that analysts and users often use to help them agree on the best approach for the new information system. Conceptual Data Modeling A conceptual data model is a representation of organizational data. The purpose of a conceptual data model is to show as many rules about the meaning and interrelationships among data as possible.
Conceptual data model A detailed model that shows the overall structure of organizational data while being independent of any database management system or other implementation considerations. Entity-relationship (E-R) data models are commonly used diagrams that show how data are organized in an information system. The main goal of conceptual data modeling is to create accurate E-R diagrams. As a systems analyst, you typically do conceptual data modeling at the same time as other requirements analysis and structuring steps during systems analysis.
You can use methods such as interviewing, questionnaires, and JAD sessions to collect information for conceptual data modeling. On larger systems development teams, a subset of the project team concentrates on data modeling while other team members focus attention on process or logic modeling. You develop (or use from prior systems development) a conceptual data model for the current system and build a conceptual data model that supports the scope and requirements for the proposed or enhanced system. The work of all team members is coordinated and shared through the project dictionary or repository.
As discussed in Chapter 3, this repository and associated diagrams may be maintained by a CASE tool or a specialized tool such as Microsoft’s Visio. Whether automated or manual, the process flow, decision logic, and data-model descriptions of a system must be consistent and complete, because each describes different but complementary views of the same information system. For example, the names of data stores on primitive-level DFDs often correspond to the names of data entities in entity-relationship diagrams, and the data elements in data flows on DFDs must be attributes of entities and relationships in entity-relationship diagrams.
The Process of Conceptual Data Modeling You typically begin conceptual data modeling by developing a data model for the system being replaced, if a system exists. This phase is essential for planning the conversion of the current files or database into the database of the new system. Further, it is a good, but not a perfect, starting point for your understanding of the new system’s data requirements. Then, you build a new conceptual data model that includes all of the data requirements for the new system. You discovered these requirements from the fact-finding methods used during requirements determination.
Today, given the popularity of prototyping and other rapid development methodologies, these requirements often evolve through various iterations of a prototype, so the data model is constantly changing. Conceptual data modeling is only one kind of data modeling and database design activity done throughout the systems development process. Figure 7-2 shows the different kinds of data modeling and database design that occur during the systems development life cycle. The conceptual data-modeling methods we discuss in this chapter are suitable for various tasks in the planning and analysis phases.
These phases of the SDLC address issues of system scope, general requirements, and content. An E-R data model evolves from project identification and selection through analysis as it becomes more specific and is validated by more detailed analysis of system needs. In the design phase, the final E-R model developed in analysis is matched with designs for systems inputs and outputs and is translated into a format that enables physical data storage decisions. During physical design, specific data storage architectures are selected, and then, in implementation, files and databases are defined as the system is coded.
Through the use of the project repository, a field in a physical data record can, for example, be traced back to the conceptual data attribute that represents it on an E-R diagram. Thus, the data modeling and design steps in each of the SDLC phases are linked through the project repository. Deliverables and Outcomes Most organizations today do conceptual data modeling using entity-relationship modeling, which uses a special notation of rectangles, diamonds, and lines to represent as much meaning about data as possible.
Thus, the primary deliverable from the conceptual data-modeling step within the analysis phase is an entity-relationship (E-R) diagram. A sample E-R diagram appears in Figure 7-3(A). This figure shows the major categories of data (rectangles in the diagram) and the business relationships between them (lines connecting rectangles). For example, Figure 7-3(A) describes that, for the business represented, a SUPPLIER sometimes supplies ITEMs to the company, and an ITEM is always supplied by one to four SUPPLIERS.
The fact that a supplier only sometimes supplies items implies that the business wants to keep track of some suppliers without designating what they can supply. This diagram includes two names on each line, giving you explicit language to read a relationship in each direction. For simplicity, we will not typically include two names on lines in E-R diagrams in this book; however, many organizations use this standard. FIGURE 7-2 Relationship between Data Modeling and the Systems Development Life Cycle It is common that E-R diagrams are developed using CASE tools or other smart drawing packages.
These tools provide functions to facilitate consistency of data models across different systems development phases, reverse engineering an existing database definition into an E-R diagram, and provide documentation of objects on a diagram. One popular tool is Microsoft Visio. Figure 7-3(B) shows the equivalent of Figure 7-3(A) using Visio. This diagram is developed using the Database Model Diagram tool. The Database|Options|Document settings are specified as relational symbol set, conceptual names on the diagram, option-ality is shown, and relationships are shown using the crow’s foot notation with forward and inverse relationship names.
These settings cause Visio to draw an E-R diagram that most closely resembles the standards used in this text. FIGURE 7-3 Sample Conceptual Data Model Diagrams (A) Standard E-R Notation Some key differences distinguish the standard E-R notation illustrated in Figure 7-3(A) from the notation used in Visio, including: ¦Relationships such as Supplies/Supplied by between SUPPLIER and ITEM in Figure 7-3(A) require an intermediate category of data (called SUPPLIED ITEM in Figure 7-3(B)) because Visio does not support representing these so-called many-to-many relationships. Relationships may be named in both directions, but these names appear on a text box on the relationship line, separated by a forward slash. ¦Limitations, such as an ITEM is always supplied by at most four SUPPLIERS, are not shown on the diagram but rather are documented in the Miscellaneous set of Database Properties of the relationship, which are part of Visio’s version of a CASE repository. ¦The symbol for each category of data (e. g. SHIPMENT) includes space for listing other properties of each data category (such as all the attributes or columns of data we know about that data category); we will illustrate these components later in this chapter. We concentrate on the traditional E-R diagramming notation in this chapter; however, we will include the equivalent Visio version on several occasions so you can see how to show data-modeling concepts in this popular database design tool. As many as four E-R diagrams may be produced and analyzed during conceptual data modeling: 1. An E-R diagram that covers just the data needed in the project’s application. This first diagram allows you to concentrate on the data requirements without being constrained or confused by unnecessary details. ) FIGURE 7-3 Sample Conceptual Data Model Diagrams (B) Visio E-R Notation 2. An E-R diagram for the application system being replaced. (Differences between this diagram and the first show what changes you have to make to convert databases to the new application. ) This version is, of course, not produced if the proposed system supports a completely new business function. 3. An E-R diagram for the whole database from which the new application’s data are extracted. Because many applications share the same database or even several databases, this and the first diagram show how the new application shares the contents of more widely used databases. ) 4. An E-R diagram for the whole database from which data for the application system being replaced is drawn. (Again, differences between this diagram and the third show what global database changes you have to make to implement the new application. ) Even if no system is being replaced, an understanding of the existing data systems is necessary to see where the new data will fit in or if existing data structures must change to accommodate new data.
The other deliverable from conceptual data modeling is a set of entries about data objects to be stored in the project dictionary or repository. The repository is the mechanism to link data, process, and logic models of an information system. For example, explicit links can be shown between a data model and a data-flow diagram. Some important links are briefly explained here. ¦Data elements included in data flows also appear in the data model, and vice versa. You must include in the data model any raw data captured and retained in a data store.
The data model can include only data that have been captured or are computed from captured data. Because a data model is a general business picture of data, both manual and automated data stores will be included. ¦Each data store in a process model must relate to business objects (what we call data entities) represented in the data model. For example, in Figure 6-5, the Inventory File data store must correspond to one or several data objects on a data model. Gathering Information for Conceptual Data Modeling Requirements determination methods must include questions and investigations hat take a data focus rather than only a process and logic focus. For example, during interviews with potential system users, you must ask specific questions to gain the perspective on data needed to develop a data model. In later sections of this chapter, we introduce some specific terminology and constructs used in data modeling. Even without this specific data-modeling language, you can begin to understand the kinds of questions that must be answered during requirements determination. These questions relate to understanding the rules and policies by which the area supported by the new information system operates.
That is, a data model explains what the organization does and what rules govern how work is performed in the organization. You do not, however, need to know how or when data are processed or used to do data modeling. You typically do data modeling from a combination of perspectives. The first perspective is called the top-down approach. It derives the data model from an intimate understanding of the nature of the business, rather than from any specific information requirements in computer displays, reports, or business forms.
Table 7-1 summarizes key questions to ask system users and business managers so that you can develop an accurate and complete data model. The questions are purposely posed in business terms. Of course, technical terms do not mean much to a business manager, so you must learn how to frame your questions in business terms. Alternatively, you can gather the information for data modeling by reviewing specific business documents—computer displays, reports, and business forms—handled within the system. This second perspective of gaining an understanding of data is often called a bottom-up approach.
These business documents will appear as data flows on DFDs and will show the data processed by the system, which probably are the data that must be maintained in the system’s database. Consider, for example, Figure 7-4, which shows a customer order form used at Pine Valley Furniture. From the form in Figure 7-4, we determine that the following data must be kept in the database: ORDER NO ORDER DATE PROMISED DATE PRODUCT NO DESCRIPTION QUANTITY ORDERED UNIT PRICE CUSTOMER NO NAME ADDRESS CITY-STATE-ZIP TABLE 7-1: Questions to Ask to Develop Accurate and Complete Data Models Category of Questions
Questions to Ask System Users and Business Managers 1. Data entities and their descriptions What are the subjects/objects of the business? What types of people, places, things, and materials are used or interact in this business about which data must be maintained? How many instances of each object might exist? 2. Candidate key What unique characteristic(s) distinguishes each object from other objects of the same type? Could this distinguishing feature change over time or is it permanent? Could this characteristic of an object be missing even though we know the object exists? 3. Attributes and secondary keys
What characteristic describes each object? On what basis are objects referenced, selected, qualified, sorted, and categorized? What must we know about each object in order to run the business? 4. Security controls and understanding who really knows the meaning of data How do you use these data? That is, are you the source of the data for the organization, do you refer to the data, do you modify them, and do you destroy them? Who is not permitted to use these data? Who is responsible for establishing legitimate values for these data? 5. Cardinality and time dimensions of data Over what period of time are you interested in these data?
Do you need historical trends, current “snapshot” values, and/or estimates or projections? If a characteristic of an object changes over time, must you know the obsolete values? 6. Relationships and their cardinality and degrees What events occur that imply associations between various objects? What natural activities or transactions of the business involve handling data about several objects of the same or different type? 7. Integrity rules, minimum and maximum cardinality, time dimensions of data Is each activity or event always handled the same way, or are there special circumstances?
Can an event occur with only some of the associated objects, or must all objects be involved? Can the associations between objects change over time (e. g. , employees change departments)? Are values for data characteristics limited in any way? FIGURE 7-4 Customer Order Form Used at Pine Valley Furniture We also see that each order is from one customer, and an order can have multiple line items, each for one product. We use this kind of understanding of an organization’s operation to develop data models. WWW NET SEARCH Investigate the origins and variations of the entity-relationship notation.
Visit http://www. pearsonhighered. com/valacich to complete an exercise related to this topic. Introduction to Entity-Relationship Modeling The basic entity-relationship modeling notation uses three main constructs: data entities, relationships, and their associated attributes. Several different E-R notations exist, and many CASE tools support multiple notations. For simplicity, we have adopted one common notation for this book, the so-called crow’s foot notation. If you use another notation in courses or work, you should be able to easily translate between notations.
An entity-relationship diagram (or E-R diagram) is a detailed, logical, and graphical representation of the data for an organization or business area. The E-R diagram is a model of entities in the business environment, the relationships or associations among those entities, and the attributes or properties of both the entities and their relationships. A rectangle is used to represent an entity, and lines are used to represent the relationship between two or more entities. The notation for E-R diagrams appears in Figure 7-5. Entity-relationship diagram (E-R diagram) A detailed, logical, and graphical representation of the entities, ssociations, and data elements for an organization or business area. Entities FIGURE 7-5 Entity-Relationship Diagram Notations: Basic Symbols, Relationship Degree, and Relationship Cardinality Entity A person, place, object, event, or concept in the user environment about which the organization wishes to maintain data. An entity is a person, place, object, event, or concept in the user environment about which the organization wishes to maintain data. As noted in Table 7-1, the first requirements determination question an analyst should ask concerns data entities. An entity has its own identity, which distinguishes it from every other entity.
Some examples of entities follow: ¦Person: EMPLOYEE, STUDENT, PATIENT ¦Place: STATE, REGION, COUNTRY, BRANCH ¦Object: MACHINE, BUILDING, AUTOMOBILE, PRODUCT ¦Event: SALE, REGISTRATION, RENEWAL ¦Concept: ACCOUNT, COURSE, WORK CENTER You need to recognize an important distinction between entity types and entity instances. An entity type is a collection of entities that share common properties or characteristics. Each entity type in an E-R model is given a name. Because the name represents a set of entities, it is singular. Also, because an entity is an object, we use a simple noun to name an entity type.
We use capital letters in naming an entity type, and in an E-R diagram, the name is placed inside a rectangle representing the entity, for example: Entity type A collection of entities that share common properties or characteristics. An entity instance(or instance) is a single occurrence of an entity type. An entity type is described just once in a data model, whereas many instances of that entity type may be represented by data stored in the database. For example, most organizations have one EMPLOYEE entity type, but hundreds (or even thousands) of instances of this entity type may be stored in the atabase. Entity instance (instance) A single occurrence of an entity type. A common mistake made in learning to draw E-R diagrams, especially if you already know how to do data-flow diagramming, is to confuse data entities with sources/sinks, system outputs, or system users, and to confuse relationships with data flows. A simple rule to avoid such confusion is that a true data entity will have many possible instances, each with a distinguishing characteristic, as well as one or more other descriptive pieces of data. Consider the following entity types that might be associated with a church expense system:
In this situation, the church treasurer manages accounts and records expense transactions against each account. However, do we need to keep track of data about the treasurer and her supervision of accounts as part of this accounting system? The treasurer is the person entering data about accounts and expenses and making inquiries about account balances and expense transactions by category. Because the system includes only one treasurer, TREASURER data do not need to be kept. On the other hand, if each account has an account manager (e. g. a church committee chair) who is responsible for assigned accounts, then we may wish to have an ACCOUNT MANAGER entity type, with pertinent attributes as well as relationships to other entity types. In this same situation, is an expense report an entity type? Because an expense report is computed from expense transactions and account balances, it is a data flow, not an entity type. Even though multiple instances of expense reports will occur over time, the report contents are already represented by the ACCOUNT and EXPENSE entity types. Often when we refer to entity types in subsequent sections, we simply say entity.
This shorthand reference is common among data modelers. We will clarify that we mean an entity by using the term entity instance. Attributes Each entity type has a set of attributes associated with it. An attribute is a property or characteristic of an entity that is of interest to the organization (relationships may also have attributes, as we see in the section on relationships). Asking about attributes is the third question noted in Table 7-1 (see page 212). Following are some typical entity types and associated attributes: STUDENT: Student_ID, Student_Name, Address, Phone_Number, Major
AUTOMOBILE: Vehicle_ID, Color, Weight, Horsepower EMPLOYEE: Employee_ID, Employee_Name, Address, Skill Attribute A named property or characteristic of an entity that is of interest to the organization. We use nouns with an initial capital letter followed by lowercase letters in naming an attribute. In E-R diagrams, we represent an attribute by placing its name inside the rectangle that represents the associated entity. In many E-R drawing tools, such as Microsoft Visio, attributes are listed within the entity rectangle under the entity name.
Candidate Keys and Identifiers Every entity type must have an attribute or set of attributes that distinguishes one instance from other instances of the same type. A candidate key is an attribute (or combination of attributes) that uniquely identifies each instance of an entity type. A candidate key for a STUDENT entity type might be Student_ID. Candidate key An attribute (or combination of attributes) that uniquely identifies each instance of an entity type. Sometimes more than one attribute is required to identify a unique entity.
For example, consider the entity type GAME for a basketball league. The attribute Team_Name is clearly not a candidate key, because each team plays several games. If each team plays exactly one home game against every other team, then the combination of the attributes Home_Team and Visiting_Team is a candidate key for GAME. Some entities may have more than one candidate key. One candidate key for EMPLOYEE is Employee_ID; a second is the combination of Employee_Name and Address (assuming that no two employees with the same name live at the same address).
If more than one candidate key is involved, the designer must choose one of the candidate keys as the identifier. An identifier is a candidate key that has been selected to be used as the unique characteristic for an entity type. Identifier A candidate key that has been selected as the unique, identifying characteristic for an entity type. Identifiers should be selected carefully because they are critical for the integrity of data. You should apply the following identifier selection rules: 1. Choose a candidate key that will not change its value over the life of each instance of the entity type.
For example, the combination of Employee_Name and Address would probably be a poor choice as a primary key for EMPLOYEE because the values of Employee_Name and Address could easily change during an employee’s term of employment. 2. Choose a candidate key such that, for each instance of the entity, the attribute is guaranteed to have valid values and not be null. To ensure valid values, you may have to include special controls in data entry and maintenance routines to eliminate the possibility of errors. If the candidate key is a combination of two or more attributes, make sure that all parts of the key have valid values. . Avoid the use of so-called intelligent keys, whose structure indicates classifications, locations, and other entity properties. For example, the first two digits of a key for a PART entity may indicate the warehouse location. Such codes are often modified as conditions change, which renders the primary key values invalid. 4. Consider substituting single-attribute surrogate keys for large composite keys. For example, an attribute called Game_ID could be used for the entity GAME instead of the combination of Home_Team and Visiting_Team. For each entity, the name of the identifier is underlined on an E-R diagram.
The following diagram shows the representation for a STUDENT entity type using E-R notation: The equivalent representation using Microsoft Visio is the following: In the Visio notation, the primary key is listed immediately below the entity name with the notation PK, and the primary key is underlined. All required attributes (that is, an instance of STUDENT must have values for Student_ID and Name) are in bold. Multivalued Attributes A multivalued attribute may take on more than one value for each entity instance. Suppose that, Skill is one of the attributes of EMPLOYEE.
If each employee can have more than one Skill, then it is a multivalued attribute. During conceptual design, two common special symbols or notations are used to highlight multivalued attributes. The first is to use curly brackets around the name of the multivalued attribute, so that the EMPLOYEE entity with its attributes is diagrammed as follows: Multivalued attribute An attribute that may take on more than one value for each entity instance. Many E-R drawing tools, such as Microsoft Visio, do not support multivalued attributes within an entity.
Thus, a second approach is to separate the repeating data into another entity, called a weak(or attributive) entity, and then using a relationship (relationships are discussed in the next section), link the weak entity to its associated regular entity. The approach also easily handles several attributes that repeat together, called a repeating group. Consider an EMPLOYEE and his or her dependents. Dependent name, age, and relation to employee (spouse, child, parent, etc. ) are multivalued attributes about an employee, and these attributes repeat together.
We can show this repetition using an attributive entity, DEPENDENT, and a relationship, shown here simply by a line between DEPENDENT and EMPLOYEE. The crow’s foot next to DEPENDENT means that many DEPENDENTs may be associated with the same EMPLOYEE. Repeating group A set of two or more multivalued attributes that are logically related. Relationships Relationships are the glue that hold together the various components of an E-R model. In Table 7-1 (see page 212), questions 5, 6, and 7 deal with relationships. A relationship is an association between the instances of one or more entity types that are of interest to the organization.
An association usually means that an event has occurred or that some natural linkage exists between entity instances. For this reason, relationships are labeled with verb phrases. For example, a training department in a company is interested in tracking which training courses each of its employees has completed. This information leads to a relationship (called Completes) between the EMPLOYEE and COURSE entity types that we diagram as follows: Relationship An association between the instances of one or more entity types that is of interest to the organization.
As indicated by the lines, this relationship is considered a many-to-many relationship: Each employee may complete more than one course, and each course may be completed by more than one employee. More significantly, we can use the Completes relationship to determine the specific courses that a given employee has completed. Conversely, we can determine the identity of each employee who has completed a particular course. Conceptual Data Modeling and the E-R Model The last section introduced the fundamentals of the E-R data modeling notation—entities, attributes, and relationships.
The goal of conceptual data modeling is to capture as much of the meaning of data as possible. The more details (or what some systems analysts call business rules) about data that we can model, the better the system we can design and build. Further, if we can include all these details in an automated repository, such as a CASE tool, and if a CASE tool can generate code for data definitions and programs, then the more we know about data, the more code can be generated automatically, making the system building more accurate and faster.
More importantly, if we can keep a thorough repository of data descriptions, we can regenerate the system as needed as the business rules change. Because maintenance is the largest expense with any information system, the efficiencies gained by maintaining systems at the rule, rather than code, level drastically reduce the cost. In this section, we explore more advanced concepts needed to more thoroughly model data and learn how the E-R notation represents these concepts. WWW NET SEARCH Investigate the entity-relationship diagramming capabilities of several CASE tools. Visit http://www. pearsonhighered. om/valacich to complete an exercise related to this topic. Degree of a Relationship The degree of a relationship, question 6 in Table 7-1, is the number of entity types that participate in that relationship. Thus, the relationship Completes, illustrated previously, is of degree two because it involves two entity types: EMPLOYEE and COURSE. The three most common relationships in E-R diagrams are unary (degree one), binary (degree two), and ternary (degree three). Higher-degree relationships are possible, but they are rarely encountered in practice, so we restrict our discussion to these three cases.
Examples of unary, binary, and ternary relationships appear in Figure 7-6. Degree The number of entity types that participate in a relationship. Unary Relationship Unary relationship (recursive relationship) A relationship between the instances of one entity type. FIGURE 7-6 Examples of the Three Most Common Relationships in E-R Diagrams: Unary, Binary, and Ternary Also called a recursive relationship, a unary relationship is a relationship between the instances of one entity type. Two examples are shown in Figure 7-6. In the first example, Is_married_to is shown as a one-to-one relationship between instances of the PERSON entity type.
That is, each person may be currently married to one other person. In the second example, Manages is shown as a one-to-many relationship between instances of the EMPLOYEE entity type. Using this relationship, we could identify (for example) the employees who report to a particular manager or, reading the Manages relationship in the opposite direction, who the manager is for a given employee. Binary Relationship A binary relationship is a relationship between instances of two entity types and is the most common type of relationship encountered in data modeling. Figure 7-6 shows three examples.
The first (one-to-one) indicates that an employee is assigned one parking place, and each parking place is assigned to one employee. The second (one-to-many) indicates that a product line may contain several products, and each product belongs to only one product line. The third (many-to-many) shows that a student may register for more than one course and that each course may have many student registrants. Binary relationship A relationship between instances of two entity types. Ternary Relationship A ternary relationship is a simultaneous relationship among instances of three entity types.
In the example shown in Figure 7-6, the relationship Supplies tracks the quantity of a given part that is shipped by a particular vendor to a selected warehouse. Each entity may be a one or a many participant in a ternary relationship (in Figure 7-6, all three entities are many participants). Ternary relationship A simultaneous relationship among instances of three entity types. Note that a ternary relationship is not the same as three binary relationships. For example, Unit_Cost is an attribute of the Supplies relationship in Figure 7-6.
Unit_Cost cannot be properly associated with any of the three possible binary relationships among the three entity types (such as that between PART and VENDOR) because Unit_Cost is the cost of a particular PART shipped from a particular VENDOR to a particular WAREHOUSE. Cardinalities in Relationships Suppose that two entity types, A and B, are connected by a relationship. The cardinality of a relationship (see the fifth, sixth, and seventh questions in Table 7-1) is the number of instances of entity B that can (or must) be associated with each instance of entity A.
For example, consider the following relationship for DVDs and movies: Cardinality The number of instances of entity B that can (or must) be associated with each instance of entity A. Clearly, a video store may stock more than one DVD of a given movie. In the terminology we have used so far, this example is intuitively a “many” relationship. Yet, it is also true that the store may not have a single DVD of a particular movie in stock. We need a more precise notation to indicate the range of cardinalities for a relationship. This notation of relationship cardinality was introduced in Figure 7-5, which you may want to review at this point.
Minimum and Maximum Cardinalities WWW NET SEARCH Investigate the concept of business rules (cardinality is one type of business rule). Visit http://www. pearsonhighered. com/valacich to complete an exercise related to this topic. The minimum cardinality of a relationship is the minimum number of instances of entity B that may be associated with each instance of entity A. In the preceding example, the minimum number of DVDs available for a movie is zero, in which case we say that DVD is an optional participant in the Is_stocked_as relationship.
When the minimum cardinality of a relationship is one, then we say entity B is a mandatory participant in the relationship. The maximum cardinality is the maximum number of instances. For our example, this maximum is “many” (an unspecified number greater than one). Using the notation from Figure 7-5, we diagram this relationship as follows: The zero through the line near the DVD entity means a minimum cardinality of zero, whereas the crow’s foot notation means a “many” maximum cardinality. It is possible for the maximum cardinality to be a fixed number, not an arbitrary “many” value.
For example, see the Supplies relationship in Figure 7-3(A), which indicates that each item involves at most four suppliers. Associative Entities As seen in the examples of the Supplies ternary relationship in Figure 7-6, attributes may be associated with a many-to-many relationship as well as with an entity. For example, suppose that the organization wishes to record the date (month and year) when an employee completes each course. Some sample data follow: EmployeeJD Course_Name Date_Completed 549-23-1948 Basic Algebra March 2009 29-16-8407 Software Quality June 2009 816-30-0458 Software Quality Feb 2009 549-23-1948 C Programming May 2009 From this limited data, you can conclude that the attribute Date_Completed is not a property of the entity EMPLOYEE (because a given employee, such as 549-23-1948, has completed courses on different dates). Nor is Date_Completed a property of COURSE, because a particular course (such as Software Quality) may be completed on different dates. Instead, Date_Completed is a property of the relationship between EMPLOYEE and COURSE.
The attribute is associated with the relationship and diagrammed as follows: Associative entity An entity type that associates the instances of one or more entity types and contains attributes that are peculiar to the relationship between those entity instances. Because many-to-many and one-to-one relationships may have associated attributes, the E-R diagram poses an interesting dilemma: Is a many-to-many relationship actually an entity in disguise? Often the distinction between entity and relationship is simply a matter of how you view the data.
An associative entity is a relationship that the data modeler chooses to model as an entity type. Figure 7-7 shows the E-R notation for representing the Completes relationship as an associative entity. The lines from CERTIFICATE to the two entities are not two separate binary relationships, so they do not have labels. Note that EMPLOYEE and COURSE have mandatory one cardinality, because an instance of Completes must have an associated EMPLOYEE and COURSE. The implicit identifier of Completes is the combination of the identifiers of EMPLOYEE and COURSE, Employee_ID, and Course_ID, respectively.
The explicit identifier is Certificate_Number, as shown in Figure 7-7. FIGURE 7-7 Example of an Associative Entity E-R drawing tools that do not support many-to-many relationships require that any such relationship be converted into an associative entity, whether it has attributes or not. You have already seen an example of this in Figure 7-3 for Microsoft Visio, in which the Supplies/Is supplied by relationship from Figure 7-3(A) was converted in Figure 7-3(B) into the SUPPLIED ITEM entity (actually, associative entity) and two mandatory one-to-many relationships.
One situation in which a relationship must be turned into an associative entity is when the associative entity has other relationships with entities besides the relationship that caused its creation. For example, consider the E-R model, which represents price quotes from different vendors for purchased parts stocked by Pine Valley Furniture, shown in Figure 7-8(A). Now, suppose that we also need to know which price quote is in effect for each part shipment received. This additional data requirement necessitates that the relationship between VENDOR and PART be transformed into an associative entity.
This new relationship is represented in Figure 7-8(B). FIGURE 7-8 An E-R Model That Represents Each Price Quote for Each Part Shipment Received by Pine Valley Furniture In this case, PRICE QUOTE is not a ternary relationship. Rather, PRICE QUOTE is a binary many-to-many relationship (associative entity) between VENDOR and PART. In addition, each PART RECEIPT, based on Amount, has an applicable, negotiated Price. Each PART RECEIPT is for a given PART from a specific VENDOR, and the Amount of the receipt dictates the purchase price in effect by matching with the Quantity attribute.
Because the PRICE QUOTE pertains to a given PART and given VENDOR, PART RECEIPT does not need direct relationships with these entities. An Example of Conceptual Data Modeling at Hoosier Burger Chapter 6 structured the process and data-flow requirements for a food ordering system for Hoosier Burger. Figure 7-9 describes requirements for a new system using Microsoft Visio. The purpose of this system is to monitor and report changes in raw material inventory levels and to issue material orders and payments to suppliers. Thus, the central data entity for this system will be an INVENTORY ITEM, shown in Figure 7-10, corresponding to ata store D1 in Figure 7-9. FIGURE 7-9 Level-0 Data-Flow Diagram for Hoosier Burger’s New Logical Inventory Control System Changes in inventory levels are due to two types of transactions: receipt of new items from suppliers and consumption of items from sales of products. Inventory is added upon receipt of new raw materials, for which Hoosier Burger receives a supplier INVOICE (see Process 1. 0 in Figure 7-9). Figure 7-10 shows that each INVOICE indicates that the supplier has sent a specific quantity of one or more INVOICE ITEMs, which correspond to Hoosier’s INVENTORY ITEMs.
Inventory is used when customers order and pay for PRODUCTs. That is, Hoosier makes a SALE for one or more ITEM SALEs, each of which corresponds to a food PRODUCT. Because the real-time customer-order processing system is separate from the inventory control system, a source, STOCK-ON-HAND in Figure 7-9, represents how data flow from the order processing to the inventory control system. Finally, because food PRODUCTs are made up of various INVENTORY ITEMs (and vice versa), Hoosier maintains a RECIPE to indicate how much of each INVENTORY ITEM goes into making one PRODUCT.
From this discussion, we have identified the data entities required in a data model for the new Hoosier Burger inventory control system: INVENTORY ITEM, INVOICE, INVOICE ITEM, PRODUCT, SALE, ITEM SALE, and RECIPE . To complete the E-R diagram, we must determine necessary relationships among these entities as well as attributes for each entity. FIGURE 7-10 Preliminary E-R Diagram for Hoosier Burger’s Inventory Control System The wording in the previous description tells us much of what we need to know to determine relationships: ¦An INVOICE includes one or more INVOICE ITEMs, each of which corresponds to an INVENTORY ITEM.
Obviously, an INVOICE ITEM cannot exist without an associated INVOICE, and over time the result will be zero-to-many receipts, or INVOICE ITEMs, for an INVENTORY ITEM. ¦Each PRODUCT is associated with INVENTORY ITEMs. ¦A SALE indicates that Hoosier sells one or more ITEM SALEs, each of which corresponds to a PRODUCT. An ITEM SALE cannot exist without an associated SALE, and over time the result will be zero-to-many ITEM SALEs for a PRODUCT. Figure 7-10 shows an E-R diagram with the entities and relationships previously described. We include on this diagram two labels for each relationship, one to be read in either relationship direction (e. . , an INVOICE Includes one-to-many INVOICE ITEMs, and an INVOICE ITEM Is_included_on exactly one INVOICE). Now that we understand the entities and relationships, we must decide which data elements are associated with the entities and associative entities in this diagram. You may wonder at this point why only the INVENTORY data store is shown in Figure 7-9 when seven entities and associative entities are on the E-R diagram. The INVENTORY data store corresponds to the INVENTORY ITEM entity in Figure 7-10. The other entities are hidden inside other processes for which we have not shown lower-level diagrams.
In actual requirements structuring steps, you would have to match all entities with data stores: Each data store represents some subset of an E-R diagram, and each entity is included in one or more data stores. Ideally, each data store on a primitive DFD will be an individual entity. To determine data elements for an entity, we investigate data flows in and out of data stores that correspond to the data entity and supplement this information with a study of decision logic that uses or changes data about the entity. Six data flows are associated with the INVENTORY data store in Figure 7-9.
The description of each data flow in the project dictionary or repository would include the data flow’s composition, which then tells us what data are flowing in or out of the data store. For example, the Amounts Used data flow coming from Process 2. 0 indicates how much to decrease an attribute STOCK_ON_HAND due to use of the INVENTORY ITEM to fulfill a customer sale. Thus, the Amounts Used data flow implies that Process 2. 0 will first read the relevant INVENTORY ITEM record, then update its STOCK_ON_HAND attribute, and finally store the updated value in the record.
Each data flow would be analyzed similarly (space does not permit us to show the analysis for each data flow). After having considered all data flows in and out of data stores related to data entities, plus all decision logic related to inventory control, we derive the full E-R diagram, with attributes, shown in Figure 7-11. In Visio, the ITEM SALE, RECIPE, and INVOICE ITEM entities participate in what are called identifying relationships. Thus, Visio treats them as associative entities, not just the RECIPE entity. Visio automatically includes the primary keys of the identifying entities as primary keys in the identified (associative) ntities. Also note that in Visio, because it cannot represent many-to-many relationships, there are two mandatory relationships on either side of RECIPE. FIGURE 7-11 Final E-R Diagram for Hoosier Burger’s Inventory Control System PVF WebStore: Conceptual Data Modeling Conceptual data modeling for an Internet-based electronic commerce application is no different from the process followed when analyzing the data needs for other types of applications. In the last chapter, you read how Jim Woo analyzed the flow of information within the WebStore and developed a data-flow diagram.
In this section, we examine the process he followed when developing the WebStore’s conceptual data model. Conceptual Data Modeling for Pine Valley Furniture’s WebStore To better understand what data would be needed within the WebStore, Jim Woo carefully reviewed the information from the JAD session and his previously developed data-flow diagram. Table 7-2 summarizes the customer and inventory information identified during the JAD session. Jim wasn’t sure whether this information was complete but knew that it was a good starting place for identifying what information the WebStore needed to capture, store, and process.
To identify additional information, he carefully studied the level-0 DFD shown in Figure 7-12. In this diagram, two data stores—Inventory and Shopping Cart—are clearly identified; both were strong candidates to become entities within the conceptual data model. Finally, Jim examined the data flows from the DFD as additional possible sources for entities. Hence, he identified five general categories of information to consider: ¦Customer ¦Inventory ¦Order ¦Shopping Cart ¦Temporary User/System Messages TABLE 7-2: Customer and Inventory Information for WebStore Corporate Customer
Home Office Customer Student Customer Inventory Information Company name Company address Company phone Company fax Company preferred shipping method Buyer name Buyer phone Buyer e-mail Name Doing business as (company’s name) Address Phone Fax E-mail Name School Address Phone E-mail SKU Name Description Finished product size Finished product weight Available materials Available colors Price Lead time After identifying these multiple categories of data, his next step was to define each item carefully. He again examined all data flows within the DFD and recorded each one’s source and destination.
By carefully listing these flows, he could move more easily through the DFD and understand more thoroughly what information needed to move from point to point. This activity resulted in the creation of two tables that documented Jim’s growing understanding of the WebStore’s requirements. The first, Table 7-3, lists each of the data flows within each data category and its corresponding description. The second, Table 7-4, lists each of the unique data flows within each data category. Jim then felt ready to construct an entity-relationship diagram for the WebStore. FIGURE 7-12 Level-0 Data-Flow Diagram for the WebStore
He concluded that Customer, Inventory, and Order were all unique entities and would be part of his E-R diagram. Recall that an entity is a person, place, or object; all three of these items meet this criteria. Because the Temporary User/System Messages data were not permanently stored items—nor were they a person, place, or object—he concluded that this should not be an entity in the conceptual data model. Alternatively, although the shopping cart was also a temporarily stored item, its contents needed to be stored for at least the duration of a customer’s visit to the WebStore and should be considered an object.
As shown in Figure 7-12, Process 4, Check Out Process Order, moves the Shopping Cart contents to the Purchasing Fulfillment System, where the order details are stored. Thus, he concluded that Shopping Cart—along with Customer, Inventory, and Order—would be entities in his E-R diagram. TABLE 7-3: Data Category, Data Flow, and Data-Flow Descriptions for the WebStore DFD Data Category Data Flow Description Customer Related Customer ID Unique identifier for each customer (generated by Customer Tracking System) Customer Information Detailed customer information (stored in Customer Tracking System) Inventory Related
Product Item Unique identifier for each product item (stored in Inventory Database) Item Profile Detailed product information (stored in Inventory Database) Order Related Order Number Unique identifier for an order (generated by Purchasing Fulfillment System) Order Detailed order information (stored in Purchasing Fulfillment System) Return Code Unique code for processing customer returns (generated by/stored in Purchasing Fulfillment System) Invoice Detailed order summary statement (generated from order information stored in Purchasing Fulfillment System) Order Status Information
Detailed summary information on order status (stored/generated by Purchasing Fulfillment System) Shopping Cart Cart ID Unique identifier for shopping cart Temporary User/System Messages Product Item Request Request to view information on a catalog item Purchase Request Request to move an item into the shopping cart View Cart Request to view the contents of the shopping cart Items in Cart Summary report of all shopping cart items Remove Item Request to remove item from shopping cart Check Out Request to check out and process order The final step was to identify the interrelationships between these four ntities. After carefully studying all the related information, he concluded the following: 1. Each Customer owns zero-to-many Shopping Cart Instances; each Shopping Cart Instance is-owned-by one-and-only-one Customer. 2. Each Shopping Cart Instance contains one-and-only-one Inventory item; each Inventory item is-contained-in zero-to-many Shopping Cart Instances. 3. Each Customer places zero-to-many Orders; each Order is-placed-by one-and-only-one Customer. 4. Each Order contains one-to-many Shopping Cart Instances; each Shopping Cart Instance is-contained-in one-and-only-one Order.
With these relationships defined, Jim drew the E-R diagram shown in Figure 7-13. Through it, he demonstrated his understanding of the requirements, the flow of information within the WebStore, the flow of information between the WebStore and existing PVF systems, and now the conceptual data model. Over the next few hours, Jim planned to refine his understanding further by listing the specific attributes for each entity and then compare these lists with the existing inventory, customer, and order database tables. He had to make sure that all attributes were accounted for before determining a final design strategy.
TABLE 7-4: Data Category, Data Flow, and the Source/Destination of Data Flows within the WebStore DFD Data Category Data Flow From/To Customer Related Customer ID From Customer to Process 4. 0 From Process 4. 0 to Customer Tracking System From Process 5. 0 to Customer Customer Information From Customer to Process 5. 0 From Process 5. 0 to Customer From Process 5. 0 to Customer Tracking System From Customer Tracking System to Process 4. 0 Inventory Related Product Item From Process 1. 0 to Data Store D1 From Process 3. 0 to Data Store D2 Item Profile
From Data Store D1 to Process 1. 0 From Process 1. 0 to Process 2. 0 From Process 2. 0 to Data Store D2 From Data Store D2 to Process 3. 0 From Data Store D2 to Process 4. 0 Order Related Order Number From Purchasing Fulfillment System to Process 4. 0 From Customer to Process 6. 0 From Process 6. 0 to Purchasing Fulfillment System Order From Process 4. 0 to Purchasing Fulfillment System Return Code From Purchasing Fulfillment System to Process 4. 0 Invoice From Process 4. 0 to Customer Order Status Information From Process 6. 0 to Customer
From Purchasing Fulfillment System to Process 6. 0 Shopping Cart Cart ID From Data Store D2 to Process 3. 0 From Data Store D2 to Process 4. 0 Temporary User/System Messages Product Item Request From Customer to Process 1. 0 Purchase Request From Customer to Process 2. 0 View Cart From Customer to Process 3. 0 Items in Cart From Process 3. 0 to Customer Remove Item From Customer to Process 3. 0 From Process 3. 0 to Data Store D2 Check Out From Customer to Process 4. 0 FIGURE 7-13 Entity-Relationship Diagram for the WebStore System Selecting the Best Alternative Design Strategy
Selecting the best alternative system involves at least two basic steps: (1) generating a comprehensive set of alternative design strategies and (2) selecting the one that is most likely to result in the desired information system, given all of the organizational, economic, and technical constraints that limit what can be done. A system design strategy represents a particular approach to developing the system. Selecting a strategy requires you to answer questions about the system’s functionality, hardware and system software platform, and method for acquisition.
We use the term design strategy in this chapter rather than alternative system because, at the end of analysis, we are still quite a long way from specifying an actual system. This delay is purposeful because we do not want to invest in design efforts until some agreement is reached on which direction to take the project and the new system. The best we can do at this point is to outline, rather broadly, the approach we can take in moving from logical system specifications to a working physical system. The overall process of selecting the best system strategy and the deliverables from this step in the analysis process are discussed next.
Design strategy A particular approach to developing an information system. It includes statements on the system’s functionality, hardware and system software platform, and method for acquisition. The Process of Selecting the Best Alternative Design Strategy Systems analysis involves determining requirements and structuring requirements. After the system requirements have been structured in terms of process flow and data, analysts again work with users to package the requirements into different system configurations.
Shaping alternative system design strategies involves the following processes: ¦Dividing requirements into different sets of capabilities, ranging from the bare minimum that users would accept (the required features) to the most elaborate and advanced system the company could afford to develop (which includes all the features desired across all users). Alternatively, different sets of capabilities may represent the position of different organizational units with conflicting notions about what the system should do. Enumerating different potential implementation environments (hardware, system software, and network platforms) that could be used to deliver the different sets of capabilities. (Choices on the implementation environment may place technical limitations on the subsequent design phase activities. ) ¦Proposing different ways to source or acquire the various sets of capabilities for the different implementation environments. In theory, if the system includes three sets of requirements, two implementation environments, and four sources of application software, twenty-four design strategies would be possible.
In practice, some combinations are usually infea-sible, and only a small number—typically three—can be easily considered. Selecting the best alternative is usually done with the help of a quantitative procedure, an example of which comes later in the chapter. Analysts will recommend what they believe to be the best alternative, but management (a combination of the steering committee and those who will fund the rest of the project) will make the ultimate decision about which system design strategy to follow.
At this point in the life cycle, it is also certainly possible for management to end a project before the more expensive phases of system design or system implementation and operation are begun. Reasons for ending a project might include the costs or risks outweighing the benefits, the needs of the organization having changed since the project began, or other competing projects having become more important while development resources remain limited. Generating Alternative Design Strategies The solution to an organizational problem may seem obvious to an analyst.
Typically, the analyst is familiar with the problem, having conducted an extensive analysis of it and how it has been solved in the past. On the other hand, the analyst may be mo