DAT 102: Module 02 Entity Relationship Diagram Gathering Database Requirements: This is part of the analysis phase of a software development project which focuses on the data items that must be stored. Redundancy in the date is expected the broader the scope of the project. E.g. Product ID/Product #). Appropriate names for those items must be stored (particularly where data is used in more than one place or department). Database requirements can be found from user interviews, current spreadsheets, forms, invoices, existing apps & systems, etc. Spreadstreets are usually the best sources as data is already organized in rows & columns. it is also easy (relatively) to identify primary data (stored data) & derived data (data is calculated when it is displayed). They are also slightly difficult to work with as they are often modified. This can cause the introduction of new data items & renamed data items. When data items are renamed, a consensus to determine the singular names of the data items is needed. With reports, similar to spreadsheets, the data is often organized in rows & columns. The source of the data is often easily determined. Manual reports allow you to discuss with the individual(s) that created the report while Automated reports, the Information Systems department provides details on data origin. Reports are also less likely to have redundant data as they are better controlled than spreadsheets. Data to Store (at first glance): • invoice number • date • customer information • invoice details Data to Store (upon closer analysis): • invoice number • date • customer name • " address • " contact • quantity • description repeated • unit price • price } Entity Relationship Diagram provides a simple way to convey the requirements of a database. It is a common tool for both developers & users as basis for discussions. It is often more descriptive than an invoice in regards to what their database developer must have in the end. The diagram identifies what we need to collect data about & the relationships between them.
Entity Definitions: • Entity: the "thing" an organization wishes to maintain data regarding e.g client/ customer, place, event, object, etc. • Entity Type: collection of entities that share common properties or characteristics • Entity Instance: a single occurrence of an entity type (e.g a specific customer & all related data). Entity Description: SHOULD BE: • an object with many instances in the database (e.g customer, products invoice, etc.) • an object with multiple attributes (e.g. Customer attributes like name, address, contact, phone, etc.) • an object we are trying to model (e.g needs to be relevant to the system) SHOULD NOT BE: • a user of the database system • an output of the database system (e.g reports because they are generated from existing data). Relationships: Relationships describe how entities are related (e.g a customer having invoices) They also include the cardinality (numerical limits). E.g customer may have zero to many invoices, an invoice has one to many invoice details, an employee has zero or one supervisors, etc. Attributes: Attributes are the characteristics of an entity or a relationship type. Attributes of a relationship are intuitive (e.g height, color, address, d.o.b, etc.). Attributes of a relationship type is a little more difficult. For instance, a relationship may exist between a student & a course; attributes of this relationship could be "completion date" & “final mark". Another instance would be a relationship between a contractor & a project; attributes of this would be "contract fee" & "date of hire". If a relationship has attributes, that relationship with its attributes becomes an associative entity & is treated like a regular entity. Symbols used in ERDs: • emp name & i.d: single attribute • age: derived attribute • dependents: multivalued attribute
Entity & Attributes Single Attributes: • single data items (fields) that are descriptors for the entity (eg emp id) Derived Attributes: • an item that is not stored as is (eg age is not stared but calculated by Date of Birth) Multivalued Attributes: • items that have multiple instances (eg an employee may have multiple dependents) The attribute with an underlined name are the identifiers. (Eg SIN, product #, etc.). When more than one attribute is used as key, it is referred to as a composite key. All composite key attributes are underlined. Good identifiers will not change in value over time and will not be null. Intelligent identifiers i.e identifiers that contain data that helps describe the instance should be avoided. (eg using an employees phone # as the key would lose value should the employee ever change their #) Similar to/Using our previous example… cardinality index In our example: • customer has zero or many invoices • an invoice has one or many invoice details Interpreted as: • customer may have no invoices or many invoices • an invoice can have only one customer • an invoice has at least one invoice detail • an invoice detail can only be on one invoice
A More Advanced Illustration In our example, an ERD for the ETS of a company will be created. The following entities are part of the Employee Training System (ETS): – Course – Employee – Instructor The ERD: The ERD with an associative entity: Here an employee takes zero or many courses which are taught by one and only one professor who can teach zero or many courses. Here, we see that between the employee & a course, a completion date & mark must be tracked. In this particular case, an employee may have zero or many course(s) completed. A course can have zero or many completions by the employee(s). The associative entity labelled in our diagram can only be related to one employee & one course as it’s role is matching the two together. Hence, the symbol change from our first example.