Databases

Outline

Learning Objectives

The Advantages of Databases
Types of Database Software

File Management Programs
Database Management Systems

Hierarchical Databases
Network Databases
Relational Databases
Object-Oriented Databases

The Parts of DBMS Software

The Data Definition Language
The Data Manipulation Language
The Query Language
The Report Generator

Creating and Using a Database
DBMS Tips
Database Security

Learning Objectives

When you have finished reading this lesson, you will be able to

Recognize potential uses of a database program

Differentiate between a file management program and a database management system

Describe four popular database structures

Identify the parts of DBMS software

Understand how a database is created and used

The true power of the computer lies in its capability to store, organize, and retrieve large quantities of data. Programs designed to accomplish these tasks are known as database programs. Some people associate the need to maintain large amounts of data with corporations and government offices. However, small businesses and even individuals can improve their efficiency by using database software. Both large and small companies use databases to maintain personnel files, inventory files, customer information, employee information, and accounting records.

For example, the Bulletin Boardroom maintains databases that enable stockholders to exchange views, communicate with company directors, and communicate with company representatives about issues concerning approximately 1,000 companies. (The Bulletin Boardroom was founded by Ross Kaplin in 1993, after the Securities and Exchange Commission changed the rules governing the way stockholders communicate with each other. Before this ruling, restrictions made it difficult for more than ten stockholders to confer about proxy votes.) The service, which can be accessed by modem, costs $2,000 per year. Officials anticipate 18,000 subscribers by 1997.

Imagine cataloging by hand the contents of the Library of Congress, the Smithsonian Museum, or the National Art Museum; then imagine trying to find in your catalog all references to Leonardo da Vinci. Now imagine doing the same tasks with a computer--storing all the information in a database program and then having the computer perform the search for you.

In the movie Mission: Impossible, the safety of all American undercover agents on assignment in Europe is endangered when a copy of a database file containing true names, as well as code names, falls into the wrong hands.

Although large organizations are major users of database management software, small businesses and individuals find them useful as well. You could use a database program to keep track of the addresses and phone numbers of friends. When they move, it's a simple matter to update the information in the database. Compare this method to using an address book, where you must cross out old addresses and telephone numbers and try to squeeze in new ones.

You can also use a database program to catalog collections, such as a baseball card or stamp collection, or a large assortment of tapes or compact discs. Another common personal use of database software is to maintain a household inventory. Think how helpful it would be to have a listing of all your valuable possessions if you became the victim of a burglary or a fire.

Even if you never develop a personal database, learning about database software is an essential component of your computer literacy. Many of the most troubling issues concerning computer use involve huge databases containing information about private citizens. Understanding how databases work can help you grasp why some people are concerned about privacy issues.

The Advantages of Databases

A database is a collection of data files that can be combined and treated as a unit for information retrieval purposes. As you recall from the discussion in Lesson 2B, computers use a hierarchy to store and retrieve data. Characters are combined to form a field, such as an item description. Related fields are combined to form a record.

An inventory record, for instance, might include fields for the stock number, the description, the number on hand, and the cost. Records that relate to the same application are then combined to form a file, such as a company's inventory file. Files can then be combined to form a database. For example, the inventory, accounts payable, accounts receivable, and general ledger files can be combined to form a company's accounting database. An organization may have more than one database, each consisting of related files.

Traditionally, data or information was organized into separate files that were not related or combined in any way. However, this approach has three drawbacks: It produces data dependence, creates data redundancy, and does not guarantee data integrity.

Data dependence occurs when data files from different applications are incompatible and cannot be linked; the data is dependent on the application. A bank may have credit information about a customer in one application's file and have current account data in another application's file. For this reason, the credit and account data cannot be directly linked in a report.

When data is kept in a series of unrelated files, the inevitable result is data redundancy, or repetition of the same data items in more than one file. Repetition of data items wastes valuable storage space.

The traditional approach also endangers data integrity. Data integrity is the consistency of the data in all applications. When a field is repeated a number of times, all the occurrences may not be identical. Suppose that Ken moves to a new address in the same city. Soon after moving, he has new checks printed. If the bank maintains separate files for checking, savings, and loans, his address might be changed only in his checking file and not in his loan and savings files. As a result, Ken doesn't receive his car loan payment forms and doesn't pay on time.

Computerized databases reduce data redundancy and help ensure data integrity because they can link related files using a common field. In the preceding example, the bank could include a customer identification field in all checking, savings, and loan files. Then each customer would be given a unique customer identification number. Changing a particular customer's address in one file in the database would automatically change it for all the transactions involving that customer. (Ken would receive the car loan payment form at his new address.) To put this point another way, computer databases can integrate data from separate files. This capability to integrate files increases the flexibility of the data, reduces data dependence, and makes the tasks of modifying reports and adding new data items easier.

Database packages may also include data security features to protect the data from individuals not authorized to use it. Specified records or fields, as well as the entire database, can be restricted to prevent modification or access.

Types of Database Software

Two types of applications software have been developed to work with database files. File management programs can work with only one file at a time. Database management systems can work with several separate files at a time. This section describes both types of software.

File Management Programs

A file management program enables users to create customized databases and to store and retrieve data from these databases. File management programs come in handy when an individual or small business needs to set up a computerized information storage and retrieval system. A baseball card store, for example, could create a database of available baseball cards for customer reference.

Because file management programs are less complex than database management systems, they are inexpensive and usually easy to use. The ease of use comes at a price, though. File management programs create flat files. Flat files can be accessed sequentially when most of the records need to be processed, accessed randomly to retrieve a specific record, or sorted (so that the records can be accessed sequentially in a different order). The information stored in a flat file, however, cannot be linked to data in other files.

Similar to flat file systems that don't have subdirectories, flat database systems are not capable of accessing related tables of data or of relating information stored in two or more files.

Database Management Systems

A database management system (DBMS) can link the data from several files. A DBMS is usually more expensive and more difficult to learn than a file management program.

DBMS programs commonly use one of four database structures to link files: hierarchical databases, network databases, relational databases, and object-oriented databases.

Hierarchical Databases

Hierarchical databases, the first of the four database structures, were developed by IBM in 1968. A hierarchical database links data using a hierarchical relationship. In a hierarchical DBMS, a group of fields is called a segment rather than a record. The data element at the top of the hierarchy is known as the parent element. There may be several child elements beneath the parent element. Each of these children may, in turn, become a parent to several lower-level child elements. However, note that in a hierarchical structure, each segment can have only one parent. The structure that is created resembles a pyramid or an organizational chart.

The problem with a hierarchical database is that the data can be accessed only by following a path down the structure--access is not flexible. All the relationships among the data elements must be determined when the database is first designed. For example, suppose that a hospital maintains a hierarchical database file of all employees. Employees are categorized by the department in which they work (emergency room, intensive care unit, and so on). Within the department, employees are categorized by job function, such as nurses, doctors, and technicians. If the emergency room has a shortage of nurses one evening, producing a list of all nurses on staff would not be possible. Instead, the nurses assigned to each department would have to be determined department by department.

Many companies continue to use hierarchical databases even though the other database structures are superior. In addition, IBM still supports a hierarchical database on many of its mainframe computers.

Network Databases

The first CODASYL meeting gave birth to the high-level language COBOL. CODASYL meetings have taken place periodically since that first meeting in 1959.

The network database structure was developed by a Conference on Data Systems Languages (CODASYL). As in a hierarchical database, a network database organizes data in a parent-child relationship, and all the relationships among the data items must be determined during the design phase. In a network structure, however, a child can have more than one parent or no parent at all.

The relationship of students to college classes can be shown with a network structure. Each student may be enrolled in several courses, and each course includes a number of students. With a network database structure, you can produce both a student schedule (showing all classes that the student is enrolled in) and a class roster (showing all students enrolled in a class).

Relational Databases

Until recently, relational databases were considered the most flexible, and therefore most desirable, database structure. In a relational database, data in several files is related through the use of a common key field. The contents of a key field are unique to one record in the file, enabling the field to be used to identify a record. The computer uses this key field as an index to locate the records directly without having to read all the records in the files.

A relational DBMS is best envisioned as a two-dimensional table. Each row in the table corresponds to a record, and each column corresponds to a field. A relational database structure can link a customer file and an accounts payable file, for example, through the use of a common field, such as a customer account number field. The user can then request a report consisting of fields from both the customer record and the accounts payable record.

Object-Oriented Databases

Object-oriented databases are the newest type of database structure and are likely to gain in popularity. In an object-oriented database, the result of a retrieval operation is an object of some kind, such as a document. Within this object are miniprograms that enable the object to perform tasks, such as displaying a graphic. Object-oriented databases can incorporate sound, video, text, and graphics into a single database record. This type of database is well suited for multimedia applications. A search of a health-related database, for example, could display a document that included pictures of healthful foods, videos of exercise techniques, and recorded lectures from health professionals. Object-oriented databases can have their data linked to different programs while using a hierarchical, network, or relational database structure.

Aldus Fetch 1.2 is a popular multimedia database program that supports several file formats for graphics, animation and video, sound, and text.

The Parts of DBMS Software

When properly prepared, database management systems enable people to access complex file systems with a minimum of difficulty. Most people are able to use a database management system without knowing about its underlying structure (its fields, records, and files). This ease of access is made possible by the four components of DBMS software: data definition language, data manipulation language, query language, and report generator.

The Data Definition Language

The data definition language (DDL) is used to define the structure of the database. The structure, or schema, outlines the data to be included in the database. In the schema, each field in a record must be defined with a name, a field length, and a type. Common field types are numeric, alphanumeric, date, logical, and memo. Numeric fields can contain only the digits 0 through 9, a decimal point, and a sign. Alphanumeric fields can contain a combination of alphabetic characters, special symbols, and digits. A date field holds a single date. A logical field contains one of two possible values: Yes/No or T (true)/F (false). The value is placed there by the program as the result of a comparison or decision requested by the user. A memo field can be used to hold any type of reminder that the user might like to type.

A DDL can also define subschemas. A subschema outlines the fields that a user will be allowed to access. Different users can access different subschemas. The use of subschemas is an excellent way to protect the privacy of sensitive data items. For example, a payroll clerk should have access to salary data, but a salesperson should not have access to that data.

A drop-down list displays the data field types available in Microsoft Access.

A logical field assumes one of two values. For example, in a certain scheme, a logical field might assume the value of Yes or the value of No.

The Data Manipulation Language

The data manipulation language (DML) includes all the commands that enable the user to manipulate and use the database. With these commands, the user can view the data, add new records, delete records, sort the records, and modify selected fields in a record. The data definition language and the data manipulation language are combined in some microcomputer database programs designed for use on only one computer because these programs have less need for security than mainframe database programs.

The Query Language

The query language enables users to ask specific questions of the database. A marketing vice president trying to decide which items to sell at a discount, for example, might ask a database program to list all inventory items with a profit margin greater than 30 percent. The most popular query language is Structured Query Language (SQL). SQL has been an industry-wide standard since it was first used on IBM mainframes in the 1980s. Today, SQL is used in database management systems on many platforms.

The Report Generator

The report generator helps the user to design and generate reports and graphs in printed form. Report headings, column headings, page numbers, and totals are just some of the features that are easy to include with the report generator.

Creating and Using a Database

Most databases for large computer systems are created (and maintained) by professional programmers. The users of these large-system databases are typically nonprogrammers who can access the data and produce reports but cannot modify the database program. Many database programs designed for microcomputers, however, enable nonprogrammers to create their own databases.

Creating a database involves two steps: defining the structure and entering the data. A good way to begin is to sketch the record on paper before using the database software. Include all the fields that you will need; then determine the name, type, and size for each field.

As you refine your design, consider possible future needs and the needs of other users. For example, if you are designing a library database, you would include a logical field to indicate whether a book has been checked out. Although you can modify the database structure at any time, making modifications is a time-consuming process. It is better to design carefully and minimize the need to modify the database structure.

A database for library books might include field names such as those shown in this Microsoft Access table. Because each book has a unique Library of Congress number, this field is the primary key.

After you enter the database structure into the computer, the DBMS displays a form for each record. The form includes the appropriate amount of space available for each field. Now the user enters into the database the data values for each field in the record. Records can be added to the database as needed.

Records can also be deleted. Deletion is usually a two-step process: You must mark the record for deletion, and you must reorganize the file to eliminate all the marked records.

Once the database is established, users can view the database records on-screen. Users can also print a report or use a query to print only selected records. In addition, they can use a query to select certain records for viewing or printing. For example, if an individual is a fan of John Grisham, the person can request a report listing all the books by that author.

This report shows books written by John Grisham.

To make the database easier to use, you can create a custom form. A custom form provides a user-friendly way to view the data stored in each record. In a well-designed form, for instance, you can choose frequently accessed options by clicking option buttons or check boxes. You can also create menus that automatically produce charts and reports.

DBMS Tips

Designing a database can be a series of frustrating and time-consuming mistakes unless care is taken in selecting the database program and designing the files. The following tips should help to minimize problems and make good use of your time: 1. DBMS programs range from easy to learn and intuitive to extremely difficult. The simplest is the flat-file address book; the most difficult require the use of a programming language, creating the database nearly from scratch. Choose your DBMS carefully; study comparison reviews before you buy.

2. Design your database carefully and thoroughly on paper before you begin using the application. Study the design frequently to determine whether you've left out anything and to ensure that the structure is logical.

3. Define data that you'll want to search for. If you want to be able to search for an area code or ZIP code, be sure that you distinguish them as such. Otherwise, the DBMS will stop at every sequence of numbers. And if you want to sort alphabetically by last name, type the last name first.

4. If you have large quantities of existing information in another data format, you can often convert it into a format the DBMS can read. You can find the specifics for performing this task in the DBMS user manual.

5. Use forms, or templates, to enter repetitive data and automate its collection.

6. Learn the DBMS programming language to create the command files that automate your work and simplify ways to retrieve data.

7. Use the DBMS query language instead of the programming language to learn easier, quicker ways to retrieve data.

8. Explore the features of the report generator--there are many types of reports and ways to use reports. Master the various methods of formatting reports, from the use of type fonts to formats.

9. Learn to use the DBMS's relational characteristics; there are many more ways to define relationships between files than are immediately apparent.

10. Find out whether there are add-on utility programs that will assist in automating tasks you commonly perform with the DBMS.

Database Security

The data that a business stores in a database is usually essential to the operation of the organization. At least some of the data is also confidential and should not be accessed by unauthorized individuals. The most common forms of database security are elaborate passwords and user call-back systems, in which the user calls the computer, hangs up, and the computer calls the user back at a predefined phone number.

Database security can be protected also by making backups routinely. If suspected discrepancies are detected in the data, the database administrator can restore the database from the most recent backup.

Lesson Summary

Database programs can store, organize, and retrieve large quantities of data.

Two types of database programs are in widespread use: file management programs, which work with only one file at a time; and database management systems, which can work with more than one file.

Database programs help users deal with the problems of data dependence, data redundancy, and lack of data integrity.

A hierarchical database must access data following a path down through the levels of the hierarchy.

A network database has a more flexible structure that allows the path to flow up or down.

A relational database uses a key field as an index to locate records.

An object-oriented database can incorporate sound, video, text, and graphics into a record.

The data definition language (DDL) is used to define the structure of the database.

The data manipulation language includes all the user commands.

A query language enables users to ask specific questions.

A report generator facilitates the printing of reports.

Creating the database involves defining the structure and then entering the data.