Quia - Class Page - clientservervblssn2

SQL Definition and History

Every business has data, which requires some organized method or mechanism for maintaining the data. This mechanism is referred to as a database management system (DBMS). Database management systems have been around for years, many of which started out as flat-file systems on a mainframe. With today's technologies, the accepted use of database management systems has begun to flow in other directions, driven by the demands of growing businesses, increased volumes of corporate data, and of course, Internet technologies.

The modern wave of information management is primarily carried out through the use of a relational database management (RDBMS), derived from the traditional DBMS. Relational databases and client/server technologies are typical combinations used by current businesses to successfully manage their data and stay competitive in their appropriate markets. The next few sections discuss the relational database and client/server technology to provide you with a stronger foundation of knowledge for the standard relational database language—SQL.

What Is SQL?

SQL, Structured Query Language, is the standard language used to communicate with a relational database. The prototype was originally developed by IBM using Dr. E.F. Codd's paper ("A Relational Model of Data for Large Shared Data Banks") as a model. In 1979, not long after IBM's prototype, the first SQL product, ORACLE, was released by Relational Software, Incorporated (it was later renamed Oracle Corporation). It is, today, one of the distinguished leaders in relational database technologies. SQL is pronounced either of two ways: as the letters S-Q-L, or as "sequel"; both pronunciations are acceptable.

If you travel to a foreign country, you may be required to know that country's language to get around. For example, you may have trouble ordering from a menu via your native tongue if the waiter speaks only his country's language. Look at a database as a foreign land in which you seek information. SQL is the language you use to express your needs to the database. Just as you would order a meal from a menu in another country, you can request specific information from within a database in the form of a query using SQL.

What Is ANSI SQL?

The American National Standards Institute (ANSI) is an organization that approves certain standards in many different industries. SQL has been deemed the standard language in relational database communication, originally approved in 1986 based on IBM's implementation. In 1987, the ANSI SQL standard was accepted as the international standard by the International Standards Organization (ISO). The standard was revised again in 1992 and was called SQL/92. The newest standard is now called SQL3 or is sometimes referred to as SQL/99.

The New Standard: SQL3

SQL3 has five interrelated documents and other documents may be added in the near future. The five interrelated parts are as follows:

· Part 1—SQL/Framework— Specifies the general requirements for conformance and defines the fundamental concepts of SQL.

· Part 2—SQL/Foundation— Defines the syntax and operations of SQL.

· Part 3—SQL/Call-Level Interface— Defines the interface for application programming to SQL.

· Part 4—SQL/Persistent Stored Modules— Defines the control structures that then define SQL routines. Part 4 also defines the modules that contain SQL routines .

· Part 5—SQL/Host Language Bindings— Defines how to embed SQL statements in application programs that are written in a standard programming language .

The new ANSI standard (SQL3) has two levels of minimal conference that a DBMS may claim: Core SQL Support and Enhanced SQL Support .

ANSI stands for American National Standards Institute, an organization that is responsible for devising standards for various products and concepts.

With any standard come numerous, obvious advantages, as well as some disadvantages. Foremost, a standard steers vendors in the appropriate industry direction for development; in the case of SQL, providing a basic skeleton of necessary fundamentals which, as an end result, allows consistency between various implementations and better serves increased portability (not only for database programs, but databases in general and individuals who manage databases).

Some may argue that a standard is not so good, limiting the flexibility and possible capabilities of a particular implementation. However, most vendors who comply with the standard have added product-specific enhancements to standard SQL to fill in these gaps.

A standard is good, considering the advantages and disadvantages. The expected standard demands features that should be available in any complete SQL implementation and outlines basic concepts that not only force consistency between all competitive SQL implementations, but increase the value of a SQL programmer or relational database user in today's database market.

A SQL implementation is a particular vendor's SQL product.

What Is a Database?

In very simple terms, a database is a collection of data. Some like to think of a database as an organized mechanism that has the capability of storing information, through which a user can retrieve stored information in an effective and efficient manner.

People use databases every day without realizing it. A phone book is a database. The data contained consists of individuals' names, addresses, and telephone numbers. The listings are alphabetized or indexed, which allows the user to reference a particular local resident with ease. Ultimately, this data is stored in a database somewhere on a computer. After all, each page of a phone book is not manually typed each year a new edition is released.

The database has to be maintained. As people move to different cities or states, entries may have to be added or removed from the phone book. Likewise, entries will have to be modified for people changing names, addresses, or telephone numbers, and so on.

An Introduction to the Relational Database

A relational database is a database divided into logical units called tables, where tables are related to one another within the database. A relational database allows data to be broken down into logical, smaller, more manageable units, allowing for easier maintenance and providing more optimal database performance according to the level of organization

Again, tables are related in a relational database, allowing adequate data to be retrieved in a single query (although the desired data may exist in more than one table). By having common keys, or fields, among relational database tables, data from multiple tables can be joined to form one large result set. As you venture deeper into this book, you see more of a relational database's advantages, including overall performance and easy data access .

A relational database is a database composed of related objects, primarily tables. A table is the most basic means of storage for data in a database .

An Introduction to Client/Server Technology

In the past, the computer industry was predominately ruled by mainframe computers; large, powerful systems capable of high storage capacity and high data processing capabilities. Users communicated with the mainframe through dumb terminals—terminals that did not think on their own, but relied solely on the mainframe's CPU, storage, and memory. Each terminal had a data line attached to the mainframe. The mainframe environment definitely served its purpose, and does today in many businesses, but a greater technology was soon to be introduced: the client/server model.

In the client/server system, the main computer, called the server, is accessible from a network—typically a local area network (LAN) or a wide area network (WAN). The server is normally accessed by personal computers (PCs) or by other servers, instead of dumb terminals. Each PC, called a client, is provided access to the network, allowing communication between the client and the server, thus explaining the name client/server. The main difference between client/server and mainframe environments is that the user's PC in a client/server environment is capable of thinking on its own, capable of running its own processes using its own CPU and memory, but readily accessible to a server computer through a network. In most cases, a client/server system is much more flexible for today's overall business needs and is much preferred.

Relational database systems reside on both mainframes and on client/server platforms. Although a client/server system is preferred, the continued use of mainframes can certainly be justified according to a company's needs. A high percentage of companies have recently been leaving their mainframe systems behind and moving their data to a client/ server system, motivated by the urge to stay current with new technologies, provide more flexibility to better suit their business needs, and make old systems Year 2000-compliant.

The switch to a client/server system has proven beneficial for some companies, while others have failed in the client/server implementation and have, as a result, wasted millions of dollars, causing some to return to their mainframes; others still hesitate to make a change. The lack of appropriate expertise—a result of new technology combined with a lack of training—is the main reason for failed implementations. Nevertheless, an understanding of the client/server model is imperative with the rising (and sometimes unreasonable) demands placed on today's businesses as well as the development of Internet technologies and network computing.

Some Popular Relational Database Vendors

Some of the most predominant database vendors include Oracle, Microsoft, Informix, Sybase, and IBM. Although there are many more, this list includes names that you may have recognized on the bookshelf, in the newspaper, magazines, the stock market, or on the World Wide Web.

Differences Between Implementations

As each individual in this world is unique in both features and nature, so is each vendor-specific implementation of SQL. A database server is a product, like any other product on the market, manufactured by a widespread number of vendors. It is to the benefit of the vendor to ensure that its implementation is compliant with the current ANSI standard for portability and user convenience. For instance, if a company is migrating from one database server to another, it would be rather discouraging for the database users to have to learn another language to maintain functionality with the new system.

With each vendor's SQL implementation, however, you find that there are enhancements that serve the purpose for each database server. These enhancements, or extensions, are additional commands and options that are simply a bonus to the standard SQL package and available with a specific implementation.

SQL Sessions

An SQL session is an occurrence of a user interacting with a relational database through the use of SQL commands. When a user initially connects to the database, a session is established. Within the scope of an SQL session, valid SQL commands can be entered to query the database, manipulate data in the database, and define database structures, such as tables.

`CONNECT`

When a user connects to a database, the SQL session is initialized. The CONNECT command is used to establish a database connection . With the CONNECT command, you can either invoke a connection or change connections to the database. For example, if you are connected as USER1, you can use the CONNECT command to connect to the database as USER2. When this happens, the SQL session for USER1 is implicitly disconnected.

CONNECT user@database

When you attempt to connect to a database, you are automatically prompted for a password that corresponds with your current username.

`DISCONNECT`

When a user disconnects from a database, the SQL session is terminated. The DISCONNECT command is used to disconnect a user from the database. When you disconnect from the database, you may still appear to be in the tool that allows you to communicate with the database, but you have lost your connection. When you use EXIT to leave the database, your SQL session is terminated and the tool that you are using to access the database is normally closed.

CONNECT

Types of SQL Commands

The following sections discuss the basic categories of commands used in SQL to perform various functions. These functions include building database objects, manipulating objects, populating database tables with data, updating existing data in tables, deleting data, performing database queries, controlling database access, and overall database administration.

The main categories are

· DDL (Data Definition Language)

· DML(Data Manipulation Language)

· DQL (Data Query Language)

· DCL (Data Control Language)

· Data administration commands

· Transactional control commands

Defining Database Structures (DDL)

Data Definition Language, DDL, is the part of SQL that allows a database user to create and restructure database objects, such as the creation or the deletion of a table.

The main DDL commands discussed during this lesson include the following:

CREATE TABLE

ALTER TABLE

DROP TABLE

CREATE INDEX

ALTER INDEX

DROP INDEX

Manipulating Data (DML)

Data Manipulation Language, DML, is the part of SQL used to manipulate data within objects of a relational database.

There are three basic DML commands:

INSERT

UPDATE

DELETE

Selecting Data (DQL)

Though comprised of only one command, Data Query Language (DQL) is the most concentrated focus of SQL for a relational database user. The command is as follows:

SELECT

This command, accompanied by many options and clauses, is used to compose queries against a relational database. Queries, from simple to complex, from vague to specific, can be easily created.

A query is an inquiry to the database for information.

Data Control Language (DCL)

Data control commands in SQL allow you to control access to data within the database. These DCL commands are normally used to create objects related to user access and also control the distribution of privileges among users. Some data control commands are as follows:

ALTER PASSWORD

GRANT

REVOKE

CREATE SYNONYM

You find that these commands are often grouped with other commands and may appear in a number of different lessons.

Data Administration Commands

Data administration commands allow the user to perform audits and perform analyses on operations within the database. They can also be used to help analyze system performance. Two general data administration commands are as follows:

START AUDIT

STOP AUDIT

Do not get data administration confused with database administration. Database administration is the overall administration of a database, which envelops the use of all levels of commands.

Transactional Control Commands

In addition to the previously introduced categories of commands, there are commands that allow the user to manage database transactions.

· COMMIT Used to save database transactions

· ROLLBACK Used to undo database transactions

· SAVEPOINT Creates points within groups of transactions in which to ROLLBACK

· SET TRANSACTION Places a name on a transaction

A Closer Look at What Composes a Table

The storage and maintenance of valuable data is the reason for any database's existence. You have just viewed the data that is used to explain SQL concepts in this book. The following sections take a closer look at the elements within a table. Remember, a table is the most common and simplest form of data storage in a relational database.

A Field

Every table is broken up into smaller entities called fields. The fields in the PRODUCTS_TBL table consist of PROD_ID, PROD_DESC, and COST. These fields categorize the specific information that is maintained in a given table. A field is a column in a table that is designed to maintain specific information about every record in the table.

A Record, or Row, of Data

A record, also called a row of data, is each individual entry that exists in a table. Looking at the last table, PRODUCTS_TBL, considers the following first record in that table :

<C1>11235      WITCHES COSTUME                29.99

The record is obviously composed of a product identification, product description, and unit cost. For every distinct product, there should be a corresponding record in the PRODUCTS_TBL table. A record is a horizontal entity in a table.

A row of data is an entire record in a relational database table.

A Column

A column is a vertical entity in a table that contains all information associated with a specific field in a table. For example, a column in the PRODUCTS_TBL having to do with the product description would consist of the following :

WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH

LIGHTED LANTERNS

ASSORTED COSTUMES

CANDY CORN

PUMPKIN CANDY

PLASTIC SPIDERS

ASSORTED MASKS

This column is based on the field PROD_DESC, the product description. A column pulls information about a certain field from every record within a table.

The Primary Key

A primary key is a column that makes each row of data in the table unique in a relational database. The primary key in the PRODUCTS_TBL table is PROD_ID, which is typically initialized during the table creation process. The nature of the primary key is to ensure that all product identifications are unique, so that each record in the PRODUCTS_TBL table has its own PROD_ID. Primary keys alleviate the possibility of a duplicate record in a table and are used in other ways.

A NULL Value

NULL is the term used to represent a missing value. A NULL value in a table is a value in a field that appears to be blank. A field with a NULL value is a field with no value. It is very important to understand that a NULL value is different than a zero value or a field that contains spaces. A field with a NULL value is one that has been left blank during record creation. Notice that in the EMPLOYEE_TBL table, not every employee has a middle initial. Those records for employees who do not have an entry for middle initial signify a NULL value.

Basic Data Types

The following sections discuss the basic data types supported by ANSI SQL. Data types are characteristics of the data itself, whose attributes are placed on fields within a table. For example, you can specify that a field must contain numeric values, disallowing the entering of alphanumeric strings. After all, you would not want to enter alphabetic characters in a field for a dollar amount .

Note

Every implementation of SQL seems to have its own specific data types. The use of implementation-specific data types is necessary to support the philosophy of each implementation on how to handle the storage of data. However, the basics are the same among all implementations.

The very basic data types, as with most other languages, are

· Character strings

· Numeric strings

· Date and time values

Fixed-Length Characters

Constant characters, those strings that always have the same length, are stored using a fixed-length data type. The following is the standard for an SQL fixed-length character:

CHARACTER(n)

n represents a number identifying the allocated, or maximum length, of the particular field with this definition.

Some implementations of SQL use the CHAR data type to store fixed-length data. Alphanumeric data can be stored in this data type. An example of a constant length data type would be for a state abbreviation because all state abbreviations are two characters.

Spaces are normally used to fill extra spots when using a fixed-length data type; if a field's length was set to 10 and data entered filled only five places, the remaining five spaces are recorded as spaces. The padding of spaces ensures that each value in a field is a fixed length.

Caution

Be careful not to use a fixed-length data type for fields that may contain varying-length values, such as an individual's name. If you use the fixed-length data type inappropriately, problems such as the waste of available space and the inability to make accurate comparisons between data will eventually be encountered.

Variable Characters

SQL supports the use of varying-length strings, strings whose length is not constant for all data. The following is the standard for an SQL varying-length character:

CHARACTER VARYING(n)

n represents a number identifying the allocated, or maximum length, of the particular field with this definition .

Common data types for variable-length character values are the VARCHAR and VARCHAR2 data types. VARCHAR is the ANSI standard, which Microsoft SQL Server uses; VARCHAR2 is used by Oracle and should be used in Oracle, because VARCHAR's usage in the future may change. The data stored can be alphanumeric.

Remember that fixed-length data types typically pad spaces to fill in allocated places not used by the field. The varying-length data type does not work this way. For instance, if the allocated length of a varying-length field is 10, and a string of five characters is entered, the total length of that particular value is only 5. Spaces are not used to fill unused places in a column.

Tip

Always use the varying-length data type for non-constant character strings to save database space .

Numeric Values

Numeric values are stored in fields that are defined as some type of number, typically referred to as NUMBER, INTEGER, REAL, DECIMAL, and so on.

The following are the standards for SQL numeric values:

BIT (n)

BIT VARYING (n)

DECIMAL (p,s)

INTEGER

SMALLINT

FLOAT(p)

REAL(s)

DOUBLE PRECISION(P)

p represents a number identifying the allocated, or maximum length, of the particular field for each appropriate definition.

s is a number to the right of the decimal point, such as 34.ss.

A common numeric data type in SQL implementations is NUMBER, which accommodates the direction for numeric values provided by ANSI. Numeric values can be stored as zero, positive, negative, fixed, and floating-point numbers. The following is an example using NUMBER:

NUMBER(5)

This example restricts the maximum value entered in a particular field to 99999.

Decimal Values

Decimal values are numeric values that include the use of a decimal point. The standard for a decimal in SQL follows, where the p is the precision and the s is the decimal's scale :

DECIMAL(p,s)

The precision is the total length of the numeric value. In a numeric defined DECIMAL(4,2), the precision is 4, which is the total length allocated for a numeric value.

The scale is the number of digits to the right of the decimal point. The scale is 2 in the previous DECIMAL(4,2) example .

34.33 inserted into a DECIMAL(3,1) is typically rounded to 34.3.

If a numeric value was defined as the following data type, the maximum value allowed would be 99.99:

DECIMAL(4,2)

The precision is 4, which represents the total length allocated for an associated value. The scale is 2, which represents the number of places, or bytes, reserved to the right side of the decimal point. The decimal point itself does not count as a character .

Allowed values for a column defined as DECIMAL(4,2) include the following:

12

12.4

12.44

12.449

The last numeric value, 12.449, is rounded off to 12.45 upon input into the column.

Integers

An integer is a numeric value that does not contain a decimal, only whole numbers (both positive and negative).

Valid integers include the following:

1

0

-1

99

-99

199

Floating-Point Decimals

Floating-point decimals are decimal values whose precision and scale are variable lengths and virtually without limit. Any precision and scale is acceptable. The REAL data type designates a column with single-precision, floating-point numbers. The DOUBLE PRECISION data type designates a column that contains double-precision, floating-point numbers. To be considered a single-precision floating point, the precision must be between 1 and 21 inclusive. To be considered a double-precision floating point, the precision must be between 22 and 53 inclusive. The following are examples of the FLOAT data type:

FLOAT

FLOAT(15)

FLOAT(50)

Dates and Time

Date and time data types are quite obviously used to keep track of information concerning dates and time. Standard SQL supports what are called DATETIME data types, which include the following specific data types:

DATE

TIME

INTERVAL

TIMESTAMP

The elements of a DATETIME data type consist of the following:

YEAR

MONTH

DAY

HOUR

MINUTE

SECOND

Note

The SECOND element can also be broken down to fractions of a second. The range is from 00.000 to 61.999, although some implementations of SQL may not support this range.

Be aware that each implementation of SQL may have its own customized data type for dates and times. The previous data types and elements are standards to which each SQL vendor should adhere, but be advised that most implementations have their own data type for date values, varying in both appearance and the way date information is actually stored internally.

A length is not normally specified for a date data type. Later in this lesson, you learn more about dates, how date information is stored in some implementations, how to manipulate dates and times using conversion functions, and study practical examples of how dates and time are used in the real world.

Literal Strings

A literal string is a series of characters, such as a name or a phone number that is explicitly specified by a user or program. Literal strings consist of data with the same attributes as the previously discussed data types, but the value of the string is known; the value of a column itself is usually unknown, because there is typically a different value for a column associated with each row of data in a table.

You do not actually specify data types with literal strings—you simply specify the string. Some examples of literal strings follow:

'Hello'

45000

"45000"

3.14

'November 1, 1997'

The alphanumeric strings are enclosed by single quotation marks, whereas the number value 45000 is not. Also notice that the second numeric value of 45000 is enclosed by quotation marks. Generally speaking, character strings require quotation marks, whereas numeric strings don't. You see later how literal strings are used with database queries .

`NULL` Data Types

A NULL value is a missing value or a column in a row of data that has not been assigned a value. NULL values are used in nearly all parts of SQL, including the creation of tables, search conditions for queries, and even in literal strings.

The following are two methods for referencing a NULL value :

· NULL (the keyword NULL itself)

· '' (single quotation marks with nothing in between)

The following does not represent a NULL value, but a literal string containing the characters N-U-L-L:

'NULL'

`BOOLEAN` Values

A BOOLEAN value is a value of either TRUE, FALSE, or NULL. BOOLEAN values are used to make data comparisons. For example, when criteria are specified for a query, each condition evaluates to either a TRUE, FALSE, or NULL. If the BOOLEAN value of TRUE is returned by all conditions in a query, data is returned. If a BOOLEAN value of FALSE or NULL is returned, data may not be returned.

Consider the following example:

WHERE NAME = 'SMITH'

This line might be a condition found in a query. The condition is evaluated for every row of data in the table that is being queried. If the value of NAME is SMITH for a row of data in the table, the condition returns the value TRUE, thereby returning the data associated with that record .

User-Defined Types

A user-defined type is a data type that is defined by the user. User-defined types allow users to customize their own data types based on existing data types. The CREATE TYPE statement is used to create a user-defined type .

For example, you can create a type as follows:

CREATE TYPE PERSON AS OBJECT

(NAME       VARCHAR2(30),

 SSN    VARCHAR2(9));

You can reference your user-defined type as follows :

CREATE TABLE EMP_PAY

(EMPLOYEE   PERSON,

 SALARY     NUMBER(10,2),

 HIRE_DATE      DATE);

Notice that the data type referenced for the first column EMPLOYEE is PERSON. PERSON is the user-defined type you created in the first example.

Domains

A domain is a set of valid data types that can be used. A domain is associated with a data type, so that only certain data is accepted. After a domain is created, you can add constraints to the domain. The domain is used like the user-defined type.

You can create a domain as follows :

CREATE DOMAIN MONEY_D AS NUMBER(8,2);

You can add constraints to your domain as follows :

ALTER DOMAIN MONEY_D

ADD CONSTRAINT MONEY_CON1

CHECK (VALUE > 5);

You can reference the domain as follows :

CREATE TABLE EMP_PAY

(EMP_ID        NUMBER(9),

 EMP_NAME      VARCHAR2(30),

 PAY_RATE      MONEY_D);

Note

Note that some of the data types mentioned during this lesson may not be available by name in the implementation of SQL that you are using. Data types are often named differently among implementations of SQL, but the concept behind each data type remains. Most, if not all, data types are supported by most relational databases.

What Are Database Objects?

A database object is any defined object in a database that is used to store or reference data. Some examples of database objects include tables, views, clusters, sequences, indexes, and synonyms. The table is this lesson's focus, because it is the simplest form of data storage in a relational database.

What Is a Schema?

A schema is a collection of database objects (as far as this lesson is concerned—tables) associated with one particular database username. This username is called the schema owner, or the owner of the related group of objects. You may have one or multiple schemas in a database. Basically, any user who creates an object has just created his or her own schema. A schema can consist of a single table and has no limits to the number of objects that it may contain, unless restricted by a specific database implementation.

Say you have been issued a database username and password by the database administrator. Your username is USER1. Suppose you log on to the database and then create a table called EMPLOYEE_TBL. Your table's actual name is USER1.EMPLOYEE_TBL. The schema name for that table is USER1, which is also the owner of that table. You have just created the first table of a schema.

The good thing about schemas is that when you access a table that you own (in your own schema), you do not have to refer to the schema name. For instance, you could refer to your table as either one of the following:

EMPLOYEE_TBL

USER1.EMPLOYEE_TBL

The first option is preferred because it requires fewer keystrokes. If another user were to query one of your tables, the user would have to specify the schema, as follows:

USER1.EMPLOYEE_TBL

There are, two user accounts in the database that own tables: USER1 and USER2. Each user account has its own schema. Some examples for how the two users can access their own tables and tables owned by the other user follow :

USER1 accesses own `table1:`	`TABLE1`
USER1 accesses own test:	`TEST`
USER1 accesses USER2's `table10:`	`USER2.TABLE10`
USER1 accesses USER2's test:	`USER2.TEST`

Both users have a table called TEST. Tables can have the same names in a database as long as they belong to different schemas. If you look at it this way, table names are always unique in a database, because the schema owner is actually part of the table name. For instance, USER1.TEST is different than USER2.TEST. If you do not specify a schema with the table name when accessing tables in a database, the database server looks for a table that you own by default. That is, if USER1 tries to access TEST, the database server looks for a USER1-owned table named TEST before it looks for other objects owned by USER1, such as synonyms to tables in another schema.

Note

Every database server has rules concerning how you can name objects and elements of objects, such as field names. You must check your particular implementation for the exact naming conventions or rules.

A Table: The Primary Storage for Data

The table is the primary storage object for data in a relational database. A table consists of row(s) and column(s), both of which hold the data. A table takes up physical space in a database and can be permanent or temporary.

Fields and Columns

A field, also called a column in a relational database, is part of a table that is assigned a specific data type; a field should be named to correspond with the type of data that will be entered into that column. Columns can be specified as NULL or NOT NULL, meaning that if a column is NOT NULL, something must be entered. If a column is specified as NULL, nothing has to be entered.

Every database table must consist of at least one column. Columns are those elements within a table that hold specific types of data, such as a person's name or address. For example, a valid column in a customer table may be the customer's name.

Generally, a name must be one continuous string. An object name must typically be one continuous string and can be limited to the number of characters used according to each implementation of SQL. It is typical to use underscores with names to provide separations between characters. For example, a column for the customer's name can be named CUSTOMER_NAME instead of CUSTOMERNAME.

Note

Be sure to check your implementation for rules when naming objects and other database elements.

Rows

A row is a record of data in a database table. For example, a row of data in a customer table might consist of a particular customer's identification number, name, address, phone number, fax number, and so on . A row is comprised of fields that contain data from one record in a table. A table can contain as little as one row of data and up to as many as millions of rows of data or records.

The `CREATE TABLE` Statement

The CREATE TABLE statement is obviously used to create a table. Although the very act of creating a table is quite simple, much time and effort should be put into planning table structures before the actual execution of the CREATE TABLE statement.

Some elementary questions need to be answered when creating a table:

· What type of data will be entered into the table?

· What will be the table's name?

· What column(s) will compose the primary key?

· What names shall be given to the columns (fields)?

· What data type will be assigned to each column?

· What will be the allocated length for each column?

· Which columns in a table require data?

After these questions are answered, the actual CREATE TABLE statement is simple .

The basic syntax to create a table is as follows:

CREATE TABLE TABLE_NAME

( FIELD1  DATA TYPE  [ NOT NULL ],

  FIELD2  DATA TYPE  [ NOT NULL ],

  FIELD3  DATA TYPE  [ NOT NULL ],

  FIELD4  DATA TYPE  [ NOT NULL ],

  FIELD5  DATA TYPE  [ NOT NULL ] );

Note

In this lesson's examples, you use the popular data types CHAR (constant-length character), VARCHAR (variable-length character), NUMBER (numeric values, decimal and non-decimal), and DATE (date and time values).

Create a table called EMPLOYEE_TBL in the following example :

CREATE TABLE EMPLOYEE_TBL

(EMP_ID        CHAR(9)        NOT NULL,

EMP_NAME       VARCHAR2(40)   NOT NULL,

EMP_ST_ADDR    VARCHAR2(20)   NOT NULL,

EMP_CITY       VARCHAR2(15)   NOT NULL,

EMP_ST         CHAR(2)        NOT NULL,

EMP_ZIP        NUMBER(5)      NOT NULL,

EMP_PHONE      NUMBER(10)     NULL,

EMP_PAGER      NUMBER(10)     NULL);

Eight different columns make up this table. Notice the use of the underscore character to break the column names up into what appears to be separate words (EMPLOYEE ID is stored as EMP_ID). Each column has been assigned a specific data type and length, and by using the NULL/NOT NULL constraint, you have specified which columns require values for every row of data in the table. The EMP_PHONE is defined as NULL, meaning that NULL values are allowed in this column because there may be individuals without a telephone number. The information concerning each column is separated by a comma, with parentheses surrounding all columns (a left parenthesis before the first column and a right parenthesis following the information on the last column).

A semicolon is the last character in the previous statement. Most SQL implementations have some character that terminates a statement or submits a statement to the database server. Oracle uses the semicolon. Transact-SQL uses the GO statement. This book uses the semicolon.

Each record, or row of data, in this table would consist of the following:

EMP_ID, EMP_NAME, EMP_ST_ADDR, EMP_CITY, EMP_ST, EMP_ZIP, EMP_PHONE, EMP_PAGER

In this table, each field is a column. The column EMP_ID could consist of one employee's identification number or many employees' identification numbers, depending on the requirements of a database query or transactions. The column is a vertical entity in a table, whereas a row of data is a horizontal entity .

Note

NULL is the default value for a column; therefore, it does not have to be entered in the CREATE TABLE statement.

`STORAGE` Clause

Some form of a STORAGE clause is available in many relational database implementations of SQL. The STORAGE clause in a CREATE TABLE statement is used for initial table sizing and is usually done at table creation. The syntax of a STORAGE clause as used in one implementation is shown in the following example :

CREATE TABLE EMPLOYEE_TBL

(EMP_ID        CHAR(9)         NOT NULL,

EMP_NAME       VARCHAR(40)     NOT NULL,

EMP_ST_ADDR    VARCHAR(20)     NOT NULL,

EMP_CITY       VARCHAR(15)     NOT NULL,

EMP_ST         CHAR(2)         NOT NULL,

EMP_ZIP        NUMBER(5)       NOT NULL,

EMP_PHONE      NUMBER(10)      NULL,

EMP_PAGER      NUMBER(10)      NULL)

STORAGE

    (INITIAL     3K

     NEXT        2K );

In some implementations, there are several options available in the STORAGE clause. INITIAL allocates a set amount of space in bytes, kilobytes, and so on, for the initial amount of space to be used by a table. The NEXT part of the STORAGE identifies the amount of additional space that should be allocated to the table if it should grow beyond the space allocated for the initial allocation. You find that there are other options available with the STORAGE clause, and remember that these options vary from implementation to implementation. If the STORAGE clause is omitted from most major implementations, there are default storage parameters invoked, which may not be the best for the application.

Notice the neatness of the CREATE TABLE statement. This is for ease of reading and error resolution. Indentation has been used to help .

Note

The STORAGE clause differs between relational database implementations of SQL. The previous example used Oracle's STORAGE clause, which was added to the CREATE TABLE statement. Remember that the ANSI standard for SQL is just that, a standard. The standard is not a language itself, but guidelines on how vendors should develop their SQL implementation. You also find that data types differ between implementations. Most issues concerning the actual storage and processing of data are implementation-specific.

Naming Conventions

When selecting names for objects, specifically tables and columns, the name should reflect the data that is to be stored. For example, the name for a table pertaining to employee information could be named EMPLOYEE_TBL. Names for columns should follow the same logic. When storing an employee's phone number, an obvious name for that column would be PHONE_NUMBER.

Note

Check your particular implementation for name length limits and characters that are allowed; they could differ from implementation to implementation.

The `ALTER TABLE` Command

A table can be modified through the use of the ALTER TABLE command after that table's creation. You can add column(s), drop column(s), change column definitions, add and drop constraints, and, in some implementations, modify table STORAGE values. The standard syntax for the ALTER TABLE command follows:

ALTER TABLE TABLE NAME [MODIFY] [COLUMN COLUMN_NAME][DATATYPE|NULL NOT NULL]

[RESTRICT|CASCADE]

                       [DROP]   [CONSTRAINT CONSTRAINT_NAME]

                       [ADD]    [COLUMN] COLUMN DEFINITION

Modifying Elements of a Table

The attributes of a column refer to the rules and behavior of data in a column. You can modify the attributes of a column with the ALTER TABLE command. The word attributes here refers to the following:

· The data type of a column

· The length, precision, or scale of a column

· Whether the column can contain NULL values

The following example uses the ALTER TABLE command on EMPLOYEE_TBL to modify the attributes of the column EMP_ID:

ALTER TABLE EMPLOYEE_TBL MODIFY (EMP_ID VARCHAR2(10));

Table altered.

The column was already defined as data type VARCHAR2 (a varying-length character), but you increased the maximum length from 9 to 10.

Adding Mandatory Columns to a Table

One of the basic rules for adding columns to an existing table is that the column you are adding cannot be defined as NOT NULL if data currently exists in the table. NOT NULL means that a column must contain some value for every row of data in the table, so if you are adding a column defined as NOT NULL, you are contradicting the NOT NULL constraint right off the bat if the preexisting rows of data in the table do not have values for the new column.

There is, however, a way to add a mandatory column to a table:

1. Add the column and define it as NULL (the column does not have to contain a value).

2. Insert a value into the new column for every row of data in the table.

3. After ensuring that the column contains a value for every row of data in the table, you can alter the table to change the column's attribute to NOT NULL.

Modifying Columns

There are many things to take into consideration when modifying existing columns of a table .

Common rules for modifying columns:

· The length of a column can be increased to the maximum length of the given data type.

· The length of a column can be decreased only if the largest value for that column in the table is less than or equal to the new length of the column.

· The number of digits for a number data type can always be increased.

· The number of digits for a number data type can be decreased only if the value with the most number of digits for that column is less than or equal to the new number of digits specified for the column.

· The number of decimal places for a number data type can either be increased or decreased.

· The data type of a column can normally be changed.

Some implementations may actually restrict you from using certain ALTER TABLE options. For example, you may not be allowed to drop columns from a table . To do this, you would have to drop the table itself, and then rebuild the table with the desired columns. You could run into problems by dropping a column in one table that is dependent on a column in another table, or a column that is referenced by a column in another table. Be sure to refer to your specific implementation documentation .

Creating a Table from an Existing Table

A copy of an existing table can be created using a combination of the CREATE TABLE statement and the SELECT statement. The new table has the same column definitions. All columns or specific columns can be selected. New columns that are created via functions or a combination of columns automatically assume the size necessary to hold the data. The basic syntax for creating a table from another table is as follows:

CREATE TABLE NEW_TABLE_NAME AS

SELECT [ *|COLUMN1, COLUMN2 ]

FROM TABLE NAME

[ WHERE ]

Notice some new keywords in the syntax, particularly the SELECT keyword. SELECT is a database query, and is discussed in more detail later. However, it is important to know that you can create a table based on the results from a query.

First, do a simple query to view the data in the PRODUCTS_TBL table .

Select * From Products_TBL;

PROD_ID    PROD_DESC                      COST

---------- ----------------------------- ------

11235      WITCHES COSTUME               29.99

222        PLASTIC PUMPKIN 18 INCH        7.75

13         FALSE PARAFFIN TEETH           1.1

90         LIGHTED LANTERNS              14.5

15         ASSORTED COSTUMES             10

9          CANDY CORN                     1.35

6          PUMPKIN CANDY                  1.45

87         PLASTIC SPIDERS                1.05

119        ASSORTED MASKS                 4.95

Note

SELECT * selects data from all fields in the given table. The * represents a complete row of data, or record, in the table.

Next, create a table called PRODUCTS_TMP based on the previous query :

create table products_tmp as

select * from products_tbl;

Table created.

Now, if you run a query on the PRODUCTS_TMP table, your results appear the same as if you had selected data from the original table .

SELECT *

FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                      COST

---------- ----------------------------- ------

11235      WITCHES COSTUME               29.99

222        PLASTIC PUMPKIN 18 INCH        7.75

13         FALSE PARAFFIN TEETH           1.1

90         LIGHTED LANTERNS              14.5

15         ASSORTED COSTUMES             10

9          CANDY CORN                     1.35

6          PUMPKIN CANDY                  1.45

87         PLASTIC SPIDERS                1.05

119        ASSORTED MASKS                 4.95

Note

When creating a table from an existing table, the new table takes on the same STORAGE attributes as the original table .

Dropping Tables

Dropping a table is actually one of the easiest things to do. When the RESTRICT option is used and the table is referenced by a view or constraint, the DROP statement returns an error. When the CASCADE option is used, the drop succeeds and all referencing views and constraints are dropped. The syntax to drop a table follows :

DROP TABLE TABLE_NAME [ RESTRICT|CASCADE ]

In the following example, you drop the table that you just created:

DROP TABLE PRODUCTS_USER1.TMP;

Table dropped.

Caution

Whenever dropping a table, be sure to specify the schema name or owner of the table before submitting your command. You could drop the incorrect table. If you have access to multiple user accounts, ensure that you are connected to the database through the correct user account before dropping tables.

Integrity Constraints

Integrity constraints are used to ensure accuracy and consistency of data in a relational database. Data integrity is handled in a relational database through the concept of referential integrity. There are many types of integrity constraints that play a role in referential integrity (RI).

Primary Key Constraints

Primary key is the term used to identify one or more columns in a table that make a row of data unique. Although the primary key typically consists of one column in a table, more than one column can comprise the primary key. For example, either the employee's Social Security number or an assigned employee identification number is the logical primary key for an employee table. The objective is for every record to have a unique primary key or value for the employee's identification number. Because there is probably no need to have more than one record for each employee in an employee table, the employee identification number makes a logical primary key. The primary key is assigned at table creation.

The following example identifies the EMP_ID column as the PRIMARY KEY for the EMPLOYEES table :

CREATE TABLE EMPLOYEE_TBL

(EMP_ID        CHAR(9)        NOT NULL PRIMARY KEY,

EMP_NAME       VARCHAR2(40)   NOT NULL,

EMP_ST_ADDR    VARCHAR2(20)   NOT NULL,

EMP_CITY       VARCHAR2(15)   NOT NULL,

EMP_ST         CHAR(2)        NOT NULL,

EMP_ZIP        NUMBER(5)      NOT NULL,

EMP_PHONE      NUMBER(10)     NULL,

EMP_PAGER      NUMBER(10)     NULL);

This method of defining a primary key is accomplished during table creation. The primary key in this case is an implied constraint. You can also specify a primary key explicitly as a constraint when setting up a table, as follows :

CREATE TABLE EMPLOYEE_TBL

(EMP_ID        CHAR(9)        NOT NULL,

EMP_NAME       VARCHAR2(40)   NOT NULL,

EMP_ST_ADDR    VARCHAR2(20)   NOT NULL,

EMP_CITY       VARCHAR2(15)   NOT NULL,

EMP_ST         CHAR(2)        NOT NULL,

EMP_ZIP        NUMBER(5)      NOT NULL,

EMP_PHONE      NUMBER(10)     NULL,

EMP_PAGER      NUMBER(10)     NULL,

PRIMARY KEY (EMP_ID));

The primary key constraint in this example is defined after the column comma list in the CREATE TABLE statement.

A primary key that consists of more than one column can be defined by either of the following methods :

CREATE TABLE PRODUCTS

(PROD_ID       VARCHAR2(10)     NOT NULL,

 VEND_ID       VARCHAR2(10)     NOT NULL,

 PRODUCT       VARCHAR2(30)     NOT NULL,

 COST              NUMBER(8,2)  NOT NULL,

PRIMARY KEY (PROD_ID, VEND_ID));

ALTER TABLE PRODUCTS

ADD CONSTRAINT PRODUCTS_PK PRIMARY KEY (PROD_ID, VEND_ID);

Unique Constraints

A unique column constraint in a table is similar to a primary key in that the value in that column for every row of data in the table must have a unique value. While a primary key constraint is placed on one column, you can place a unique constraint on another column even though it is not actually for use as the primary key.

Study the following example:

CREATE TABLE EMPLOYEE_TBL

(EMP_ID        CHAR(9)        NOT NULL     PRIMARY KEY,

EMP_NAME       VARCHAR2(40)   NOT NULL,

EMP_ST_ADDR    VARCHAR2(20)   NOT NULL,

EMP_CITY       VARCHAR2(15)   NOT NULL,

EMP_ST         CHAR(2)        NOT NULL,

EMP_ZIP        NUMBER(5)      NOT NULL,

EMP_PHONE      NUMBER(10)     NULL         UNIQUE,

EMP_PAGER      NUMBER(10)     NULL);

The primary key in this example is EMP_ID, meaning that the employee identification number is the column that is used to ensure that every record in the table is unique. The primary key is a column that is normally referenced in queries, particularly to join tables. The column EMP_PHONE has been designated as a UNIQUE value, meaning that no two employees can have the same telephone number. There is not a lot of difference between the two, except that the primary key is used to provide an order to data in a table and, in the same respect, join related tables .

Foreign Key Constraints

A foreign key is a column in a child table that references a primary key in the parent table. A foreign key constraint is the main mechanism used to enforce referential integrity between tables in a relational database. A column defined as a foreign key is used to reference a column defined as a primary key in another table.

Study the creation of the foreign key in the following example:

CREATE TABLE EMPLOYEE_PAY_TBL

(EMP_ID           CHAR(9)        NOT NULL,

POSITION           VARCHAR2(15)   NOT NULL,

DATE_HIRE          DATE           NULL,

PAY_RATE           NUMBER(4,2)    NOT NULL,

DATE_LAST_RAISE    DATE           NULL,

CONSTRAINT EMP_ID_FK FOREIGN KEY (EMP_ID) REFERENCES EMPLOYEE_TBL (EMP_ID));

The EMP_ID column in this example has been designated as the foreign key for the EMPLOYEE_PAY_TBL table. This foreign key, as you can see, references the EMP_ID column in the EMPLOYEE_TBL table. This foreign key ensures that for every EMP_ID in the EMPLOYEE_PAY_TBL, there is a corresponding EMP_ID in the EMPLOYEE_TBL. This is called a parent/child relationship. The parent table is the EMPLOYEE_TBL table, and the child table is the EMPLOYEE_PAY_TBL table.

In this figure, the EMP_ID column in the child table references the EMP_ID column in the parent table. In order for a value to be inserted for EMP_ID in the child table, there must first exist a value for EMP_ID in the parent table. Likewise, for a value to be removed for EMP_ID in the parent table, all corresponding values for EMP_ID must first be removed from the child table. This is how referential integrity works. End of the Block Missing.

A foreign key can be added to a table using the ALTER TABLE command, as shown in the following example:

ALTER TABLE EMPLOYEE_PAY_TBL

ADD CONSTRAINT ID_FK FOREIGN KEY (EMP_ID)

REFERENCES EMPLOYEE TBL (EMP_ID);

Note

The options available with the ALTER TABLE command differ among different implementations of SQL, particularly when dealing with constraints. In addition, the actual use and definitions of constraints also vary, but the concept of referential integrity should be the same with all relational databases.

`NOT NULL` Constraints

Previous examples use the keywords NULL and NOT NULL listed on the same line as each column and after the data type. NOT NULL is a constraint that you can place on a table's column. This constraint disallows the entrance of NULL values into a column; in other words, data is required in a NOT NULL column for each row of data in the table. NULL is generally the default for a column if NOT NULL is not specified, allowing NULL values in a column .

Using Check (`CHK`) Constraints

Check constraints can be utilized to check the validity of data entered into particular table columns. Check constraints are used to provide back-end database edits, although edits are commonly found in the front-end application as well. General edits restrict values that can be entered into columns or objects, whether within the database itself or on a front-end application. The check constraint is a way of providing another protective layer for the data .

The following example illustrates the use of a check constraint:

CREATE TABLE EMPLOYEE_TBL

(EMP_ID        CHAR(9)        NOT NULL,

EMP_NAME       VARCHAR2(40)   NOT NULL,

EMP_ST_ADDR    VARCHAR2(20)   NOT NULL,

EMP_CITY       VARCHAR2(15)   NOT NULL,

EMP_ST         CHAR(2)        NOT NULL,

EMP_ZIP        NUMBER(5)      NOT NULL,

EMP_PHONE      NUMBER(10)     NULL,

EMP_PAGER      NUMBER(10)     NULL),

PRIMARY KEY (EMP_ID),

CONSTRAINT CHK_EMP_ZIP CHECK ( EMP_ZIP = '46234');

The check constraint in this table has been placed on the EMP_ZIP column, ensuring that all employees entered into this table have a ZIP code of '46234'. Perhaps that is a little restricting. Nevertheless, you can see how it works.

If you wanted to use a check constraint to verify that the ZIP code is within a list of values, your constraint definition could look like the following :

CONSTRAINT CHK_EMP_ZIP CHECK ( EMP_ZIP in ('46234','46227','46745') );

If there is a minimum pay rate that can be designated for an employee, you could have a constraint that looks like the following:

CREATE TABLE EMPLOYEE_PAY_TBL

(EMP_ID            CHAR(9)        NOT NULL,

POSITION           VARCHAR2(15)   NOT NULL,

DATE_HIRE          DATE           NULL,

PAY_RATE           NUMBER(4,2)    NOT NULL,

DATE_LAST_RAISE    DATE           NULL,

CONSTRAINT  EMP_ID_FK FOREIGN KEY (EMP_ID) REFERENCES EMPLOYEE_TBL (EMP_ID),

CONSTRAINT CHK_PAY CHECK ( PAY_RATE > 12.50 ) );

In this example, any employee entered in this table must be paid more than $12.50 an hour. You can use just about any condition in a check constraint, as you can with an SQL query. You will learn more about these conditions later.

Dropping Constraints

Any constraint that you have defined can be dropped using the ALTER TABLE command with the DROP CONSTRAINT option. For example, to drop the primary key constraint in the EMPLOYEES table, you can use the following command:

ALTER TABLE EMPLOYEES DROP CONSTRAINT EMPLOYEES_PK;

Table altered.

Some implementations may provide shortcuts for dropping certain constraints. For example, to drop the primary key constraint for a table in Oracle, you can use the following command :

ALTER TABLE EMPLOYEES DROP PRIMARY KEY;

Table altered.

Note

Some implementations allow you to disable constraints. Instead of permanently dropping a constraint from the database, you may want to temporarily disable the constraint, and then enable it later .

Overview of Data Manipulation

Data Manipulation Language (DML) is the part of SQL that allows a database user to actually propagate changes among data in a relational database. With DML, the user can populate tables with new data, update existing data in tables, and delete data from tables. Simple database queries can also be performed within a DML command.

There are three basic DML commands in SQL:

INSERT

UPDATE

DELETE

Populating Tables with New Data

Populating a table with data is simply the process of entering new data into a table, whether through a manual process using individual commands or through batch processes using programs or other related software .

Many factors can affect what data and how much data can be put into a table when populating tables with data. Some major factors include existing table constraints, the physical table size, column data types, the length of columns, and other integrity constraints, such as primary and foreign keys. The following sections help you learn the basics of inserting new data into a table, in addition to offering some Dos and Don'ts.

Note

Do not forget that SQL statements can be in upper- or lowercase. The data, depending on how it is stored in the database, is not case-sensitive. These examples use both lower- and uppercases just to show that it does not affect the outcome.

Inserting Data into a Table

Use the INSERT statement to insert new data into a table. There are a few options with the INSERT statement; look at the following basic syntax to begin :

insert into schema.table_name

VALUES ('value1', 'value2', [ NULL ] );

Using this INSERT statement syntax, you must include every column in the specified table in the VALUES list. Notice that each value in this list is separated by a comma. The values inserted into the table must be enclosed by quotation marks for character and date data types. Quotation marks are not required for numeric data types or NULL values using the NULL keyword. A value should be present for each column in the table .

In the following example, you insert a new record into the PRODUCTS_TBL table.

Table structure:

products_tbl

COLUMN Name                     Null?    DATA Type

------------------------------ -------- -------------

PROD_ID                         NOT NULL VARCHAR2(10)

PROD_DESC                       NOT NULL VARCHAR2(25)

COST                            NOT NULL NUMBER(6,2)

Sample INSERT statement :

INSERT INTO PRODUCTS_TBL

VALUES ('7725','LEATHER GLOVES',24.99);

001 1 row created.

In this example, you insert three values into a table with three columns. The inserted values are in the same order as the columns listed in the table. The first two values are inserted using quotation marks, because the data types of the corresponding columns are of character type. The third value's associated column, COST, is a numeric data type and does not require quotation marks, although they can be used .

Note

The schema name, or table owner, has not been specified as part of the table name, as it was shown in the syntax. The schema name is not required if you are connected to the database as the user who owns the table.

Inserting Data into Limited Columns of a Table

There is a way you can insert data into a table's limited columns. For instance, suppose you want to insert all values for an employee except a pager number. You must, in this case, specify a column list as well as a VALUES list in your INSERT statement.

INSERT INTO EMPLOYEE_TBL

(EMP_ID, LAST_NAME, FIRST_NAME, MIDDLE_NAME, ADDRESS, CITY, STATE, ZIP,

PHONE)

VALUES

('123456789', 'SMITH', 'JOHN', 'JAY', '12 BEACON CT',

'INDIANAPOLIS', 'IN', '46222', '3172996868');

001 1 row created.

The syntax for inserting values into a limited number of columns in a table is as follows :

INSERT INTO SCHEMA TABLE_NAME ('COLUMN1', 'COLUMN2')

VALUES ('VALUE1', 'VALUE2');

You use ORDERS_TBL and insert values into only specified columns in the following example.

Table structure :

ORDERS_TBL

COLUMN NAME                     Null?    DATA TYPE

------------------------------ --------- ------------

ORD_NUM                         NOT NULL VARCHAR2(10)

CUST_ID                         NOT NULL VARCHAR2(10)

PROD_ID                         NOT NULL VARCHAR2(10)

QTY                             NOT NULL NUMBER(4)

ORD_DATE                                 DATE

Sample INSERT statement:

insert into orders_tbl (ord_num,cust_id,prod_id,qty)

values ('23A16','109','7725',2);

001 1 row created.

You have specified a column list enclosed by parentheses after the table name in the INSERT statement . You have listed all columns into which you want to insert data. ORD_DATE is the only excluded column. You can see, if you look at the table definition, that ORD_DATE does not require data for every record in the table. You know that ORD_DATE does not require data because NOT NULL is not specified in the table definition. NOT NULL tells us that NULL values are not allowed in the column. Furthermore, the list of values must appear in the order in which you want to insert them according to the column list .

Note

The column list in the INSERT statement does not have to reflect the same order of columns as in the definition of the associated table, but the list of values must be in the order of the associated columns in the column list.

Inserting Data from Another Table

You can insert data into a table based on the results of a query from another table using a combination of the INSERT statement and the SELECT statement. Briefly, a query is an inquiry to the database that expects data to be returned. A query is a question that the user asks the database, and the data returned is the answer. In the case of combining the INSERT statement with the SELECT statement, you are able to insert the data retrieved from a query into a table.

The syntax for inserting data from another table is

insert into schema.table_name [('column1', 'column2')]

select [*|('column1', 'column2')]

from table_name

[where condition(s)];

You see three new keywords in this syntax, which are covered here briefly. These keywords are SELECT, FROM, and WHERE. SELECT is the main command used to initiate a query in SQL. FROM is a clause in the query that specifies the names of tables in which the target data should be found. The WHERE clause, also part of the query, is used to place conditions on the query itself. An example condition may state: WHERE NAME = 'SMITH'. A condition is a way of placing criteria on data affected by a SQL statement.

The following example uses a simple query to view all data in the PRODUCTS_TBL table. SELECT * tells the database server that you want information on all columns of the table. Because there is no WHERE clause, you want to see all records in the table as well .

select * from products_tbl;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ -----

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

Now, insert values into the PRODUCTS_TMP table based on the preceding query. You can see that 11 rows are created in the temporary table .

INSERT INTO PRODUCTS_TMP

SELECT * FROM PRODUCTS_TBL;

001 11 rows created.

The following query shows all data in the PRODUCTS_TMP table that you just inserted :

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ -----

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

Inserting `NULL` Values

Inserting a NULL value into a column of a table is a simple matter. You might want to insert a NULL value into a column if the value of the column in question is unknown. For instance, not every person carries a pager, so it would be inaccurate to enter an erroneous pager number—not to mention, you would not be budgeting space. A NULL value can be inserted into a column of a table using the keyword NULL.

The syntax for inserting a NULL value follows:

insert into schema.table_name values

('column1', NULL, 'column3');

The NULL keyword should be used in the associated column that exists in the table. That column will not have data in it for that row if you enter NULL. In the syntax, a NULL value is being entered in the place of COLUMN2.

Study the two following examples:

INSERT INTO ORDERS_TBL (ORD_NUM,CUST_ID,PROD_ID,QTY,ORD_DATE)

VALUES ('23A16','109','7725',2,NULL);

001 1 row created.

In the first example, all columns in which to insert values are listed, which also happen to be every column in the ORDERS_TBL table. You insert a NULL value for the ORD_DATE column, meaning that you either do not know the order date, or there is no order date at this time.

INSERT INTO ORDERS_TBL

VALUES ('23A16','109','7725',2, '');

001 1 row created.

There are two differences from the first statement in the second example, but the results are the same. First, there is not a column list. Remember that a column list is not required if you are inserting data into all columns of a table. Second, instead of inserting the value NULL into the ORD_DATE column, you insert '' (two single quotation marks together), which also symbolizes a NULL value (because there is nothing between them) .

Updating Existing Data

Pre-existing data in a table can be modified using the UPDATE command. The UPDATE command does not add new records to a table, nor does it remove records—it simply updates existing data. The update is generally used to update one table at a time in a database, but can be used to update multiple columns of a table at the same time. An individual row of data in a table can be updated, or numerous rows of data can be updated in a single statement, depending on what's needed.

Updating the Value of a Single Column

The most simple form of the UPDATE statement is its use to update a single column in a table. Either a single row of data or numerous records can be updated when updating a single column in a table.

The syntax for updating a single column follows:

update table_name

set column_name = 'value'

[where condition];

The following example updates the QTY column in the ORDERS table to the new value 1 for the ORD_NUM 23A16, which you have specified using the WHERE clause .

UPDATE ORDERS_TBL

SET QTY = 1

WHERE ORD_NUM = '23A16';

001 1 row updated.

The following example is identical to the previous example, except for the absence of the WHERE clause :

UPDATE ORDERS_TBL

SET QTY = 1;

001 11 rows updated.

Notice that in this example, 11 rows of data were updated. You set the QTY to 1, which updated the quantity column in the ORDERS_TBL table for all rows of data. Is this really what you wanted to do? Perhaps in some cases, but rarely will you issue an UPDATE statement without a WHERE clause .

Caution

Extreme caution must be used when using the UPDATE statement without a WHERE clause. The target column is updated for all rows of data in the table if conditions are not designated using the WHERE clause.

Updating Multiple Columns in One or More Records

Next, you see how to update multiple columns with a single UPDATE statement. Study the following syntax :

update table_name

set column1 = 'value',

   [column2 = 'value',]

   [column3 = 'value']

[where condition];

Notice the use of the SET in this syntax—there is only one SET, but multiple columns. Each column is separated by a comma. You should start to see a trend in SQL. The comma is usually used to separate different types of arguments in SQL statements .

UPDATE ORDERS_TBL

SET QTY = 1,

    CUST_ID = '221'

WHERE ORD_NUM = '23A16';

001 1 row updated.

A comma is used to separate the two columns being updated. Again, the WHERE clause is optional, but usually necessary.

Note

The SET keyword is used only once for each UPDATE statement. If more than one column is to be updated, a comma is used to separate the columns to be updated .

Deleting Data from Tables

The DELETE command is used to remove entire rows of data from a table. The DELETE command is not used to remove values from specific columns; a full record, including all columns, is removed. The DELETE statement must be used with caution—it works all too well. The next section discusses methods for deleting data from tables.

To delete a single record or selected records from a table, the DELETE statement must be used with the following syntax :

delete from schema.table_name

[where condition];

DELETE FROM ORDERS_TBL

WHERE ORD_NUM = '23A16';

001 1 row deleted.

Notice the use of the WHERE clause. The WHERE clause is an essential part of the DELETE statement if you are attempting to remove selected rows of data from a table. You rarely issue a DELETE statement without the use of the WHERE clause. If you do, your results are similar to the following example :

DELETE FROM ORDERS_TBL;

11 rows deleted.

Caution

If the WHERE clause is omitted from the DELETE statement, all rows of data are deleted from the table. As a general rule, always use a WHERE clause with the DELETE statement .

Note

The temporary table that was populated from the original table earlier in this lesson can be very useful for testing the DELETE and UPDATE commands before issuing them against the original table .

What Is a Transaction?

A transaction is a unit of work that is performed against a database. Transactions are units or sequences of work accomplished in a logical order, whether in a manual fashion by a user or automatically by some sort of a database program. In a relational database using SQL, transactions are accomplished using the DML commands (INSERT, UPDATE, and DELETE. ) A transaction is the propagation of one or more changes to the database. For instance, you are performing a transaction if you performed an UPDATE statement on a table to change an individual's name.

A transaction can either be one DML statement or a group of statements. When managing groups of transactions, each designated group of transactions must be successful as one entity or none of them will be successful.

The following list describes the nature of transactions:

· All transactions have a beginning and an end.

· A transaction can be saved or undone.

· If a transaction fails in the middle, no part of the transaction can be saved to the database.

Note

To start or execute transactions is implementation-specific. You must check your particular implementation for how to begin transactions. There is no explicit start or begin transaction in the ANSI standard . What Is Transactional Control?

Transactional control is the ability to manage various transactions that may occur within a relational database management system. When you speak of transactions, you are referring to the INSERT, UPDATE, and DELETE commands, which were covered during this lesson.

When a transaction is executed and completes successfully, the target table is not immediately changed, although it may appear so according to the output. When a transaction successfully completes, there are transactional control commands that are used to finalize the transaction, either saving the changes made by the transaction to the database or reversing the changes made by the transaction.

There are three commands used to control transactions :

· COMMIT

· ROLLBACK

· SAVEPOINT

Each of these is discussed in detail in the following sections.

Note

Transactional control commands are only used with the DML commands INSERT, UPDATE, and DELETE. For example, you do not issue a COMMIT statement after creating a table. When the table is created, it is automatically committed to the database. Likewise, you cannot issue a ROLLBACK to replenish a table that was just dropped.

When a transaction has completed, the transactional information is stored either in an allocated area or in a temporary rollback area in the database. All changes are held in this temporary rollback area until a transactional control command is issued. When a transactional control command is issued, changes are either made to the database or discarded; then, the temporary rollback area is emptied.

The `COMMIT` Command

The COMMIT command is the transactional command used to save changes invoked by a transaction to the database. The COMMIT command saves all transactions to the database since the last COMMIT or ROLLBACK command .

The syntax for this command is

commit [ work ];

The keyword COMMIT is the only mandatory part of the syntax, along with the character or command used to terminate a statement according to each implementation. WORK is a keyword that is completely optional; its only purpose is to make the command more user-friendly.

In the following example, you begin by selecting all data from the PRODUCT_TMP table :

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH             1.1

90         LIGHTED LANTERNS                14.5

15         ASSORTED COSTUMES                 10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

Next, you delete all records from the table where the product cost is less than $14.00 .

DELETE FROM PRODUCTS_TMP

WHERE COST < 14;

001 8 rows deleted.

A COMMIT statement is issued to save the changes to the database, completing the transaction .

COMMIT;

Commit complete.

Caution

Frequent COMMITs in large loads or unloads of the database are highly recommended; however, too many COMMITs cause the job running to take a lot of extra time to complete. Remember that all changes are sent to the temporary rollback area first. If this temporary rollback area runs out of space and cannot store information about changes made to the database, the database will probably halt, disallowing further transactional activity .

Note

In some implementations, transactions are committed without issuing the COMMIT command—instead, merely signing out of the database causes a commit to occur.

The `ROLLBACK` Command

The ROLLBACK command is the transactional control command used to undo transactions that have not already been saved to the database. The ROLLBACK command can only be used to undo transactions since the last COMMIT or ROLLBACK command was issued .

The syntax for the ROLLBACK command is as follows:

rollback [ work ];

Once again, as in the COMMIT statement, the WORK keyword is an optional part of the ROLLBACK syntax.

In the following example, you begin by selecting all records from the PRODUCTS_TMP table since the previous deletion of 14 records:

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

90         LIGHTED LANTERNS                14.5

2345       OAK BOOKSHELF                  59.99

3 rows selected.

Next, you update the table, changing the product cost to $39.99 for the product identification number 11235:

UPDATE PRODUCT_TMP;

SET COST = 39.99

WHERE PROD_ID = '11235';

001 1 row updated.

If you perform a quick query on the table, the change appears to have occurred :

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                39.99

90         LIGHTED LANTERNS                14.5

2345       OAK BOOKSHELF                  59.99

3 rows selected.

Now, issue the ROLLBACK statement to undo the last change :

rollback;

Rollback complete.

Finally, verify that the change was not committed to the database :

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

90         LIGHTED LANTERNS                14.5

2345       OAK BOOKSHELF                  59.99

3 rows selected

The `SAVEPOINT` Command

A SAVEPOINT is a point in a transaction when you can roll the transaction back to a certain point without rolling back the entire transaction .

The syntax for the SAVEPOINT command is

SAVEPOINT SAVEPOINT_NAME

This command serves only in the creation of a SAVEPOINT among transactional statements. The ROLLBACK command is used to undo a group of transactions. The SAVEPOINT is a way of managing transactions by breaking large numbers of transactions into smaller, more manageable groups.

Note

The SAVEPOINT name must be unique to the associated group of transactions. However, the SAVEPOINT can have the same name as a table or other object. Refer to specific implementation documentation for more details on naming conventions .

The `ROLLBACK TO SAVEPOINT` Command

The syntax for rolling back to a SAVEPOINT is as follows:

ROLLBACK TO SAVEPOINT_NAME;

In this example, you plan to delete the remaining three records from the PRODUCTS_TMP table. You want to create a SAVEPOINT before each delete, so that you can ROLLBACK to any SAVEPOINT at any time to return the appropriate data to its original state :

SAVEPOINT sp1;

Savepoint created.

DELETE FROM PRODUCTS_TMP WHERE PROD_ID = '11235';

001 1 row deleted.

SAVEPOINT sp2;

Savepoint created.

DELETE FROM PRODUCTS_TMP WHERE PROD_ID = '90';

001 1 row deleted.

SAVEPOINT sp3;

Savepoint created.

DELETE FROM PRODUCTS_TMP WHERE PROD_ID = '2345';

001 1 row deleted.

Now that the three deletions have taken place, say you have changed your mind and decided to ROLLBACK to the SAVEPOINT that you identified as SP2. Because SP2 was created after the first deletion, the last two deletions are undone :

ROLLBACK TO sp2;

Rollback complete.

Notice that only the first deletion took place since you rolled back to SP2:

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ -----

90         LIGHTED LANTERNS                14.5

2345       OAK BOOKSHELF                  59.99

2 rows selected.

Remember, the ROLLBACK command by itself will roll back to the last COMMIT or ROLLBACK. You have not yet issued a COMMIT, so all deletions are undone, as in the following example :

ROLL BACK;

Rollback complete.

SELECT * FROM PRODUCTS_TMP;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

90         LIGHTED LANTERNS                14.5

2345       OAK BOOKSHELF                  59.99

3 rows selected.

The `RELEASE SAVEPOINT` Command

The RELEASE SAVEPOINT command is used to remove a SAVEPOINT that you have created. Once a SAVEPOINT has been released, you can no longer use the ROLLBACK command to undo transactions performed since the SAVEPOINT.

RELEASE SAVEPOINT SAVEPOINT_NAME;

The `SET TRANSACTION` Command

The SET TRANSACTION command can be used to initiate a database transaction. This command is used to specify characteristics for the transaction that follows. For example, you can specify a transaction to be read only, or read write. For example ,

SET TRANSACTION READ WRITE;

SET TRANSACTION READ ONLY;

There are other characteristics that can be set for a transaction which are out of the scope of this book. For more information, see the documentation for your implementation of SQL.

Transactional Control and Database Performance

Poor transactional control can hurt database performance and even bring the database to a halt. Repeatedly poor database performance may be due to a lack of transactional control during large inserts, updates, or deletes. Not only are large batch processes, such as these, demanding on the CPU and memory themselves, but the temporary storage for rollback information continues to grow until either a COMMIT or ROLLBACK command is issued .

When a COMMIT is issued, rollback transactional information is written to the target table and the rollback information in temporary storage is cleared. When a ROLLBACK is issued, no changes are made to the database and the rollback information in the temporary storage is cleared. If neither a COMMIT or ROLLBACK is issued, the temporary storage for rollback information continues to grow until there is no more space left, thus forcing the database to stop all processes until space is freed.

What Is a Query?

A query is an inquiry into the database using the SELECT statement. A query is used to extract data from the database in a readable format according to the user's request. For instance, if you have an employee table, you might issue a SQL statement that returns the employee who is paid the most. This request to the database for usable employee information is a typical query that can be performed in a relational database.

Introduction to the `SELECT` Statement

The SELECT statement, the command that represents Data Query Language (DQL) in SQL, is the statement used to construct database queries. The SELECT statement is not a standalone statement, which means that clauses are required. In addition to the required clauses, there are optional clauses that increase the overall functionality of the SELECT statement. The SELECT statement is by far one of the most powerful statements in SQL. The FROM clause is the mandatory clause and must always be used in conjunction with the SELECT statement.

There are four keywords, or clauses, that are valuable parts of a SELECT statement. These keywords are as follows:

· SELECT

· FROM

· WHERE

· ORDER BY

Each of these keywords is covered in detail during the following sections.

The `SELECT` Statement

The SELECT statement is used in conjunction with the FROM clause to extract data from the database in an organized, readable format. The SELECT part of the query is for selecting the data you want to see according to the columns in which they are stored in a table .

The syntax for a simple SELECT statement is as follows:

SELECT [ * | ALL | DISTINCT COLUMN1, COLUMN2 ]

FROM TABLE1 [ , TABLE2 ];

The SELECT keyword in a query is followed by a list of columns that you want displayed as part of the query output. The FROM keyword is followed by a list of one or more tables from which you want to select data. The asterisk (*) is used to denote that all columns in a table should be displayed as part of the output. Check your particular implementation for its usage. The ALL option is used to display all values for a column, including duplicates. The DISTINCT option is used to eliminate duplicate rows. The default between DISTINCT and ALL is ALL, which does not have to be specified. Notice that the columns following the SELECT are separated by commas, as is the table list following the FROM.

Note

Commas are used to separate arguments in a list in SQL statements. Some common lists include lists of columns in a query, lists of tables to be selected from in a query, values to be inserted into a table, and values grouped as a condition in a query's WHERE clause .

Arguments are values that are either required or optional to the syntax of a SQL statement or command.

Explore the basic capabilities of the SELECT statement by studying the following examples. First, perform a simple query from the PRODUCTS_TBL table :

SELECT * FROM PRODUCTS_TBL;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

The asterisk represents all columns in the table, which, as you can see, are displayed in the form PROD_ID, PROD_DESC, and COST. Each column in the output is displayed in the order that it appears in the table. There are 11 records in this table, identified by the feedback 11 rows selected. This feedback differs among implementations; for example, another feedback for the same query would be 11 rows affected.

Now select data from another table, CANDY_TBL. Create this table in the image of the PRODUCTS_TBL table for the following examples. List the column name after the SELECT keyword to display only one column in the table :

SELECT PROD_DESC FROM CANDY_TBL;

PROD_DESC

------------------

CANDY CORN

CANDY CORN

HERSHEYS KISS

SMARTIES

4 rows selected.

Four records exist in the CANDY_TBL table. You have used the ALL option in the next statement to show you that the ALL is optional and redundant. There is never a need to specify ALL; it is a default option .

SELECT ALL PROD_DESC

FROM CANDY_TBL;

PROD_DESC

-------------------

CANDY CORN

CANDY CORN

HERSHEYS KISS

SMARTIES

4 rows selected.

The DISTINCT option is used in the following statement to suppress the display of duplicate records. Notice that the value CANDY CORN is only printed once in this example .

SELECT DISTINCT PROD_DESC

FROM CANDY_TBL;

PROD_DESC

------------------

CANDY CORN

HERSHEYS KISS

SMARTIES

3 rows selected.

DISTINCT and ALL can also be used with parentheses enclosing the associated column. The use of parentheses is often used in SQL—as well as many other languages—to improve readability .

SELECT DISTINCT(PROD_DESC)

FROM CANDY_TBL;

PROD_DESC

------------------

CANDY CORN

HERSHEYS KISS

SMARTIES

3 rows selected.

The `FROM` Clause

The FROM clause is always used in conjunction with the SELECT statement. It is a required element for any query. The FROM clause's purpose is to tell the database what table(s) to access to retrieve the desired data for the query. The FROM clause can contain one or more tables.

The syntax for the FROM clause is as follows:

FROM TABLE1 [ , TABLE2 ]

Using Conditions to Distinguish Data

A condition is part of a query that is used to display selective information as specified by the user. The value of a condition is either TRUE or FALSE, thereby limiting the data received from the query. The WHERE clause is used to place conditions on a query by eliminating rows that would normally be returned by a query without conditions.

There can be more than one condition in the WHERE clause. If there is more than one condition, they are connected by the AND and OR operators . As you also learn during the next lesson, there are several conditional operators that can be used to specify conditions in a query. This lesson only deals with a single condition for each query.

An operator is a character or keyword in SQL that is used to combine elements in a SQL statement.

The syntax for the WHERE clause is as follows :

SELECT [ ALL | * | DISTINCT COLUMN1, COLUMN2 ]

FROM TABLE1 [ , TABLE2 ]

WHERE [ CONDITION1 | EXPRESSION1 ]

[ AND CONDITION2 | EXPRESSION2 ]

The following is a simple SELECT without conditions specified by the WHERE clause :

SELECT *

FROM PRODUCTS_TBL;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

Now add a condition for the same query .

SELECT * FROM PRODUCTS_TBL

WHERE COST < 5;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------- -----

13         FALSE PARAFFIN TEETH            1.1

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

5 rows selected.

The only records displayed are those that cost less than $5.

In the following query, you want to display the product description and cost that matches the product identification 119.

SELECT PROD_DESC, COST

FROM PRODUCTS_TBL

WHERE PROD_ID = '119';

PROD_DESC                       COST

------------------------------- -----

ASSORTED MASKS                  4.95

1 row selected.

Sorting Your Output

You usually want your output to have some kind of order. Data can be sorted by using the ORDER BY clause. The ORDER BY clause arranges the results of a query in a listing format you specify. The default ordering of the ORDER BY clause is an ascending order; the sort displays in the order A–Z if it's sorting output names alphabetically. A descending order for alphabetical output would be displayed in the order Z–A. Ascending order for output for numeric values between 1 and 9 would be displayed 1–9; descending order is displayed as 9–1.

The syntax for the ORDER BY is as follows :

SELECT [ ALL | * | DISTINCT COLUMN1, COLUMN2 ]

FROM TABLE1 [ , TABLE2 ]

WHERE [ CONDITION1 | EXPRESSION1 ]

[ AND CONDITION2 | EXPRESSION2 ]

ORDER BY COLUMN1|INTEGER [ ASC|DESC ]

Begin your exploration of the ORDER BY clause with an extension of one of the previous statements. Order by the product description in ascending order or alphabetical order. Note the use of the ASC option. ASC can be specified after any column in the ORDER BY clause .

SELECT PROD_DESC, PROD_ID, COST

FROM PRODUCTS_TBL

WHERE COST < 20

ORDER BY PROD_DESC DESC;

PROD_DESC                 PROD_ID          COST

------------------------- --------------- ------

ASSORTED COSTUMES         15              10

ASSORTED MASKS            119              4.95

CANDY CORN                9                1.35

FALSE PARAFFIN TEETH      13               1.1

LIGHTED LANTERNS          90              14.5

PLASTIC PUMPKIN 18 INCH   222              7.75

PLASTIC SPIDERS           87               1.05

PUMPKIN CANDY             6                1.45

8 rows selected.

Tip

Because ascending order for output is the default, ASC does not have to be specified .

You can use DESC, as in the following statement, if you want the same output to be sorted in reverse alphabetical order .

SELECT PROD_DESC, PROD_ID, COST

FROM PRODUCTS_TBL

WHERE COST < 20

ORDER BY PROD_DESC DESC;

PROD_DESC                 PROD_ID          COST

------------------------- --------------- ------

PUMPKIN CANDY             6                1.45

PLASTIC SPIDERS           87               1.05

PLASTIC PUMPKIN 18 INCH   222              7.75

LIGHTED LANTERNS          90              14.5

FALSE PARAFFIN TEETH      13               1.1

CANDY CORN                9                1.35

ASSORTED MASKS            119              4.95

ASSORTED COSTUMES         15              10

8 rows selected.

There are shortcuts in SQL. A column listed in the ORDER BY clause can be abbreviated with an integer. The INTEGER is a substitution for the actual column name, identifying the position of the column after the SELECT keyword .

An example of using an integer as an identifier in the ORDER BY clause follows:

SELECT PROD_DESC, PROD_ID, COST

FROM PRODUCTS_TBL

WHERE COST < 20

ORDER BY 1;

PROD_DESC                 PROD_ID          COST

------------------------- --------------- ------

ASSORTED COSTUMES         15              10

ASSORTED MASKS            119              4.95

CANDY CORN                9                1.35

FALSE PARAFFIN TEETH      13               1.1

LIGHTED LANTERNS          90              14.5

PLASTIC PUMPKIN 18 INCH   222              7.75

PLASTIC SPIDERS           87               1.05

PUMPKIN CANDY             6                1.45

8 rows selected.

In this query, the integer 1 represents the column PROD_DESC. The integer 2 represents the PROD_ID column, 3 represents the COST column, and so on.

You can order by multiple columns in a query, using either the column name itself or the associated number of the column in the SELECT:

ORDER BY 1,2,3

Columns in an ORDER BY clause are not required to appear in the same order as the associated columns following the SELECT, as shown by the following example :

ORDER BY 1,3,2

Case Sensitivity

Case sensitivity is a very important concept to understand when coding with SQL. Typically, SQL commands and keywords are not case-sensitive, which allows you to enter your commands and keywords in either upper- or lowercase—whatever you prefer. The case may be mixed (both upper- and lowercase for a single word or statement).

Case sensitivity is, however, a factor when dealing with data in SQL. In most situations, data seems to be stored exclusively in uppercase in a relational database to provide data consistency.

For instance, your data would not be consistent if you arbitrarily entered your data using random case:

SMITH

Smith

smith

If the last name was stored as smith and you issued a query as follows, no rows would be returned.

SELECT *

FROM EMPLOYEE_TBL

WHERE LAST_NAME = 'SMITH';

Note

You must use the same case in your query as the data is stored when referencing data in the database. When entering data, consult the rules set forth by your company for the appropriate case to be used .

Examples of Simple Queries

This section provides several examples of queries based on the concepts that have been discussed. The lesson begins with the simplest query you can issue, and builds upon the initial query progressively. You use the EMPLOYEE_TBL table .

Selecting all records from a table and displaying all columns:

SELECT * FROM EMPLOYEE_TBL;

Selecting all records from a table and displaying a specified column:

SELECT EMP_ID

FROM EMPLOYEE_TBL;

Selecting all records from a table and displaying a specified column. You can enter code on one line or use a carriage return as desired :

SELECT EMP_ID FROM EMPLOYEE_TBL;

Selecting all records from a table and displaying multiple columns separated by commas:

SELECT EMP_ID, LAST_NAME

FROM EMPLOYEE_TBL;

Displaying data for a given condition :

SELECT EMP_ID, LAST_NAME

FROM EMPLOYEE_TBL

WHERE EMP_ID = '333333333';

Displaying data for a given condition and sorting the output:

SELECT EMP_ID, LAST_NAME

FROM EMPLOYEE_TBL

WHERE CITY = 'INDIANAPOLIS'

ORDER BY EMP_ID;

Displaying data for a given condition and sorting the output on multiple columns, one column sorted in reverse order :

SELECT EMP_ID, LAST_NAME

FROM EMPLOYEE_TBL

WHERE CITY = 'INDIANAPOLIS'

ORDER BY EMP_ID, LAST_NAME DESC;

Displaying data for a given condition and sorting the output using an integer in the place of the spelled-out column name:

SELECT EMP_ID, LAST_NAME

FROM EMPLOYEE_TBL

WHERE CITY = 'INDIANAPOLIS'

ORDER BY 1;

Displaying data for a given condition and sorting the output by multiple columns using integers, the order of the columns in the sort is different than their corresponding order after the SELECT keyword :

SELECT EMP_ID, LAST_NAME

FROM EMPLOYEE_TBL

WHERE CITY = 'INDIANAPOLIS'

ORDER BY 2, 1;

Note

When selecting all rows of data from a large table, the results could render a substantial amount of data returned .

Counting the Records in a Table

A simple query can be issued on a table to get a quick count on the number of records in the table or on the number of values for a column in the table. A count is accomplished by the function COUNT. Although functions are not discussed until later in this book, this function should be introduced here because it is often a part of one of the simplest queries that you can create .

The syntax of the COUNT function is as follows:

SELECT COUNT(*)

FROM TABLE_NAME;

The COUNT function is used with parentheses, which are used to enclose the target column to count or the asterisk to count all rows of data in the table.

Counting the number of records in the PRODUCTS_TBL table :

SELECT COUNT(*) FROM PRODUCTS_TBL;

COUNT(*)

----------

1 row selected.

Counting the number of values for PROD_ID in the PRODUCTS_TBL table :

SELECT COUNT(PROD_ID) FROM PRODUCTS_TBL;

COUNT(PROD_ID)

---------------

1 row selected.

Note

Counting the number of values for a column is the same as counting the number of records in a table, if the column being counted is NOT NULL (a required column).

Selecting Data from Another User's Table

Permission must be granted to a user to access another user's table. If no permission has been granted, access is not allowed by users that do not own the table. You can select data from another user's table after access has been granted (the GRANT command is used to select from another user's table. To access another user's table in a SELECT statement, you must precede the table name with the schema name or the username that owns the table, as in the following example:

SELECT EMP_ID

FROM SCHEMA.EMPLOYEE_TBL;

Note

If a synonym exists in the database for the table to which you desire access, you do not have to specify the schema name for the table. Synonyms are alternate names for tables.

Column Aliases

Column aliases are used to rename a table's columns for the purpose of a particular query. The PRODUCTS_TBL illustrates the use of column aliases.

SELECT COLUMN_NAME ALIAS_NAME

FROM TABLE_NAME;

The following example displays the product description twice, giving the second column an alias named PRODUCT. Notice the column headers in the output .

SELECT PROD_DESC,

       PROD_DESC PRODUCT

FROM PRODUCTS_TBL;

PROD_DESC                 PRODUCT

------------------------- ------------------------

WITCHES COSTUME           WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH   PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH      FALSE PARAFFIN TEETH

LIGHTED LANTERNS          LIGHTED LANTERNS

ASSORTED COSTUMES         ASSORTED COSTUMES

CANDY CORN                CANDY CORN

PUMPKIN CANDY             PUMPKIN CANDY

PLASTIC SPIDERS           PLASTIC SPIDERS

ASSORTED MASKS            ASSORTED MASKS

1234                      KEY CHAIN

2345                      OAK BOOKSHELF

11 rows selected.

Column aliases can be used to customize names for column headers, and can also be used to reference a column with a shorter name in some SQL implementations .

Note

When a column is renamed in a SELECT statement, the name is not a permanent change. The change is for that particular SELECT statement .

What Is an Operator in SQL?

An operator is a reserved word or a character used primarily in an SQL statement's WHERE clause to perform operation(s), such as comparisons and arithmetic operations. Operators are used to specify conditions in an SQL statement and to serve as conjunctions for multiple conditions in a statement.

The operators discussed during this lesson are

· Comparison operators

· Logical operators

· Operators used to negate conditions

· Arithmetic operators

Comparison Operators

Comparison operators are used to test single values in an SQL statement. The comparison operators discussed consist of =, <>, <, and >.

These operators are used to test

· Equality

· Non-equality

· Less-than values

· Greater-than values

Examples and the meanings of comparison operators are covered in the following sections.

Equality

The equal operator compares single values to one another in an SQL statement. The equal sign (=) symbolizes equality. When testing for equality, the compared values must match exactly or no data is returned. If two values are equal during a comparison for equality, the returned value for the comparison is TRUE; the returned value is FALSE if equality is not found. This Boolean value (TRUE/FALSE) is used to determine whether data is returned according to the condition.

The = operator can be used by itself or combined with other operators. An example and the meaning of the equality operator follows:

Example	Meaning
`WHERE SALARY = '20000'`	Salary equals 20000

The following query returns all rows of data where the PROD_ID is equal to 2345:

SELECT *

FROM PRODUCTS_TBL

WHERE PROD_ID = '2345';

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

2345       OAK BOOKSHELF                  59.99

1 row selected.

Non-Equality

For every equality, there is a non-equality. In SQL, the operator used to measure non-equality is <> (the less-than sign combined with the greater-than sign). The condition returns TRUE if the condition finds non-equality; FALSE is returned if equality is found .

Note

Another option comparable to <> is !=. Many of the major implementations have adopted != to represent not-equal. Check your particular implementation for the usage.

Example	Meaning
`WHERE SALARY <> '20000'`	Salary does not equal 20000

SELECT *

FROM PRODUCTS_TBL

WHERE PROD_ID <> '2345';

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235     WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

Less-Than, Greater-Than

The symbols < (less-than) and > (greater-than) can be used by themselves, or in combination with each other or other operators.

Example	Meaning
`WHERE SALARY <'20000'`	Salary is less than 20000
`WHERE SALARY > '20000'`	Salary is greater than 20000

In the first example, anything less-than and not equal to 20000 returns TRUE. Any value of 20000 or more returns FALSE. Greater-than works the opposite of less-than.

SELECT *

FROM PRODUCTS_TBL

WHERE COST > 20;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

2345       OAK BOOKSHELF                  59.99

2 rows selected.

In the next example, notice that the value 24.99 was not included in the query's result set. The less-than operator is not inclusive.

SELECT *

FROM PRODUCTS_TBL

WHERE COST < 24.99;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

9 rows selected.

Combination Examples of Comparison Operators

The equal operator can be combined with the less-than and greater-than operators, as in the following examples):

Example	Meaning
`WHERE SALARY <= '20000'`	Salary less-than or equal-to
`WHERE SALARY >= '20000'`	Salary greater-than or equal-to

Less-than or equal-to 20000 includes 20000 and all values less than 20000. Any value in that range returns TRUE; any value greater than 20000 returns FALSE. Greater-than or equal-to also includes the value 20000 in this case and works the same as the less-than or equal-to .

SELECT *

FROM PRODUCTS_TBL

WHERE COST <= 24.99;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

9 rows selected.

Logical Operators

Logical operators are those operators that use SQL keywords to make comparisons instead of symbols. The logical operators covered in the following subsections are

· IS NULL

· BETWEEN

· IN

· LIKE

· EXISTS

· UNIQUE

· ALL and ANY

`IS NULL`

The NULL operator is used to compare a value with a NULL value. For example, you might look for employees who do not have a pager by searching for NULL values in the PAGER column of the EMPLOYEE_TBL table .

The following example shows comparing a value to a NULL value:

Example	Meaning
`WHERE SALARY IS NULL`	Salary has no value

The following example does not find a NULL value:

Example	Meaning
`WHERE SALARY = NULL`	Salary has a value containing the letters N-U-L-L

SELECT EMP_ID, LAST_NAME, FIRST_NAME, PAGER

FROM EMPLOYEE_TBL

WHERE PAGER IS NULL;

EMP_ID    LAST_NAM FIRST_NA PAGER

--------- -------- -------- -----

311549902 STEPHENS TINA

442346889 PLEW     LINDA

220984332 WALLACE  MARIAH

443679012 SPURGEON TIFFANY

4 rows selected.

Understand that the literal word "null" is different than a NULL value. Examine the following example:

SELECT EMP_ID, LAST_NAME, FIRST_NAME, PAGER

FROM EMPLOYEE_TBL

WHERE PAGER = NULL;

no rows selected.

`BETWEEN`

The BETWEEN operator is used to search for values that are within a set of values, given the minimum value and the maximum value. The minimum and maximum values are included as part of the conditional set .

Example	Meaning
`WHERE SALARY BETWEEN '20000' AND '30000'`	The salary must fall between 20000 and 30000, including the values `20000` and `30000`

SELECT *

FROM PRODUCTS_TBL

WHERE COST BETWEEN 5.95 AND 14.5;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

222        PLASTIC PUMPKIN 18 INCH         7.75

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

1234       KEY CHAIN                       5.95

4 rows selected.

Notice that the values 5.95 and 14.5 are included in the output.

Note

BETWEEN is inclusive and therefore includes the minimum and maximum values in the query results .

`IN`

The IN operator is used to compare a value to a list of literal values that have been specified. For TRUE to be returned, the compared value must match at least one of the values in the list .

Examples	Meaning
`WHERE SALARY IN ('20000', '30000', '40000')`	The salary must match one of the values `20000, 30000,` or `40000`

SELECT *

FROM PRODUCTS_TBL

WHERE PROD_ID IN ('13','9','87','119');

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

119        ASSORTED MASKS                  4.95

87         PLASTIC SPIDERS                 1.05

9          CANDY CORN                       1.35

13         FALSE PARAFFIN TEETH           1.1

4 rows selected.

Using the IN operator can achieve the same results as using the OR operator and can return the results more quickly .

`LIKE`

The LIKE operator is used to compare a value to similar values using wildcard operators. There are two wildcards used in conjunction with the LIKE operator :

· The percent sign (%)

· The underscore (_)

The percent sign represents zero, one, or multiple characters. The underscore represents a single number or character. The symbols can be used in combinations.

Examples are

`WHERE SALARY LIKE '200%'`	Finds any values that start with `200`
`WHERE SALARY LIKE '%200%'`	Finds any values that have `200` in any position
`WHERE SALARY LIKE '_00%'`	Finds any values that have `00` in the second and third positions
`WHERE SALARY LIKE '2_%_%'`	Finds any values that start with `2` and are at least 3 characters in length
`WHERE SALARY LIKE '%2'`	Finds any values that end with `2`
`WHERE SALARY LIKE '_2%3'`	Finds any values that have a `2` in the second position and end with a `3`
`WHERE SALARY LIKE '2___3'`	Finds any values in a five-digit number that start with `2` and end with `3`

The following example shows all product descriptions that end with the letter S:

SELECT PROD_DESC

FROM PRODUCTS_TBL

WHERE PROD_DESC LIKE '%S';

PROD_DESC

------------------

LIGHTED LANTERNS

ASSORTED COSTUMES

PLASTIC SPIDERS

ASSORTED MASKS

4 rows selected.

The following example shows all product descriptions whose second character is the letter S :

SELECT PROD_DESC

FROM PRODUCTS_TBL

WHERE PROD_DESC LIKE '_S%';

PROD_DESC

------------------

ASSORTED COSTUMES

ASSORTED MASKS

2 rows selected.

`EXISTS`

The EXISTS operator is used to search for the presence of a row in a specified table that meets certain criteria .

Example	Meaning
`WHERE EXISTS (SELECT EMP_ID FROM EMPLOYEE_TBL WHERE EMPLOYEE_ID = '333333333')`	Searching to see whether the `EMP_ID 3333333333` is in the `EMPLOYEE_TBL`

The following example is a form of a sub query.

SELECT COST

FROM PRODUCTS_TBL

WHERE EXISTS ( SELECT COST

               FROM PRODUCTS_TBL

               WHERE COST > 100 );

No rows selected.

----------

There were no rows selected because no records existed where the cost was greater than 100 .

Consider the following example:

SELECT COST

FROM PRODUCTS_TBL

WHERE EXISTS ( SELECT COST

               FROM PRODUCTS_TBL

               WHERE COST < 100 );

COST

----------

     29.99

      7.75

1.1

      14.5

      1.35

      1.45

      1.05

      4.95

      5.95

     59.99

11 rows selected.

The cost was displayed for records in the table because records existed where the product cost was less than 100 .

`UNIQUE`

The UNIQUE operator searches every row of a specified table for uniqueness (no duplicates).

Example	Meaning
`WHERE UNIQUE (SELECT SALARY FROM EMPLOYEE_TBL WHERE EMPLOYEE_ID = '333333333')`	Testing `salary` to see whether there are duplicates

`ALL` and `ANY` OPERATORS

The ALL operator is used to compare a value to all values in another value set .

Example	Meaning
`WHERE SALARY > ALL (SELECT SALARY FROM EMPLOYEE TBL WHERE CITY =' INDIANAPOLIS')`	Testing `salary` to see whether it is greater than all salaries of the employees living in Indianapolis

SELECT *

FROM PRODUCTS_TBL

WHERE COST > ALL ( SELECT COST

                   FROM PRODUCTS_TBL

                   WHERE COST < 10 );

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

2345       OAK BOOKSHELF                  59.99

4 rows selected.

In this output, there were five records that had a cost greater than the cost of all records having a cost less than 10 .

The ANY operator is used to compare a value to any applicable value in the list according to the condition .

Example	Meaning
`WHERE SALARY > ANY (SELECT SALARY FROM EMPLOYEE_TBL WHERE CITY = 'INDIANAPOLIS')`	Testing `salary` to see whether it is greater than any of the salaries of employees living in Indianapolis

SELECT *

FROM PRODUCTS_TBL

WHERE COST > ANY ( SELECT COST

                   FROM PRODUCTS_TBL

                   WHERE COST < 10 );

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

10 rows selected.

In this output, more records were returned than when using ALL, because the cost only had to be greater than any of the costs that were less than 10. The one record that was not displayed had a cost of 1.05, which was not greater than any of the values less than 10 (which was, in fact, 1.05).

Conjunctive Operators

What if you want to use multiple conditions to narrow data in an SQL statement? You must be able to combine the conditions, and you do this with what is call conjunctive operators. These operators are

· AND

· OR

These operators provide a means to make multiple comparisons with different operators in the same SQL statement. The following sections describe each operator's behavior.

`AND`

The AND operator allows the existence of multiple conditions in an SQL statement's WHERE clause. For an action to be taken by the SQL statement, whether it be a transaction or query, all conditions separated by the AND must be TRUE.

Example	Meaning
`WHERE EMPLOYEE_ID = '333333333' AND SALARY = '20000'`	The `EMPLOYEE_ID` must match `333333333` and the `salary` must equal `20000`

SELECT *

FROM PRODUCTS_TBL

WHERE COST > 10

  AND COST < 30;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

90         LIGHTED LANTERNS                14.5

2 rows selected.

In this output, the value for cost had to be both greater than 10 and less than 30 for data to be retrieved .

SELECT *

FROM PRODUCTS_TBL

WHERE PROD_ID = '7725'

  AND PROD_ID = '2345';

no rows selected

This output retrieved no data because each row of data has only one product identification .

`OR`

The OR operator is used to combine multiple conditions in an SQL statement's WHERE clause. For an action to be taken by the SQL statement, whether it be a transaction or query, at least one of the conditions that are separated by OR must be TRUE.

Example	Meaning
`WHERE SALARY = '20000' OR SALARY = '30000'`	The `salary` must match either `20000` or `30000`

Note

Each of the comparison and logical operators can be used singularly or in combination with each other .

SELECT *

FROM PRODUCTS_TBL

WHERE PROD_ID = '7725'

   OR PROD_ID = '2345'

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

2345       OAK BOOKSHELF                  59.99

1 row selected.

In this output, either one of the conditions had to be TRUE for data to be retrieved. Two records that met either one or the other condition were found.

Tip

When using multiple conditions and operators in an SQL statement, you may find that it improves overall readability if parentheses are used to separate statements into logical groups. However, be aware that the misuse of parentheses could adversely affect your output results .

In the next example, notice the use of the AND and two OR operators. In addition, notice the logical placement of the parentheses to make the statement more readable .

SELECT *

FROM PRODUCTS_TBL

WHERE COST > 10

  AND ( PROD_ID = '222'

   OR   PROD_ID = '90'

   OR   PROD_ID = '11235' );

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

90         LIGHTED LANTERNS               14.5

2 rows selected.

The cost in this output had to be greater than 10, and the product identification had to be any one of the three listed. A row was not returned for PROD_ID 222, because the cost for this identification was not greater than 10 .

Negating Conditions with the `NOT` Operator

Of all the conditions tested by the logical operators discussed here, there is a way to negate each one of these operators to change the condition's viewpoint .

The NOT operator reverses the meaning of the logical operator with which it is used. The NOT can be used with the following operators in the following methods:

· NOT EQUAL

· NOT BETWEEN

· NOT IN

· NOT LIKE

· IS NOT NULL

· NOT EXISTS

· NOT UNIQUE

Each method is discussed in the following sections. First, let's look at how to test for inequality.

Not Equal

You have learned how to test for inequality using the <> operator. Inequality is worth mentioning in this section because to test for it, you are actually negating the equality operator. The following is a second method for testing inequality available in some SQL implementations :

Example	Meaning
`WHERE SALARY <>'20000'`	`Salary` does not equal 20000
`WHERE SALARY != '20000'`	`Salary` does not equal 20000

In the second example, you can see that the exclamation mark is used to negate the equality comparison. The use of the exclamation mark is allowed in addition to the standard operator for inequality <> in some implementations.

Note

Check your particular implementation for the use of the exclamation mark to negate the inequality operator.

`NOT BETWEEN`

The BETWEEN operator is negated as follows :

Example	Meaning
`WHERE Salary NOT BETWEEN '20000' AND '30000'`	The value for `salary` cannot fall between `20000` and `30000,` to include the values `20000` and `30000`

SELECT *

FROM PRODUCTS_TBL

WHERE COST NOT BETWEEN 5.95 AND 14.5;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

13         FALSE PARAFFIN TEETH            1.1

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

2345       OAK BOOKSHELF                  59.99

7 rows selected.

Note

Remember that BETWEEN is inclusive; therefore, in the previous example, any rows that equal 5.95 or 14.50 are not included in the query results.

`NOT IN`

The IN operator is negated as NOT IN. All salaries in the following example that are not in the listed values, if any, are returned :

Example	Meaning
`WHERE SALARY NOT IN ('20000', '30000', '40000')`	The `salary` cannot be equal to any of the given values for action to be taken

SELECT *

FROM PRODUCTS_TBL

WHERE PROD_ID NOT IN ('13','9','87','119');

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

6          PUMPKIN CANDY                   1.45

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

7 rows selected.

In this output, records were not displayed for the listed identifications after the NOT IN operator .

`NOT LIKE`

The LIKE, or wildcard, operator is negated as NOT LIKE. When NOT LIKE is used, only values that are not similar are returned. Examples include :

Example	Meaning
`WHERE SALARY NOT LIKE '200%'`	Finds any values that do not start with `200`
`WHERE SALARY NOT LIKE '%200%'`	Finds any values that do not have `200` in any position
`WHERE SALARY NOT LIKE '_00%'`	Finds any values that have `00` starting in the second position
`WHERE SALARY NOT LIKE '2_%_%'`	Does not find any values that start with `2` and have a length of `3` or greater

SELECT PROD_DESC

FROM PRODUCTS_TBL

WHERE PROD_DESC NOT LIKE 'L%';

PROD_DESC

------------------------

WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH

ASSORTED COSTUMES

CANDY CORN

PUMPKIN CANDY

PLASTIC SPIDERS

ASSORTED MASKS

KEY CHAIN

OAK BOOKSHELF

10 rows selected.

In this output, the product descriptions starting with the letter L were not displayed .

`IS NOT NULL`

The IS NULL operator is negated as IS NOT NULL to test for values that are not NULL.

Example	Meaning
`WHERE SALARY IS NOT NULL`	Only `NOT NULL` rows are returned

SELECT EMP_ID, LAST_NAME, FIRST_NAME, PAGER

FROM EMPLOYEE_TBL

WHERE PAGER IS NOT NULL;

EMP_ID    LAST_NAM FIRST_NA PAGER

--------- -------- -------- ----------

213764555 GLASS    BRANDON 3175709980

313782439 GLASS    JACOB    8887345678

2 rows selected.

`NOT EXISTS`

EXISTS is negated as NOT EXISTS.

Example	Meaning
`WHERE NOT EXISTS (SELECT EMP_ID EMP_ID FROM EMPLOYEE_TBL WHERE EMP_ID = '333333333'`	Searching to see whether the `EMP_ID 3333333333` is not in the `EMPLOYEE_TBL`

SELECT MAX(COST)

FROM PRODUCTS_TBL

WHERE NOT EXISTS (SELECT COST

                   FROM PRODUCTS_TBL

                   WHERE COST > 100);

 MAX(COST)

-----------

     59.99

The maximum cost for the table is displayed in this output because there were not any records that existed where the cost was greater than 100 .

`NOT UNIQUE`

The UNIQUE operator is negated as NOT UNIQUE.

Example	Meaning
`WHERE NOT UNIQUE (SELECT SALARY FROM EMPLOYEE_TBL)`	Testing to see whether there are salaries in the table that are not `unique`

Arithmetic Operators

Arithmetic operators are used to perform mathematical functions in SQL—the same as in most other languages. There are four conventional operators for mathematical functions .

+ (addition)

- (subtraction)

* (multiplication)

/ (division)

Addition

Addition is performed through the use of the plus (+) symbol .

Example	Meaning
`SELECT SALARY + BONUS FROM EMPLOYEE_PAY_TBL;`	The `SALARY` column is added with the `BONUS` column for a total for each row of data
`SELECT SALARY FROM EMPLOYEE_PAY_TBL WHERE SALARY + BONUS > '40000';`	Returns all rows that are greater than the total of the `SALARY` and `bonus columns`

Subtraction

Subtraction is performed using the minus (-) symbol .

Example	Meaning
`SELECT SALARY - BONUS FROM EMPLOYEE_PAY_TBL;`	The `BONUS` column is subtracted from the `SALARY` column for the difference
`SELECT SALARY FROM EMPLOYEE_PAY_TBL WHERE SALARY - BONUS > '40000';`	Returns all rows where the `SALARY` minus the `BONUS` is greater than `40000`

Multiplication

Multiplication is performed by using the asterisk (*) symbol .

Example	Meaning
`SELECT SALARY * 10 FROM EMPLOYEE_PAY_TBL;`	The `SALARY` column is multiplied by 10
`SELECT SALARY FROM EMPLOYEE_PAY_TBL WHERE SALARY * 10 > '40000';`	Returns all rows where the product of the `SALARY` multiplied by 10 is greater than `40000`

The pay rate in the following example is multiplied by 1.1, which increases the current pay rate by 10 percent :

SELECT EMP_ID, PAY_RATE, PAY_RATE * 1.1

FROM EMPLOYEE_PAY_TBL

WHERE PAY_RATE IS NOT NULL;

EMP_ID      PAY_RATE PAY_RATE*1.1

----------- -------- ------------

442346889      14.75       16.225

220984332         11         12.1

443679012         15         16.5

3 rows selected.

Division

Division is performed through the use of the slash (/) symbol .

Example	Meaning
`SELECT SALARY / 10 FROM EMPLOYEE_PAY_TBL;`	The `SALARY` column is> divided by 10
`SELECT SALARY FROM EMPLOYEE_PAY_TBL WHERE SALARY / 10 > '40000';`	Returns all rows that are greater than the `SALARY`
`SELECT SALARY FROM EMPLOYEE_PAY_TBL WHERE SALARY / 10 >'40000'`	Returns all rows where the salary divided by 10 is greater than `40000`

Arithmetic Operator Combinations

The arithmetic operators can be used in combinations with one another. Remember the rules of precedence in basic mathematics. Multiplication and division operations are performed first, and then addition and subtraction operations. The only way the user has control over the order of the mathematical operations is through the use of parentheses. Parentheses surrounding an expression cause that expression to be evaluated as a block .

Precedence is the order in which expressions are resolved in a mathematical expression or with embedded functions in SQL.

Expression	Result
`1 + 1 * 5`	`6`
`(1 + 1) * 5`	`10`
`10 - 4 / 2 + 1`	`9`
`(10 - 4) / (2 + 1)`	`2`

In the following examples, notice that the placement of parentheses in an expression does not affect the outcome if only multiplication and division are involved. Precedence is not a factor in these cases. Although it may not appear to make sense, it is possible that some implementations of SQL do not follow the ANSI standard in cases like this, however unlikely.

Expression	Result
`4 * 6 / 2`	`12`
`(4 * 6) / 2`	`12`
`4 * (6 / 3)`	`12`

The following are some more examples:

SELECT SALARY * 10 + 1000

FROM EMPLOYEE_PAY_TBL

WHERE SALARY > 20000;

SELECT SALARY / 52 + BONUS

FROM EMPLOYEE_PAY_TBL;

SELECT (SALARY - 1000 + BONUS) / 52 * 1.1

FROM EMPLOYEE_PAY_TBL;

The following is a rather wild example :

SELECT SALARY

FROM EMPLOYEE_PAY_TBL

WHERE SALARY < BONUS * 3 + 10 / 2 - 50;

Because parentheses are not used, mathematical precedence takes effect, altering the value for BONUS tremendously for the condition.

Caution

When combining arithmetic operators, remember to consider the rules of precedence. The absence of parentheses in a statement could render inaccurate results .

What Are Aggregate Functions?

Functions are keywords in SQL used to manipulate values within columns for output purposes. A function is a command always used in conjunction with a column name or expression. There are several types of functions in SQL. This lesson covers aggregate functions. An aggregate function is used to provide summarization information for an SQL statement, such as counts, totals, and averages.

The aggregate functions discussed in this lesson are

· COUNT

· SUM

· MAX

· MIN

· AVG

The following queries show the data used for most of this lesson's examples :

SELECT *

FROM PRODUCTS_TBL;

PROD_ID    PROD_DESC                       COST

---------- ------------------------------ ------

11235      WITCHES COSTUME                29.99

222        PLASTIC PUMPKIN 18 INCH         7.75

13         FALSE PARAFFIN TEETH            1.1

90         LIGHTED LANTERNS               14.5

15         ASSORTED COSTUMES              10

9          CANDY CORN                      1.35

6          PUMPKIN CANDY                   1.45

87         PLASTIC SPIDERS                 1.05

119        ASSORTED MASKS                  4.95

1234       KEY CHAIN                       5.95

2345       OAK BOOKSHELF                  59.99

11 rows selected.

Some employees do not have a pager number in the results of the following query :

SELECT EMP_ID, LAST_NAME, FIRST_NAME, PAGER

FROM EMPLOYEE_TBL;

EMP_ID    LAST_NAM FIRST_NA PAGER

--------- -------- -------- ----------

311549902 STEPHENS TINA

442346889 PLEW     LINDA

213764555 GLASS    BRANDON 3175709980

313782439 GLASS    JACOB    8887345678

220984332 WALLACE MARIAH

443679012 SPURGEON TIFFANY

6 rows selected.

The `COUNT` Function

The COUNT function is used to count rows or values of a column that do not contain a NULL value. When used with a query, the COUNT function returns a numeric value. When the COUNT function is used with the DISTINCT command , only the distinct rows are counted. ALL (opposite of DISTINCT) is the default; it is not necessary to include ALL in the syntax. Duplicate rows are counted if DISTINCT is not specified. One other option with the COUNT function is to use COUNT with an asterisk. COUNT, when used with an asterisk, counts all the rows of a table including duplicates, whether a NULL value is contained in a column or not.

The syntax for the COUNT function is as follows :

COUNT [ (*) | (DISTINCT | ALL) ] (COLUMN NAME)

Note

The DISTINCT command cannot be used with COUNT(*), only with the COUNT(column_name).

Example	Meaning
`SELECT COUNT(EMPLOYEE_ID)`	Counts all employee IDs
`FROM EMPLOYEE_PAY_ID`
`SELECT COUNT(DISTINCT SALARY)`	Counts only the distinct rows
`FROM EMPLOYEE_PAY_TBL`
`SELECT COUNT(ALL SALARY)`	Counts all rows for `SALARY`
`FROM EMPLOYEE_PAY_TBL`
`SELECT COUNT (*)`	Counts all rows of the
`FROM EMPLOYEE_TBL`	`EMPLOYEE` table

COUNT(*) is used in the following example to get a count of all records in the EMPLOYEE_TBL table. There are six employees

SELECT COUNT(*)

FROM EMPLOYEE_TBL;

COUNT(*)

----------

COUNT(EMP_ID) is used in the next example to get a count of all of the employee identifications that exist in the table. The returned count is the same as the last query because all employees have an identification number .

SELECT COUNT(EMP_ID)

FROM EMPLOYEE_TBL;

COUNT(EMP_ID)

-------------

COUNT(PAGER) is used in the following example to get a count of all of the employee records that have a pager number. Only two employees had pager numbers .

SELECT COUNT(PAGER)

FROM EMPLOYEE_TBL;

COUNT(PAGER)

------------

The ORDERS_TBL table, shown next, is used in the following COUNT example :

SELECT *

FROM ORDERS_TBL;

ORD_NUM    CUST_ID    PROD_ID           QTY ORD_DATE_

---------- ---------- ----------------- -------------

56A901     232        11235               1 22-OCT-99

56A917     12         907               100 30-SEP-99

32A132     43         222                25 10-OCT-99

16C17      090        222                 2 17-OCT-99

18D778     287        90                 10 17-OCT-99

23E934     432        13                 20 15-OCT-99

90C461     560        1234                2

7 rows selected.

This last example obtains a count of all distinct product identifications in the ORDERS_TBL table .

SELECT COUNT(DISTINCT(PROD_ID))

FROM ORDERS_TBL;

COUNT(DISTINCT(PROD_ID))

------------------------

The PROD_ID 222 has two entries in the table, thus reducing the distinct values from 7 to 6.

Note

Because the COUNT function counts the rows, data types do not play a part. The rows can contain columns with any data type.

The `SUM` Function

The SUM function is used to return a total on the values of a column for a group of rows. The SUM function can also be used in conjunction with DISTINCT. When SUM is used with DISTINCT, only the distinct rows are totaled, which may not have much purpose. Your total is not accurate in that case, because rows of data are omitted.

The syntax for the SUM function is as follows:

SUM ([ DISTINCT ] COLUMN NAME)

Note

The value of an argument must be numeric to use the SUM function . The SUM function cannot be used on columns having a data type other than numeric, such as character or date

Example	Meaning
`SELECT SUM(SALARY) FROM EMPLOYEE_PAY_TBL`	Totals the salaries
`SELECT SUM(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL`	Totals the distinct

The sum, or total amount of all cost values, is being retrieved from the PRODUCTS_TBL table.

SELECT SUM(COST)

FROM PRODUCTS_TBL;

 SUM(COST)

----------

    163.07

The `AVG` Function

The AVG function is used to find averages for a group of rows. When used with the DISTINCT command , the AVG function returns the average of the distinct rows. The syntax for the AVG function is as follows:

AVG ([ DISTINCT ] COLUMN NAME)

Note

The value of the argument must be numeric for the AVG function to work .

Example	Meaning
`SELECT AVG(SALARY) FROM EMPLOYEE_PAY_TBL`	Returns the average salary
`SELECT AVG(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL`	Returns the distinct average salary

The average value for all values in the PRODUCTS_TBL table's COST column is being retrieved in the following example .

SELECT AVG(COST)

FROM PRODUCTS_TBL;

001  AVG(COST)

002 ----------

003 13.5891667

Note

In some implementations, the results of your query may be truncated to the precision of the data type .

The next example uses two aggregate functions in the same query. Because some employees are paid hourly and others paid salary, you want to retrieve the average value for both PAY_RATE and SALARY.

SELECT AVG(PAY_RATE), AVG(SALARY)

FROM EMPLOYEE_PAY_TBL;

AVG(PAY_RATE) AVG(SALARY)

------------- -----------

   13.5833333       30000

The `MAX` Function

The MAX function is used to return the maximum value for the values of a column in a group of rows. NULL values are ignored when using the MAX function. The DISTINCT command is an option. However, because the maximum value for all the rows is the same as the distinct maximum value, it is useless.

MAX([ DISTINCT ] COLUMN NAME)

Example	Meaning
`SELECT MAX(SALARY) FROM EMPLOYEE_PAY_TBL`	Returns the highest salary
`SELECT MAX(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL`	Returns the highest distinct salary

The following example returns the maximum value for the COST column in the PRODUCTS_TBL table :

SELECT MAX(COST)

FROM PRODUCTS_TBL;

 MAX(COST)

----------

     59.99

The `MIN` Function

The MIN function returns the minimum value of a column for a group of rows. NULL values are ignored when using the MIN function. The DISTINCT command is an option. However, because the minimum value for all rows is the same as the minimum value for distinct rows, it is useless.

MIN([ DISTINCT ] COLUMN NAME)

Example	Meaning
`SELECT MIN(SALARY) FROM EMPLOYEE_PAY_TBL`	Returns the lowest salary
`SELECT MIN(DISTINCT SALARY) FROM EMPLOYEE_PAY_TBL`	Returns the lowest distinct salary

The following example returns the minimum value for the COST column in the PRODUCTS_TBL table:

SELECT MIN(COST)

FROM PRODUCTS_TBL;

 MIN(COST)

----------

      1.05

Caution

One very important thing to keep in mind when using aggregate functions with the DISTINCT command is that your query may not return the desired results. The purpose of aggregate functions is to return summarized data based on all rows of data in a table .

The final example combines aggregate functions with the use of arithmetic operators:

SELECT COUNT(ORD_NUM), SUM(QTY),

       SUM(QTY) / COUNT(ORD_NUM) AVG_QTY

FROM ORDERS_TBL;

COUNT(ORD_NUM)   SUM(QTY)    AVG_QTY

-------------- ---------- ----------

             7        160 22.857143

You have performed a count on all order numbers, figured the sum of all quantities ordered, and, by dividing the two figures, have derived the average quantity of an item per order. You also created a column alias for the computation—AVG_QTY.

Sorting and Grouping Data

You have learned how to query the database and return data in an organized fashion. You have learned how to sort data from a query. During this lesson, you learn how to break returned data from a query into groups for improved readability .

The highlights of this lesson include:

· Why you would want to group data

· The GROUP BY clause

· Group value functions

· The how and why of group functions

· Grouping by columns

· GROUP BY versus ORDER BY

· The HAVING clause

Why Group Data?

Grouping data is the process of combining columns with duplicate values in a logical order. For example, a database may contain information about employees; many employees live in different cities, while some employees live in the same city. You may want to execute a query that shows employee information for each particular city. You are grouping employee information by city, and a summarized report is created .

Suppose that you wanted to figure the average salary paid to employees according to each city. You would do this by using the aggregate function AVG on the SALARY column, as you learned last couple of minutes, and by using the GROUP BY clause to group the output by city .

Grouping data is accomplished through the use of the GROUP BY clause of a SELECT statement (query). Last lesson, you learned how to use aggregate functions. During this lesson, you see how aggregate functions are used in conjunction with the GROUP BY clause for the database to display results more effectively.

The `GROUP BY` Clause

The GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups. The GROUP BY clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY clause.

The position of the GROUP BY clause in a query is as follows :

SELECT

FROM

WHERE

GROUP BY

ORDER BY

The GROUP BY clause must follow the conditions in the WHERE clause and must precede the ORDER BY clause if one is used.

The following is the SELECT statement's syntax, including the GROUP BY clause:

SELECT COLUMN1, COLUMN2

FROM TABLE1, TABLE2

WHERE CONDITIONS

GROUP BY COLUMN1, COLUMN2

ORDER BY COLUMN1, COLUMN2

The following sections give examples and explanations of the GROUP BY clause's use in a variety of situations .

Grouping Selected Data

Grouping data is a simple process. The selected columns (the column list following the SELECT keyword in a query) are the columns that can be referenced in the GROUP BY clause. If a column is not found in the SELECT statement, it cannot be used in the GROUP BY clause. This is logical if you think about it—how can you group data on a report if the data is not displayed?

If the column name has been qualified, the qualified name must go into the GROUP BY clause. The column name can also be represented by a number, which is discussed later in this lesson. When grouping the data, the order of columns grouped does not have to match the column order in the SELECT clause .

Group Functions

Typical group functions—those that are used with the GROUP BY clause to arrange data in groups—include AVG, MAX, MIN, SUM, and COUNT.

Creating Groups and Using Aggregate Functions

There are conditions that the SELECT clause has that must be met when using GROUP BY. Specifically, whatever columns are selected must appear in the GROUP BY clause, except for any aggregate values. The columns in the GROUP BY clause do not necessarily have to be in the same order as they appear in the SELECT clause. Should the columns in the SELECT clause be qualified, the qualified names of the columns must be used in the GROUP BY clause. The following are some examples of syntax for the GROUP BY clause:

Example

SELECT EMP_ID, CITY

FROM EMPLOYEE_TBL

GROUP BY CITY, EMP_ID;

The SQL statement selects the EMP_ID and the CITY from the EMPLOYEE_TBL and groups the data returned by the CITY and then RMP_ID.

Note

Note the order of the columns selected, versus the order of the columns in the GROUP BY clause.

Example

SELECT EMP_ID, SUM(SALARY)

FROM EMPLOYEE_PAY_TBL

GROUP BY SALARY, EMP_ID;

This SQL statement returns the EMP_ID and the total of the salary groups, as well as groups both the salaries and employee IDs .

Example

SELECT SUM(SALARY)

FROM EMPLOYEE_PAY_TBL;

This SQL statement returns the total of all the salaries from the EMPLOYEE_PAY_TBL.

Example

SELECT SUM(SALARY)

FROM EMPLOYEE_PAY_TBL

GROUP BY SALARY;

This SQL statement returns the totals for the different groups of salaries.

Practical examples using real data follow. In this first example, you can see that there are three distinct cities in the EMPLOYEE_TBL table .

SELECT CITY

FROM EMPLOYEE_TBL;

CITY

-------------

GREENWOOD

INDIANAPOLIS

WHITELAND

INDIANAPOLIS

INDIANAPOLIS

INDIANAPOLIS

6 rows selected.

In the following example, you select the city and a count of all records for each city. You see a count on each of the three distinct cities because you are using a GROUP BY clause .

SELECT CITY, COUNT(*)

FROM EMPLOYEE_TBL

GROUP BY CITY;

CITY           COUNT(*)

-------------- --------

GREENWOOD             1

INDIANAPOLIS          4

WHITELAND             1

3 rows selected.

The following is a query from a temporary table created based on EMPLOYEE_TBL and EMPLOYEE_PAY_TBL. You will soon learn how to join two tables for a query .

SELECT *

FROM EMP_PAY_TMP;

CITY         LAST_NAM FIRST_NA   PAY_RATE     SALARY

------------ -------- ---------- ------------ ------

GREENWOOD    STEPHENS TINA                     30000

INDIANAPOLIS PLEW     LINDA         14.75

WHITELAND    GLASS    BRANDON                  40000

INDIANAPOLIS GLASS    JACOB                    20000

INDIANAPOLIS WALLACE  MARIAH           11

INDIANAPOLIS SPURGEON TIFFANY          15

6 rows selected.

In the following example, you retrieve the average pay rate and salary on each distinct city using the aggregate function AVG. There is no average pay rate for GREENWOOD or WHITELAND, because no employees living in those cities are paid hourly .

SELECT CITY, AVG(PAY_RATE), AVG(SALARY)

FROM EMP_PAY_TMP

GROUP BY CITY;

CITY         AVG(PAY_RATE) AVG(SALARY)

------------ ------------- -----------

GREENWOOD                         30000

INDIANAPOLIS    13.5833333       20000

WHITELAND                         40000

3 rows selected.

In the next example, you combine the use of multiple components in a query to return grouped data. You still want to see the average pay rate and salary, but only for INDIANAPOLIS and WHITELAND. You group the data by CITY, of which you have no choice because you are using aggregate functions on the other columns. Lastly, you want to order the report by 2, and then 3, which is the average pay rate, and then average salary. Study the following details and output .

SELECT CITY, AVG(PAY_RATE), AVG(SALARY)

FROM EMP_PAY_TMP

WHERE CITY IN ('INDIANAPOLIS','WHITELAND')

GROUP BY CITY

ORDER BY 2,3;

CITY         AVG(PAY_RATE) AVG(SALARY)

------------ ------------- -----------

INDIANAPOLIS    13.5833333       20000

WHITELAND                        40000

Values are sorted before NULL values; therefore, the record for INDIANAPOLIS was displayed first. GREENWOOD was not selected, but if it were, its record would have been displayed before WHITELAND's record because GREENWOOD's average salary is $30,000 (the second sort in the ORDER BY clause was on average salary).

The last example in this section shows the use of the MAX and MIN aggregate functions with the GROUP BY clause .

SELECT CITY, MAX(PAY_RATE), MIN(SALARY)

FROM EMP_PAY_TMP

GROUP BY CITY;

CITY         MAX(PAY_RATE) MIN(SALARY)

------------ ------------- -----------

GREENWOOD                       30000

INDIANAPOLIS            15 20000

WHITELAND                     40000

3 rows selected.

Representing Column Names with Numbers

Unlike the ORDER BY clause the GROUP BY clause cannot be ordered by using an integer to represent the column name—except when using a UNION and the column names are different. The following is an example of representing column names with numbers :

SELECT EMP_ID, SUM(SALARY)

FROM EMPLOYEE_PAY_TBL

UNION

SELECT EMP_ID, SUM(PAY_RATE)

FROM EMPLOYEE_PAY_TBL

GROUP BY 2, 1;

This SQL statement returns the employee ID and the group totals for the salaries. When using the UNION operator, the results of the two SELECT statements are merged into one result set. The GROUP BY is performed on the entire result set. The order for the groupings is 2 representing salary, and 1 representing EMP_ID.

`GROUP BY` Versus `ORDER BY`

You should understand that the GROUP BY clause works the same as the ORDER BY clause in that both are used to sort data. The ORDER BY clause is specifically used to sort data from a query; the GROUP BY clause also sorts data from a query to properly group the data. Therefore, the GROUP BY clause can be used to sort data the same as ORDER BY.

There are some differences and disadvantages of using GROUP BY for sorting operations:

· All non-aggregate columns selected must be listed in the GROUP BY clause.

· Integers cannot be used in the GROUP BY to represent columns after the SELECT keyword, similar to using the ORDER BY clause.

· The GROUP BY clause is generally not necessary unless using aggregate functions.

An example of performing sort operations utilizing the GROUP BY clause in place of the ORDER BY clause is shown next :

SELECT LAST_NAME, FIRST_NAME, CITY

FROM EMPLOYEE_TBL

GROUP BY LAST_NAME;

SELECT LAST_NAME, FIRST_NAME, CITY

ERROR at line 1:

ORA-00979: not a GROUP BY expression

In this example, an error was received from the database server stating that FIRST_NAME is not a GROUP BY expression. Remember that all columns and expressions in the SELECT must be listed in the GROUP BY clause, with the exception of aggregate columns (those columns targeted by an aggregate function).

In the next example, the previous problem is solved by adding all expressions in the SELECT to the GROUP BY clause :

SELECT LAST_NAME, FIRST_NAME, CITY

FROM EMPLOYEE_TBL

GROUP BY LAST_NAME, FIRST_NAME, CITY;

LAST_NAM FIRST_NA CITY

-------- -------- ------------

GLASS    BRANDON WHITELAND

GLASS    JACOB    INDIANAPOLIS

PLEW     LINDA    INDIANAPOLIS

SPURGEON TIFFANY INDIANAPOLIS

STEPHENS TINA     GREENWOOD

WALLACE MARIAH   INDIANAPOLIS

6 rows selected.

In this example, the same columns were selected from the same table, but all columns in the GROUP BY clause are listed as they appeared after the SELECT keyword. The results were ordered by LAST_NAME first, FIRST_NAME second, and CITY third. These results could have been accomplished easier with the ORDER BY clause; however, it may help you better understand how the GROUP BY works if you can visualize how it must first sort data to group data results.

The following example shows a SELECT from EMPLOYEE_TBL and uses the GROUP BY to order by CITY, which leads into the next example .

SELECT CITY, LAST_NAME

FROM EMPLOYEE_TBL

GROUP BY CITY, LAST_NAME;

CITY         LAST_NAM

------------ ---------

GREENWOOD    STEPHENS

INDIANAPOLIS GLASS

INDIANAPOLIS PLEW

INDIANAPOLIS SPURGEON

INDIANAPOLIS WALLACE

WHITELAND    GLASS

6 rows selected.

Notice the order of data in the previous results, as well as the LAST_NAME of the individual for each CITY. All employee records in the EMPLOYEE_TBL table are now counted, and the results are grouped by CITY but ordered by the count on each city first .

SELECT CITY, COUNT(*)

FROM EMPLOYEE_TBL

GROUP BY CITY

ORDER BY 2,1;

CITY           COUNT(*)

-------------- --------

GREENWOOD             1

WHITELAND             1

INDIANAPOLIS          4

Notice the order of the results. The results were first sorted by the count on each city (1–4), and then by city. The count for the first two cities in the output is 1. Because the count is the same, which is the first expression in the ORDER BY clause, the city is then sorted; GREENWOOD is placed before WHITELAND.

Although GROUP BY and ORDER BY perform a similar function, there is one major difference. The GROUP BY is designed to group identical data, while the ORDER BY is designed merely to put data into a specific order. GROUP BY and ORDER BY can be used in the same SELECT statement, but must follow a specific order. The GROUP BY clause is always placed before the ORDER BY clause in the SELECT statement.

Tip

The GROUP BY clause can be used in the CREATE VIEW statement to sort data, but the ORDER BY clause is not allowed in the CREATE VIEW statement.

The `HAVING` Clause

The HAVING clause, when used in conjunction with the GROUP BY in a SELECT statement, tells GROUP BY which groups to include in the output. HAVING is to GROUP BY as WHERE is to SELECT. In other words, the WHERE clause places conditions on the selected columns, whereas the HAVING clause places conditions on groups created by the GROUP BY clause .

The following is the position of the HAVING clause in a query:

SELECT

FROM

WHERE

GROUP BY

HAVING

ORDER BY

The HAVING clause must follow the GROUP BY clause in a query and must also precede the ORDER BY clause if used.

The following is the syntax of the SELECT statement, including the HAVING clause:

SELECT COLUMN1, COLUMN2

FROM TABLE1, TABLE2

WHERE CONDITIONS

GROUP BY COLUMN1, COLUMN2

HAVING CONDITIONS

ORDER BY COLUMN1, COLUMN2

In the following example, you select the average pay rate and salary for all cities except GREENWOOD. You group the output by CITY, but only want to display those groups (cities) that have an average salary greater than $20,000. You sort the results by average salary for each city .

SELECT CITY, AVG(PAY_RATE), AVG(SALARY)

FROM EMP_PAY_TMP

WHERE CITY <> 'GREENWOOD'

GROUP BY CITY

HAVING AVG(SALARY) > 20000

ORDER BY 3;

CITY         AVG(PAY_RATE) AVG(SALARY)

------------ ------------- -----------

WHITELAND                        40000

1 row selected.

Why was only one row returned by this query?

· The city GREENWOOD was eliminated from the WHERE clause.

· INDIANAPOLIS was deducted from the output because the average salary was 20000, which is not greater than 20000

Restructuring the Appearance of Data

During this lesson, you learned how to restructure the appearance of output results using a wide array of functions, some ANSI standard functions, and other functions based on the standard and several variations used by some major SQL implementations.

This lesson's highlights include

· Introduction to character functions

· How and when to use character functions

· Examples of ANSI SQL functions

· Examples of common implementation-specific functions

· Overview of conversion functions

· How and when to use conversion functions

The Concepts of ANSI Character Functions

Character functions are functions used to represent strings in SQL in formats alternate to how they are stored in the table. The first part of this lesson discusses the concepts for character functions as covered by ANSI. The second part of this lesson shows real-world examples using functions that are specific to various SQL implementations. ANSI functions discussed in this lesson include concatenation, substring, TRANSLATE, REPLACE, UPPER, and LOWER.

Concatenation

Concatenation is the process of combining two separate strings into one string. For example, you may want to concatenate an individual's first and last names into a single string for the complete name.

JOHN concatenated with SMITH = JOHN SMITH

Substring

The concept of substring is the capability to extract part of a string, or a "sub" of the string. For example, the following values are substrings of JOHNSON:

J     JOHN    JO    ON    SON    …

`TRANSLATE`

The TRANSLATE function is used to translate a string, character by character, into another string. There are normally three arguments with the TRANSLATE function: the string to be converted, a list of the characters to convert, and a list of the substitution characters. Implementation examples are shown in the next part of this lesson.

Various Common Character Functions

Character functions are used mainly to compare, join, search, and extract a segment of a string or a value in a column. There are several character functions available to the SQL programmer.

The following sections illustrate the application of ANSI concepts in some of the leading implementations of SQL, such as in Oracle, Sybase, SQLBase, Informix, and SQL Server.

Note

The ANSI concepts discussed in this book are just that—concepts. Standards provided by ANSI are simply guidelines for how the use of SQL in a relational database should be implemented. With that thought, keep in mind that the specific functions discussed in this lesson are not necessarily the exact functions that you may use in your particular implementation. Yes, the concepts are the same, and the way the functions work are generally the same, but function names and actual syntax may differ.

Concatenation

Concatenation, along with most other functions, is represented slightly differently among various implementations. The following examples show the use of concatenation in Oracle and SQL Server .

In Oracle
`SELECT 'JOHN' \|\| 'SON'` returns `JOHNSON`

In SQL Server
`SELECT 'JOHN' \|\| 'SON'` returns `JOHNSON`

The syntax for Oracle is

COLUMN_NAME || [ '' || ] COLUMN_NAME [ COLUMN_NAME ]

The syntax for SQL Server is

COLUMN_NAME + [ '' + ] COLUMN_sNAME [ COLUMN_NAME ]

Example	Meaning
SELECT CITY + STATE FROM EMPLOYEE_TBL;	This SQL Server statement concatenates the values for city and state into one value.
SELECT CITY \|\|', '\|\| STATE FROM EMPLOYEE_TBL;	This Oracle statement concatenates the values for city and state into one value, placing a comma between the values for city and state.
SELECT CITY + ' ' + STATE FROM EMPLOYEE_TBL;	This SQL Server statement concatenates the values for city and state into one value, placing a space between the two original values.

Example:

SELECT LAST_NAME || ', ' || FIRST_NAME NAME

FROM EMPLOYEE_TBL;

NAME

-----------------

STEPHENS, TINA

PLEW, LINDA

GLASS, BRANDON

GLASS, JACOB

WALLACE, MARIAH

SPURGEON, TIFFANY

6 rows selected.

Note

Notice the use of single quotation marks and a comma in the preceding SQL statement. Most characters and symbols are allowed if enclosed by single quotations marks. Some implementations may use double quotation marks for literal string values .

`TRANSLATE`

The TRANSLATE function searches a string of characters and checks for a specific character, makes note of the position found, searches the replacement string at the same position, and then replaces that character with the new value. The syntax is

TRANSLATE(CHARACTER SET, VALUE1, VALUE2)

Example

Meaning

SELECT TRANSLATE

(CITY,'IND','ABC'

FROM EMPLOYEE_TBL);

This SQL statement substitutes every occurrence of I in the string with A, replaces all occurrences of N with B, and D with C.

The following example illustrates the use of TRANSLATE with real data:

SELECT CITY, TRANSLATE(CITY,'IND','ABC')

FROM EMPLOYEE_TBL;

CITY         TRANSLATE(CI

------------ ------------

GREENWOOD    GREEBWOOC

INDIANAPOLIS ABCAABAPOLAS

WHITELAND    WHATELABC

INDIANAPOLIS ABCAABAPOLAS

INDIANAPOLIS ABCAABAPOLAS

INDIANAPOLIS ABCAABAPOLAS

6 rows selected.

Notice in this example that all occurrences of I were replaced with A, N with B, and D with C. In the city INDIANAPOLIS, IND was replaced with ABC, but in GREENWOOD, D was replaced with C. Also notice how the value WHITELAND was translated .

`REPLACE`

The REPLACE function is used to replace every occurrence of a character(s) with a specified character(s). The use of this function is similar to the TRANSLATE function; only one specific character or string is replaced within another string. The syntax is

REPLACE('VALUE', 'VALUE', [ NULL ] 'VALUE')

Example

Meaning

SELECT REPLACE(FIRST_

NAME 'T', 'B')

FROM EMPLOYEE_TBL

This statement returns all the first names and changes any occurrence of T to a B.

SELECT CITY, REPLACE(CITY,'I','Z')

FROM EMPLOYEE_TBL;

CITY         REPLACE(CITY)

------------ -------------

GREENWOOD    GREENWOOD

INDIANAPOLIS ZNDZANAPOLZS

WHITELAND    WHZTELAND

INDIANAPOLIS ZNDZANAPOLZS

INDIANAPOLIS ZNDZANAPOLZS

INDIANAPOLIS ZNDZANAPOLZS

6 rows selected.

`UPPER`

Most implementations have a way to control the case of data by using functions. The UPPER function is used to convert lowercase letters to uppercase letters for a specific string .

The syntax is as follows:

UPPER(character string)

Example

Meaning

SELECT UPPER(LAST_NAME)

FROM EMPLOYEE_TBL;

LAST_NAME

This SQL statement converts all characters in the column to uppercase.

SELECT UPPER(CITY)

FROM EMPLOYEE_TBL;

UPPER(CITY)

-------------

GREENWOOD

INDIANAPOLIS

WHITELAND

INDIANAPOLIS

INDIANAPOLIS

INDIANAPOLIS

6 rows selected.

`LOWER`

Converse of the UPPER function, the LOWER function is used to convert uppercase letters to lowercase letters for a specific string .

The syntax is as follows:

LOWER(character string)

Example

Meaning

SELECT LOWER(LAST_NAME)

FROM EMPLOYEE_TBL;

LAST_NAME

This SQL statement converts all characters in the column to lowercase.

SELECT LOWER(CITY)

FROM EMPLOYEE_TBL;

LOWER(CITY)

-------------

greenwood

indianapolis

whiteland

indianapolis

indianapolis

indianapolis

6 rows selected.

`SUBSTR`

Taking an expression's substring is common in most implementations of SQL, but the function name may differ, as shown in the following Oracle and SQL Server examples .

The syntax for Oracle is

SUBSTR(COLUMN NAME, STARTING POSITION, LENGTH)

The syntax for SQL Server is

SUBSTRING(COLUMN NAME, STARTING POSITION, LENGTH)

The only difference between the two implementations is the spelling of the function name.

Example	Meaning
SELECT SUBSTRING (EMP_ID,1,3) FROM EMPLOYEE_TBL	This SQL statement returns the first three characters of `Emp_Id.`
SELECT SUBSTRING (EMP_ID,4,2) FROM EMPLOYEE_TBL	This SQL statement returns the fourth and fifth characters of `EMP_ID.`
SELECT SUBSTRING (EMP_ID,6,4) FROM EMPLOYEE_TBL	This SQL statement returns the sixth through the ninth characters of `EMP_ID.`

The following is an example using Microsoft SQL Server:

SELECT EMP_ID, SUBSTRING(EMP_ID,1,3)

FROM EMPLOYEE_TBL;

EMP_ID    SUB

--------- ---

311549902 311

442346889 442

213764555 213

313782439 313

220984332 220

443679012 443

6 rows affected.

The following is an example using Oracle8:

SELECT EMP_ID, SUBSTR(EMP_ID,1,3)

FROM EMPLOYEE_TBL;

EMP_ID    SUB

--------- ---

311549902 311

442346889 442

213764555 213

313782439 313

220984332 220

443679012 443

6 rows selected.

Note

Notice the difference between the feedback of the two queries. The first example returns the feedback 6 rows affected and the second returns 6 rows selected. You see differences such as this between implementations .

`INSTR`

The INSTR function is a variation of the POSITION function; it is used to search a string of characters for a specific set of characters and report the position of those characters. The syntax is as follows:

INSTR(COLUMN NAME, 'SET',

[ START POSITION [ , OCCURRENCE ] ]);

Example

Meaning

SELECT INSTR(STATE

,'I',1,1)

FROM EMPLOYEE_TBL;

This SQL statement returns the position of the first occurrence of the letter I for each state in EMPLOYEE_TBL.

SELECT PROD_DESC,

       INSTR(PROD_DESC,'A',1,1)

FROM PRODUCTS_TBL;

PROD_DESC                 INSTR(PROD_DESC,'A',1,1)

------------------------- --------------------------

WITCHES COSTUME                                  0

PLASTIC PUMPKIN 18 INCH                          3

FALSE PARAFFIN TEETH                             2

LIGHTED LANTERNS                                10

ASSORTED COSTUMES                                1

CANDY CORN                                       2

PUMPKIN CANDY                                   10

PLASTIC SPIDERS                                  3

ASSORTED MASKS                                   1

KEY CHAIN                                        7

OAK BOOKSHELF                                    2

11 rows selected.

Notice that if the searched character A was not found in a string, the value 0 was returned for the position .

`LTRIM`

The LTRIM function is another way of clipping part of a string. This function and SUBSTRING are in the same family. LTRIM is used to trim characters from the left of a string. The syntax is

LTRIM(CHARACTER STRING [ ,'set' ])

Example

Meaning

SELECT LTRIM(FIRST_

NAME,'LES')

FROM CUSTOMER_TBL

FIRST_NAME =

'LESLIE';

This SQL statement trims the characters LES from the left of all names that are WHERE LESLIE.

SELECT POSITION, LTRIM(POSITION,'SALES')

FROM EMPLOYEE_PAY_TBL;

POSITION        LTRIM(POSITION,

--------------- ---------------

MARKETING       MARKETING

TEAM LEADER     TEAM LEADER

SALES MANAGER   MANAGER

SALESMAN        MAN

SHIPPER         HIPPER

SHIPPER         HIPPER

6 rows selected.

The S in SHIPPER was trimmed off, even though SHIPPER does not contain the string SALES. The first four characters of SALES were ignored. The searched characters must appear in the same order of the search string and must be on the far left of the string. In other words, LTRIM will trim off all characters to the left of the last occurrence in the search string .

`RTRIM`

Like the LTRIM, the RTRIM function is used to trim characters from the right of a string. The syntax is

RTRIM(CHARACTER STRING [ ,'set' ])

Example

Meaning

SELECT RTRIM(FIRST_

NAME, 'ON')

FROM EMPLOYEE_TBL

WHERE FIRST_NAME=

'BRANDON';

This SQL statement returns the first name BRANDON and trims the ON, leaving BRAND as a result.

SELECT POSITION, RTRIM(POSITION,'ER')

FROM EMPLOYEE_PAY_TBL;

POSITION        RTRIM(POSITION,

--------------- ---------------

MARKETING       MARKETING

TEAM LEADER     TEAM LEAD

SALES MANAGER   SALES MANAG

SALESMAN        SALESMAN

SHIPPER         SHIPP

SHIPPER         SHIPP

6 rows selected.

The string ER was trimmed from the right of all applicable strings .

`DECODE`

The DECODE function is not ANSI—at least not at the time of this writing—but its use is shown here because of its great power. This function is used in SQLBase, Oracle, and possibly other implementations. DECODE is used to search a string for a value or string, and if the string is found, an alternate string is displayed as part of the query results .

The syntax is

DECODE(COLUMN NAME, 'SEARCH1', 'RETURN1',[ 'SEARCH2', 'RETURN2' ,'DEFAULT åVALUE'])

Example

Meaning

SELECT DECODE(LAST_NAME,

'SMITH', 'JONES',

OTHER') FROM EMPLOYEE_

TBL;

This query searches the value of all last names in EMPLOYEE_TBL; if the value SMITH is found, JONES is displayed in its place. Any other names are displayed as OTHER, which is called the default value.

In the following example, DECODE is used on the values for CITY in EMPLOYEE_TBL:

SELECT CITY,

       DECODE(CITY,'INDIANAPOLIS','INDY',

                   'GREENWOOD','GREEN', 'OTHER')

FROM EMPLOYEE_TBL;

CITY         DECOD

------------ -----

GREENWOOD    GREEN

INDIANAPOLIS INDY

WHITELAND    OTHER

INDIANAPOLIS INDY

INDIANAPOLIS INDY

INDIANAPOLIS INDY

6 rows selected.

The output shows the value INDIANAPOLIS displayed as INDY, GREENWOOD displayed as GREEN, and all other cities displayed as OTHER.

Miscellaneous Character Functions

The following sections show a few other character functions worth mentioning. Once again, these are functions that are fairly common among major implementations.

Finding a Value's Length

The LENGTH function is a common function used to find the length of a string, number, date, or expression in bytes. The syntax is

LENGTH(CHARACTER STRING)

Example

Meaning

SELECT LENGTH

FROM EMPLOYEE_TBL;

This SQL statement (LAST_NAME) returns the length of the last name for each employee.

SELECT PROD_DESC, LENGTH(PROD_DESC)

FROM PRODUCTS_TBL;

PROD_DESC                                LENGTH(PROD_DESC)

------------------------                 -----------------

WITCHES COSTUME                          15

PLASTIC PUMPKIN 18 INCH                  23

FALSE PARAFFIN TEETH                     19

LIGHTED LANTERNS                         16

ASSORTED COSTUMES                        17

CANDY CORN                               10

PUMPKIN CANDY                            13

PLASTIC SPIDERS                          15

ASSORTED MASKS                           14

KEY CHAIN                                 9

OAK BOOKSHELF                            13

11 rows selected.

`NVL` (`NULL` Value)

The NVL function is used to return data from one expression if another expression is NULL. NVL can be used with most data types; however, the value and the substitute must be the same data type. The syntax is

NVL('VALUE', 'SUBSTITUTION')

Example

Meaning

SELECT NVL(SALARY,

'00000' FROM

EMPLOYEE_PAY_TBL;

This SQL statement) finds NULL values and substitutes 00000 for any NULL values.

SELECT PAGER, NVL(PAGER,9999999999)

FROM EMPLOYEE_TBL;

PAGER      NVL(PAGER,

---------- ----------

9999999999

9999999999

3175709980 3175709980

8887345678 8887345678

9999999999

9999999999

6 rows selected.

Only NULL values were represented as 9999999999.

`LPAD`

LPAD (left pad) is used to add characters or spaces to the left of a string. The syntax is

LPAD(CHARACTER SET)

The following example pads periods to the left of each product description, totaling 30 characters between the actual value and padded periods .

SELECT LPAD(PROD_DESC,30,'.') PRODUCT

FROM PRODUCTS_TBL;

PRODUCT

------------------------------

……………WITCHES COSTUME

…….PLASTIC PUMPKIN 18 INCH

……….FALSE PARAFFIN TEETH

…………..LIGHTED LANTERNS

………….ASSORTED COSTUMES

………………..CANDY CORN

……………..PUMPKIN CANDY

……………PLASTIC SPIDERS

…………….ASSORTED MASKS

…………………KEY CHAIN

…….. ……..OAK BOOKSHELF

11 rows selected.

`RPAD`

The RPAD (right pad) is used to add characters or spaces to the right of a string. The syntax is

RPAD(CHARACTER SET)

The following example pads periods to the right of each product description, totaling 30 characters between the actual value and padded periods .

SELECT RPAD(PROD_DESC,30,'.') PRODUCT

FROM PRODUCTS_TBL;

PRODUCT

------------------------------

WITCHES COSTUME……………

PLASTIC PUMPKIN 18 INCH…….

FALSE PARAFFIN TEETH……….

LIGHTED LANTERNS…………..

ASSORTED COSTUMES………….

CANDY CORN………………..

PUMPKIN CANDY……………..

PLASTIC SPIDERS……………

ASSORTED MASKS…………….

KEY CHAIN…….. …….. …

OAK BOOKSHELF…….. ……..

11 rows selected.

`ASCII`

The ASCII function is used to return the ASCII (American Standard Code for Information Interchange) representation of the leftmost character of a string. The syntax is

ASCII(CHARACTER SET)

Examples:

`ASCII('A')`returns `65`
`ASCII('B')`returns `66`
`ASCII('C')` returns `67`

Mathematical Functions

Mathematical functions are fairly standard across implementations. These are functions that allow you to manipulate numeric values in a database according to mathematical rules .

The most common functions include the following:

Absolute value	(`ABS`)
Rounding	(`ROUND`)
Square root	(`SQRT`)
Sign values	(`SIGN`)
Power	(`POWER`)
Ceiling and floor values	(`CEIL, FLOOR`)
Exponential values `SIN, COS, TAN`	(`EXP`)

The general syntax of most mathematical functions is

FUNCTION(EXPRESSION)

Conversion Functions

Conversion functions are used to convert a data type into another data type. For example, there may be times when you want to convert character data into numeric data. You may have data that is normally stored in character format, but occasionally you want to convert the character format to numeric for the purpose of making calculations. Mathematical functions and computations are not allowed on data that is represented in character format.

The following are general types of data conversions:

· Character to numeric

· Numeric to character

· Character to date

· Date to character

The first two types of conversions are discussed in this lesson.

Note

Some implementations may implicitly convert data types when necessary.

Converting Character Strings to Numbers

There are two things you should notice regarding the differences between numeric data types and character string data types :

1. Arithmetic expressions and functions can be used on numeric values.

2. Numeric values are right-justified, whereas character string data types are left- justified in output results.

When a character string is converted to a numeric value, the value takes on the two attributes just mentioned.

Some implementations may not have functions to convert character strings to numbers, while some have conversion functions such as this. In either case, consult your implementation documentation for specific syntax and rules for conversions.

Note

Characters in a character string being converted to a number must typically be 0 through 9. The addition symbol, minus symbol, and period can also be used to represent positive numbers, negative numbers, and decimals. For example, the string STEVE cannot be converted to a number, whereas an individual's Social Security number could be stored as a character string, but could easily be converted to a numeric value via use of a conversion function.

The following is an example of a numeric conversion using an Oracle conversion function :

SELECT EMP_ID, TO_NUMBER(EMP_ID)

FROM EMPLOYEE_TBL;

EMP_ID            TO_NUMBER(EMP_ID)

---------         -----------------

311549902                 311549902

442346889                 442346889

213764555                 213764555

313782439                 313782439

220984332                 220984332

443679012                 443679012

6 rows selected.

The employee identification is right-justified following the conversion.

Tip

The justification of data is the simplest way to identify a column's data type .

Converting Numbers to Strings

The conversion of numeric values to character strings is precisely the opposite of converting characters to numbers.

The following is an example of converting a numeric value to a character string using a Transact-SQL conversion function for Microsoft SQL Server :

SELECT PAY = PAY_RATE, NEW_PAY = STR(PAY_RATE)

FROM EMPLOYEE_PAY_TBL

WHERE PAY_RATE IS NOT NULL;

PAY NEW_PAY

---------- -------

      17.5 17.5

     14.75 14.75

     18.25 18.25

12.8 12.8

        11 11

        15 15

6 rows affected.

The following is the same example using an Oracle conversion function:

SELECT PAY_RATE, TO_CHAR(PAY_RATE)

FROM EMPLOYEE_PAY_TBL

WHERE PAY_RATE IS NOT NULL;

  PAY_RATE TO_CHAR(PAY_RATE)

---------- -----------------

      17.5 17.5

     14.75 14.75

     18.25 18.25

      12.8 12.8

        11 11

        15 15

6 rows selected.

The Concept of Combining Character Functions

Most functions can be combined in a single SQL statement. SQL would be far too limited if function combinations were not allowed. The following examples show how some functions can be combined with one another in a query :

SELECT LAST_NAME || ', ' || FIRST_NAME NAME,

       SUBSTR(EMP_ID,1,3) || '-' ||

       SUBSTR(EMP_ID,4,2) || '-' ||

       SUBSTR(EMP_ID,6,4) ID

FROM EMPLOYEE_TBL;

NAME               ID

------------------ -----------

STEPHENS, TINA     311-54-9902

PLEW, LINDA        442-34-6889

GLASS, BRANDON     213-76-4555

GLASS, JACOB       313-78-2439

WALLACE, MARIAH    220-98-4332

SPURGEON, TIFFANY  443-67-9012

6 rows selected.

The following example combines two functions in the query (concatenation with substring). By pulling the EMP_ID column apart into three pieces, you can concatenate those pieces with dashes to render a readable Social Security number .

SELECT SUM(LENGTH(LAST_NAME) + LENGTH(FIRST_NAME)) TOTAL

FROM EMPLOYEE_TBL;

     TOTAL

----------

1 row selected.

This example uses the LENGTH function and the arithmetic operator (+) to add the length of the first name to the length of the last name for each column; the SUM function then finds the total length of all first and last names.

Note

When embedding functions within functions in an SQL statement, remember that the innermost function is resolved first, and then each function is subsequently resolved from the inside out .

How Is a Date Stored?

Each implementation has a default storage format for the date and time. This default storage often varies among different implementations, as do other data types for each implementation. The following sections begin by reviewing the standard format of the DATETIME data type and its elements. Then you see the data types for date and time in some popular implementations of SQL, including Oracle, Sybase, and Microsoft SQL Server.

Standard Data Types for Date and Time

There are three standard SQL data types for date and time (DATETIME) storage :

Data Type	Usage
`DATE`	Stores date literals
`TIME`	Stores time literals
`TIMESTAMP`	Stores date and time literals

Format and range of valid values for each data type:

`DATE`
Format: `YYYY-MM-DD`
Range: `0001-01-01` to `9999-12-31`
`TIME`
Format: `HH:MI:SS.nn…`
Range: `00:00:00…` to `23:59:61.999…`
`TIMESTAMP`
Format: `YYYY-MM-DD HH:MI:SS.` `nn…`
Range: `0001-01-01 00:00:00…` to `9999-12-31 23:59:61.999…`

`DATETIME` Elements

DATETIME elements are those elements pertaining to date and time that are included as part of a DATETIME definition. The following is a list of the constrained DATETIME elements and a valid range of values for each element:

`YEAR`	`0001`to `9999`
`MONTH`	`01`to `12`
`DAY`	`01`to `31`
`HOUR`	`00`to `23`
`MINUTE`	`00`to `59`
`SECOND`	`00.000…` to `61.999…`

Seconds can be represented as a decimal, allowing the expression of tenths of a second, hundredths of a second, milliseconds, and so on. Each of these elements, except for the last, is self explanatory; they are elements of time that we deal with on a daily basis. You may question the fact that a minute can contain more than 60 seconds. According to the ANSI standard, this 61.999 seconds is due to the possible insertion or omission of a leap second in a minute, which in itself is a rare occurrence. Refer to your implementation on the allowed values because date and time storage may vary widely .

Implementation Specific Data Types

As with other data types, each implementation provides its own representation and syntax. This section shows how three products (Oracle, Sybase, and SQLBase) have been implemented with date and time .

Product	Data Type	Use
Oracle	`DATE`	Stores both date and time information
Sybase	`DATETIME`	Stores both date and time information
	`SMALLDATETIME`	Stores both date and time information, but includes a smaller date range than `DATETIME`
SQLBase	`DATETIME`	Stores both date and time information
	`TIMESTAMP`	Stores both date and time information
	`DATE`	Stores a date value
	`TIME`	Stores a time value

Note

Each implementation has its own specific data type(s) for date and time information. However, most implementations comply with the ANSI standard in the fact that all elements of the date and time are included in their associated data types. The way the date is internally stored is implementation-dependent .

Date Functions

Date functions are available in SQL depending on the options with each specific implementation. Date functions, similar to character string functions, are used to manipulate the representation of date and time data. Available date functions are often used to format the output of dates and time in an appealing format, compare date values with one another, compute intervals between dates, and so on.

The Current Date

You may have already raised the question: How do I get the current date from the database? The need to retrieve the current date from the database may originate from several situations, but the current date is normally returned either to compare to a stored date or to return the value of the current date as some sort of timestamp.

The current date is ultimately stored on the host computer for the database, and is called the system date . The database, which interfaces with the appropriate operating system, has the capability to retrieve the system date for its own purpose or to resolve database requests, such as queries.

Take a look at a couple of methods of attaining the system date based on commands from two different implementations.

Sybase uses a function called GETDATE() to return the system date. This function is used in a query as follows. The output is what would return if today's current date was New Year's Eve for 1999.

SELECT GETDATE()

Dec 31, 1999

Note

Most options discussed in this book for Sybase's and Microsoft's implementations are applicable to both implementations, because both use SQL Server for their database server. Both implementations also use an extension to standard SQL known as Transact-SQL .

Oracle uses what is calls a pseudo column, SYSDATE, to retrieve the current date. SYSDATE acts as any other column in a table and can be selected from any table in the database, although it is not actually part of the table's definition .

To return the system date in Oracle, the following statement returns the output if today was New Year's Eve before 2000:

SELECT SYSDATE FROM TABLE_NAME

31-DEC-99

Time Zones

The use of time zones may be a factor when dealing with date and time information. For instance, a time of 6:00 p.m. in central United States does not equate to the same time in Australia, although the actual point in time is the same. Some of us who live within the daylight savings time zone are used to adjusting our clocks twice a year. If time zones are considerations when maintaining data in your case, you may find it necessary to consider time zones and perform time conversions, if available with your SQL implementation .

The following are some common time zones and their abbreviations :

Abbreviation	Definition
AST, ADT	Atlantic standard, daylight time
BST, BDT	Bering standard, daylight time
CST, CDT	Central standard, daylight time
EST, EDT	Eastern standard, daylight time
GMT	Greenwich mean time
HST, HDT	Alaska/Hawaii standard, daylight time
MST, MDT	Mountain standard, daylight time
NST	Newfoundland standard, daylight time
PST, PDT	Pacific standard, daylight time
YST, YDT	Yukon standard, daylight time

Note

Some implementations have functions that allow you to deal with different time zones. However, not all implementations may support the use of time zones. Be sure to verify the use of time zones in your particular implementation, as well as the need in the case of your database.

Adding Time to Dates

Days, months, and other parts of time can be added to dates for the purpose of comparing dates to one another, or to provide more specific conditions in the WHERE clause of a query .

Intervals can be used to add periods of time to a DATETIME value. As defined by the standard, intervals are used to manipulate the value of a DATETIME value, as in the following examples:

DATE '1999-12-31' + INTERVAL '1' DAY

'2000-01-01'

DATE '1999-12-31' + INTERVAL '1' MONTH

'2000-01-31'

The following is an example using the SQL Server function DATEADD:

SELECT DATEADD(MONTH, 1, DATE_HIRE)

FROM EMPLOYEE_PAY_TBL

DATE_HIRE ADD_MONTH

---------- ----------

23-MAY-89 23-JUN-89

17-JUN-90 17-JUL-90

14-AUG-94 14-SEP-94

28-JUN-97 28-JUL-97

22-JUL-96 22-AUG-96

14-JAN-91 14-FEB-91

6 rows affected.

The following example uses the Oracle function ADD_MONTHS:

SELECT DATE_HIRE, ADD_MONTHS(DATE_HIRE,1)

FROM EMPLOYEE_PAY_TBL;

DATE_HIRE ADD_MONTH

----- ------

23-MAY-89 23-JUN-89

17-JUN-90 17-JUL-90

14-AUG-94 14-SEP-94

28-JUN-97 28-JUL-97

22-JUL-96 22-AUG-96

14-JAN-91 14-FEB-91

6 rows selected.

To add one day to a date in Oracle, use the following :

SELECT DATE_HIRE, DATE_HIRE + 1

FROM EMPLOYEE_PAY_TBL

WHERE EMP_ID = '311549902';

DATE_HIRE DATE_HIRE

---------- ----------

23-MAY-89 24-MAY-89

1 row selected.

Notice that these examples in SQL Server and Oracle, though they differ syntactically from the ANSI examples, derive their results based on the same concept as described by the SQL standard .

Comparing Dates and Time Periods

OVERLAPS is a powerful standard SQL conditional operator for DATETIME values. The OVERLAPS operator is used to compare two timeframes and return the Boolean value TRUE or FALSE, depending on whether the two timeframes overlap. The following comparison returns the value TRUE:

(TIME '01:00:00' , TIME '05:59:00')

OVERLAPS

(TIME '05:00:00' , TIME '07:00:00')

The following comparison returns the value FALSE:

(TIME '01:00:00' , TIME '05:59:00')

OVERLAPS

(TIME '06:00:00 , TIME '07:00:00')

Miscellaneous Date Functions

The following list shows some powerful date functions that exist in the implementations for SQL Server and Oracle.

SQL SERVER
`DATEPART`	Returns the integer value of a `DATEPART` for a date
`DATENAME`	Returns the text value of a `DATEPART` for a date
`GETDATE()`	Returns the system date
`DATEDIFF`	Returns the difference between two dates for specified date parts, such as days, minutes, and seconds

ORACLE
`NEXT_DAY`	Returns the next day of the week as specified (for example, `FRIDAY`) since a given date
`MONTHS_BETWEEN`	Returns the number of months between two given dates

Date Conversions

The conversion of dates takes place for any number of reasons. Conversions are mainly used to alter the data type of values defined as a DATETIME value or any other valid data type of a particular implementation .

Typical reasons for date conversions are as follows:

· To compare date values of different data types

· To format a date value as a character string

· To convert a character string into a date format

The ANSI CAST operator is used to convert data types into other data types .

The basic syntax is as follows:

CAST ( EXPRESSION AS NEW_DATA_TYPE )

Specific examples according to the syntax of some implementations are illustrated in the following subsections, covering:

· The representation of parts of a DATETIME value

· Conversions of dates to character strings

· Conversions of character strings to dates

Date Pictures

A date picture is composed of formatting elements used to extract date and time information from the database in a desired format. Date pictures may not be available in all SQL implementations.

Without the use of a date picture and some type of conversion function, the date and time information is retrieved from the database in a default format, such as:

1999-12-31

31-DEC-99

1999-12-31 23:59:01.11

…

What if you wanted the date displayed as the following? You have to convert the date from a DATETIME format into a character string format :

December 31, 1997

This is accomplished by implementation-specific functions for this very purpose, further illustrated in the following sections.

Sybase date pictures :
`yy`	year
`qq`	quarter
`mm`	month
`dy`	day of year
`wk`	week
`dw`	weekday
`hh`	hour
`mi`	minute
`ss`	second
`ms`	millisecond

Oracle date pictures :
`AD`	anno Domini
`AM`	ante meridian
`BC`	Before Christ
`CC`	Century
`D`	Number of the day in the week
`DD`	Number of the day in the month
`DDD`	Number of the day in the year
`DAY`	The day spelled out (MONDAY)
`Day`	The day spelled out (Monday)
`day`	The day spelled out (monday)
`DY`	The three-letter abbreviation of day (MON)
`Dy`	The three-letter abbreviation of day (Mon)
`dy`	The three-letter abbreviation of day (mon)
`HH`	Hour of the day
`HH12`	Hour of the day
`HH24`	Hour of the day for a 24-hour clock
`J`	Julian days since 12-31-4713 b.c.
`MI`	Minute of the hour
`MM`	The number of the month
`MON`	The three-letter abbreviation of the month (JAN)
`Mon`	The three-letter abbreviation of the month (Jan)
`mon`	The three-letter abbreviation of the month (jan)
`MONTH`	The month spelled out (JANUARY)
`Month`	The month spelled out (January)
`month`	The month spelled out (january)
`PM`	post meridian
`Q`	The number of the quarter
`RM`	The Roman numeral for the month
`RR`	The two digits of the year
`SS`	The second of a minute
`SSSSS`	The seconds since midnight
`SYYYY`	The signed year; if b.c. 500, b.c. = -500
`W`	The number of the week in a month
`WW`	The number of the week in a year
`Y`	The last digit of the year
`YY`	The last two digits of the year
`YYY`	The last three digits of the year
`YYYY`	The year
`YEAR`	The year spelled out (NINETEEN-NINETY-NINE)
`Year`	The year spelled out (Nineteen-Ninety-Nine)
`year`	The year spelled out (nineteen-ninety-nine)

Converting Dates to Character Strings

DATETIME values are converted to character strings to alter the appearance of output from a query. A conversion function is used to achieve this. Two examples, the first using SQL Server, of converting date and time data into a character string as designated by a query follow :

SELECT DATE_HIRE = DATENAME(MONTH, DATE_HIRE)

FROM EMPLOYEE_PAY_TBL

DATE_HIRE

-----

May

June

August

June

July

Jan

6 rows affected.

The following is an Oracle date conversion using the TO_CHAR function :

SELECT DATE_HIRE, TO_CHAR(DATE_HIRE,'Month dd, yyyy') HIRE

FROM EMPLOYEE_PAY_TBL;

DATE_HIRE HIRE

------ ---------

23-MAY-89 May       23, 1989

17-JUN-90 June      17, 1990

14-AUG-94 August    14, 1994

28-JUN-97 June      28, 1997

22-JUL-96 July      22, 1996

14-JAN-91 January   14, 1991

6 rows selected.

Converting Character Strings to Dates

The following example illustrates a method from one implementation of converting a character string into a date format. When the conversion is complete, the data can be stored in a column defined as having some form of a DATETIME data type.

SELECT TO_DATE('JANUARY 01 1998','MONTH DD YYYY')

FROM EMPLOYEE_PAY_TBL;

TO_DATE('

-----

01-JAN-99

01-JAN-99

01-JAN-99

01-JAN-99

01-JAN-99

01-JAN-99

6 rows selected.

You may be wondering why six rows were selected from this query when only one date value was provided. The reason is because the conversion of the literal string was selected from the EMPLOYEE_PAY_TBL, which has six rows of data. Hence, the conversion of the literal string was selected against each record in the table .

Selecting Data from Multiple Tables

Having the capability to select data from multiple tables is one of SQL's most powerful features. Without this capability, the entire relational database concept would not be feasible. Single-table queries are sometimes quite informative, but in the real world, the most practical queries are those whose data is acquired from multiple tables within the database.

A join combines two or more tables to retrieve data from multiple tables.

You might ask why you should normalize tables if, in the end, you are only going to rejoin the tables to retrieve the data you want. You rarely select all data from all tables, so it is better to pick and choose according to the needs of each individual query. Although performance may suffer slightly due to a normalized database, overall coding and maintenance are much simpler.

Types of Joins

While different implementations have many ways of joining tables, you concentrate on the most common joins in this lesson. The types of joins that you learn are

EQUIJOINS

NATURAL JOINS

NON-EQUIJOINS

OUTER JOINS

SELF JOINS

Component Locations of a Join Condition

As you have learned from previous lessons, the SELECT and FROM clauses are both required SQL statement elements; the WHERE clause is a required element of an SQL statement when joining tables. The tables being joined are listed in the FROM clause. The join is performed in the WHERE clause. Several operators can be used to join tables, such as =, <, >, <>, <=, >=,!=, BETWEEN, LIKE, and NOT; they can all be used to join tables. However, the most common operator is the equal symbol.

Joins of Equality

Perhaps the most used and important of the joins is the EQUIJOIN, also referred to as an INNER JOIN. The EQUIJOIN joins two tables with a common column in which each is usually the primary key.

The syntax for an EQUIJOIN is

SELECT TABLE1.COLUMN1,TABLE2.COLUMN2…

FROM TABLE1, TABLE2 [, TABLE3 ]

WHERE TABLE1.COLUMN_NAME= TABLE2.COLUMN_NAME

[ AND TABLE1.COLUMN_NAME = TABLE3.COLUMN_NAME]

Note

Take note of the example SQL statements. Indentation is used in the SQL statements to improve overall readability. Indentation is not required, but is recommended.

Look at the following example :

SELECT EMPLOYEE_TBL.EMP_ID,

       EMPLOYEE_PAY_TBL.DATE_HIRE

FROM EMPLOYEE_TBL,

       EMPLOYEE_PAY_TBL

WHERE EMPLOYEE_TBL.EMP_ID = EMPLOYEE_PAY_TBL.EMP_ID;

This SQL statement returns the employee identification and the employee's date of hire. The employee identification is selected from the EMPLOYEE_TBL (although it exists in both tables, you must specify one table), while the hire date is selected from the EMPLOYEE_PAY_TBL. Because the employee identification exists in both tables, both columns must be justified with the table name. By justifying the columns with the table names, you tell the database server where to get the data.

Data in the following example is selected from tables EMPLOYEE_TBL and EMPLOYEE_PAY_TBL tables because desired data resides in each of the two tables. An equality join is used.

SELECT EMPLOYEE_TBL.EMP_ID, EMPLOYEE_TBL.LAST_NAME,

       EMPLOYEE_PAY_TBL.POSITION

FROM EMPLOYEE_TBL, EMPLOYEE_PAY_TBL

WHERE EMPLOYEE_TBL.EMP_ID = EMPLOYEE_PAY_TBL.EMP_ID;

EMP_ID    LAST_NAM POSITION

--------- -------- -------------

311549902 STEPHENS MARKETING

442346889 PLEW     TEAM LEADER

213764555 GLASS    SALES MANAGER

313782439 GLASS    SALESMAN

220984332 WALLACE  SHIPPER

443679012 SPURGEON SHIPPER

6 rows selected.

Notice that each column in the SELECT clause is preceded by the associated table name in order to identify each column. This is called qualifying columns in a query. Qualifying columns is only necessary for columns that exist in more than one table referenced by a query. You usually qualify all columns for consistency and to avoid any questions when debugging or modifying SQL code.

Natural Joins

A NATURAL JOIN is nearly the same as the EQUIJOIN; however, the NATURAL JOIN differs from the EQUIJOIN by eliminating duplicate columns in the joining columns. The JOIN condition is the same, but the columns selected differ.

The syntax is as follows:

SELECT TABLE1.*, TABLE2.COLUMN_NAME

       [ TABLE3.COLUMN_NAME ]

FROM TABLE1, TABLE2 [ TABLE3 ]

WHERE TABLE1.COLUMN_NAME = TABLE2.COLUMN_NAME

[ AND TABLE1.COLUMN_NAME = TABLE3.COLUMN ]

Look at the following example:

SELECT EMPLOYEE_TBL.*, EMPLOYEE_PAY_TBL.SALARY

FROM EMPLOYEE_TBL,

     EMPLOYEE_PAY_TBL

WHERE EMPLOYEE_TBL.EMP_ID =EMPLOYEE_PAY_TBL.EMP_ID;

This SQL statement returns all columns from EMPLOYEE_TBL and SALARY from the EMPLOYEE_PAY_TBL. The EMP_ID is in both tables, but is retrieved only from the EMPLOYEE_TBL because both contain the same information and do not need to be selected.

The following example selects all columns from the EMPLOYEE_TBL table and only one column from the EMPLOYEE_PAY_TBL table. Remember that the asterisk (*) represents all columns of a table .

SELECT EMPLOYEE_TBL.*, EMPLOYEE_PAY_TBL.POSITION

FROM EMPLOYEE_TBL, EMPLOYEE_PAY_TBL

WHERE EMPLOYEE_TBL.EMP_ID = EMPLOYEE_PAY_TBL.EMP_ID;

EMP_ID    LAST_NAM FIRST_NA M ADDRESS       CITY         ST ZIP   PHONE

--------- -------- -------- - ------------- ------------ -- ----- ----------

PAGER      POSITION

---------- --------------

311549902 STEPHENS TINA     D RR 3 BOX 17A GREENWOOD    IN 47890 3178784465

           MARKETING

442346889 PLEW     LINDA    C 3301 BEACON   INDIANAPOLIS IN 46224 3172978990

           TEAM LEADER

213764555 GLASS    BRANDON  S 1710 MAIN ST  WHITELAND    IN 47885 3178984321

3175709980 SALES MANAGER

313782439 GLASS    JACOB    3789 RIVER BLVD INDIANAPOLIS IN 45734 3175457676

8887345678 SALESMAN

220984332 WALLACE  MARIAH   7889 KEYSTONE   INDIANAPOLIS IN 46741 3173325986

           SHIPPER

443679012 SPURGEON TIFFANY  5 GEORGE COURT  INDIANAPOLIS IN 46234 3175679007

           SHIPPER

      6 rows selected.

Note

Notice how the output has wrapped in the previous example. The wrap occurred because the length of the line has exceeded the limit for the line .

Using Table Aliases

The use of table aliases means to rename a table in a particular SQL statement. The renaming is a temporary change. The actual table name does not change in the database. As we will learn later in this lesson, giving the tables aliases is a necessity for the SELF JOIN. Giving tables aliases is most often used to save keystrokes, which results in the SQL statement being shorter and easier to read. In addition, fewer keystrokes means fewer keystroke errors. Giving tables aliases also means that the columns being selected must be qualified with the table alias. The following are some examples of table aliases and the corresponding columns:

SELECT E.EMP_ID, EP.SALARY, EP.DATE_HIRE, E.LAST_NAME

FROM EMPLOYEE_TBL E,

     EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

AND EP.SALARY > 20000;

The tables have been given aliases in the preceding SQL statement. The EMPLOYEE_TBL has been renamed E. The EMPLOYEE_PAY_TBL has been renamed EP. The choice of what to rename the tables is arbitrary. The letter E is chosen because the EMPLOYEE_TBL starts with E. Because the EMPLOYEE_PAY_TBL also begins with the letter E, you could not use E again. Instead, the first letter (E) and the first letter of the second word in the name (PAY) are used as the alias. The selected columns were justified with the corresponding table alias. Note that SALARY was used in the WHERE clause and must also be justified with the table alias .

Joins of Non-Equality

NON-EQUIJOIN joins two or more tables based on a specified column value not equaling a specified column value in another table. The syntax for the NON-EQUIJOIN is

FROM TABLE1, TABLE2 [, TABLE3 ]

WHERE TABLE1.COLUMN_NAME != TABLE2.COLUMN_NAME

[ AND TABLE1.COLUMN_NAME != TABLE2.COLUMN_NAME ]

An example is as follows:

SELECTEMPLOYEE_TBL.EMP_ID,EMPLOYEE_PAY_TBL.DATE_HIRE

FROM EMPLOYEE_TBL,

     EMPLOYEE_PAY_TBL

WHERE EMPLOYEE_TBL.EMP_ID != EMPLOYEE_PAY_TBL.EMP_ID;

The preceding SQL statement returns the employee identification and the date of hire for all employees who do not have a corresponding record in both tables. The following example is a join of non-equality :

SELECT E.EMP_ID, E.LAST_NAME, P.POSITION

FROM EMPLOYEE_TBL E,

     EMPLOYEE_PAY_TBL P

WHERE E.EMP_ID <> P.EMP_ID;

EMP_ID    LAST_NAM POSITION

--------- -------- -------------

442346889 PLEW     MARKETING

213764555 GLASS    MARKETING

313782439 GLASS    MARKETING

220984332 WALLACE MARKETING

443679012 SPURGEON MARKETING

311549902 STEPHENS TEAM LEADER

213764555 GLASS TEAM LEADERS

313782439 GLASS TEAM LEADERS

220984332 WALLACE TEAM LEADER

443679012 SPURGEON TEAM LEADER

311549902 STEPHENS SALES MANAGER

442346889 PLEW SALES MANAGERS

313782439 GLASS SALES MANAGERS

220984332 WALLACE SALES MANAGER

443679012 SPURGEON SALES MANAGER

311549902 STEPHENS SALESMAN

442346889 PLEW     SALESMAN

213764555 GLASS    SALESMAN

220984332 WALLACE  SALESMAN

443679012 SPURGEON SALESMAN

311549902 STEPHENS SHIPPER

442346889 PLEW     SHIPPER

213764555 GLASS    SHIPPER

313782439 GLASS    SHIPPER

443679012 SPURGEON SHIPPER

311549902 STEPHENS SHIPPER

442346889 PLEW     SHIPPER

213764555 GLASS    SHIPPER

313782439 GLASS    SHIPPER

220984332 WALLACE  SHIPPER

30 rows selected.

You may be curious why 30 rows were retrieved when only 6 rows exist in each table. For every record in EMPLOYEE_TBL, there is a corresponding record in EMPLOYEE_PAY_TBL. Because non-equality was tested in the join of the two tables, each row in the first table is paired with all rows from the second table, except for its own corresponding row. This means that each of the 6 rows are paired with 5 unrelated rows in the second table; 6 rows multiplied by 5 rows equals 30 rows total.

In the previous section's test for equality example, each of the six rows in the first table were paired with only one row in the second table (each row's corresponding row); six rows multiplied by one row yields a total of six rows.

Caution

When using NON-EQUIJOIN's you may receive several rows of data that are of no use to you. Check your results carefully.

Outer Joins

An OUTER JOIN is used to return all rows that exist in one table, even though corresponding rows do not exist in the joined table. The (+) symbol is used to denote an OUTER JOIN in a query. The (+) is placed at the end of the table name in the WHERE clause. The table with the (+) should be the table that does not have matching rows. In many implementations, the OUTER JOIN is broken down into joins called LEFT OUTER JOIN, RIGHT OUTER JOIN,and FULL OUTER JOIN. The OUTER JOIN in these implementations is normally optional.

Note

You must check your particular implementation for exact usage and syntax of the OUTER JOIN. The (+) symbol is used by some major implementations, but is non-standard.

The general syntax is

FROM TABLE1

{RIGHT | LEFT | FULL} [OUTER] JOIN

ON TABLE2

The Oracle syntax is

FROM TABLE1, TABLE2 [, TABLE3 ]

WHERE TABLE1.COLUMN_NAME[(+)] = TABLE2.COLUMN_NAME[(+)]

[ AND TABLE1.COLUMN_NAME[(+)] = TABLE3.COLUMN_NAME[(+)]]

Note

The OUTER JOIN can only be used on one side of a join condition; however, you can use an OUTER JOIN on more than one column of the same table in the join condition.

The concept of the OUTER JOIN is explained in the next two examples. In the first example, the product description and the quantity ordered are selected; both values are extracted from two separate tables. One important factor to keep in mind is that there may not be a corresponding record in the ORDERS_TBL table for every product. A regular join of equality is performed:

SELECT P.PROD_DESC, O.QTY

FROM PRODUCTS_TBL P,

     ORDERS_TBL O

WHERE P.PROD_ID = O.PROD_ID;

PROD_DESC                        QTY

-------------------------------- ---

WITCHES COSTUME                    1

PLASTIC PUMPKIN 18 INCH           25

PLASTIC PUMPKIN 18 INCH            2

LIGHTED LANTERNS                  10

FALSE PARAFFIN TEETH              20

KEY CHAIN                          1

6 rows selected.

Only 6 rows were selected, but there are 10 distinct products. You want to display all products, whether the products have been placed on order or not.

The next example accomplishes the desired output through the use of an OUTER JOIN. Oracle's syntax is used for the OUTER JOIN.

SELECT P.PROD_DESC, O.QTY

FROM PRODUCTS_TBL P,

     ORDERS_TBL O

WHERE P.PROD_ID = O.PROD_ID(+);

PROD_DESC                        QTY

-------------------------------- ---

WITCHES COSTUME                    1

ASSORTED MASKS

FALSE PARAFFIN TEETH              20

ASSORTED COSTUMES

PLASTIC PUMPKIN 18 INCH           25

PLASTIC PUMPKIN 18 INCH            2

PUMPKIN CANDY

PLASTIC SPIDERS

CANDY CORN

LIGHTED LANTERNS                  10

KEY CHAIN                          1

OAK BOOKSHELF

12 rows selected.

All products were returned by the query, even though they may not have had a quantity ordered. The outer join is inclusive of all rows of data in the PRODUCTS_TBL table, whether a corresponding row exists in the ORDERS_TBL table or not .

Self Joins

The SELF JOIN is used to join a table to itself, as if the table were two tables, temporarily renaming at least one table in the SQL statement. The syntax is as follows:

SELECT A.COLUMN_NAME, B.COLUMN_NAME, [ C.COLUMN_NAME ]

FROM TABLE1 A, TABLE2 B [, TABLE3 C ]

WHERE A.COLUMN_NAME = B.COLUMN_NAME

[ AND A.COLUMN_NAME = C.COLUMN_NAME ]

The following is an example:

SELECT A.LAST_NAME, B.LAST_NAME, A.FIRST_NAME

FROM EMPLOYEE_TBL A,

     EMPLOYEE_TBL B

WHERE A.LAST_NAME = B.LAST_NAME;

The preceding SQL statement returns the employees' first name for all the employees with the same last name from the EMPLOYEE_TBL. Self joins are useful when all of the data you want to retrieve resides in one table, but you must somehow compare records in the table to other records in the table .

Another common example used to explain a self join is as follows. Suppose you have a table that stores an employee identification number, the employee's name, and the employee identification number of the employee's manager. You may want to produce a list of all employees and their managers' names. The problem is that the manager name does not exist in the table, only the employee name:

SELECT * FROM EMP;

ID   NAME      MGR_ID

---- --------- ------

1    JOHN      0

2    MARY      1

3    STEVE     1

4    JACK      2

5    SUE       2

In the following example, we have included the table EMP twice in the FROM clause of the query, giving the table two aliases for the purpose of the query. By providing two aliases, it is as if you are selecting from two distinct tables. All managers are also employees, so the join condition between the two tables compares the value of the employee identification number from the first table with the manager identification number in the second table. The first table acts as a table that stores employee information, whereas the second table acts as a table that stores manager information:

SELECT E1.NAME, E2.NAME

FROM EMP E1, EMP E2

WHERE E1.MGR_ID = E2.ID;

NAME      NAME

--------- ---------

MARY      JOHN

STEVE     JOHN

JACK      MARY

SUE       MARY

Joining on Multiple Keys

Most join operations that occur involve the merging of data based on a key in one table and a key in another table. Depending on how your database has been designed, you may have to join on more than one key field to accurately depict that data in your database. You may have a table that has a primary key that is comprised of more than one column. You may also have a foreign key in a table that consists of more than one column, which references the multiple column primary key .

Consider the following tables that are used here for examples only:

SQL> desc prod

 Name                                      Null?    Type

------------------------------------------ -------- ----------------------------

 SERIAL_NUMBER                             NOT NULL NUMBER(10)

 VENDOR_NUMBER                             NOT NULL NUMBER(10)

 PRODUCT_NAME                              NOT NULL VARCHAR2(30)

 COST                                      NOT NULL NUMBER(8,2)

SQL> desc ord

 Name                                      Null?    Type

 ----------------------------------------- -------- ----------------------------

 ORD_NO                                    NOT NULL NUMBER(10)

 PROD_NUMBER                               NOT NULL NUMBER(10)

 VENDOR_NUMBER                             NOT NULL NUMBER(10)

 QUANTITY                                  NOT NULL NUMBER(5)

 ORD_DATE                                  NOT NULL DATE

The primary key in PROD is the combination of the columns SERIAL_NUMBER and VENDOR_NUMBER. Perhaps two products can have the same serial number within the distribution company, but each serial number is unique per vendor.

The foreign key in ORD is also the combination of the columns SERIAL_NUMBER and VENDOR_NUMBER.

When selecting data from both tables (PROD and ORD), the join operation may appear as follows:

SELECT P.PRODUCT_NAME, O.ORD_DATE, O.QUANTITY

FROM PROD P, ORD O

WHERE P.SERIAL_NUMBER = O.SERIAL_NUMBER

  AND P.VENDOR_NUMBER = O.VENDOR_NUMBER;

Join Considerations

Several things should be considered before using joins. Some considerations include what columns(s) to join on, if there is no common column to join on, and performance issues.

Using a `BASE TABLE`

What to join on? Should you have the need to retrieve data from two tables that do not have a common column to join, you must use another table that has a common column or columns to both tables to join on. That table becomes the BASE TABLE. A BASE TABLE is used to join one or more tables that have common columns, or to join tables that do not have common columns. Use the following three tables for an example of a base table:

`CUSTOMER_TBL`
`CUST_ID`	`VARCHAR2(10)`	`NOT NULL`	PRIMARY KEY
`CUST_NAME`	`VARCHAR2(30)`	`NOT NULL`
`CUST_ADDRESS`	`VARCHAR2(20)`	`NOT NULL`
`CUST_CITY`	`VARCHAR2(15)`	`NOT NULL`
`CUST_STATE`	`CHAR(2)`	`NOT NULL`
`CUST_ZIP`	`NUMBER(5)`	`NOT NULL`
`CUST_PHONE`	`NUMBER(10)`
`CUST_FAX`	`NUMBER(10)`

`ORDERS_TBL`
`ORD_NUM`	`VARCHAR2(10)`	`NOT NULL`	PRIMARY KEY
`CUST_ID`	`VARCHAR2(10)`	`NOT NULL`
`PROD_ID`	`VARCHAR2(10)`	`NOT NULL`
`QTY`	`NUMBER(6)`	`NOT NULL`
`ORD_DATE`	`DATE`

`PRODUCTS_TBL`
`PROD_ID`	`VARCHAR2(10)`	`NOT NULL`	PRIMARY KEY
`PROD_DESC`	`VARCHAR2(40)`	`NOT NULL`
`COST`	`NUMVER(6,2)`	`NOT NULL`

You have a need to use the CUSTOMERS_TBL and the PRODUCTS_TBL. There is no common column in which to join the tables. Now look at the ORDERS_TBL. The ORDERS_TBL has CUST_ID to join with the CUSTOMERS_TBL, which also has CUST_ID. The PRODUCTS_TBL has PROD_ID, which is also in the ORDERS_TBL. The JOIN conditions and results look like the following :

SELECT C.CUST_NAME, P.PROD_DESC

FROM CUSTOMER_TBL C,

     PRODUCTS_TBL P,

     ORDERS_TBL O

WHERE C.CUST_ID = O.CUST_ID

  AND P.PROD_ID = O.PROD_ID;

CUST_NAME                      PROD_DESC

------------------------------ -----------------------

LESLIE GLEASON                 WITCHES COSTUME

SCHYLERS NOVELTIES             PLASTIC PUMPKIN 18 INCH

WENDY WOLF                     PLASTIC PUMPKIN 18 INCH

GAVINS PLACE                   LIGHTED LANTERNS

SCOTTYS MARKET                 FALSE PARAFFIN TEETH

ANDYS CANDIES                  KEY CHAIN

6 rows selected.

Note

Note the use of table aliases and their use on the columns in the WHERE clause.

The Cartesian Product

The Cartesian Product is a result of a CARTESIAN JOIN or "no join." If you select from two or more tables and do not JOIN the tables, your output is all possible rows from all the tables selected from. If your tables were large, the result could be hundreds of thousands, or even millions, of rows of data. A WHERE clause is highly recommended for SQL statements retrieving data from two or more tables. The Cartesian Product is also known as a cross join.

The syntax is

FROM TABLE1, TABLE2 [, TABLE3 ]

WHERE TABLE1, TABLE2 [, TABLE3 ]

The following is an example of a cross join, or the dreaded Cartesian Product :

SELECT E.EMP_ID, E.LAST_NAME, P.POSITION

FROM EMPLOYEE_TBL E,

     EMPLOYEE_PAY_TBL P;

EMP_ID    LAST_NAM POSITION

--------- -------- --------------

311549902 STEPHENS MARKETING

442346889 PLEW     MARKETING

213764555 GLASS    MARKETING

313782439 GLASS    MARKETING

220984332 WALLACE  MARKETING

443679012 SPURGEON MARKETING

311549902 STEPHENS TEAM LEADER

442346889 PLEW     TEAM LEADER

213764555 GLASS    TEAM LEADER

313782439 GLASS    TEAM LEADER

220984332 WALLACE  TEAM LEADER

443679012 SPURGEON TEAM LEADER

311549902 STEPHENS SALES MANAGER

442346889 PLEW     SALES MANAGER

213764555 GLASS    SALES MANAGER

313782439 GLASS    SALES MANAGER

220984332 WALLACE  SALES MANAGER

443679012 SPURGEON SALES MANAGER

311549902 STEPHENS SALESMAN

442346889 PLEW     SALESMAN

213764555 GLASS    SALESMAN

313782439 GLASS    SALESMAN

220984332 WALLACE  SALESMAN

443679012 SPURGEON SALESMAN

311549902 STEPHENS SHIPPER

442346889 PLEW     SHIPPER

213764555 GLASS    SHIPPER

313782439 GLASS    SHIPPER

220984332 WALLACE  SHIPPER

443679012 SPURGEON SHIPPER

311549902 STEPHENS SHIPPER

442346889 PLEW     SHIPPER

213764555 GLASS    SHIPPER

313782439 GLASS    SHIPPER

220984332 WALLACE  SHIPPER

443679012 SPURGEON SHIPPER

36 rows selected.

Data is being selected from two separate tables, yet no JOIN operation is performed. Because you have not specified how to join rows in the first table with rows in the second table, the database server pairs every row in the first table with every row in the second table. Because each table has 6 rows of data each, the product of 36 rows selected is achieved from 6 rows multiplied by 6 rows.

To fully understand exactly how the Cartesian Product is derived, study the following example .

SQL> SELECT X FROM TABLE1;

4 rows selected.

SQL> SELECT V FROM TABLE2;

4 rows selected.

SQL> SELECT TABLE1.X, TABLE2.X

  2* FROM TABLE1, TABLE2;

X X

- -

A A

B A

C A

D A

A B

B B

C B

D B

A C

B C

C C

D C

A D

B D

C D

D D

16 rows selected.

Caution

Be careful to always join all tables in a query. If two tables in a query have not been joined and each table contains 1,000 rows of data, the Cartesian Product consists of 1,000 rows multiplied by 1,000 rows, which results in a total of 1,000,000 rows of data returned.

What Is a Sub query?

A sub query is a query embedded within the WHERE clause of another query to further restrict data returned by the query. A sub query is a query within another query, also known as a nested query. A sub query is used to return data that will be used in the main query as a condition to further restrict the data to be retrieved. Sub queries are used with the SELECT, INSERT, UPDATE, and DELETE statements.

A sub query can be used in some cases in place of a join operation by indirectly linking data between the tables based on one or more conditions. When a sub query is used in a query, the sub query is resolved first, and then the main query is resolved according to the condition(s) as resolved by the sub query. The results of the sub query are used to process expressions in the WHERE clause of the main query. The sub query can either be used in the WHERE clause or the HAVING clause of the main query. Logical and relational operators, such as =, >, <, <>, IN, NOT IN, AND, OR, and so on, can be used within the sub query as well to evaluate a sub query in the WHERE or HAVING clause.

Note

The same rules that apply to standard queries also apply to sub queries. Join operations, functions, conversions, and other options can be used within a sub query.

There are a few rules that sub queries must follow:

· Sub queries must be enclosed within parentheses.

· A sub query can have only one column in the SELECT clause, unless multiple columns are in the main query for the sub query to compare its selected columns.

· An ORDER BY cannot be used in a sub query, although the main query can use an ORDER BY. The GROUP BY can be used to perform the same function as the ORDER BY in a sub query.

· Sub queries that return more than one row can only be used with multiple value operators, such as the IN operator.

· The SELECT list cannot include any references to values that evaluate to a BLOB, ARRAY, CLOB, or NCLOB.

· A sub query cannot be immediately enclosed in a set function.

· The BETWEEN operator cannot be used with a sub query; however, the BETWEEN can be used within the sub query.

The basic syntax for a sub query is as follows:

SELECT COLUMN_NAME

FROM TABLE

WHERE COLUMN_NAME = (SELECT COLUMN_NAME

                     FROM TABLE

                     WHERE CONDITIONS);

Note

Notice the use of indentation in our examples. The use of indentation is merely for readability. We have found that when looking for errors in SQL statements, the neater your statements are, the easier it is to read and find any errors in syntax.

The following examples show how the BETWEEN operator can and cannot be used with a sub query:

The following is an example of a correct use of BETWEEN in the sub query:

SELECT COLUMN_NAME

FROM TABLE

WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME

                            FROM TABLE)

                            WHERE VALUE BETWEEN VALUE)

The following is an example of an illegal use of BETWEEN with a sub query:

SELECT COLUMN_NAME

FROM TABLE

WHERE COLUMN_NAME BETWEEN VALUE AND (SELECT COLUMN_NAME

                                     FROM TABLE)

Sub queries with the `SELECT` Statement

Sub queries are most frequently used with the SELECT statement, although they can be used within a data manipulation statement as well. The sub query, when used with the SELECT statement, retrieves data for the main query to use to solve the main query.

The basic syntax is as follows:

SELECT COLUMN_NAME [, COLUMN_NAME ]

FROM TABLE1 [, TABLE2 ]

WHERE COLUMN_NAME OPERATOR

                  (SELECT COLUMN_NAME [, COLUMN_NAME ]

                   FROM TABLE1 [, TABLE2 ]

                  [ WHERE ])

The following is an example:

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE

FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

AND EP.PAY_RATE > (SELECT PAY_RATE

                   FROM EMPLOYEE_PAY_TBL

                   WHERE EMP_ID = '313782439')

The preceding SQL statement returns the employee identification, last name, first name, and pay rate for all employees who have a pay rate greater than that of the employee with the identification 313782439. In this case, you do not necessarily know (or care) what the exact pay rate is for this particular employee; you only care about the pay rate for the purpose of getting a list of employees who bring home more than the employee specified in the sub query.

The next query selects the pay rate for a particular employee. This query is used as the sub query in the following example .

SELECT PAY_RATE

FROM EMPLOYEE_PAY_TBL

WHERE EMP_ID = '220984332';

  PAY_RATE

----------

1 row selected.

The previous query is used as a sub query in the WHERE clause of the following query.

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE

FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

  AND EP.PAY_RATE > (SELECT PAY_RATE

                     FROM EMPLOYEE_PAY_TBL

                     WHERE EMP_ID = '220984332');

EMP_ID    LAST_NAM FIRST_NA   PAY_RATE

--------- -------- ---------- --------

442346889 PLEW     LINDA         14.75

443679012 SPURGEON TIFFANY          15

2 rows selected.

The result of the sub query is 11 (shown in the last example), so the last condition of the WHERE clause is evaluated as

AND EP.PAY_RATE > 11

You did not know the value of the pay rate for the given individual when you executed the query. However, the main query was able to compare each individual's pay rate to the sub query results.

Note

Sub queries are frequently used to place conditions on a query when the exact conditions are unknown. The salary for 220984332 was unknown, but the sub query was designed to do the footwork for you.

Sub queries with the `INSERT` Statement

Sub queries also can be used in conjunction with data manipulation language (DML) statements. The INSERT statement is the first instance you examine. The INSERT statement uses the data returned from the sub query to insert into another table. The selected data in the sub query can be modified with any of the character, date, or number functions.

The basic syntax is as follows:

INSERT INTO TABLE_NAME [ (COLUMN1 [, COLUMN2 ]) ]

SELECT [ *|COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE VALUE OPERATOR ]

The following is an example of the INSERT statement with a sub query:

INSERT INTO RICH_EMPLOYEES

SELECT E.EMP_ID, E.LAST_NAME, E.FIRST_NAME, EP.PAY_RATE

FROM EMPLOYEE_TBL E, EMPLOYEE_PAY_TBL EP

WHERE E.EMP_ID = EP.EMP_ID

  AND EP.PAY_RATE > (SELECT PAY_RATE

                     FROM EMPLOYEE_PAY_TBL

                     WHERE EMP_ID = '220984332');

2 rows created.

This INSERT statement inserts the EMP_ID, LAST_NAME, FIRST_NAME, and PAY_RATE into a table called RICH_EMPLOYEES for all records of employees who have a pay rate greater than the pay rate of the employee with identification 220984332.

Note

Remember to use the COMMIT and ROLLBACK commands when using DML commands such as the INSERT statement.

Sub queries with the `UPDATE` Statement

The sub query can be used in conjunction with the UPDATE statement. Either single or multiple columns in a table can be updated when using a sub query with the UPDATE statement.

The basic syntax is as follows:

UPDATE TABLE

SET COLUMN_NAME [, COLUMN_NAME) ] =

    (SELECT ]COLUMN_NAME [, COLUMN_NAME) ]

    FROM TABLE

    [ WHERE ]

Examples showing the use of the UPDATE statement with a sub query follow. The first query returns the employee identification of all employees that reside in Indianapolis. You can see that there are four individuals who meet these criterions.

SELECT EMP_ID

FROM EMPLOYEE_TBL

WHERE CITY = 'INDIANAPOLIS';

EMP_ID

---------

442346889

313782439

220984332

443679012

4 rows selected.

The first query is used as the sub query in the following UPDATE statement. The first query proves how many employee identifications are returned by the sub query. The following is the UPDATE with the sub query:

UPDATE EMPLOYEE_PAY_TBL

SET PAY_RATE = PAY_RATE * 1.1

WHERE EMP_ID IN (SELECT EMP_ID

                 FROM EMPLOYEE_TBL

                 WHERE CITY = 'INDIANAPOLIS');

4 rows updated.

As expected, four rows are updated. One very important thing to notice is that, unlike the example in the first section, this sub query returns multiple rows of data. Because you expect multiple rows to be returned, you have used the IN operator instead of the equal sign. Remember that IN is used to compare an expression to values in a list. If the equal sign was used, an error would have been returned.

Caution

Be sure to use the correct operator when evaluating a sub query. For example, an operator used to compare an expression to one value, such as the equal sign, cannot be used to evaluate a sub query that returns more than one row of data.

Sub queries with the `DELETE` Statement

The sub query also can be used in conjunction with the DELETE statement.

The basic syntax is as follows:

DELETE FROM TABLE_NAME

[ WHERE OPERATOR [ VALUE ]

                (SELECT COLUMN_NAME

                FROM TABLE_NAME)

                [ WHERE) ]

In this example, you delete BRANDON GLASS's record from the EMPLOYEE_PAY_TBL table. You do not know Brandon's employee identification number, but you can use a sub query to get his identification from the EMPLOYEE_TBL table, which contains the FIRST_NAME and LAST_NAME columns.

DELETE FROM EMPLOYEE_PAY_TBL

WHERE EMP_ID = (SELECT EMP_ID

                FROM EMPLOYEE_TBL

                WHERE LAST_NAME = 'GLASS'

                  AND FIRST_NAME = 'BRANDON');

1 row deleted.

Caution

Do not forget the use of the WHERE clause with the UPDATE and DELETE statements. All rows are updated or deleted from the target table if the WHERE clause is not used.

Embedding a Sub query Within a Sub query

A sub query can be embedded within another sub query, just as you can embed the sub query within a regular query. When a sub query is used, that sub query is resolved before the main query. Likewise, the lowest level sub query is resolved first in embedded or nested sub queries, working out to the main query.

Note

You must check your particular implementation for limits on the number of sub queries, if any, that can be used in a single statement. It may differ between vendors.

The basic syntax for embedded sub queries is as follows:

SELECT COLUMN_NAME [, COLUMN_NAME ]

FROM TABLE1 [, TABLE2 ]

WHERE COLUMN_NAME OPERATOR (SELECT COLUMN_NAME

                            FROM TABLE

                            WHERE COLUMN_NAME OPERATOR

                                    (SELECT COLUMN_NAME

                                    FROM TABLE

                                    [ WHERE COLUMN_NAME OPERATOR VALUE ]))

The following example uses two sub queries, one embedded within the other. You want to find out what customers have placed orders where the quantity multiplied by the cost of a single order is greater than the sum of the cost of all products.

SELECT CUST_ID, CUST_NAME

FROM CUSTOMER_TBL

WHERE CUST_ID IN (SELECT O.CUST_ID)

                  FROM, ORDERS_TBL O, PRODUCTS_TBL P

                  WHERE O PROD_ID = P.PROD_ID

                    AND O.QTY + P.COST < (SELECT SUM(COST)

                                          FROM PRODUCTS_TBL));

CUST_ID    CUST_NAME

---------- ------------------

090        WENDY WOLF

232        LESLIE GLEASON

287        GAVINS PLACE

43         SCHYLERS NOVELTIES

432        SCOTTYS MARKET

560        ANDYS CANDIES

6 rows selected.

Six rows that met the criteria of both sub queries were selected.

The following two examples show the results of each of the sub queries to aid your understanding of how the main query was resolved.

SELECT SUM(COST) FROM PRODUCTS_TBL;

 SUM(COST)

----------

     138.08

1 row selected.

SELECT O.CUST_ID

FROM ORDERS_TBL O, PRODUCTS_TBL P

WHERE O.PROD_ID = P.PROD_ID

  AND O.QTY * P.COST > 72.14;

CUST_ID

-------

2 rows selected.

In essence, the main query (after the resolution of the sub queries) is evaluated, as shown in the following example, the substitution of the second sub query:

SELECT CUST_ID, CUST_NAME

FROM CUSTOMER_TBL

WHERE CUST_ID IN (SELECT O.CUST_ID

                  FROM ORDERS_TBL O, PRODUCTS_TBL P

                  WHERE O.PROD_ID = P.PROD_ID

                    AND O.QTY * P.COST > 72.14);

The following shows the substitution of the first sub query:

SELECT CUST_ID, CUST_NAME

FROM CUSTOMER_TBL

WHERE CUST_ID IN ('287','43');

The following is the final result:

CUST_ID    CUST_NAME

---------- ------------------

43         SCHYLERS NOVELTIES

287        GAVINS PLACE

2 rows selected.

Caution

The use of multiple sub queries results in slower response time and may result in reduced accuracy of the results due to possible mistakes in the statement coding.

Correlated Sub queries

Correlated sub queries are common in many SQL implementations. The concept of correlated sub queries is discussed as an ANSI standard SQL topic and is covered briefly in this lesson. A correlated sub query is a sub query that is dependent upon information in the main query.

In the following example, the table join between CUSTOMER_TBL and ORDERS_TBL in the sub query is dependent on the alias for CUSTOMER_TBL (C) in the main query. This query returns the name of all customers that have ordered more than 10 units of one or more items.

SELECT C.CUST_NAME

FROM CUSTOMER_TBL C

WHERE 10 < (SELECT SUM(0.QTY)

            FROM ORDERS_TBL 0

            WHERE C.CUST_ID = O.CUST_ID

CUST_NAME

------------------

SCOTTYS MARKET

SCHYLERS NOVELTIES

MARYS GIFT SHOP

Note

In the case of a correlated sub query, the reference to the table in the main query must be accomplished before the sub query can be resolved.

The sub query is slightly modified in the next statement to show you the total quantity of units ordered for each customer, allowing the previous results to be verified.

SELECT C.CUST_NAME, SUM(0.QTY)

FROM CUSTOMER_TBL C,

     ORDERS_TBL O

WHERE C. CUST_ID = O.CUST_TD

GROUP BY C.CUST_NAME;

CUST_NAME                      SUM(O.QTY)

------------------------------ ----------

ANDYS CANDIES                           1

GAVINS PLACE                           10

LESLIE GLEASON                          1

MARYS GIFT SHOP                       100

SCHYLERS NOVELTIES                     25

SCOTTYS MARKET                         20

WENDY WOLF                              2

7 rows selected.

The GROUP BY clause in this example is required because another column is being selected with the aggregate function SUM. This gives you a sum for each customer. In the original sub query, a GROUP BY clause is not required because SUM is used to achieve a total for the entire query, which is run against the record for each individual customer.

Single Queries Versus Compound Queries

The single query is one SELECT statement, while the compound query includes two or more SELECT statements.

Compound queries are formed by using some type of operator that is used to join the two queries. The UNION operator in the following examples is used to join two queries.

A single SQL statement could be written as follows:

SELECT EMP_ID, SALARY, PAY_RATE

FROM EMPLOYEE_PAY_TBL

WHERE SALARY IS NOT NULL OR

PAY_RATE IS NOT NULL;

This is the same statement using the UNION operator:

SELECT EMP_ID, SALARY

FROM EMPLOYEE_PAY_TBL

WHERE SALARY IS NOT NULL

UNION

SELECT EMP_ID, PAY_RATE

FROM EMPLOYEE_PAY_TBL

WHERE PAY_RATE IS NOT NULL;

The previous statements return pay information for all employees who are paid either hourly or salaried .

Note

If you executed the second query, the output has two column headings: EMP_ID and SALARY. Each individual's pay rate is listed under the SALARY column. When using the UNION operator, column headings are determined by column names or column aliases used in the first SELECT of the UNION.

Why Would I Ever Want to Use a Compound Query?

Compound operators are used to combine and restrict the results of two SELECT statements. These operators can be used to return or suppress the output of duplicate records. Compound operators can bring together similar data that is stored in different fields.

Compound queries allow you to combine the results of more than one query to return a single set of data. Compound queries are often simpler to write than a single query with complex conditions. Compound queries also allow for more flexibility regarding the never-ending task of data retrieval.

Compound Query Operators

The compound query operators vary among database vendors. The ANSI standard includes the UNION, UNION ALL, EXCEPT, and INTERSECT operators, all of which are discussed in the following sections.

The `UNION` Operator

The UNION operator is used to combine the results of two or more SELECT statements without returning any duplicate rows. In other words, if a row of output exists in the results of one query, the same row is not returned, even though it exists in the second query that combined with a UNION operator. To use UNION, each SELECT must have the same number of columns selected, the same number of column expressions, the same data type, and have them in the same order—but they do not have to be the same length.

The syntax is as follows:

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

UNION

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

Look at the following example:

SELECT EMP_ID FROM EMPLOYEE_TBL

UNION

SELECT EMP_ID FROM EMPLOYEE_PAY_TBL;

Those employee IDs that are in both tables appear only once in the results .

This lesson's examples begin with a simple SELECT from two tables:

SELECT PROD_DESC FROM PRODUCTS_TBL;

PROD_DESC

----------------------

WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH

LIGHTED LANTERNS

ASSORTED COSTUMES

CANDY CORN

PUMPKIN CANDY

PLASTIC SPIDERS

ASSORTED MASKS

KEY CHAIN

OAK BOOKSHELF

11 rows selected.

SELECT PROD_DESC FROM PRODUCTS_TMP;

PROD_DESC

--------------------

WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH

LIGHTED LANTERNS

ASSORTED COSTUMES

CANDY CORN

PUMPKIN CANDY

PLASTIC SPIDERS

ASSORTED MASKS

KEY CHAIN

OAK BOOKSHELF

11 rows selected.

Now , combine the same two queries with the UNION operator, making a compound query.

SELECT PROD_DESC FROM PRODUCTS_TBL

UNION

SELECT PROD_DESC FROM PRODUCTS_TMP;

PROD_DESC

-----------------------

ASSORTED COSTUMES

ASSORTED MASKS

CANDY CORN

FALSE PARAFFIN TEETH

LIGHTED LANTERNS

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PUMPKIN CANDY

WITCHES COSTUME

KEY CHAIN

OAK BOOKSHELF

11 rows selected.

In the first query, nine rows of data were returned, and six rows of data were returned from the second query. Nine rows of data are returned when the UNION operator combines the two queries. Only nine rows are returned because duplicate rows of data are not returned when using the UNION operator.

The next example shows an example of combining two unrelated queries with the UNION operator :

SELECT PROD_DESC FROM PRODUCTS_TBL

UNION

SELECT LAST_NAME FROM EMPLOYEE_TBL;

PROD_DESC

-----------------------

ASSORTED COSTUMES

ASSORTED MASKS

CANDY CORN

FALSE PARAFFIN TEETH

GLASS

KEY CHAIN

LIGHTED LANTERNS

OAK BOOKSHELF

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PLEW

PUMPKIN CANDY

SPURGEON

STEPHENS

WALLACE

WITCHES COSTUME

16 rows selected.

The PROD_DESC and LAST_NAME values are listed together, and the column heading taken is from the column name in the first query.

The `UNION` ALL Operator

The UNION ALL operator is used to combine the results of two SELECT statements including duplicate rows. The same rules that apply to UNION apply to the UNION ALL operator. The UNION and UNION ALL operators are the same, although one returns duplicate rows of data where the other does not.

The syntax is as follows:

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

UNION ALL

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

Look at the following example:

SELECT EMP_ID FROM EMPLOYEE_TBL

UNION ALL

SELECT EMP_ID FROM EMPLOYEE_PAY_TBL

The preceding SQL statement returns all employee IDs from both tables and shows duplicates .

The following is the same compound query in the previous section with the UNION ALL operator:

SELECT PROD_DESC FROM PRODUCTS_TBL

UNION ALL

SELECT PROD_DESC FROM PRODUCTS_TMP;

PROD_DESC

-----------------------

WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH

LIGHTED LANTERNS

ASSORTED COSTUMES

CANDY CORN

PUMPKIN CANDY

PLASTIC SPIDERS

ASSORTED MASKS

KEY CHAIN

OAK BOOKSHELF

WITCHES COSTUME

PLASTIC PUMPKIN 18 INCH

FALSE PARAFFIN TEETH

LIGHTED LANTERNS

ASSORTED COSTUMES

CANDY CORN

PUMPKIN CANDY

PLASTIC SPIDERS

ASSORTED MASKS

KEY CHAIN

OAK BOOKSHELF

22 rows selected.

Notice that there were 22 rows returned in this query (9+6) because duplicate records are retrieved with the UNION ALL operator.

The `INTERSECT` Operator

The INTERSECT operator is used to combine two SELECT statements, but returns rows only from the first SELECT statement that are identical to a row in the second SELECT statement. Just as with the UNION operator, the same rules apply when using the INTERSECT operator.

The syntax is as follows:

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

INTERSECT

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

Look at the following example:

SELECT CUST_ID FROM CUSTOMER_TBL

INTERSECT

SELECT CUST_ID FROM ORDERS_TBL;

The preceding SQL statement returns the customer identification for those customers who have placed an order.

The following example illustrates the INTERSECT using the two original queries in this lesson:

SELECT PROD_DESC FROM PRODUCTS_TBL

INTERSECT

SELECT PROD_DESC FROM PRODUCTS_TMP;

PROD_DESC

--------------------

ASSORTED COSTUMES

ASSORTED MASKS

CANDY CORN

FALSE PARAFFIN TEETH

KEY CHAIN

LIGHTED LANTERNS

OAK BOOKSHELF

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PUMPKIN CANDY

WITCHES COSTUME

11 rows selected.

Only eleven rows are returned, because only eleven rows were identical between the output of the two single queries.

The `EXCEPT` Operator

The EXCEPT operator combines two SELECT statements and returns rows from the first SELECT statement that are not returned by the second SELECT statement. Once again, the same rules that apply to the UNION operator also apply to the EXCEPT operator.

The syntax is as follows:

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

EXCEPT

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

Study the following example:

SELECT PROD_DESC FROM PRODUCTS_TBL

EXCEPT

SELECT PROD_DESC FROM PRODUCTS_TMP;

PROD_DESC

-----------------------

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PUMPKIN CANDY

3 rows selected.

According to the results, there were three rows of data returned by the first query that were not returned by the second query.

Note

The EXCEPT operator is known as the MINUS operator in some implementations. Check your implementation for the operator name that performs the EXCEPT operator's function .

SELECT PROD_DESC FROM PRODUCTS_TBL

MINUS

SELECT PROD_DESC FROM PRODUCTS_TMP;

PROD_DESC

-----------------------

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PUMPKIN CANDY

3 rows selected.

Using an `ORDER BY` with a Compound Query

The ORDER BY clause can be used with a compound query. However, the ORDER BY can only be used to order the results of both queries. Therefore, there can be only one ORDER BY clause in a compound query, even though the compound query may consist of multiple individual queries or SELECT statements. The ORDER BY must reference the columns being ordered by an alias or by the number of column order.

The syntax is as follows:

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

OPERATOR{UNION | EXCEPT | INTERSECT | UNION ALL}

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

[ ORDER BY ]

Examine the following example:

SELECT EMP_ID FROM EMPLOYEE_TBL

UNION

SELECT EMP_ID FROM EMPLOYEE_PAY_TBL

ORDER BY 1;

The results of the compound query are sorted by the first column of each individual query. Duplicate records can easily be recognized by sorting compound queries.

Note

The column in the ORDER BY clause is referenced by the number 1 instead of the actual column name.

The preceding SQL statement returns the employee ID from the EMPLOYEE_TBL and the EMPLOYEE_PAY_TBL, but does not show duplicates and orders by the employee ID.

The following example shows the use of the ORDER BY clause with a compound query. The column name can be used in the ORDER BY clause if the column sorted by has the same name in all individual queries of the statement.

SELECT PROD_DESC FROM PRODUCTS_TBL

UNION

SELECT PROD_DESC FROM PRODUCTS_TBL

ORDER BY PROD_DESC;

PROD_DESC

-----------------------

ASSORTED COSTUMES

ASSORTED MASKS

CANDY CORN

FALSE PARAFFIN TEETH

KEY CHAIN

LIGHTED LANTERNS

OAK BOOKSHELF

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PUMPKIN CANDY

WITCHES COSTUME

11 rows selected.

The following query uses a numeric value in place of the actual column name in the ORDER BY clause :

SELECT PROD_DESC FROM PRODUCTS_TBL

UNION

SELECT PROD_DESC FROM PRODUCTS_TBL

ORDER BY 1;

PROD_DESC

-----------------------

ASSORTED COSTUMES

ASSORTED MASKS

CANDY CORN

FALSE PARAFFIN TEETH

KEY CHAIN

LIGHTED LANTERNS

OAK BOOKSHELF

PLASTIC PUMPKIN 18 INCH

PLASTIC SPIDERS

PUMPKIN CANDY

WITCHES COSTUME

11 rows selected.

Using a `GROUP BY` with a Compound Query

Unlike the ORDER BY, the GROUP BY can be used in each SELECT statement of a compound query, but also can be used following all individual queries. In addition, the HAVING clause (sometimes used with the GROUP BY clause) can be used in each SELECT statement of a compound statement.

The syntax is as follows:

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

[ GROUP BY ]

[ HAVING ]

OPERATOR {UNION | EXCEPT | INTERSECT | UNION ALL}

SELECT COLUMN1 [, COLUMN2 ]

FROM TABLE1 [, TABLE2 ]

[ WHERE ]

[ GROUP BY ]

[ HAVING ]

[ ORDER BY ]

In the following example, you select a literal string to represent customer records, employee records, and product records. Each individual query is simply a count of all records in each appropriate table. The GROUP BY clause is used to group the results of the entire report by the numeric value 1, which represents the first column in each individual query.

SELECT 'CUSTOMERS' TYPE, COUNT(*)

FROM CUSTOMER_TBL

UNION

SELECT 'EMPLOYEES' TYPE, COUNT(*)

FROM EMPLOYEE_TBL

UNION

SELECT 'PRODUCTS' TYPE, COUNT(*)

FROM PRODUCTS_TBL

GROUP BY 1;

TYPE        COUNT(*)

----------- --------

CUSTOMERS         15

EMPLOYEES          6

PRODUCTS           9

3 rows selected.

The following query is identical to the previous query, except that the ORDER BY clause is used as well:

SELECT 'CUSTOMERS' TYPE, COUNT(*)

FROM CUSTOMER_TBL

UNION

SELECT 'EMPLOYEES' TYPE, COUNT(*)

FROM EMPLOYEE_TBL

UNION

SELECT 'PRODUCTS' TYPE, COUNT(*)

FROM PRODUCTS_TBL

GROUP BY 1

ORDER BY 2;

TYPE        COUNT(*)

----------- --------

EMPLOYEES          6

PRODUCTS           9

CUSTOMERS         15

3 rows selected.

This is sorted by column 2, which was the count on each table. Hence, the final output is sorted by the count from least to greatest.

Retrieving Accurate Data

Be cautious when using the compound operators. Incorrect or incomplete data may be returned if you were using the INTERSECT operator and you used the wrong SELECT statement as the first individual query. In addition, consider whether duplicate records are wanted when using the UNION and UNION ALL operators. What about EXCEPT? Do you need any of the rows that were not returned by the second query? As you can see, the wrong compound query operator or the wrong order of individual queries in a compound query can easily cause misleading data to be returned.

Note

Incomplete data returned by a query qualifies as incorrect data.

Gaining a Perspective on Microsoft Data Access Components

Access 2002 continues Microsoft's strategy of emphasizing "Universal Data Access" for Windows database programming. Microsoft wants all Office users, not just Access developers, to abandon Data Access Objects (DAO), ODBCDirect, and the Open Database Connectivity (ODBC) Application Programming Interface (API) in favor of a collection of Component Object Model (COM) interfaces called OLE DB and ActiveX Data Objects (ADO). To encourage Access users and developers to adopt OLE DB and ADO, all traditional Microsoft database technologies (referred to by Microsoft as downlevel, a synonym for "obsolete") are destined for maintenance mode. Maintenance mode is a technological purgatory in which Microsoft fixes only the worst bugs and upgrades occur infrequently, if ever. In 1999, OLE DB, ADO, and for Jet programmers, ActiveX Data Object Extensions (ADOX), became Microsoft's mainstream data access technologies.

Microsoft's primary goals for Universal Data Access are to

· Provide the capability to accommodate less common data types unsuited to SQL queries, such as directory services (specifically Active Directory), spreadsheets, email messages, and file systems

· Minimize the size and memory consumption of the dynamic link libraries (DLLs) required to support data access on Internet and intranet clients (PCs and handheld "Internet appliances")

· Reduce development and support costs for the multiplicity of Windows-based data access architectures in common use today

· Extend the influence of COM and COM+ in competition with other object models, primarily Common Object Request Broker Architecture (CORBA) and its derivatives

This chapter introduces the fundamentals of Universal Data Access and Microsoft Data Access Components (MDAC). MDAC makes connecting to databases with OLE DB practical for Access users and developers. MDAC includes ADO and ADOX for conventional data handling, plus ADOMD for multidimensional expressions (MDX) to create and manipulate data cube s.

Note

Microsoft SQL Server Analysis Services (formerly OLAP Services) generates data cubes from online sources, such as transactional databases. Office XP installs Msadomd.dll and other supporting files for MDX and data cubes. Microsoft provides OLE DB for OLAP and the PivotTable Service to enable Excel 2002 PivotTables to manipulate data cubes. MDX and PivotTable services are beyond the scope of this book.

Interfacing with a Wide Range of Data Sources

Despite Microsoft's emphasis on OLE DB and ADP, ODBC and DAO remain the most popular methods of connecting desktop database front ends to data sources. Thirty-two-bit ODBC drivers are available for virtually every client/server RDBMS; most popular Indexed Sequential Access Method (ISAM) databases (Jet, Btrieve dBASE, Visual FoxPro, and Paradox); spreadsheets (Excel); and delimited text files. Microsoft Office traditionally has relied on ODBC drivers for most of its database connectivity features. Although other software publishers attempted to introduce ODBC alternatives, ODBC quickly became the de facto standard of the database industry. ODBC is a C/C++ API; making direct use of the ODBC API in Access applications requires a large number of VBA function prototype declarations (DECLAREs) and heavy-duty, low-level coding. Few, if any, Access developers wrote code to use the ODBC API directly.

In 1994, Microsoft Access 2.0 introduced 16-bit DAO in conjunction with version 2.0 of the Jet database engine. Like ODBC, the Jet database engine is a C/C++ API. DAO provided what was then called an OLE Automation wrapper over the Jet API; up to that time OLE had been used primarily for creating and manipulating compound documents. DAO exposes a complex hierarchy of programmable data-related objects with the DBEngine object at the top. Lower objects in the hierarchy—Workspaces, Databases, TableDefs, QueryDefs, and Recordsets—are object layers under DBEngine. Jet databases permit attaching other databases via ODBC, eliminating the need for low-level ODBC API programming. (You can't use OLE DB to attach databases to Jet front ends.) With the release of Access 95, 32-bit Jet and DAO 3.0 appeared; Access 2000 introduced Jet 4.0 and DAO 3.6, which remain unchanged by Access 2002. Microsoft has sold tens of millions of Access licenses, making Jet the most widely used desktop database in the world and DAO remains the default database object model for Access users and for most Access and Visual Basic programmers.

The DAO/Jet combination is a heavyweight. Dao360.dll, in your \Program Files\Common Files\Microsoft Shared\Dao folder, weighs in at 544KB. Msjet40.dll, in \Windows\System or \Winnt\System32, tips the scales at 1,465KB. To eliminate the 2MB DAO/Jet footprint for Access client/server front ends, Microsoft introduced in Access 97 another object model, ODBCDirect. ODBCDirect allowed client/server front ends to communicate with back-end SQL RDBMSs without incurring the Jet overhead. ODBCDirect is based on the Remote Data Object (RDO), yet another data object model (YADOM). (ODBCDirect and RDO are obsolete). The proliferation of Microsoft data access object models caused choice crises among developers .

Some ADO 2.x support file names have a 1.5 version number, as in Msado15.dll; the strange versioning of these files is required for backward compatibility with applications that used very early versions of ADO.

Note

MDAC 2.x also supports Remote Data Services (RDS, formerly Advanced Database Connector, or ADC). RDS handles lightweight ADOR.Recordsets for browser-based applications; RDS, which commonly is used for three-tier, Web-based applications, is required to make Data Access Pages (DAP) accessible safely over the Internet.

Creating `ADODB.Recordset` Objects

The concept of database object independence is new to Access. The best way of demonstrating this feature is to compare DAO and ADO code to create a Recordset object from an SQL statement. DAO syntax uses successive instantiation of each object in the DAO hierarchy: DBEngine, Workspace, Database, and Recordset, as in the following example:

Dim wsName As DAO.Workspace

Dim dbName As DAO.Database

Dim rstName As DAO.Recordset

Set wsName = DBEngine.Workspaces(0)

Set dbName = wsName.OpenDatabase ("DatabaseName.mdb")

Set rstName = dbName.OpenRecordset ("SQL Statement")

As you descend through the hierarchy, you open new child objects with methods of the parent object.

The most common approach with ADO is to create one or more independent, reusable instances of each object in the Declarations section of a form or module:

Private cnnName As New ADODB.Connection

Private cmmName As New ADODB.Command

Private rstName As New ADODB.Recordset

Note

This book uses cnn as the object type prefix for Connection, cmm for Command, and rst for Recordset. The cmm prefix is used because the cmd prefix traditionally identifies a command button control and the com prefix identifies the MSComm ActiveX control (Microsoft Comm Control 6.0).

Although you're likely to find references to DAO.Recordset dynasets and snapshots in the Access documentation, these terms don't apply to ADODB.Recordset objects. See the CursorType property of the ADODB.Recordset object in the "Recordset Properties" section later in this chapter for the CursorType equivalents of dynasets and snapshots.

After the initial declarations, you set the properties of the new object instances and apply methods—Open for Connections and Recordsets, or Execute for Commands—to activate the object. Invoking the Open method of the ADODB.Recordset object, rather than the OpenRecordset method of the DAO.Database object, makes ADO objects independent of one another. Object independence and batch-optimistic locking, for instance, let you close the ADODB.Recordset's ADODB.Connection object, make changes to the Recordset, and then re-open the Connection to send only the changes to the underlying tables. The examples that follow illustrate the independence of top-level ADO members .

Designing a Form Bound to an `ADODB.Recordset` Object

Access 2000+ forms have a property, Recordset, which lets you assign an ADODB.Recordset object as the RecordSource for one or more forms. The Recordset property of a form is an important addition, because you can assign the same Recordset to multiple forms. All forms connected to the Recordset synchronize to the same current record. Access developers have been requesting this feature since version 1.0. Access 2002 finally delivers updatable ADODB.Recordsets for Jet, SQL Server, and Oracle data sources that you can assign to the Recordset property value of forms and reports.

To create a simple form that uses VBA code to bind a form to a Jet ADODB.Recordset object, follow these steps:

1. Open a new database in Access 2000 format named ADOTest.mdb or name it whatever you want. Add a new form in design mode and save it as frmADO_Jet.

2. Click the Code button on the toolbar to open the VBA editor, and choose Tools, References to open the References dialog.

3. Clear the check box for the reference to the Microsoft ActiveX Data Objects 2.1 Library, scroll to the Microsoft ActiveX Data Objects 2.6 Library, and mark the check box. Close and reopen the References dialog to verify that the new reference has percolated to the select region of the list (see Figure 4). Close the References dialog.

4. Add the following code to the Declarations section of the frmADO_Jet Class Module:

5.            Private strSQL As String

6.            Private cnnNwind As New ADODB.Connection

7.             Private rstNwind As New ADODB.Recordset

8. Add the following code to create the Form_Load event handler:

9.            Private Sub Form_Load()

10.           'Specify the OLE DB provider and open the connection

11.            With cnnNwind

12.              .Provider = "Microsoft.Jet.OLEDB.4.0"

13.              .Open CurrentProject.Path & "\Northwind.mdb", "Admin"

14.          End With

15.

16.           strSQL = "SELECT * FROM Customers"

17.            With rstNwind

18.              Set .ActiveConnection = cnnNwind

19.              .CursorType = adOpenKeyset

20.             .CursorLocation = adUseClient

21.              .LockType = adLockOptimistic

22.              .Open strSQL

23.          End With

24.           'Assign rstNwind as the Recordset for the form

25.           Set Me.Recordset = rstNwind

26.        End Sub

Note

The preceding code includes several properties that this chapter hasn't discussed. The objective of this and the following sections is to get you started with a quick demonstration of the Form.Recordset property. Properties and methods of the Connection and Recordset objects are the subject of the "Exploring Top-Level ADO Properties, Methods, and Events" section that follows shortly.

6. Return to Access and change to Form view to execute the preceding code. Then open the Properties window and click the Data tab. Your form appears as shown in Figure 5, with the first of 91 records selected by the navigation buttons .

Note

The form's Record Source property value is the SQL statement specified as the argument of the Recordset object's Open method. The Recordset Type property value appears as Dynaset, which isn't a valid ADODB.Recordset type. The enabled Add New Record navigation button demonstrates that the form is updatable. Updatable ADODB.Recordset objects for Jet databases are a new feature of Access 2002.

Working with the `ADODB.Connection` Object

The Connection object is the primary top-level ADO component. You must successfully open a Connection object to a data source before you can use associated Command or Recordset objects.

Provider-Specific Properties and Their Values

When you're tracking down problems with Connection, Command, Recordset, or Record objects, you might need to provide the values of some provider-specific properties to a Microsoft or another database vendor's technical service representative. To display the names and values of provider-specific ADODB.Property objects for an ADODB.Connection to a Jet database in the Immediate window, do the following:

1. In the declarations section of the VBA code for the frmADO_Jet or frmADO_MSDE form, add the following object variable declaration:

2.            PrivateprpProp AsADODB.Property

Property objects exist in the Properties collection, so you don't add the New keyword in this case.

3. After the End With statement for cnnNwind, add the following instructions to print the property names and values:

4.            Debug.Print cnnNwind.Properties.Count & _

5.               " { SQL Server|Jet}  Connection Properties"

6.            For Each prpProp In cnnNwind.Properties

7.                Debug.Print prpProp.Name & " = " & prpProp.Value

8.            Next prpProp

9. Press Ctrl+G to open the Immediate window and delete its contents.

10. Reopen the form in Datasheet view to execute the Form_Load event handler, and return to the VBA editor to view the result in the Immediate window.

11. To find a definition of a provider-specific property of Jet or SQL Server data sources, connect to the Microsoft Web site, select Search, Search microsoft.com from the opening page, copy or type the name of the property in the Search For text box, select Exact Phrase from the Using list, and click Search. Click the appropriate link (usually the first) to display the definition of the property.

12. After you've satisfied your curiosity about provider-specific properties and their values, comment out or delete the added code. Sending a significant amount of data to the Immediate window delays opening the form.

Note

Pages that define SQL Server-specific properties specify values by reference to DBPROPVAL_... constants whose values aren't included in the table. Many searches for Jet-specific Property object definitions lead to the "Appendix C: Microsoft Jet 4.0 OLE DB Provider-Defined Property Values" page (http://msdn.microsoft.com/library/techart/daotoadoupdate_topic15.htm), which provides a set of constant values that you can add to an Access module.

Tip

The "Appendix A: DAO to ADO Quick Reference" page (http://msdn.microsoft.com/library/techart/daotoadoupdate_topic13.htm) of the "Migrating from DAO to ADO" white paper contains a table that translates DAO objects and properties to ADO objects, properties, and provider-specific Jet properties. To create an easily searchable version, copy the table to a Word document and save it in both .doc and .htm formats. Importing the .htm table to a Jet or SQL Server table lets you view the contents in a searchable datasheet. Contents of the Microsoft Web site are copyrighted, so the table isn't included in this lesson's sample databases.

SQL Definition and History

What Is SQL?

What Is ANSI SQL?

The New Standard: SQL3

What Is a Database?

An Introduction to the Relational Database

An Introduction to Client/Server Technology

Some Popular Relational Database Vendors

Differences Between Implementations

SQL Sessions

CONNECT

DISCONNECT

Types of SQL Commands

Defining Database Structures (DDL)

Manipulating Data (DML)

Selecting Data (DQL)

Data Control Language (DCL)

Data Administration Commands

Transactional Control Commands

A Closer Look at What Composes a Table

A Field

A Record, or Row, of Data

A Column

The Primary Key

A NULL Value

Basic Data Types

Fixed-Length Characters

Variable Characters

Numeric Values

Decimal Values

Integers

Floating-Point Decimals

Dates and Time

Literal Strings

NULL Data Types

BOOLEAN Values

User-Defined Types

Domains

What Are Database Objects?

What Is a Schema?

A Table: The Primary Storage for Data

Fields and Columns

Rows

The CREATE TABLE Statement

STORAGE Clause

Naming Conventions

The ALTER TABLE Command

Modifying Elements of a Table

Adding Mandatory Columns to a Table

Modifying Columns

Creating a Table from an Existing Table

Dropping Tables

Integrity Constraints

Primary Key Constraints

Unique Constraints

Foreign Key Constraints

NOT NULL Constraints

Using Check (CHK) Constraints

Dropping Constraints

Overview of Data Manipulation

Populating Tables with New Data

Inserting Data into a Table

Inserting Data into Limited Columns of a Table

Inserting Data from Another Table

Inserting NULL Values

Updating Existing Data

Updating the Value of a Single Column

Updating Multiple Columns in One or More Records

Deleting Data from Tables

What Is a Transaction?

To start or execute transactions is implementation-specific. You must check your particular implementation for how to begin transactions. There is no explicit start or begin transaction in the ANSI standard. What Is Transactional Control?

The COMMIT Command

The ROLLBACK Command

The SAVEPOINT Command

The ROLLBACK TO SAVEPOINT Command

The RELEASE SAVEPOINT Command

The SET TRANSACTION Command

Transactional Control and Database Performance

What Is a Query?

Introduction to the SELECT Statement

`CONNECT`

`DISCONNECT`

`NULL` Data Types

`BOOLEAN` Values

The `CREATE TABLE` Statement

`STORAGE` Clause

The `ALTER TABLE` Command

`NOT NULL` Constraints

Using Check (`CHK`) Constraints

Inserting `NULL` Values

To start or execute transactions is implementation-specific. You must check your particular implementation for how to begin transactions. There is no explicit start or begin transaction in the ANSI standard . What Is Transactional Control?

The `COMMIT` Command

The `ROLLBACK` Command

The `SAVEPOINT` Command

The `ROLLBACK TO SAVEPOINT` Command

The `RELEASE SAVEPOINT` Command

The `SET TRANSACTION` Command

Introduction to the `SELECT` Statement

The `SELECT` Statement

The `FROM` Clause

`IS NULL`

`BETWEEN`

`IN`

`LIKE`

`EXISTS`

`UNIQUE`

`ALL` and `ANY` OPERATORS

`AND`

`OR`

Negating Conditions with the `NOT` Operator

`NOT BETWEEN`

`NOT IN`

`NOT LIKE`

`IS NOT NULL`

`NOT EXISTS`

`NOT UNIQUE`

The `COUNT` Function

The `SUM` Function

The `AVG` Function

The `MAX` Function

The `MIN` Function

The `GROUP BY` Clause

`GROUP BY` Versus `ORDER BY`

The `HAVING` Clause

`TRANSLATE`

`TRANSLATE`

`REPLACE`

`UPPER`

`LOWER`

`SUBSTR`

`INSTR`

`LTRIM`