I couldn't let it go, I had to correct my spelling error. :) Great job on a concise, "just the facts, man" sort of article...
Classic HTML-based web sites can become a headache as they grow in complexity. Changing hundreds of pages in a multi-lingual company presentation site can really deter site keepers from frequent content updates. As a result, the site eventually dies out by becoming obsolete. Dynamic websites come with a different approach to web development: content and presentation are separated. Content is organized and stored in a database, a repository that is then accessed in order to extract the relevant information where needed, or to add new information. In this article,
I will showcase some of the advantages of going dynamic.
I will explain the main concepts related to using and designing databases.
I will guide you through designing your databases.
I will suggest some best practices in database naming.
A database usually refers to data organized and stored on a computer that can be searched and retrieved by a computer program. As Ben Forta put it in one of his excellent articles, a database is a "structured collection of similar data". Basically, any entity in your application can be modeled as a database table, and any similar information can be stored as separate records in such tables. Imagine an online product catalog. Products are grouped by categories. Each product has a manufacturer. Since products are similar, they can be all stored in the same table. The same goes for categories, and for manufactures.
Dynamic has nothing to do with animation or real-time applications. Dynamic simply means using a database. Unlike static web sites, dynamic ones change automatically when a piece of data is added, updated or removed from the database. Reusing and syndicating content to other sites can also become a lot easier, since everything is stored in a central repository.
Every dynamic website is driven by a database, which defines the structure and organization of the data you will be manipulating. The database stores all site information which can be retrieved by the application server via a query. The relevant information is then returned as a recordset, which represents the query results.
A table is a data grid used to store similar information. It is made up of columns (also known as fields), which represent entity attributes or pieces of data, and rows, which represent individual records. All records in a table share the same fields. In the previous example, a TV set and a T-shirt are records in the table that stores products. The price and description of each product represent columns in this table. Usually, databases have more tables, each corresponding to one of the objects you are trying to represent.

Illustration
1: Listing all your data in one table
In a database, tables are usually related in a
logical way. A link between two tables is called a relation. A
relation always involves two tables and one column from each (of
them). Linking tables is very important for avoiding data redundancy
and for better organizing your database. Suppose the TV set product
mentioned earlier was manufactured by a company called ACME, and you
also store a CD player produced by Acme. If there were not separate
tables for products and manufacturers, you would need to store the
same manufacturer twice in your database, once for each product. This
not only takes up a lot of disk space, but maintaining several copies
of the same data can make updates very difficult. What happens if
ACME decides to relocate and you have to change its address?
From the following diagram, its clear that the manufacturer
name and address are not duplicated. However, in order to indicate
that there is a relationship between the two tables, you need two
fields: a primary key in the manufacturers table and a foreign key in
the products table. These concepts will be defined later on.

Illustration
2: Breaking up information to eliminate redundancy
A special table containing additional attributes for features stored in an associated feature attribute table. It usually contains two fields: one storing a numeric identifier (the ID) and one storing the label (symbol). An example of a look-up table could be a categories table for products, storing the category ID and the name. To retrieve the complete information regarding the product, a JOIN operation must be performed between the two tables.
In practice, relations are one of three types:
one-to-many – Each manufacturer produces a variety of products, but a product belongs to a single manufacturer.
many-to-many – Each student attends several different classes and each class is attended by several students.
one-to-one – Each shopper has a delivery address.
However, not all these relations are translated into the actual database design. There is no point in having a one-to-one relation between two tables, since it would mean that the tables actually store different attributes of the same object. All similar data should be stored in the same entity, that is, a table. Also, in most situations, a many-to-many relationship can be broken into two or more one-to-many relationships, through the process of "normalization". I will discuss this in detail in the "Normalize the information" section.
A key is a special column in a table. While some keys are used for searching and indexing, other are used for linking related tables or uniquely identifying a record. While a database table can survive without a primary key or an index, but this deters the whole purpose of a database: the easy retrieval of information. Below, I define the types of keys used in databases: primary key, foreign key, unique key and index.
The purpose of a primary key is to uniquely identify each record in a table. In general, a primary key is defined on a single column, but it is not uncommon to have it defined on two columns. There are three basic rules that you should observe when defining primary keys:
Every record must have a value in the primary key. It cannot be null.
Primary key values must be unique.
Primary key values should not be reused. If a record is deleted, its primary key should not be reassigned to another record, as this can create errors and confusion.
A foreign key is a field from a table that refers to (or targets) a specific key, usually the primary key, in another table. This is a convenient way of logically linking information from related tables. For instance, the products table contains a foreign key (idman_prd) that references the primary key field (id_man) in the manufacturers table. This way, each product has an associated manufacturer – its foreign key points to the unique identifier of the manufacturer. Please note that the foreign key is not unique, but the referenced field (the primary key in the referenced table) usually is. For instance, in the previous example, the CD player and the TV set are both manufactured by ACME company, therefore both records have the same foreign key value.

Illustration
3: Defining relations with foreign keys
A unique key is defined as having no two of its values the same. The columns of a unique key cannot contain null values. A table can have multiple unique keys or no unique keys at all. For instance, you might want to define a unique key on the manufacturer name field (name_man) from the previous example. This way, you can guarantee that users of your application will not create two separate records for the ACME company.
An index is an alphabetical listing of the terms found on database records and the number of records on which the terms are found. The index is maintained by the database application and is used only by the database application. You never actually see the index in your database, and in fact, most modern database applications hide the physical storage location of the index altogether.
A query is a question that is put to the database server. Database queries can be one of two types:
Select queries, that retrieve data from tables. For instance, getting a list of all products manufactured by ACME Company, that cost more than $100.
Action queries, that manipulate data in some way (e.g. insert records, update or deleting existing records).For instance, adding a new product to your catalog or changing the address of the ACME Company.
The language used to query a database is called query language. The most well-known is the Structured Query Language (SQL).
A recordset is basically "a set of records". Recordsets are the result of a query, and are used as a source of data for your dynamic web applications. For instance, if you query the sample database for all products manufactured by ACME, you will get a recordset containing two records: the CD player and the TV set.
The database design process involves the actual creation of the database, as well as the entire planning you have to do before. In the design phase, you should ask yourself three questions:
What information must be stored?
In which tables?
In what form?
The goal of the design process is to achieve minimum redundancy (storing the same data over and over again), a minimum database size and protection against errors that might occur.
Eliminating redundancy and protecting against user errors are done through normalization. This means applying a set of principles in order to organize existing data into logical entities. These will be represented as tables connected through relations.
To achieve a robust table organization, you must pass data through the three steps of normalization. Starting from the unnormalized information, each applied rule will shape the organization of your database. Before throwing yourself on the computer to try and create something, take some time to gather and analyze all data you plan to work with in your web application. This is what normalization is all about.
Once information is normalized, you can move on to building the actual database in the database software you are using.
First, identify all pieces of information you will need for your web application. For instance, for a simple product catalog, you will need the following information:
product category
manufacturer
address of the manufacturer
product name
description of the product
price
Put all information in one simple grid or list, even if some of it will be repeated. For the product catalog example, the grid may look like this:

Illustration
4: Listing all information in one table
Identify
groups of repeating data and isolate them. For instance, in the
previous example, the ACME company name and its address are
repeated. Also, categories are repeated for similar products. This
means splitting the data grid into two tables, one containing the
repeating group, and the other the rest of information.
Once you
filtered data according to this criteria, you should obtain three
tables:

Illustration
5: Manufacturers list
one
for the manufacturers

Illustration
6: Categories list
one
for the categories:
and one for the products.

Illustration
7: Product list
Decide
what information must be unique within each group. In the previous
example, the manufacturer name should be unique, to avoid having the
same manufacturer twice in the database. Categories should also be
different. Finally, product names should be unique. Simply remove
duplicate records from the tables you obtained so far. This means
removing the redundant lines with "ACME" and with
"Cameras" in our example.
Note:
At this point you must avoid the vertical partitioning of
tables. If the same or similar entities exist, they should probably
be in the same table.
Define
a primary key for each group that contains non-repeating entities.
Since categories, manufacturers and products are all stored in a
separate table and must all be unique, they should each have a
unique numeric identifier associated.
The tables will now
contain an additional field, storing the unique identifier:

Illustration
8: Categories table
for
the categories table:

Illustration
9: Manufacturers table
for
the manufacturer table:
for
the products table:

Illustration
10: Products table
Represent relations among entities using foreign keys.. Establish how objects are logically linked. Make sure each relation is represented by two keys: the primary key in the detail table and the foreign key in the master table. Here is how these relations are modeled for the previous example:

Illustration
11: Final database diagram
When you have information scattered across different tables, and you need to retrieve it from more than one, you will have to perform a JOIN operation. The JOIN is an SQL operation that puts together data from related tables. There are several types of JOINS, based on the information they retrieve. What they all have in common is the fact that a condition must be specified (e.g. one column in the first table must relate to a column in the second table. This is the foreign key.
Inner JOIN
The INNER JOIN returns all rows from both tables where there is a match. If there are rows in Products that do not have matches in Categories, those rows will not be listed.
When trying to retrieve all data about products that exist in a category, the following query will display only products that belong to a defined category:
SELECT
* FROM product_prd
INNER JOIN category_ctg
ON
product_prd.idctg_prd = category_ctg.id_ctg
To achieve the same JOIN results, you can also use the following syntax:
SELECT
* FROM product_prd, category_ctg
WHERE product_prd.idctg_prd =
category_ctg.id_ctg
Instead of using the reserved keyword, the syntax uses a simple WHERE statement that verifies the equality between the two table keys.
Left Join
The LEFT JOIN returns all the rows from the first table, even if there are no matches in the second table. If there are rows in the first table that do not have matches in the second, those rows also will be listed.
As an example, assume that the table product_prd is the first table, and the category_ctg table is the second. The query retrieving all rows using a LEFT JOIN operation is:
SELECT
* FROM product_prd
LEFT JOIN category_ctg
ON
produt_prd.idctg_prd = category_ctg.id_ctg
Through this query, all rows in the products table will be displayed, even if some are without an associated category.
Right Join
The RIGHT JOIN returns all the rows from the second table involved in the query, even if there are no matches in the first table. If there had been any rows in the second table that did not have matches in the first one, those rows also would have been listed.
Using the same table, the query that retrieves all the rows from the categories table (even if a category does not contain any products) is:
SELECT
* FROM product_prd
RIGHT JOIN category_ctg
ON
product_prd.idctg_prd = category_ctg.id_ctg
This query will display all rows in the category table, even the categories that have no products matching them.
Outer Join
An OUTER JOIN retrieves data from both tables, when there are no matching values in the tables. This way the end result will be a summation of all rows in both tables.
Normalization is an important step in database design, as it minimizes the redundancy in information, thus reducing the disk space occupied by the database itself.
It also reduces the risk of an error affecting the information. An anomaly is the undesired effect of a correct query on a badly designed database. There are three types of anomalies, corresponding to three types of action queries:
Update anomaly – if the same data is stored in two different places, updating only one record would leave the other unchanged, thus creating confusion among users.
Delete anomaly – if only one record holds details regarding general information or different entities, once that record is removed, relevant information is lost.
Insert anomaly – if inserting a fact about some attribute requires inserting additional information about another entity or attribute.
As you can see, these anomalies are eliminated through normalization. However, this has its downside: if the database is excessively normalized, queries that require complex data to be displayed will take more time to complete, as data has to be retrieved from more and more tables.
Once your data is normalized (i.e. it's in a normal form), you can start creating the actual database, with all of its tables. To do so, you must first fire up your database management software. Create a new empty database. Please observe the naming conventions presented in the "Database Naming Convention" chapter.
The database components that store data are the tables. A table is defined by its name, and by the columns it contains. Tables are separate entities in a database, and different relationships can exist between them, as discussed previously.
When you create a table in the database, you must also define each of the columns it contains. Although you can also add columns later, it is recommended that all columns identified during the normalization process be created at the same time with the table.
For each column in a table, the following elements must be specified:
The
column name – this identifies a column, and it must be unique
for the table it belongs to, as you will use it in queries to
retrieve specific attributes.
Tip: Columns with the
same name can exist in the same database, but in different tables.
The column's data type – this establishes what type of data a certain column can contain. The types allowed for various purposes vary between different impl
I couldn't let it go, I had to correct my spelling error. :) Great job on a concise, "just the facts, man" sort of article...
Great job on a consice, "just the facts, mam" sort of article.