BOMBOLOM.COM

(databases) Theory: 2 - Relational Databases

Posted by José Lopes.

This post goal is to explain in general what is a relational database and some associated terminology. Its useful to have an idea about this type of database and its management system to understand better the process of database design (to be treated in another post)

The relational database is currently the most used in the database world.

This is due, among other thing, to:

This type of database use the relational model. This is a mathematical model where the term relation represent a table and not a relationship. This is a very commom misunderstanding.

The relation is to be regarded as a collection of ordered n-thing (normally referred as n-tuples).
An example of such collections can be:

If we write these n-thing on top of each others we will get something resembling a table:

Europe Portugal Euro
South America Brazil Real

This is the conceptual image of databases following the relational model. We can even say that the information in the relational model is laid out using only tables and all the operation we may carry out result in other tables.

Lets define some terminology before moving forward.

Drawing it to a diagram:

The use of tables allows a high uniformity of the model because, like we can see in the diagram, the values on the lines represent an unique association, in other words, they represent the entity individual values or states and nothing other than the entity (we can see it as an instance of the entity that records specific values), while the values on the columns are limited to a specific domain (like the Regions).

The 'conceptual image' term comes with the theory implicit on the relational model that, from its theorical world, states that there should not exist special columns neither pointer mechanisms to associate information. This could not be more distant to the reality.

Though this model mentor, Edgar F. Codd, had wroten what become known as the Codd's 12 Rules to define the necessary requirements that a product ought to match to be considered as a relational database product, the truth is that none of the existing product meet them in full.

This could be due to some subjectivity on the text rules, but for sure is due to the products using optimization mechanisms (the optimizers mentioned before) that aim the performance. This goes to the point that indexation tables are created within the management system and special coulmns are indead used on the real tables, making use of primary keys and foreign keys, that do not exist in the conceptual image.

We then have to become aware of the primary keys and the foreign keys. Any database management system uses extensively these two concepts.

A primary key is nothing more than the values of a given column that assure that is always possible to distinguish the rows, i.e., is mandatory that these values are unique. The primary key is what identifies a record in the table assuring its unicity.

This is not a question of record repetition on the table. The main function of the primary key is to assure the database consistency, so that the management system doesn't froze because its not able to distinguish between two records, that is the reason of the unicity.

Usually, just to keep a visual logic, the first column of a table stores the primary keys. Of course that this is not mandatory and we can set any other column for that, this is more to keep a logical reading.

The management systems (DBMS) use internally the indexation, for internal optimization and consolidation mechanisms, adding an extra column at the begging of each table. This is not a primary key, is internal to the systen and the user doesn't see it neither controls it.

Allow me to make a small break and take as example an Oracle system. With Oracle we can use this indexation (through a PL/SQL statment) to retrieve data faster since with it we retrieve the record directly without any association method. Oracle uses a number with a defined size for the indexation.

I really don't advice the use of this technique and I invoke two reasons:

Ending the break, its very important to properly choose the primary key to have a good database consistency. I will mention again the primary keys and how to choose them in the post about the database design process.

The foreign keys are values in a column that represent the relationship between that table and another one. With the following diagram the concept is easy:

In the column type of the wines table we have foreign keys, i.e., the values on the column come from the table types. This is set when building the database, limiting the possible values to the type column of the table wines to the ones existing in the table types.

We can observe that there is no uniqueness constraint for this kind of keys, we can have the same foreign key several times in the wine table. Notice, on the other hand, that the column on the types table is made of primary keys.

In conclusion, the keys and tables concept will be important when normalizing the databse. This point will also be treated in the post about the database design process.

2008.05.31 | There's more... | Comments 0 | Tags , ,

Deixe a sua mensagem:

Nome:


E-mail:


URL:


Comment:

Secret number

To send you comment you must insert the "secret number" on the box


Made with PyBlosxom | Add to Google