Thursday 21 June 2012

Second Normal Form

After talkling about the first normal form earlier, its time for second normal form.

But, before we start, first go through this and this article serially if you haven't.

Now, to understand it, take a look on this statement
Second Normal Form = First Normal Form + [ ADDITIONAL PROPERTY ]
From this, we can conclude that for a table to have second normal form, it must have first normal form plus some additional property.

We already have discussed first normal form. So, In this post, I will discuss this additional property for a table (already having first normal form) to qualify for second normal form.
And this addtional characteristic is that every column in the table depends on the whole candidate key (column suitable to be primary key), NOT on just a subset column of candidate key in case of composite candidate key.

Let's continue the example of table EMP containg employee recordswe took to understand first normal form. Now we have the table having first normal form as shown. Here the primary key is "ID".

Does this have second normal form? YES. Why? because, here every column other than primary key column (ID) is funtionally dependent on the primary key. So, we can say a first normal table with non-composite candiidate key have second normal form already.

Now, we see the case of a composite primary key in a first normal table. We take other example here as shown.


This table have first normal form and it has composite primary key. i.e. "Bike_brand - colors".
But it doesn't have second normal form. because the values of column "Price" are dependent only on "Bike_brand" column, not on whole primary key.
So, if we want to make it into second normal form we will separate this column to a new table (with the same column as primary key on which these value s are dependent) as shown below.

Look at the structure of these new tables.
The main table has second normal form now as all non-candiadate columns (here "Discount_percentage") are dependent on whole primary key ( "Bike_brand - colors"). While the new table contains the columns "Bike_brand" (primary key in new table) and "Price" wih primary-foreign key relationship with main table.

Now, The way I remember it is that to make a first normal form (1-N) table to second normal form (2-N), the column with redundant values are taken in new table. And there is many-to-one relationship between main table and new table. We have done same in second example.

I hope this article was some help on the topic. The next post will be discussing the third normal form. bye !

Monday 18 June 2012

First Normal Form

Hey ! As said previously, now, we will discuss types of database normalization. And we're gonna begin with First Normal form.
[Note: I suggest you to visit my earlier post discussing the concept of normalization in general, if haven't done already.]
The two criteria to be satisfied for a relational table to be having First Normal form (without going to exact academic definition) are:
  1. The table should have a primary key (A column with unique and non-null values. Since it contains unique values, primary key values can identify each records of table.)
  2. There should not be two or more columns containing same type of values.
To understand better, let's take an example of table EMP_DETAIL storing details of employees in an office as shown.

As you can see an employee can have more than one phone number,so there are three columns in the table storing phone numbers.
Let's see whether this table qualifies for first normal form or not.
It satisfies the first criterion, it has a column "ID" congaing unique and non-null values. So, does this table have first normal form? NO, the second criterion isn't satisfied here, as the table has three columns "Phone1", "Phone2", and "Phone3" which contain same type of values phone numbers.
Now, if we want to normalize this table to first normal form, we have to make sure it fulfills the second criteion as well. To do so, we can separate columns containing
same types of values (i.e. phone number columns in this example) and create a separate table for them. After making this structural change, we'll be having two tables as shown here.

Observe both the tables, Here the new table contains two columns, "ID" and "Phone" . And ID is foreign key in new table to make a primary - foreign key relation with main table. Now, we have the two tables with fiest normal form. In main table, the primary key is ID, while in new second table, the primary key is composite i.e. both "ID" and "Phone" together make primary key in second table.

Now, here is a important tip I use to keep remembering the first normal form: To make first normal form from a denormalized table, we create one-to-many relationship between main table and new second table. Look the example again, we have done same thing.

Hope, this post was of some benefit to you. We will discuss the second normal form in the next post.

Bye !!!

Friday 15 June 2012

Database Normalization

Hi guys !!! Database Normalization is must to be known and understood by any IT professional who is working with relational databases.
For most of the guys getting started to learn the basic concepts, this topic proves a little hard to grasp. In my case, it haunted me. Even after understanding it many times, I just kept forgetting it.

So, keeping that in mind, we will try to discuss the Normalization in way easy to understand and remember as well. In this post, we will first get  the basic idea, what is normalization , its relevance and how it is achieved. It's types will be discussed in other posts.

Now, In the nutshell, normalizaion is about two things.
  1. Removal of data redundancy (repetition of same data) over same fields in the records in database table.
  2. Sensible data dependencies. To make it understand better, its like having a table in which all the columns are related.
Based on different level of the two factors mentioned above, we have different types of Database Normalization.
We First , Second , Third, fourth and fifth normal form. The first three normal forms are most practically used forms of normalization.

Here are some points highlighting its significance of Normalization.
  • Removal of redundancy --> uniqueness of records in table, Saving of storage space and Reduction in time to write or insert data into database
  • Sensible data dependencies --> Elimination of unnecessary relationships between columns within/among the database tables. And Retrieving data becomes simple and logical in terms of the query writing.
  • Database tables become easy to maintain.
After getting the basic idea and significance of Normalization, Let's know how it is achieved. Basically, it's done by separating data by creating new table(s) having primary - foreign key relationships with main table. The intent here is to achieve some level of the two factors I mentioned in the beginning.

I'll be back with more posts discussing First, Second and Third normal form.

See you then, Bye !!!

Thursday 14 June 2012

Durability in Database

I have talked about about the three properties (Atomicity, Consistency and Isolation) in my previous posts. In this post, we'll cover the last but very important property i.e. Durability. Well, It's important because it saves the world from situation like THIS. Now let's see why and how.

 As said in earlier post Durability ensures that any transaction committed on the database is not lost in case of events like error, power crash, etc. Probably, the concepts is relatively easy to understand compared to it's importance.

Think of a hypothetical situation where a bank's database table stores the numerous transactions related to the accounts of its users. What happens now is a power crash, system failure, or anything like these one would prefer not to happen. And result: Data loss i.e. loss (partial/full) of the records storing the transaction details of all the accounts of bank. So it should result into one of two situations for user (first and second) depending on the his/her credit/debit situation.

Now, think: such database issue, happening in other organizations specially those related to defense, tax, etc. So, it can concluded that this issue has be avoided. In other words the history of data operations/transaction needs to be saved from loss due to crashes, failures, etc. The database should be durable.

To maintain the durability of database, data restoration is done by using database backups and transaction logs. e.g. In case of Oracle, there are redo logs.

Tuesday 12 June 2012

Isolation in Database

This post is to discuss the Isolation in database. It took me a little difficulty to understand this concept as I couldn't find any suitable example over internet. It was explained as a theoretical concept.

So, We'll try to discuss it using an example. But before that let's have a look on a statement on Isolation in my earlier post, "Any two or more transactions on a database are isolated from each other. i.e. they don’t interfere with each other."

Now let's take an example of table named EMP in a database which supports isolation. And we have two users: user X and user Y who frequently access/use this table.


The table contains data as shown in picture above.
In a situation, where user X logins into database and change the Age of Michelle from 19 to 23 as shown in picture below. And this change in the EMP table is uncommitted yet.

Now, consider three scenarios.
scenario(I): While the session of user X is still running with uncommitted data change, user Y logins into same database as a separate user.
scenario(II): User X commits the change made by him on table EMP. After that User Y logins into database.
scenario(III): While the session of user X is still running with uncommitted data change, user Y logins into same database as a separate user. After that user X commits the change.

Here is a task for you to guess how the data of table EMP would appear to user Y for each of three scenarios.

Here are the answers.
scenario(I): The data of table EMP would appear unchanged as in first picture to user Y.
scenario(II) and (III): The data of table EMP would appear changed as in the second picture.

Now, hopefully you would have figured out the reason why. Still, here is the explanation.

Only committed data is visible across the sessions. Since the data change was uncommitted for session(I), the last committed data of table EMP (as in first picture) would appear to user Y.

Now after going through this example, we conclude that any database changes/operations/transaction are isolated from each other across different sessions untill and unless those are committed. This is the property of isolation in the database.

Hope you liked my post. Here is more on this topic on Wikipedia

Consistency of Database

It's time discuss another ACID property : Consistency

As mentioned in older post, Consistency ensures that the database always stays in a valid state before and after each transaction with respect to the rules and constraints of the database.

To understand it more clearly, we will take an example of database of company which stores the data related to his employees in EMP table.

Now, clearly this company is strict about the age of its employees. It wants its employees to be of age >= 17 and less than 50.No one can work in the company whose age is out of this range.  So, there is a constraint applied over the table as highlighted yellow in the picture.

Now if the database is consistent, it must satisfy this constraint.And Any transaction or data operation which violated this state should de failed or undone(rolled back).

e.g. If we try to insert a record into the EMP table where age is out of this range like 15 or 55, it must fail or roll-back the running transaction/data operation.In other words, the database is consistent with the rules/constraints of database.

Here is a good article explaining this topic and other ACID rules of Database.

Database Transaction

A Database Transaction can be understood as single unit of a logical operation in a database. It consists of one or more data operations performed together.

To understand the concept, let's take an example of a database of a bank. Here we have two accounts one for Mr. A and other for Mr. B represented by Table A and Table B in the database.

To transfer some amount of money from Mr. A's account to Mr. B's account as a payment, there are two logical steps:
  1. Deduct money from Mr A's account. i.e. Make an entry of money withdrawal in Table A.
  2. Deposit same amount of money to Mr. B's account. i.e. Make an entry of money depositin Table B.
There is no sense in doing either one of these steps alone. So, any of these steps alone doesn't qualify as a transaction.
It makes sense only when both of these steps are performed together. Hence constitute as a logical operation.  i.e. Transaction.

Here is a more detailed article on the topic on Wikipedia. You might be interested in this article as well.

Monday 11 June 2012

Atomicity in Database

It's time to discuss the Atomicity of a database in little more detail. But before we begin, here is a disclaimer.

Disclaimer: this post has no relation whatsoever to this , this and this

Now, I mentioned in my previous post that the transaction on an atomic database follows "all or none" rule. So, here it means that a database transaction is either successful or failed.

There is no such thing as partial successful/failed transaction. If any part of transaction fails, the whole transaction fails. (See this to learn about a database transaction)

e.g. A there is database transaction consisting of three steps occurring serially (to move record from one table to other table).
  1. Copy a row from one table to a other table
  2. Delete the same from first table
  3. Commit the changes on database.
So, to be an Atomic database it should be programmed in such a way that if any of these step fail during the transaction, the whole transaction must fail and all the previous successful steps of the same failed transaction must be undone.

In above example, if step 2 fails, the step 3 should not occur and step 1 should be undone (rolled-back). This transaction will be successful only if all the steps are successful.

In the nutshell, in an atomic database, either all statements or steps in a database transaction are successful or none.

Here is the Wikipedia link on the same topic.

The ACIDic nature of a Database

Now, this concept is not specific to relational database, but it must be known as it's a basic principle applied practically on all kinds of databases. So, here we go. Let's discuss some ACIDic properties of a database.

Well just to have a clear start, we are not talking about stuff which somehow relates to "this".

The four letters of the word ACID denote the set of four properties.

[You can click on them to see more on each properties in my other posts.]
  • Atomicity It means that a database is said to be atomic if each transaction on the database follows  "all or none" rule.
  • Consistency It means that a database is consistent (in a valid state) before and after each transaction with respect to the rules and constraints applied on the database.
  • Isolation Any two or more transactions on a database are isolated from each other. i.e. they don't interfere with each other.
  • Durability This property ensures that any transaction committed on the database is not lost in case of events like error, power crash, etc.

Relational Database: a basic understanding

What is a Relational Database?

It’s a database based on Relational Model. Which was proposed by E.F. Codd

Now in a relational model, the data is stored in the form of tables which look something like this example.

So, the catch here is "the data is stored logically in tabular form".

Table shown in above picture gives the details (i.e. Name, age and city) of three persons and it contains the rows and columns where
  • Each Row represents a record having information about a person. e.g. The last row contains Name, Age and City of Katie.

  • Each Column represents a category of information. e.g. The second column contains age of all the persons in the example.
So, Basically a Relational database is a collection of such enties which contain data in tabular form as explained in above example.

Now, to use and manage such relational databases (like Oracle, SQL server, etc.), the standard language is SQL (Structured Query language)

The First Post

Hi guys !!!

 This post here is my first entry on this blog giving a little introduction to this blog

The purpose of this blog here is to share/exchange the concepts,  knowledge, and ideas about a Relational Database with you. The core purpose here is mutual learning benefit in simplest way possible.

You can mail me here.
Your suggestions and comments are welcome :)