Null Usage in RDBMS

Why does “NULL” cause a complication in the definition of attributes in the database? How is it stored in the database?

8 thoughts on “Null Usage in RDBMS

  1. mykh74

    “NULL” means that entity does not have a value for an attribute. It indicates that the value of the attribute is unknown, missing or inapplicable. NULL values introduce special problems in relational operations such as arithmetic operations, comparison and logical operations, and set operations. The result of arithmetic operations with null values produces null value. Comparison operators when one of the operators is null value, produces null value. Logical operations are extended to deal with NULL values:
    * Operation ‘AND’: true and NULL produces NULL, false and NULL produces false, NULL and NULL produces NULL
    * Operation ‘OR’: true or NULL produces NULL, false or NULL produces NULL, NULL or NULL produces NULL
    * Operation ‘NOT’: NOT NULL operation produces NULL result.

    If statement in WHERE statement produces NULL or false result for a tuple, this tuple is not added to the result. It is true also for joins when there is a comparison between attributes with NULL values. For example, if both attributes in a tuple have unknown (NULL) values, comparison between these attributes (NULL = NULL) produces NULL and, as a result, a tuple is not added to the result.

    In order to distinguish NULL and not NULL values in logical statements, SQL uses special keywords. Keyword ‘IS NULL’ returns true, if the attribute for a particular tuple has NULL value. Keyword ‘IS NOT NULL’ returns true, if the attribute for a particular tuple contains defined value.

    NULL values are skipped in aggregated functions. For example, if a SQL statement calculates an average grade for a student, NULL values are ignored (the average of 2, NULL, 4 is 3, not 2.25).

    Treatment of NULL values for set operations such as ‘union’, ‘intersect’, and ‘except’ is different than treatment of NULL values in ‘where’ statement. If two tuples have NULL values for some of attributes, and the rest of attributes are identical, those tuples are considered identical. It is also true for ‘select distinct’ statement when duplicated tuples must be eliminated.

    In order to store NULL values in database, each row should contain a null bitmap for attributes that allows nulls. If the attribute has NULL value in the entity then a bit in the bitmap is 1, otherwise it’s 0. If the attribute has fixed width, storing NULL value takes the same space as any other value. If the field has variable length, storing NULL takes no space. Details how NULL values are stored in the database can depend on database engine.

  2. manishekar

    NULL is an undefined value. It is different from regular data values. It is not the same as value zero. It is also not the same as empty string. It can have many different plausible interpretations.

    Normal values are treated as TRUE or FALSE. But, NULL introduces three-value logic and hence brings problems in the maintenance of database. This additional state introduces problems in the correct implementation. The manner in which NULL is interpreted depends on the application.

    Databases treat NUL values in a special way. For example, consider the case where the user is required to fill an application form and the middle name is optional to be filled. The user can specify the middle name or sometimes he can leave the middle name blank. Leaving the middle name blank does not mean that the user does not have a middle name. Such fields are represented in database using NULL.

    Logical AND of NULL with any FALSE value will return FALSE and logical AND of NULL with any TRUE or NULL value will be NULL. Similarly logical OR of NULL with any TRUE value will be TRUE and logical OR of NULL with any FALSE or NULL value will be FALSE.

    Data structures such as indexes can be used to locate records that contain a NULL value in one of their field. The use of index to identify NULL will depend will depend on the significance of the query performance as well as the use of additional disk space. This method also introduces delays in the updating, deletion and insertion of the database. Quite a few alternatives exist. One solution is to define a composite index with at least one another column that has a not NULL constraint. Composite indexes are created using one or more columns of a database table. Another solution is to define a composite index with a constant value as the second column in the composite index. Bitmaps indexes can also be used to store NULL values where majority of the data is stored as bit arrays and responses to the queries are obtained through bitwise logical operations on these bitmaps.

  3. Vinay Venkatachala

    NULL is actually not a value. It is used to signify that the value is unknown. It is part of the 3-value logic. Because of an additional state, there may be complications in implementation if the domain in which it is used is not correctly understood. So, NULL does not indicate 0 or empty. It simply means unknown.

    Some of the problems are-
    Logical operation with NULL have no defined output. All the logical operations consider NULL as 0 or FALSE which may not be what is required.
    Problems are created when we try to retrieved 0 or FALSE values as NULL may also be considered as part of it.

    Some of the databases use index to identify records that contain NULL. Others use composited index. Bitmap indexes are another option.

  4. ramya

    To answer the above questions about NULL value first we need to know some important points about “NULL”.
    IMPORTANT POINTS ABOUT “ NULL” VALUE:
    • NULL values do not have a data type
    • They can be found in a field containing any type of data and has a very special meaning within the context of a relational database.
    • NULL is not the number zero.
    • NULL is not the empty string (“”) value.
    • NULL is the value used to represent an unknown piece of data.
    Now that is “NULL” in a nut shell.

    Databases treat NULL values in a special way, depending upon the type of operation that it is used in.
    When NULL is used in the definition of an attribute it causes a lot of complication as we do not get the right or definite output while executing a query. The following examples will explain it in a better way.
    Example 1:
    In SQL, NULL and empty string are very different and act very differently. Consider the following data table:
    test:
    ID NAME
    1 Sachin
    2
    3 Messi
    4
    5
    6

    When the following query was executed we do not get the desired output.
    SELECT
    (SELECT COUNT( * ) FROM test t WHERE LEN( t.name ) = 0) AS len_count,
    (SELECT COUNT( * ) FROM test t WHERE t.name IS NULL) AS null_count,
    (SELECT COUNT( * ) FROM test t WHERE t.name NOT LIKE ‘_%’), AS like_count
    (SELECT COUNT( * ) FROM test t WHERE t.name IS NULLOR t.name NOT LIKE ‘_%’) AS combo_count
    The result is as follows:
    len_count:2
    null_count:2
    like_count:2
    combo_count:4
    Logically speaking we had to get the answer 4 for all the 4 queries. However , you can see that only one query returns this value.
    Example 2:
    Databases treat NULL values in a special way, depending upon the type of operation that it is used in. When a NULL value appears as an operand to an AND operation, the operation’s value is FALSE if the other operand is FALSE (there is no way the expression could be TRUE with one FALSE operand).
    On the other hand, the result is NULL (unknown) if the other operand is either TRUE or NULL The OR operand treats NULL values in a similar fashion. If the other operand is TRUE, the result of the OR is TRUE (because the real value of the NULL operand doesn’t matter.) On the other hand, if the other operand is either FALSE or NULL, the result of the OR operation is NULL. When a NULL value is obtained, we do not get or know the definite output!!!
    So, allowing NULL values makes you work extra hard to get the kind of data you are looking for. From a related angle, allowing NULL values reduces your convictions about the data in your database. You can never quite be sure if a value exists or not. Furthermore, while running LEN() on a NULL value doesn’t act as you might think it to, it also does NOT throw an error. This will make debugging your code even harder if you do not understand the difference between NULL values and data values.
    HOW ARE NULL VALUES STORED?
    The mechanism of storing the null value differs depending on the DBMS.
    MySql uses an array of bits (one per column) with the data for each row to indicate which columns are null and then leaves the data for that field blank.
    IBM Informix Dynamic Server uses special values to indicate nulls. For example, the valid range of values for a SMALLINT (16-bit, signed) is -32767..+32767. The other value, -32768, is reserved to indicate NULL. Similarly for INTEGER (4-byte, signed) and BIGINT (8-byte, signed). For other types, it uses other special representations (for example, all bits 1 for SQL FLOAT and SMALLFLOAT – aka C double and float, respectively). This means that it doesn’t have to use extra space.IBM DB2 for Linux, Unix, Windows uses extra bytes to store the null indicators

  5. rahul

    NULL value is a value which is unknown. It is different from empty string ”” and 0. The definition of NULL value creates a complication in the definition of attributes as the values can have unknown applicable state and unknown inapplicable state as mentioned by Codd. Example is a customer whose account information we don’t know and another one who doesn’t have an account. So NULL introduces different interpretations based on the usage and design.It also complicates most of the operations. This necessitates the use of three value logic which is complex to understand and implement. Also check for NULL values can be tricky which has to carefully done. Since it is not a value, this requires special handling all over the database operations including arithmetic operations , joins etc.. There is another argument that the NULLs can be removed using normalization.

    NULL values can be indicated as a bitmap for a row with each bit indicating whether a column is NULL or not. This is the method used in postgresql . The bitmap is present only if one of the column value is NULL. Since the NULL itself is not taking an memory , presence of a NULL value makes finding column values complex as the offsets has to be checked based on the NULL bitmap . SQL server provides sparse columns which are optimized for storage of NULL values.

  6. leekris

    A single data item related to a database object is called an attribute. An attribute may describe any component of the database, like a field or a table. It’s nothing but a column in a table which describes the data in different rows in the table. Nulls indicate unknown, missing or inapplicable data. A null cannot be used to imply any other value like zero. There are several operations defined for attributes like Join, Select, Insert, Update and Delete. Also there are several functions defined for dealing with data. However incase of null the output of defined operations is different in few cases from expected output and hence the complication.

    Every database has a mechanism to store Null value. In Oracle, 1 byte is required to store Null as length of column i.e. zero. The size of column is indicated in brackets and zero indicates that particular cell has no value. Aggregation functions in oracle ignore null while performing calculation. The result of any arithmetic expression involving null is null. Two nulls are treated as same. MySQl uses an array of bits along with data in each row to indicate which columns are null and then leaves data for that field blank. PostgreSQL uses an optional bitmap with one bit per column. Each row has the bitmap and it shows which all columns are null. All the table rows are structured in a similar manner with a fixed size header followed by optional null bitmap, an optional object ID field and the user data. IBM database Informix uses special values to indicate nulls. IBM DB2 uses extra bytes to store the null indicators. In Microsoft SQL Server, each row has a null bitmap for columns that allow nulls. If a row in that column is null then a bit in the bitmap is 1 or else 0 for that row. If a UNIQUE constraint is defined on a column, only one Null value is allowed in that column. However few database systems like DB2, SQL Anywhere and Borland Interbase do not allow null values in a UNIQUE column.

  7. Niki

    NULL values are not the same as data values. NULL represents the absence of a value. In SQL, NULL and empty string are very different and act very differently.

    Furthermore, while running LEN() on a NULL value doesn’t act as you might think it to, it also does NOT throw an error. This will make debugging your code even harder if you do not understand the difference between NULL values and data values.

    Common Errors with Null:
    • Nulls in Criteria: If you enter criteria under a field in a query, it returns only matching records. Nulls are excluded when you enter criteria.
    • Nulls in expressions
    • Nulls in Foreign Keys
    • Nulls and non-Variants: he only data type that can contain Null is the Variant. Whenever you assign the value of a field to a non-variant
    • Comparing something to Null
    • Forgetting Null is neither True nor False.

    In SQL Server Each row has a null bitmap for columns that allow nulls. If the row in that column is null then a bit in the bitmap is 1 else it’s 0.For variable size data types the actual size is 0 bytes. For fixed size data type the actual size is the default data type size in bytes set to default value (0 for numbers, ” for chars).

  8. sthiaga1

    A NULL value is a special type of value which does not belong to any data types that is used to indicate that the actual data value does not exist (Missing or inapplicable information) in the database. These values are introduced to fulfil the requirement of the Relational database management Systems. These NULL values causes some complication in the definition of the databases as they may cause the designer of the database some extra complication as they can make the database logic more complicated. These NULL values give severe problems to determine the true values, and for each of the NULL values, which introduces a whole new degree of uncertainty into the database.
    Null values are handled by providing each NULL value with a specific type as they cannot be represented by a single value and they can be directly assigned the NULL values to SQL type variables without triggering an exception. Certain databases like ASP.net have some built-in functionality for working with the NULL values, which stores a unique random value for every NULL value generated by the system, which is useful during retrieval operations.

Leave a Reply