What is an ENUM in MySQL?

Topics Covered

What Is An ENUM in MySQL?

While working with databases, string data types are common to use. However, in many cases, the values that a string attribute can take are limited to a set of possible values only. For such cases, the MySQL ENUM data type by MySQL comes in handy which can make your code easier to handle and more efficient. ENUM is a special string data type that allows only a specific set of pre-defined values as the valid values for an attribute. We can think of ENUM as similar to a dropdown menu rather than a text input in websites.

In this article, we'll dive deeper into the concept of ENUM in MySQL. We'll explore its syntax, how to create and use ENUM columns, how to handle ENUM literals, and the advantages and limitations of using the ENUM data type in MySQL.

Syntax

The syntax for creating an ENUM column in MySQL is as follows:

where column_name is the name of the column, and 'value1', 'value2', ..., and 'valueN' represents the set of allowed values for that column. The number of allowed values, N, can range from 1 to 65,535. The above statement is declared while creating the table columns using the CREATE_TABLE command, shown as follows:

This will create an enum column in the table which will allow only certain values.

Example of ENUM

Let's consider an example where we want the 'Gender' attribute of users to have values only as 'Male', 'Female', or 'Other'. The SQL statement for this would be as follows:

This table has three columns: id, name, and gender. The id column is an auto-incrementing primary key, and the name column is of type VARCHAR with a maximum length of 100 characters. The gender column is of type ENUM and allows only the values 'male', 'female', and 'other'.

Now, we insert some values in our table, as follows:

Finally, we query the dataset using the SELECT statement to display all the rows in the user's table.

The output is:

idnamegender
1RajuMale
2ShyamMale
3SimranFemale
4PremOther
5AnjaliFemale

The ENUM values are stored as numbers internally. If you would have placed the actual strings in a VARCHAR gender column, it would have taken up 6 million bytes of storage to insert 1 million rows with that value into this database, as compared to only 1 million bytes in the case of ENUM.

To select rows with a particular enum value, both the string literals and their assigned indices can be used in the WHERE clause. The following two statements do the same thing.

Index Values for Enumeration Literals

The ENUM values are internally stored as numerical indices. These indices start from 1 and are consecutive integers, assigned in the order in which ENUM values are specified while creating the ENUM column. For example,

Here, 'male' is assigned to index 1, 'female' to index 2, and 'other' to index 3.

You can also explicitly assign index values to ENUM literals by specifying them in the ENUM definition. For example:

In this case, 'male' is assigned an index of 5, 'female' has an index of 10, and 'other' has an index of 15.

The index value for an empty string error value is by default zero. This means that all those rows having an empty or invalid enum value will have the index 0. And hence they can be filtered out using the SELECT statement.

Also, the NULL value's index is NULL. Hence, while using an ENUM, we can expect a NULL index for NULL values, 0 indexes for invalid values, and indices from 1 to the number of valid values in the ENUM, which can go up to 65535.

These indices for the ENUM values can be retrieved using the simple SELECT statement only but just adding a "+0" to the ENUM column, as shown:

Handling of Enumeration Literals

When a table is formed, trailing spaces are automatically removed from ENUM member values in the table definition but the lettercase remains intact.

When you enter a number into an ENUM column, the value that is entered is the enumeration member with that index. The number is considered an index of the potential values. If there isn't a string that matches the quoted numeric value in the list of enumeration values, it is still treated as an index. Due to the potential for confusion, it is not recommended to construct an ENUM column with enumeration values that resemble integers. For instance, the enumeration elements in the following column have text values of "0," "1," and "2," but numerical index values of 1, 2, and 3:

When you save 2, it is converted to '1' (the value with index 2) and is used as an index value. When you save '2', it is stored as '2' since it matches an enumeration value. If you save '3', it is interpreted as an index and changes to '2' (the value with index 3), as '3' does not match any enumeration value.

Empty or NULL Enumeration Values

By default, the ENUM columns in MySQL do not allow empty or NULL values. However, you can specify a default value for an ENUM column that is either one of the allowed values or NULL. For example:

In this case, the default value for the color column is one of the allowed ENUM values. You can also specify NULL as the default value if you want to allow NULL values in the ENUM column. When an ENUM column is marked as NOT NULL, the first item in the list of acceptable values is used as the field's default value.

The empty string is substituted as a specific error value when you attempt to enter an erroneous value into an ENUM (i.e., a string that isn't on the list of acceptable values). This string differs from a "normal" empty string in that it contains the number 0.

ENUM Sorting

When we use the ORDER BY clause over an ENUM, the values are sorted based on their index values rather than the actual strings. However, we can still obtain desired sorting order. Use one of these methods when using the ORDER BY clause on an ENUM` column to avoid unexpected outcomes:

  • List the ENUM values alphabetically.
  • Make sure the column is sorted using ORDER BY CAST(col AS CHAR) or ORDER BY CONCAT(col) to ensure that the lexical order is used rather than the index number.

Advantages of ENUM Data Type

There are several reasons why ENUM makes the code more efficient.

  • Only allowing a specific list of values ensures that the attribute cannot hold an unknown or invalid value. This ensures data integrity.
  • The allowed list of string values defined in an ENUM data type is not simply stored as strings in the backend but integers which make the data storage compact and dealing with numbers to retrieve and access data is quicker than string lookups
  • Because the numbers are converted back into the output of the appropriate text, it also offers easily legible searches and output.

Limitations of ENUM Data Type

Though ENUMs propose great advantages for efficient code and storage, its use is generally limited and not widely promoted because of the following reasons:

  • To change an enumeration member, the entire table has to be rebuilt using the ALTER TABLE command, which uses a lot of resources and takes a long time.
  • Enum-based data sorting causes MySQL to prioritize the underlying integer value over the actual string. This is sometimes advantageous, but it may also be rather perplexing if you're not expecting it.
  • Integer enums might be confusing, thus it's crucial to keep in mind that they should be avoided wherever feasible. If you must use integer enums, a TINYINT column is preferable.

Conclusion

ENUM is a useful data type in MySQL for columns or attributes that have a limited set of possible values. In this article, we discussed:

  • ENUM is a special string data type in MySQL that allows only a specific set of predefined values as valid values for an attribute.
  • ENUM literals are stored as integers in the backend, which makes data storage compact and data access quicker.
  • ENUM values are sorted based on their index values, rather than their string values.
  • We can handle empty or NULL ENUM values using the DEFAULT keyword or the NOT NULL constraint.
  • ENUM can be advantageous in terms of data integrity, performance, and readability of code.
  • The use of ENUM also has limitations, such as limited extensibility and the potential for errors when modifying the predefined values.