What is ioctl in Linux? - Scaler Topics

Ioctl in Linux stands for Input Output Control. It is a system call used to talk to device drivers. Most Linux drivers support the ioctl system. Ioctl is used in cases where the kernel does not support a system call for the driver or does not have a default system call to communicate with the driver.

Ioctl in Linux is called in scenarios like updating the kernel when the disk partition scheme is changed, devices are ejected or even when the volume is adjusted on a device.

Ioctl is very flexible because of its underlying code structure. The ioctl function prototype requires two parameters: a command and an argument. The command represents a number corresponding to an operation. The argument refers to a corresponding operation related to the command. The ioctl() function implements a switch case over the function's command to determine the correct operation and provide the right functionality.

Working of Ioctl in the kernel space

Ioctl-based Interfaces

The ioctl() system call is the most common way to interact and interface with hardware device drivers in Linux. The ioctl system call is quite flexible and it is easy to extend its functionality by adding new commands. It can also be passed through block devices, sockets, character devices as well as some other file descriptors.

While it provides a lot of functionality and ease of use, it is also easy to break the

system call's functionality and it might become hard to identify and debug errors in ioctl.

Command Number Definitions

As discussed earlier, the ioctl in Linux function prototype takes in two parameters. The command or request number is the second argument that the system call accepts. This command number can be any 32-bit number but in modern conventions, a command number is usually used to correspond to a particular device driver.

Modern conventions define four macros _IO, _IOR, _IOW and _IOWR for the ioctl system call under include/uapi/asm-generic/ioctl.h header file in the Linux kernel source code. The ioctl in Linux function prototype thus requires the following parameters:

_IO, _IOR, _IOW and _IOWR: These kernel macros define the way arguments will be used. _IOW refers to a pointer to some data passed to the kernel, _IOR refers to a pointer coming out of the kernel and _IOWR represents both of these situations. _IO can refer to commands with no arguments or commands passing an integer value instead of a pointer.
type: A type is an 8-bit number specific to a driver or subsystem.
nr: It is also an 8-bit number for identifying a specific command unique to a given 'type'.
data_type: This refers to the name of the data type pointed by the argument.

kernel partition table

Interface versions

Some subsystems in Linux implement version numbers in data structures to support function and command overloading. Using interface versions, subsystems can overload commands with different implementations and interpretations for a given argument.

Using interface versions is a bad idea and not used nowadays as any changes to existing commands tend to break existing applications. A saner approach to this is to create a new ioctl function command with a new number.

Return Code

The ioctl in Linux commands are allowed to return negative error codes. These negative error codes in the kernel space are converted to errno values in the user space. A successful operation always returns a zero.

Whenever the ioctl system call is made with an unknown command number, the handler returns a code called -ENOTTY. -ENOTTY is an error code which stands for 'Not a typewriter', referring to the TTY in a Linux system.

Timestamps

Tradition timestamps are problematic because they are passed as struct timespec or struct timeval and have incompatible definitions of these structures in the userspace after converting to 64-bit time_t.

The struct __kernel_timespec type should be used to embed data in other data structures when separate second or nanosecond values are needed.

32-bit Compat Mode

To support 32-bit user space on 64-bit machines, any driver or subsystem which implements the ioctl callback handler must also implement its compat_ioctl handler counterpart.

32-bit compat mode is usually easy to implement since we only need to set the .compat_ioctl pointer to a helper function while following the rules for the data structures. These helper functions are usually compat_ptr_ioctl() or blkdev_compat_ptr_ioctl().

compat_ptr()

When running 31-bit userspace in s390 architecture, the userspace has ambiguous representations for data pointers where the upper bit is ignored. When running a process in such userspace in compat mode, the compat_ptr() helper function must be used to clear this upper bit of compat_uptr_t and convert it into a valid 64-bit pointer.

When using ioctl in Linux, the last argument of the compat_ioctl() helper function is an unsigned long, which can either be interpreted as a pointer or a scalar depending on the command. If it is treated as a scalar, then the compat_ptr() helper function should not be used so that a 64-bit kernel behaves similarly to a 32-bit kernel in arguments where the upper bit is set.

Instead of using a custom compat_ioctl file operations helper for drivers which only take in pointers as arguments, we can also use the compat_ptr_ioctl() helper function.

Structure Layout

To avoid numbers that can cause issues with ioctl in Linux, compatible data structures have the following layouts on all architectures:

long and unsigned long are the size of a register and can either be 32-bit or 64-bit wide. So, these cannot be used in portable data structures. Their fixed-length replacements are __s32, __u32, __s64 and __u64.
Pointers also share the same problems but require the use of the compat_ptr() helper function as well. The best way to tackle this problem is to use __u64 in place of pointers, which then requires casting to uintptr_t in userspace. This also requires the use of u64_to_user_ptr() in the kernel to convert to a user pointer again.
The alignment of 64-bit variables is only 32-bit on 32-bit (i386) architecture. So the structure becomes:
On a 64-bit architecture, there are 4 bytes of padding between a and b and another 4 bytes of padding at the end, but there is no padding in the i386 architecture. Thus, this needs a compat_ioctl() conversion handler to convert between the two architectural formats. So all structures should have their data members naturally aligned or should have explicitly reserved fields added in place of implicit padding.
Structures are padded to multiples of 32-bit in the ARM OABI userspace. This makes some structs incompatible with modern EABI kernels if they do not end on a 32-bit boundary.
Struct members may or may not have an alignment greater than 16 bits on the m68k architecture. This can pose a problem when relying on implicit padding.
Enums and bitfields usually work as they are expected to but they are avoided in ioctl interfaces because some of their properties are dependent on their implementation.
Depending on the architecture, char data members can be signed or unsigned. So, __u8 and __s8 types need to be used for 8-bit integer values. Char arrays are clearer for fixed-length strings.

Subsystem Abstractions

In ioctl in Linux, some drivers implement their own ioctl function but most drivers subsystems implement the same command for most drivers. What helps is that most subsystems have a .ioctl() handler which copies the arguments from and to the userspace. They also handle passing these arguments into the subsystem-specific callback functions through normal kernel pointers.

This has the following benefits:

Applications written for one driver might probably also work for some other driver in the same subsystem if there are not a lot of differences in the userspace ABI.
The possibility of bugs is reduced since the complexity of userspace access and data structure layout is done in a single place.
There is a high probability of such subsystems being reviewed by experienced kernel developers who are much more experienced in spotting issues in the interface when ioctl is used and shared between multiple drivers compared to when it is used for a single driver.

Alternatives to ioctl

Some alternatives to ioctl in Linux are:

System calls: System calls are the saner choice for any system-wide feature that is not connected to a physical device or controlled by file system permissions for a character device node.
Netlink: Netlink is the appropriate way to configure any network device-related objects with the help of sockets.
Debugfs: Ad-hoc interfaces for debugging purposes use debugfs. These devices are not exposed as stable interfaces to applications.
Sysfs: Sysfs is the best way to describe the state of an in-kernel object which is not tied to any file descriptor.
Configfs: Configfs serves the same purpose as sysfs but can be used for more complex configurations.
Custom file system: More flexibility can be available by using a custom file system. This provides a simple user interface but can also add a lot of complexity to the implementation.

Conclusion

Ioctl in Linux is a system call which is the most common way of interfacing with device drivers.
It supports adding new commands and can be passed through a variety of devices.
The ioctl syscall takes in two arguments, argument and command.
Ioctl also supports subsystem abstractions so that a single subsystem can be used for multiple drivers.