So, now we're bold kernel programmers and we know how to write kernel modules to do nothing. We feel proud of ourselves and we hold our heads up high. But somehow we get the feeling that something is missing. Catatonic modules are not much fun.
There are two major ways for a kernel module to talk to processes. One is through device files (like the files in the /dev directory), the other is to use the proc file system. Since one of the major reasons to write something in the kernel is to support some kind of hardware device, we'll begin with device files.
The original purpose of device files is to allow processes to communicate with device drivers in the kernel, and through them with physical devices (modems, terminals, etc.). The way this is implemented is the following.
Each device driver, which is responsible for some type of hardware, is assigned its own major number. The list of drivers and their major numbers is available in /proc/devices. Each physical device managed by a device driver is assigned a minor number. The /dev directory is supposed to include a special file, called a device file, for each of those devices, whether or not it's really installed on the system.
For example, if you do ls -l /dev/hd[ab]*, you'll see all of the IDE hard disk partitions which might be connected to a machine. Notice that all of them use the same major number, 3, but the minor number changes from one to the other Disclaimer: This assumes you're using a PC architecture. I don't know about devices on Linux running on other architectures.
When the system was installed, all of those device files were created by the mknod command. There's no technical reason why they have to be in the /dev directory, it's just a useful convention. When creating a device file for testing purposes, as with the exercise here, it would probably make more sense to place it in the directory where you compile the kernel module.
Devices are divided into two types: character devices and block devices. The difference is that block devices have a buffer for requests, so they can choose by which order to respond to them. This is important in the case of storage devices, where it's faster to read or write sectors which are close to each other, rather than those which are further apart. Another difference is that block devices can only accept input and return output in blocks (whose size can vary according to the device), whereas character devices are allowed to use as many or as few bytes as they like. Most devices in the world are character, because they don't need this type of buffering, and they don't operate with a fixed block size. You can tell whether a device file is for a block device or a character device by looking at the first character in the output of ls -l. If it's `b' then it's a block device, and if it's `c' then it's a character device.
This module is divided into two separate parts: The module part which registers the device and the device driver part. The init_module function calls module_register_chrdev to add the device driver to the kernel's character device driver table. It also returns the major number to be used for the driver. The cleanup_module function deregisters the device.
This (registering something and unregistering it) is the general functionality of those two functions. Things in the kernel don't run on their own initiative, like processes, but are called, by processes via system calls, or by hardware devices via interrupts, or by other parts of the kernel (simply by calling specific functions). As a result, when you add code to the kernel, you're supposed to register it as the handler for a certain type of event and when you remove it, you're supposed to unregister it.
The device driver proper is composed of the four device_<action> functions, which are called when somebody tries to do something with a device file which has our major number. The way the kernel knows to call them is via the file_operations structure, Fops, which was given when the device was registered, which includes pointers to those four functions.
Another point we need to remember here is that we can't allow the kernel module to be rmmoded whenever root feels like it. The reason is that if the device file is opened by a process and then we remove the kernel module, using the file would cause a call to the memory location where the appropriate function (read/write) used to be. If we're lucky, no other code was loaded there, and we'll get an ugly error message. If we're unlucky, another kernel module was loaded into the same location, which means a jump into the middle of another function within the kernel. The results of this would be impossible to predict, but they can't be positive.
Normally, when you don't want to allow something, you return an error code (a negative number) from the function which is supposed to do it. With cleanup_module that is impossible because it's a void function. Once cleanup_module is called, the module is dead. However, there is a use counter which counts how many other kernel modules are using this kernel module, called the reference count (that's the last number of the line in /proc/modules). If this number isn't zero, rmmod will fail. The module's reference count is available in the variable mod_use_count_. Since there are macros defined for handling this variable (MOD_INC_USE_COUNT and MOD_DEC_USE_COUNT), we prefer to use them, rather than mod_use_count_ directly, so we'll be safe if the implementation changes in the future.
/* chardev.c * Copyright (C) 1998 by Ori Pomerantz * * Create a character device (read only) */ /* The necessary header files */ /* Standard in kernel modules */ #include <linux/kernel.h> /* We're doing kernel work */ #include <linux/module.h> /* Specifically, a module */ /* Deal with CONFIG_MODVERSIONS */ #if CONFIG_MODVERSIONS==1 #define MODVERSIONS #include <linux/modversions.h> #endif /* For character devices */ #include <linux/fs.h> /* The character device definitions are here */ #include <linux/wrapper.h> /* A wrapper which does next to nothing at * at present, but may help for compatibility * with future versions of Linux */ #define SUCCESS 0 /* Device Declarations *********************************************** */ /* The name for our device, as it will appear in /proc/devices */ #define DEVICE_NAME "char_dev" /* The maximum length of the message from the device */ #define BUF_LEN 80 /* Is the device open right now? Used to prevent concurent access into * the same device */ static int Device_Open = 0; /* The message the device will give when asked */ static char Message[BUF_LEN]; /* How far did the process reading the message get? Useful if the * message is larger than the size of the buffer we get to fill in * device_read. */ static char *Message_Ptr; /* This function is called whenever a process attempts to open the device * file */ static int device_open(struct inode *inode, struct file *file) { static int counter = 0; #ifdef DEBUG printk ("device_open(%p,%p)\n", inode, file); #endif /* This is how you get the minor device number in case you have more * than one physical device using the driver. */ printk("Device: %d.%d\n", inode->i_rdev >> 8, inode->i_rdev & 0xFF); /* We don't want to talk to two processes at the same time */ if (Device_Open) return -EBUSY; /* If this was a process, we would have had to be more careful here. * * In the case of processes, the danger would be that one process * might have check Device_Open and then be replaced by the schedualer * by another process which runs this function. Then, when the first process * was back on the CPU, it would assume the device is still not open. * However, Linux guarantees that a process won't be replaced while it is * running in kernel context. * * In the case of SMP, one CPU might increment Device_Open while another * CPU is here, right after the check. However, in version 2.0 of the * kernel this is not a problem because there's a lock to guarantee * only one CPU will be kernel module at the same time. This is bad in * terms of performance, so it will probably be changed in the future, * but in a safe way. */ Device_Open++; /* Initialize the message. */ sprintf(Message, "If I told you once, I told you %d times - Hello, world\n", counter++); /* The only reason we're allowed to do this sprintf is because the * maximum length of the message (assuming 32 bit integers - up to 10 digits * with the minus sign) is less than BUF_LEN, which is 80. BE CAREFUL NOT TO * OVERFLOW BUFFERS, ESPECIALLY IN THE KERNEL!!! */ Message_Ptr = Message; /* Make sure that the module isn't removed while the file is open by * incrementing the usage count (the number of opened references to the * module, if it's not zero rmmod will fail) */ MOD_INC_USE_COUNT; return SUCCESS; } /* This function is called when a process closes the device file. It * doesn't have a return value because it can't fail (you must ALWAYS * be able to close a device). */ static void device_release(struct inode *inode, struct file *file) { #ifdef DEBUG printk ("device_release(%p,%p)\n", inode, file); #endif /* We're now ready for our next caller */ Device_Open --; /* Decrement the usage count, otherwise once you opened the file you'll * never get rid of the module. */ MOD_DEC_USE_COUNT; } /* This function is called whenever a process which already opened the * device file attempts to read from it. */ static int device_read(struct inode *inode, struct file *file, char *buffer, /* The buffer to fill with the data */ int length) /* The length of the buffer * (mustn't write beyond that!) */ { /* Number of bytes actually written to the buffer */ int bytes_read = 0; #ifdef DEBUG printk("device_read(%p,%p,%p,%d)\n", inode, file, buffer, length); #endif /* If we're at the end of the message, return 0 (which signifies end * of file) */ if (*Message_Ptr == 0) return 0; /* Actually put the data into the buffer */ while (length && *Message_Ptr) { /* Because the buffer is in the user data segment, not the kernel * data segment, assignment wouldn't work. Instead, we have to use * put_user which copies data from the kernel data segment to the user * data segment. */ put_user(*(Message_Ptr++), buffer++); length --; bytes_read ++; } #ifdef DEBUG printk ("Read %d bytes, %d left\n", bytes_read, length); #endif /* Read functions are supposed to return the number of bytes actually * inserted into the buffer */ return bytes_read; } /* This function is called when somebody tries to write into our device * file - currently unsupported */ static int device_write(struct inode *inode, struct file *file, const char *buffer, int length) { #ifdef DEBUG printk ("device_write(%p,%p,%s,%d)", inode, file, buffer, length); #endif return -EINVAL; } /* Module Declarations ********************************************** */ /* The major device number for the device. This is static because it * has to be accessible both for registration and for release. */ static int Major; /* This structure will hold the functions to be called when * a process does something to the device we created. Since a pointer to * this structure is kept in the devices table, it can't be local to * init_module. NULL is for unimplemented functions. */ struct file_operations Fops = { NULL, /* seek */ device_read, device_write, NULL, /* readdir */ NULL, /* select */ NULL, /* ioctl */ NULL, /* mmap */ device_open, device_release /* a.k.a. close */ }; /* Initialize the module - Register the character device */ int init_module() { /* Register the character device (atleast try) */ Major = module_register_chrdev(0, DEVICE_NAME, &Fops); /* Negative values signify an error */ if (Major < 0) { printk ("Sorry, registering the character device failed with %d\n", Major); return Major; } printk ("Registeration is a success. The major device number is %d.\n", Major); printk ("If you want to talk to the device driver, you'll have to\n"); printk ("create a device file. We suggest you use:\n"); printk ("mknod <name> c %d <minor>\n", Major); printk ("You can try different minor numbes and see what happens.\n"); return 0; } /* Cleanup - unregister the appropriate file from /proc */ void cleanup_module() { int ret; /* Unregister the device */ ret = module_unregister_chrdev(Major, DEVICE_NAME); /* If there's an error, report it */ if (ret < 0) printk("Error in module_unregister_chrdev: %d\n", ret); }