ALINK="#FF0000">
LINUX GAZETTE
[ Prev ][ Table of Contents ][ Front Page ][ Talkback ][ FAQ ][ Next ]

"Linux Gazette...making Linux just a little more fun!"


Taming The Linux Keyboard (My Programming Adventures in Writing a Console Application for Linux)

By Petar Marinov


It was about a year ago when I ventured into the idea of porting my Windows-based console editor to Linux. Naturally, I targeted the text console. The editor was designed in a way to facilitate the porting to any other text console environment. I have isolated all the keyboard input and text output functions in two files, which I planned to rewrite whenever a new platform comes to my way. I've supplied two different versions of these files for Windows console and plain DOS which, I presumed then, validated my initial idea that porting should be a simple task.

My only knowledge about Linux then comprised the GNU compiler package obtained by spending a year with its DJGPP port to DOS. While I knew how to write, compile, and debug programs in Linux, my knowledge for the console was limited to functions like printf() and getch(). As I knew how prominent is the role of the text console in Unix, I supposed that programming console applications should be really advanced and on a par or even better to this in DOS and Windows.

I pulled the anchor and once underway, I started to gather necessary information. Getting the advantage of having the sources of everything, I tried to find the good tides by downloading two applications that had console oriented text interfaces.

First it was the ubiquitous MC (Midnight Commander). This program was the straw, which everyone that comes from Windows or DOS into the UNIX land grabs gasping for a breath in the ocean of the unknown.

Second it was the TurboVision port to Linux. TurboVision is a popular windowing framework for DOS designed in Borland. The company was amicable enough to release the sources in public domain and shortly after having them in 32-bit DJGPP I found out that there is a Linux version as well. This pretty much showed me that there is a resolution to my problems. To further milk the ship metaphor, this blew good winds in my sails.

Something has spoiled the nice experience of using MC. There is something rotten in Denmark, as one of the Great used to say. Why ESC ... is not ESC? In MC when you press ESC it does nothing until you press it again and then the two-ESC combination does the action supposed for the single ESC. Helloooo, why is that? Read on, to learn what dictates the rules here.

In DOS we had ANSI.SYS. You use printf() with a sequence (starting with ESC and hitherto called an ESC-sequence) of funky characters to move the cursor, change the color etc, following the needs of the full-text screen applications. It was considered primitive and unproductive to use ANSI.SYS, and besides fancy ASCII art, nothing serious engaged in using this methodology. Advanced libraries offered direct access to the video memory which greatly improved the user experience in working with console applications. I remember looking in dismay the sources of tpcrt.pas of the TurboPower package, where tight assembler code tried to squeeze whatever the graphic (maybe we should say "text") card had to offer.

It turned out that what was considered a very primitive way to do full screen applications in DOS, is the road to go in UNIX! I needed sometime to collect myself after learning one of those basic facts of life. Back on my feet I tried to devise a scheme for taming the beast; I thought that as I learn the ESC-sequences and write the functions it would be a dream come true. After some further research I discovered that there is no single standard for these ESC sequences! There are different terminals that support different set of operations and that one needs to have a whole database to properly operate all possible terminals. Why does everything grow so complex? I still tremble remembering the waves of the aftershock by all the discoveries I made in a single day.

"Curses" is the king. Long live the king! It turns out that somebody has already developed this database and all the functions I need in a library originally called "curses". In Linux it is "ncurses". Everyone uses it.

Screen access functions are almost intuitive enough for everyone to start immediately utilizing them. "Ncurses" takes care to update only the portions of the screen that actually changed, which is a nice performance improvement when you use your program in a telnet session or any other remote mode to minimize the amount of the transferred data.

One simple problem I faced was the fact that DOS and Windows color attributes do not directly map to "ncurses" color attributes. The problem is aggravated by the fact that in "ncurses" I have only 64 pairs of color and background available. How will I map my 127 possible color/background attributes to just 64?! Well, a short analysis revealed that my program uses only about 25 distinctive attributes, which allows me to fit them nicely in the 64 attributes map that "ncurses" uses. It works like that, I have an array of attributes -- my pallete. I first go and count how many unique attributes I have. Then for each unique attribute, that I dissolve to color and background, I create correspondent entry in the "ncurses" color palette. The index of this entry (consider it as an ID) is stored in a secondary array of 256 bytes (the whole range of Windows and DOS). When I then pass to my display function an attribute from my palette it is used as an index in this secondary array to extract the correspondent attribute ID that is generated by "ncurses". So as long as I do not go beyond 64 unique attributes, my program will be happy and will use the good old 256 attribute values. This allows me to have a single color palette for all the platforms that I currently support, where for DOS and Windows it is used natively, it is dynamically remapped in Linux.

The TurboVision port used a direct screen access when running in a text mode linux terminal. Temporarily I considered this option, it is still possible to add this to my modules, but later I thought that the small performance gain simply doesn't worth the effort.

I had a bad hunch about the keyboard. First just by looking in the key definitions in ncurses headers, I noticed that this library basically lacks the infrastructure to define rich key combinations. The terminal ships all the "extended" keys via ESC sequences which, I don't know why, prevents you from getting single ESC as an ESC and you are always required to press ESC twice for your program to receive it once. Plus you have only a certain number of key codes, which I presume are derived from an ancient crippled terminal, and nothing creative happened to the definitions since then. Compare this to Windows, and even in DOS, where you can have an ASCII translation of a key, then you can have the keys as position codes and you can always have the shift state of the keyboard. In Windows you are delivered, via a standard API, different events as key pressed and key released. It sounds natural, isn't it? Well, because it is so natural, UNIX faithful to its orthodox approach to these matters defines everything in an extremely crippled model. Working this way is maybe good for improving your mental stamina, but believe me, is totally unproductive if you would like to achieve something fast.

Then I was unable to make ncurses operate with a 0 timeout when reading a key. Add to this the double ESC syndrome, the total lack of any roads to extend this, and in a while you look as an abandoned donkey in a desert with a water for just couple of hours. I always have a bottle of Evian within reach, I needed no more evidences that while "ncurses" support for the display is adequate, its keyboard support beyond some very basic functionality is totally irrelevant to my needs. I started to think how to maintain the whole keyboard business with my own code.

The keyboard is a file -- stdin. I never thought I will use stdin in a full screen program but, as you can suspect, this is the way to go in Unix. The stdin file transports everything in ASCII codes and keys like the arrows form a sequence that starts with ESC. At first the impression is that if ESC is a start of sequence then the key ESC itself should be ESC-caped as well. That is the way "ncurses" go. That's why in MC we need to press ESC twice.

Beside delivering all the ASCII codes and the ESC sequences, the keyboard module needs to supply a kbhit() function. In the documentation "ncurses" promises that its getch() function can work with a 0 timeout, thus never blocking when there is no key in the buffer. Maybe a plan like mine starts to form in your mind, I will use getch() with 0 timeout, then I will have a small sleep(xxx) and this loop will exit whenever a key is pressed. This sounds good in theory, but "ncurses" is short on delivering on this specific feature. Its maybe something that I didn't do right, or I used old version, or maybe it is something else, I may eventually even look at the sources of the "ncurses". I didn't want to go that deep, the whole keyboard model looked totally outdated, fixing a small flaw in "ncurses" wouldn't have helped me, I thought.

I need basically this: 1. kbhit(). I need to check for a key and exit. 2. I need to be able to read something like Ctrl+Shift+Left_Arrow. 3. I need to exit with a timeout if for sometime a key is not pressed.

For anything of this to happen one needs to put the keyboard stdin file in a row mode. By default you enter lines of text, which are send to your program only after the user presses Enter. So to be able to read a single character you need to switch to a row mode. The stdin file may not only serve a keyboard but may get characters via a serial cable, yes, only 3 wires should be enough to manipulate a whole machine. For a serial connection to operate properly you need to maintain flow control. This could be done by adding 2 additional wires for each of the talking sides to request "stop sending me characters" and when the buffer is empty to ask "now start dumping again". But adding 2 wires may prove expensive and even complex, and this shuttle in the space, who knows, something may happen if we introduce this extra level of hardware complexity. What one can not do with the hardware makes up with the software, a host of characters (in fact 31) is wasted to not only maintain the flow control but to switch modes of echoing, the canonicality (whew, this is a word, a?!) etc. As you may already know these 31 characters are the control characters that populate the lower district of the ASCII table. While this sounded as a good idea 20 years ago now pressing Ctrl+S to suspend the output on the screen looks arcane to me, and I need to use Ctrl+S to save the current file. It smells like the keyboard needs some extra massaging to fit into the shape I need. So the setup function easily grows beyond the lines of a single screen, after some hours of reading documentation and experimentation I managed to come up with something that actually works. I felt proud, and in this era of people doing space tourism I felt that I have my small shred of achievements to show.

Kbhit() proved to be relatively easy. To wait for a characters on a file (or socket), use select(). If you apply this to stdin then select() will unblock when someone presses a key. If the timeouts are tuned to 0, then it will exit immediately with a failure or success code indicating that a key is ready to be read.

The ReadKey() function has a 2-tier structure (complex isn't it, sounds like something that is multi-tire). At the first level I use select() to block for incoming characters. When select() unblocks I issue a single read() function and try to extract as much as possible. Whatever I manage to suck from stdin I store in a fifo queue. On the second level the characters are extracted one by one and a string sequence is compiled that is matched against an exhaustive list of ESC sequences. We have a definite success if we find a matching string. We have a definite success if we have a single ESC followed by a time-out. Then it is just an ESC, simple! Some code takes place to close the extra cases where nothing matches and it is not an ESC etc. All this is supplied by a function that extracts the Shift state of the keyboard. You may guess that this is a hack that only works in Linux text terminals. I learned this from the source code of MC. The downside is that this doesn't work outside bare-bones text Linux terminals. Try to work in a X terminal and you are dead, no shift key status and no means to ever extract it.

Linux keyboard gives one more advantage in front of any other Unix keyboard (that I know of), you may define your own ESC sequences. So, if you need for example Ctrl+Shift+F3 you can define this with a new ESC sequence and by using the "loadkey" utility to download it into the kernel. The changes are immediate and non-permanent. If you reboot you need to reexecute the same command with the same definition. I liked this, so I added all the key combinations that I used, and were undefined in Linux and defined in Windows.

Actually by having the function to extract the shift-state, the possible key-combinations one needs to explicitly define is greatly decreased. As for example if we have Shift+F3 defined, we can get the Control key state and then we have Ctrl+Shift+F3. Which without the shift-state function should be defined as a new key sequence with "loadkey". A problem surfaces here, which although subtle, should be well noted. If the extraction of the key from the keybuffer does not coincide with the time we extract the shift state we create a big mess. With a great probability (the computers nowadays a fast enough) we can expect for this not to happen, but hey, as this is just a probability so mathematicions say the odds are that sometimes it may actually happen. Example follow. If in my editor F3 is "close all files and discard changes" and Ctrl+Shift+F3 is assigned to be "next file", I beg for trouble here. Imagine that F3 is in the buffer and you can not get the shift state at the same time you will get just F3 and not "Ctrl+Shift+F3".

Having stdin as keyboard input has one great advantage, I should admit. The editor is subject to a total automation by just supplying an input file, provided that you put there the ESC sequences to activate various extended keys if necessary.

To scold the enthusiasm in your eyes I should note that ... ostensibly, full automation is possible, but to a certain extent ... maybe you have already figured this out, right ... you can not supply the shift states in a text file. Well, that's life, you can not have your cake and eat it too!

In Windows a console application works equally well in text mode and in a graphical console window. While my module works perfectly well in a text linux console ($TERM="linux"), its keyboard support is totaly inadequate when started from within X window terminal. All this is corollary of the fact that the most common denominator of the UNIX keyboards is "unsigned char" and the extended keys use predefined ESC sequences that didn't evolve for the last 20 (or more) years. So I'll keep working in making my modules X aware. Whenever the program senses that it is started from within X terminal it will open a new window where all the text output will be emulated with a fixed width font and the keyboard will be processed to the best possible extent that X server offers.

Eventually what I initially planed proved to be doable. It was quite an effort and that is why the victory was so sweet.

demo.tar.gz contains the whole story expresed in C language.

You may find this article at the "zepp" discussion group, here. I will be glad to respond to questions or any opinions regarding this article posted in the discussion group. Generally, the discussions are usually about software development, comments on hardware, programming languages. Anything related to 42 will find benign soil. You are welcome to join.

Petar Marinov

I come originally from Rousse, Bulgaria. Now I live in San Francisco and work in Foster City (California). I program mainly embedded systems. I started using Linux in 1998. My work is mainly done on Windows because of Visual C, but it is deployed on Linux platform.


Copyright © 2002, Petar Marinov.
Copying license http://www.linuxgazette.net/copying.html
Published in Issue 76 of Linux Gazette, March 2002

[ Prev ][ Table of Contents ][ Front Page ][ Talkback ][ FAQ ][ Next ]