You hear the word “intuitive” used frequently in relation to user interfaces or applications in general. You hear that a certain application or interface is intuitive — or, too often, that it is not intuitive.
The problem with “intuitiveness” is that it doesn’t have an objective existence. What’s intuitive to one person may not be intuitive to another. Specifically, in the case of software, what’s intuitive to the programmer may not be intuitive to the user.
The real problem is that even if the software designers decide to create software which will be intuitive for the user, that doesn’t mean it will be — what the designer thinks will be intuitive to the user still may not be what the user finds intuitive at all.
Two things could be happening in the last scenario; one is design of user interface as though the user is an idiot. The problem with designing a UI with the idea that the user is an idiot is that you just might wind up with an interface which is only intuitive to idiots.
The second thing which might be happening is that the programmer is still designing for himself or herself, but trying to adopt the mindset he or she had before learning all about computers. There’s still a problem: no matter how you try to put yourself in the shoes of one who hasn’t learned about computers, it’s actually quite difficult to remember what it’s like not to know something. For example, the “right-click” paradigm for an extra context-sensitive menu is so common that you might believe it is intuitive. Actually, there’s nothing intrinsically intuitive about clicking the right mouse button on an object to get a set of optional tasks to perform; that’s learned behavior from an existing paradigm.
The problem of learned intuitiveness makes the whole matter more complex. Is it necessarily “intuitive” to have three buttons in the top right-hand corner of each window (in a windows-based GUI)? If you’re used to Windows or Linux, you’d probably say yes — if you’re a Mac user, you might suggest that the top left corner is a little more intuitive. However, in both cases you would agree on what the buttons do — because they do the same things on nearly all window based GUIs: the outermost closes the window, the next expands or contracts it, and the innermost “minimizes” it.
The minimization of windows is another example of learned intuitiveness. There’s nothing “intuitive” about a closed window being represented by a button on a bar; there are some window managers on Linux which “iconify” a window, so a minimized window is rendered as an icon on the desktop. Mac OS X does something similar in an area of the dock; “minimized” windows may appear as icons in the dock itself.
While you may be agreeing that none of this is directly “intuitive” per se, you might want to point out that it is also not very hard to figure out. That’s a good point, and is probably the one most closely related to making a user interface actually intuitive. That is, it may not be intuitive before you learn it, but the model is so simple that it takes only a short exposure to it to make it memorable for a long term, to the point where you actually may wind up believing that it is intuitive.
Given that we do come to view things as “intuitive” by a learning process, this can be both a help and a hindrance to designing future user interfaces. A 20+ year history of graphical interfaces for computer systems has created a fairly rich set of symbols and paradigms which, while not intrinsically obvious, have become a virtual lingua franca of the UI world. What this means is, if you are going to have context-menus related to certain objects or icons, you would probably be best served by making these available by clicking the right mouse button on the object. Why? Because this is analagous to some real life behavior? Do we touch rocks or trees with our left hand to use them and with our right to learn more about them? Of course not; however, in the esperanto of UI paradigms, “right-click” means “Show me more options. Show me a menu.” Ignoring the convention means that a vast majority of computer users will go away from your interface saying: That’s not very intuitive.
Actions are not the only examples of learned intuition; we are trained to think that icons can be moved, dragged, dropped. If you decide to put icons in your interface which can’t be moved, dragged, or dropped, you’re going to confuse people. Why? Because we have an idea what an icon which doesn’t move, can’t be dragged, can’t be dropped is — it’s a button. And we’re going to tell you, if it’s a button, make it look like a button, not an icon.
This learned “language” of visual symbols and expected behaviors of objects can, of course, be beneficial as well. Unfortunately, to get the maximum benefit from existing paradigms means that you are essentially constrained to duplicating the same UIs which already exist. If you have a new idea — too bad. You just have to blindly follow the conventions of “typical” user interfaces. If we’re interested in actually improving user interfaces, then we’re basically stuck. This type of thinking assumes that the peak of human/computer interaction interfacing has already been reached, no new thinking, methods, or paradigms need to be adopted, we just need to pull all existing software up to the same bar of “conventional” UIs, and we’ll be fine.
I don’t think anyone who uses computers really thinks that the “peak” of human/computer interaction and/or interface design has been reached… So, what’s the solution?
People spend years, and whole Master’s and/or Doctorate degrees studying this, so I’m certainly not suggesting I came up with any great insight off the top of my head. Also, I don’t read academic journals, so I’m betting that anything I have to say was said by somebody else some time ago.
That being said, one key to “intuitive” interfaces seems, to me, to be related to keeping a low representational gap between the analogies you choose and the behavior that is possible. Take the typical “desktop” system. A computer desktop has items on it, much like a real world desktop does. These items can typically be either folders or application shortcuts; again, not too hard of a leap. A real world desktop might have file folder on it, which “contains” other documents (and possibly other folders). It might also have devices on it which you “use” — staplers, calculators, etc. Now, opening a real world folder on your real world desktop does not create some sort of “window” which contains the folder’s contents — however, once you see this work, it isn’t too hard to understand that the window represents a “container”, showing what’s in the folder.
So you can introduce new ideas, but they should be a short jump from existing analogies. A recent experiment I saw showed the introduction of “piles” on a 3d desktop. This is a new idea, compared to what is currently in use, but it’s intuitive because it just adds something that could exist in the real world — nobody needs have a “pile” of objects explained to them.
On the other hand, adding vi key bindings to your desktop will make it very easy for a small subset of your users, but will be ignored by most users. Why? Vi key bindings have no relation, really, to anything other than vi itself.
Or the iPod; for all intents and purposes, nothing quite like the iPod had ever been seen before. However, you would always hear about how “intuitive” the wheel interface was. Why? It basically just inherited some paradigms from existing interfaces that were familiar to people; hierarchal menus, click a name to go “in”, click up to go back “up” one level; a circular motion to scroll a list was not in use, as far as I know, but all you need to do is see it one time, and it makes sense. There is a low gap between the action you’re performing and the result you see: you move your finger in a circle and the list scrolls.
I think most advances in “intuitive” interfaces will be along these lines; they may be “new” ways of doing things, but they will either extend an existing paradigm (ie, the “desktop” model) with new features based on reality, or with abstract features which have a consistent behavior that is easily correlated with the methods used to employ them.
Designing interfaces or features that are based on the logical architecture of the software, or on existing highly-specialized environments (ie, vi or emacs), will wind up being “non-intuitive” for almost everyone except the designer himself or herself.

I always remember the first time my mother-in-law sat down to use a computer. My wife told her to point the mouse at something and she picked the mouse up into the air and pointed it right at the thing on the screen. How intuitive is that?
I also remember the time I sat my mom down to see the work I had done on a website. She couldn’t figure out how to scroll down to see more of the page. She had never used a scroll bar before.
That said, there is a cost associated with intuitiveness. Intuitiveness is a useful thing in many situations, but non-intuitive interfaces have their place as well. You use the example of vi or emacs. These are perfect examples of applications with little or no intuitiveness that are useful beyond belief. If you add intuitiveness to such an application you end up with something like Word, and that is a bad thing indeed.
If you are seeking rapid adoption of a new technology, intuitiveness is key. If, OTOH, you are designing an application for steady, repetitive use over years, intuitiveness might take a back seat to efficiency.
But yes, I agree that innovations in intuitive interfaces should be done based upon analogy and should diverge minimally from the analogy chosen. I believe the innovation comes in the choice of analogy.
Right — and of course, it depends on your audience. If your audience consists of people who are very familiar with vi, then using vi-like key binding would be intuitive. I’m reminded of the first time I tried playing Moria (Rogue/Hack/Net-hack descendant) — it had an option for Rogue-like key binding, which were of course h, j, k, and l for moving left, down, up, right…. In other words, straight out of vi. I remember thinking how that didn’t seem to make any sense at all — but to a user of vi, it would have been quite natural.
In a similar way, for better or worse I’ve come to “expect” that CTRL-C, CTRL-X, and CTRL-V will copy, cut, and paste in GUI apps… if I try that in a new application, and it works, I’ll “feel” as if that app is intuitive… even though there is nothing “intuitive” about CTRL-V being used to paste things.
I think you’re right about there being a trade-off, too. Both vi and emacs allow you to be incredibly productive… if you’ve put in a significant amount of time to learn to use them. If you have an application that will only do one thing you can afford to have a big red button in the center of it labeled “Do the thing!” — for anything more complex, you’re not going to be able to have a clearly labeled button for every possible task (or combination of tasks).
“Intuitiveness” and how we perceive it is a fascinating topic, especially since almost everything in computing that we call “intuitive” really isn’t — we’ve just learned that it works that way. I think a lot of what we call “intuitive” is just “consistent with out previous experience” — but then again, maybe that’s what “intuitive” means, in a sense… hmm.
Incidentally, I discovered a cool feature of Gnome today that is extremely intuitive, but so intuitive that I never imagined it existed.
If you select some text in any window and then click and drag that text to another window it does a copy and paste automatically. That is extremely intuitive, but like I said, so intuitive I didn’t know it was there.
Now if you’re using Beryl (which I am) you can drag the text to the upper right corner (which displays all currently opened windows ala Expose), hover over the window till it gains focus, then drop the text. Wow!