On Capturing the release events of modkeys using X11 1
As one will learn when I finally manage to publish my notes or one of the ~3-7 fragments of blog posts concerned with this topic: I tend to stumble upon the question of Ergonomics and ways to improve Unified Command Line Interfacing - the dwimlayer interfaces between humans and computers, especially using keyboards.With this in mind I find myself ever-so-often wondering why a special kind of menu or ui interaction is barely used, albeit being a quite effective mechanism. It is the ability of terminating a menu interaction on Key release. The most prominent example of this might be the behaviour of the menu used to switch the focused window under Windows, triggered using Meta(Alt)-Tab. Pressing this key combination spawns a list of available windows. Pressing the Tab key repeatedly while holding the Meta key selects the next menu element, while releasing the Meta key confirms the selected action and closes the menu. If I remember correctly, pressing the Esc key while the menu is visible closes the menu without triggering the currently selected action (and if this is not the current behaviour, I would expect this from a reasonable designed interface).
This way of interaction seems to be a nice sweet spot between two well established ways of putting in an action:
- no-menu: you press some key (maybe in combination with a modifier) and an action is immediately executed. Either some menu displaying the possibility of the action is permanently visible or you have to memorize the association between action+key combination.
- preview/confirm: in this case some menu is available, it might be spawned by your key combination, permanently or activated otherwise. Key (sic!) here is, your key combination selects a candidate without triggering its associated action. Advantages: The user is informed about the action that they are about to trigger. This does also and especially allow the interface/menu designer to detail arbitrary information that may dynamically depend on the selected item and the overall state of an application: It may show a preview of the changes the action is going to produce. It may display a longer text detailing the operation. It may even suggest additional key combinations that trigger refined versions of the selected action. The disadvantage at hand is that any selected action needs to be manually confirmed by an additional key press, effectively doubling the work that needs to be done to issue an action in many cases.
So I felt mildly pleasured to discover a hackernews link of someone who replicated the Alt-Tab menu for Linux Window Managers running under X11. Adhering to the Computer Science Culture hackers cultural spirit, I tested the linked application and found it pretty much working. Unfortunately I find the functionality realized through the windows Alt-Tab menu pretty lacking: It merely allows to iterate over the list of all windows, no filtering, no shortcuts, unclear order. Thanks to our open source software culture though there is nothing holding me back (ok, time) in an effort to fix those shortcomings. So i dabbled in the source code and found the relevant pieces:
// Grab Alt+Tab
xlib::XGrabKey(
display,
tab_key,
alt_mask,
root_window,
1,
xlib::GrabModeAsync,
xlib::GrabModeAsync,
);
loop {
let mut event: xlib::XEvent = std::mem::zeroed();
xlib::XNextEvent(display, &mut event);
match event.get_type() {
xlib::KeyPress => {
log::debug!("Alt+Tab Pressed [X11]");
state::IS_VISIBLE.store(true, Ordering::SeqCst);
let index = state::SELECTED_INDEX.load(Ordering::SeqCst);
state::SELECTED_INDEX.store(index + 1, Ordering::SeqCst);
state::SELECTED_INDEX_CHANGED.store(true, Ordering::SeqCst);
}
xlib::KeyRelease => {
let xkey = xlib::XKeyEvent::from(event);
if xkey.keycode == alt_key as u32 {
state::IS_VISIBLE.store(false, Ordering::SeqCst);
state::SELECTED_INDEX.store(-1, Ordering::SeqCst);
state::SELECTED_INDEX_CHANGED.store(true, Ordering::SeqCst);
}
if xkey.keycode == tab_key as u32 {
//
}
}
_ => {}
}
}
I found myself seriously puzzled on finding out this does not work the way it looks. The key release of the alt key was just not captured. I dug through the source a bit more and found the following behaviour coded in the menu-display logic.
let controller = EventControllerKey::new();
let window_clone = window.clone();
let tabs_clone = tabs.clone();
controller.connect_key_released(
move |_, keyval, _, _| match keyval.name().unwrap().as_str() {
"Alt_L" => {
log::debug!("Alt_L released [GTK]");
window_clone.hide();
state::IS_VISIBLE.store(false, Ordering::SeqCst);
{
let mut tabs = tabs_clone.write().unwrap();
tabs.reorder_prev_first();
}
state::SELECTED_INDEX.store(-1, Ordering::SeqCst);
let surface = window_clone.surface().unwrap();
let display = window_clone.display();
let monitor = display.monitor_at_surface(&surface).unwrap();
let monitor_name = monitor.model().unwrap();
// ...
}
_ => {}
},
);
window.add_controller(controller);
Unfortunately this code depends on the GTK4 library used to display the menu for this app. It also seems to capture the Alt_L key only “locally” which means, when the menu has focus. In my everlasting quest for generalization though, at no point I did intend to couple my fix to any specific window-rendering library nor did I tend to comply to the assumption any UI element needs to have focus when releasing any mod key to confirm a selected action.
And so, the wälzen of random google search results i call research had begun again. Sooner than later I found this stackoverflow post that was answered:
I can think of two different ways around this.
1. Select KeyReleaseMask for all windows (and keep track of appearing and disappearing windows); or
2. Once you know Alt is pressed, poll the keyboard state with XQueryKeyboard every 0.1 second or so until it's released.
#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <stdbool.h>
#include <stdio.h>
void dowin (Display* dpy, Window win, int reg)
{
Window root, parent;
Window* children;
int nchildren, i;
XSelectInput (dpy, win, reg ? KeyReleaseMask|SubstructureNotifyMask : 0);
XQueryTree (dpy, win, &root, &parent, &children, &nchildren);
for (i = 0; i < nchildren; ++i)
{
dowin (dpy, children[i], reg);
}
XFree(children);
}
int main()
{
Display* dpy = XOpenDisplay(0);
Window win = DefaultRootWindow(dpy);
XEvent ev;
unsigned int alt_modmask = Mod1Mask;
unsigned int ignored_modmask = 0; // stub
KeyCode tab_keycode = XKeysymToKeycode(dpy,XK_Tab);
KeyCode alt_keycode = XKeysymToKeycode(dpy,XK_Alt_L);
dowin (dpy, win, True);
XGrabKey (dpy,
tab_keycode,
alt_modmask | ignored_modmask,
win,
True,
GrabModeAsync, GrabModeAsync);
while(true)
{
ev.xkey.keycode = 0;
ev.xkey.state = 0;
ev.xkey.type = 0;
XNextEvent(dpy, &ev);
switch(ev.type)
{
case KeyPress:
printf ("Press %x: d-%d\n", ev.xkey.window, ev.xkey.state, ev.xkey.keycode);
break;
case KeyRelease:
printf ("Release %x: %d-%d\n", ev.xkey.window, ev.xkey.state, ev.xkey.keycode);
break;
case MapNotify:
printf ("Mapped %x\n", ev.xmap.window);
dowin (dpy, ev.xmap.window, True);
break;
case UnmapNotify:
printf ("Unmapped %x\n", ev.xunmap.window);
dowin (dpy, ev.xunmap.window, False);
break;
default:
printf ("Event type %d\n", ev.type);
break;
}
}
XCloseDisplay(dpy);
return 0;
}
unsafe fn dowin(dpy: *mut Display, win: Window, reg: bool) {
let mut root: Window = 0;
let mut parent: Window = 0;
let mut children: *mut Window = std::ptr::null_mut();
let mut nchildren: u32 = 0;
// Convert reg to the desired event mask or 0 if reg is false.
let event_mask = if reg {
x11::xlib::KeyReleaseMask | x11::xlib::SubstructureNotifyMask
} else {
0
};
// Select input events on the current window based on `reg`.
XSelectInput(dpy, win, event_mask as i64);
// Query the window tree to get its children.
XQueryTree(
dpy,
win,
&mut root,
&mut parent,
&mut children,
&mut nchildren,
);
// Recursively call `dowin` on each child.
for i in 0..nchildren as isize {
dowin(dpy, *children.offset(i), reg);
}
// Free the list of children windows to avoid memory leaks.
XFree(children as *mut std::ffi::c_void);
}
Well it worked somehow. It was just the case that scrolling a website in firefox using the vimium c extension now exposed strange behaviour (
On a personal note: I do not think it should be this hard :(. In a follow up post I describe how I solved this issue (see me wusel through all rust/c/x11 key binding repos at hand), which hack of a bash script the working solution replaces and discuss whether all of this is a konzertierte operation to convince me of using Wayland.
As always I appreciate notes on existing solutions or ways to achieve the intended mechanisms in an appropriate manner. Cheers!