The X New Developer’s Guide: X Window System Concepts (2024)

Alan Coopersmith

  1. X Is Client / Server
  2. X In Practice
    1. Input
      1. Input via Keyboard
      2. Input via Mouse
      3. Input via Touchpad
      4. Input via Touchscreen
      5. Advanced Input Devices and Techniques
    2. GetImage: Reading From the Display
    3. Output
      1. Rendering / Rasterization
    4. Displays and Screens
    5. Graphics contexts
    6. Colors (really?) and Visuals
    7. Syncing and Flushing connections
    8. Window System Objects
      1. Windows
      2. Pixmaps
      3. Widgets
      4. XIDs
      5. Atoms
      6. Properties
    9. Grabs
    10. Selections, Cut-Copy-Paste

This chapter aims to introduce you to the basic X WindowSystem concepts and terminology you will need tounderstand. When you have these concepts, you will be readyto dive deeper into specific topics in later chapters.

The X New Developer’s Guide: X Window System Concepts (1)

X Is Client / Server

The X Window System was designed to allow multiple programsto share access to a common set of hardware. This hardwareincludes both input devices such as mice and keyboards, andoutput devices: video adapters and the monitors connected tothem. A single process was designated to be the controllerof the hardware, multiplexing access to the applications.This controller process is called the X server, as itprovides the services of the hardware devices to the clientapplications. In essence, the service the Xserver providesis access, through the keyboard, mouse and display, to the Xuser.

Like many client/server systems, the X server typicallyprovides its service to many simultaneous clients. The Xserver runs longer than most of the clients do, and listensfor incoming connections from new clients.

Many users will only ever use X on a standalone laptop ordesktop system. In this setting, the X clients run on thesame computer as the X server. However, X defines a streamprotocol for clients / server communication. This protocolcan be exposed over a network to allow clients to connect toa server on a different machine. Unfortunately, in thismodel, the client/server labeling can be confusing. You mayhave an X server running on the laptop in front of you,displaying graphics generated by an X client running on apowerful machine in a remote machine room. For most otherprotocols, the laptop would be a client of file sharing,http or similar services on the powerful remote machine. Insuch cases, it is important to remind yourself thatkeyboards and mice connect to the X server. It is also theone endpoint to which all the clients (terminal windows, webbrowsers, document editors) connect.

X In Practice

This section describes some of the fundamental pieces of Xand how they work. This is one of those places whereeverything wants to be presented at once, so the section issomething of a mish-mash. Recommended reading practice is toskim it all once, and then go back and read it all again.

Input

As mentioned earlier, the X server primarily handles twokinds of hardware: input devices and outputdevices. Surprisingly, the input handling tends to be themore difficult and complicated of the two. Input ismulti-source, concurrent, and highly dependent on complexuser preferences.

Input via Keyboard

One of the tasks the X server performs is handling typing onkeyboards and sending the corresponding key events to theappropriate client applications. In a simple Xconfiguration, one client at a time has the "input focus"and most key events will go to that client. Depending onwindow manager configuration, focus may be moved to anotherwindow by simply moving the mouse to another window,clicking the mouse, using a hotkey, or by manipulating apanel showing available clients. The client with focus isusually highlighted in some way, so that the user can knowwhere their input will go. Clients may use "grabs"(described later in this chapter) to override the defaultdelivery of key events to the focused client.

There are a wide variety of keyboards in the world. This isdue to differing language requirements, to differingnational standards, and to hardware vendors trying todifferentiate their product. This variety makes the mappingof key events from hardware "key codes" into text input achallenging and complex process. The X server reports asimple 8-bit keycode in key press and release events. Theserver also provides a keyboard mapping from those keycodesto "KeySyms" representing symbolic labels on keys ("A","Enter", "Shift", etc.). Keycodes have no inherent meaningoutside a given session; the same key may generate differentcode values on different keyboards, servers, configurations,or operating systems. KeySym values are globally-assignedconstants, and are thus what most applications should beconcerned with. The X Keyboard (XKB) extension providescomplex configuration and layout handling, as well asadditional key handling functionality that was missing inthe original protocol. Xlib and toolkits also provide inputmethods for higher level input functions, such as composekey handling or mapping key sequences to complex characters(for example, Asian language input).

Input via Mouse

The X protocol defines an input "pointer" (no relation tothe programming concept). The pointer is represented onscreen by a cursor; it is usually controlled by a mouse orsimilar input device. Applications can control the cursorimage. The core protocol contains simple 2-color cursorimage support. The Render extension provides alpha-blended32-bit color cursor support; this support is normallyaccessed through libXcursor.

Pointer devices report motion events and button press andrelease events to clients. The default configuration of theXorg server has a single pointer. This pointer aggregatesmotion and button events from all pointer-type devicesattached to the server: for example, a laptop's touchpad andexternal USB mouse. Users can use the MultiPointer X (MPX)functionality in Xinput extension 2.0 to enable multiplecursors and assign devices to each one. With MPX, eachpointer has its own input focus. Each pointer is paired withkeyboards that provide input to the client that has theinput focus for that pointer.

Input via Touchpad

For basic input, a touchpad appears to clients as justanother device for moving the pointer and generating buttonevents. Clients who want to go beyond mouse emulation canuse the Xinput extension version 2.2 (shipped with Xorg1.12) or later to enable support for multitouch eventreporting.

Input via Touchscreen

[XXX write me --po8]

Advanced Input Devices and Techniques

[Make whot write this? or steal from http://who-t.blogspot.com? --alanc]

GetImage: Reading From the Display

The X server does not keep track of what it has drawn on thedisplay. Once bits are rendered to the frame buffer, itsresponsibility for them has ended. If bits need to bere-rendered (for example, because they were temporarilyobscured), the X server asks a client---usually either acompositing manager or the application that originally drewthem---to draw them again.

In some situations, most notably when taking "screenshots",a client needs to read back the contents of the frame bufferdirectly. The X protocol provides a GetImage request forthis case.

GetImage has a number of drawbacks, and should be avoidedunless it is absolutely necessary. GetImage is typicallyextremely slow, since the hardware and software paths inmodern graphics are optimized for the case of outputtingpixels at the expense of rendering them. GetImage is alsohard to use properly. Here, more than anywhere else in the Xprotocol, the underlying hardware is exposed to clients. Therequested frame buffer contents are presented to the clientwith the frame buffer's alignment, padding and byteordering. Generic library code is available in Xlib and XCBto deal with the complexity of translating the receivedframe buffer into something useful. However, using this codefurther slows processing.

Output

Rendering / Rasterization

The X protocol originally defined a core set of primitiverendering operations, such as line drawing, polygon filling,and copying of image buffers. These did not evolve asgraphics hardware and operations expected by modernapplications moved on, and are thus now mainly used inlegacy applications.

Modern applications use a variety of client side renderinglibraries, such as Cairo for rendering 2D images or OpenGLfor 3D rendering. These may then push images to the Xserver for display, or use DRI to bypass the X server andinteract directly with local video hardware, takingadvantage of GPU acceleration and other hardware features.

Polygon Rendering Model

Displays and Screens

X divides the resources of a machine into Displays andScreens. A Display is typically all the devices connectedto a single X server, and displaying a single session for asingle user. Systems may have multiple displays, such asmulti-seat setups, or even multiple virtual terminals on asystem console. Each display has a set of input devices,and one or more Screens associated with it. A screen is asubset of the display across which windows can be displayedor moved - but windows cannot span across multiple screensor move from one screen to another. Input devices caninteract with windows on all screens of an X server, such asmoving the mouse cursor from one screen to another.Originally each Screen was a single display adaptor with asingle monitor attached, but modern technologies haveallowed multiple devices to be combined into logical screensor a single device split.

When connecting a client to an X server, you must specifywhich display to connect to, either via the $DISPLAYenvironment variable or an application option such as-display or --display. The full DISPLAY syntax isdocumented in the X(7) man page, but a typical displaysyntax is: hostname:display.screen The "hostname" may beomitted for local connections, and ".screen" may also beleft off to use the default screen, leaving the minimaldisplay specification of :display, such as ":0" for thenormal default X server on a machine.

Graphics contexts

A graphics context (GC) is a structure to store shared stateand common values for X drawing operations, to avoid havingto resend the same parameters with each request. Clientscan allocate additional graphics contexts as necessary to beable to specify different values by setting up a separate GCfor each set of values and then just specifying theappropriate GC for each operation.

Colors (really?) and Visuals

X is so old that when it was designed most users hadmonochrome displays, with just black and white pixels tochoose from, and even then hardware manufacturers couldn'tagree which was 0 and which was 1. Those who spent an extrathousand dollars more would have 4 or 8 bit color, allowingpixels to be chosen from a palette of up to 256 colors.But now it's 2012, and anyone without 32-bits of color dataper pixel is a luddite. Still, a lot of complexity remainshere that someone should explain...

Syncing and Flushing connections

As described in the Communication chapter, the X protocoltries to avoid latency by doing as much asynchronously aspossible. This is especially noticed by new programmers whocall rendering functions and then wonder why they got noerrors but did not see the expected output appear. Sincedrawing operations do not require waiting for a responsefrom the X server, they are just placed in the clientsoutgoing request buffer and not sent to the X server untilsomething causes the buffer to be flushed. The buffer willbe automatically flushed when filled, but it takes a lot ofcommands to fill the default 32kb buffer size in Xlib. Xliband XCB will flush the buffer when a function is called thatblocks waiting for a response from the server (though whichfunctions those are differ between the two due to thedifferent design models - see the Xlib and XCB chapter fordetails). Lastly, clients can specifically call XFlush() inXlib or xcb_flush() in XCB to send all the queued requestsfrom the buffer to the server. To both flush the buffer andwait for the X server to finish processing all the requestsin the buffer, clients can call XSync() in Xlib orxcb_aux_sync() in XCB.

Window System Objects

A variety of objects are used by X.

Windows

In X, a window is simply a region of the screen into whichdrawing can occur. Windows are placed in a tree hierarchy,with the root window being a server created window thatcovers the entire screen surface and which lives for thelife of the server. All other windows are children of eitherthe root window or another window. The UI elements thatmost users think of as windows are just one level of thewindow hierarchy.

At each level of the hierarchy, windows have a stackingorder, controlling which portions of windows can be seenwhen sibling windows overlap each other. Clients canregister for Visibility notifications to get an eventwhenever a window becomes more or less visible than itpreviously was, which they may use to optimize to only drawthe visible portions of the window.

Clients running in traditional X environments will alsoreceive Expose events when a portion of their window isuncovered and needs to be drawn because the X server doesnot know what contents were there. When the compositeextension is active, clients will normally not receiveexpose events since composite puts the contents of eachwindow in a separate, non-overlapped offscreen buffer, andthen combines the visible segments of each window onscreenfor display. Since clients cannot control when they will beused in a composited vs. legacy environment, they must stillbe prepared to handle Expose events on windows when theyoccur.

Pixmaps

A pixmap, like a window, is a region into which drawing canoccur. Unlike windows, pixmaps are not part of a hierarchyand are not displayed on screen directly. Pixmap contentsmay be copied to windows for display, either directly viarequests such as CopyArea, or automatically by setting aWindow's background to be a given pixmap. Pixmaps may bestored in system memory, video memory on a graphics adaptor,or shared memory accessible by both client and server. Agiven pixmap may be moved back and forth between system andvideo memory as needed to maintain a good cache of recentlyaccessed pixmaps in faster access video RAM. Using theMIT-SHM extension to store a pixmap in shared memory mayallow the client to push updates faster, by operatingdirectly on the shared memory region instead of having tocopy the data through a socket to the server, but it mayalso prevent the server from moving the pixmap into thecache in video ram, making copies to a window on the screenslower.

Widgets

Applications need more than windows and pixmaps to provide auser interface - users expect to see menus, buttons, textfields, menus, etc. in their windows. These user interfaceelements are collectively called widgets in mostenvironments. X does not actually provide any widgets inthe core protocol or libraries, only the building blockssuch as rendering methods and input events for them to bebuilt with. Toolkits such as Qt and GTK+ provide a commonset of widgets for applications to build with, and a richset of functionality to provide good support for a widerange of uses and users, including those who read differentlanguages or need accessibility technology in order to useyour application. Some toolkits have utilized all theinfrastructure X provides around window stacking andpositioning by making each widget a separate window, butmost modern toolkits do this management client side nowinstead of pushing it to the X server.

XIDs

Many resources managed by the server are assigned a 32-bitidentification number, called an XID, from a server-widenamespace. Each client is assigned a range of identifierswhen it first connects to the X server, and whenever itsends a request to create a new Window, Pixmap, Cursor orother XID-labeled resource, the client (usuallytransparently in Xlib or xcb libraries) picks an unused XIDfrom it's range and includes it in the request to the serverto identify the object created by this request. This allowsfurther requests operating on the new resource to be sent tothe server without having to wait for it to process thecreation request and return an identifier assignment. Sincethe namespace is global to the Xserver, clients canreference XID's from other clients in some contexts, such asmoving a window belonging to another client.

Atoms

In order to reduce the retransmission of common strings inthe X protocol, a simple lookup table mechanism is used.Entries in this table are known as Atoms, and have aninteger key that is passed in most protocol operationsrequiring them, and a text string that can be retrieved asneeded. The InternAtom operation searches finds the Atom idnumber for a given string, and can optionally add the stringto the table and return a new id if it's not already found.The GetAtomName returns the string for a given atom idnumber. Atoms are used in a wide variety of requests andevents, but have a unique namespace across all operationsand clients of a given X server.

Properties

A common design pattern in X for providing extensiblemetadata is the Property mechanism. A property is a keyvalue pair, where the key is a text string, represented asan X atom, and the value is a typed value, which may also bean atom, an integer, or some other type. The core protocolprovides properties on windows and fonts. The Xinputextension adds properties to input devices, while the Xrandrextension adds properties to output devices.

X itself does not assign any meaning or purpose to windowproperties. However conventions have been established formany window properties to provide metadata that is usefulfor window and session management. The initial set ofproperties is defined in the X Inter-Client CommunicationConventions Manual (ICCCM), which may be found athttp://www.x.org/releases/current/doc/. This initial setwas later extended by groups working on common functionalityfor modern desktop environments at freedesktop.org, whichbecame the Extended Window Manager Hints (EWMH)specification, found athttp://www.freedesktop.org/wiki/Specifications/wm-spec.

Grabs

Grabs in X provide locking and reservation capabilities."Active Grabs" take exclusive control of a given resourceimmediately and lock out all other clients until the grab isreleased. "Passive grabs" place a reservation on aresource, causing an active grab to be triggered at a latertime, when an event occurs, such as a keypress. These canbe used for instance, to have a hotkey that goes to acertain application regardless of which applicationcurrently has input focus.

One of the available grabs is the Server Grab. A client whograbs the server locks out all other clients, preventing anyother application from being able to update the display orinteract with the user until the server grab is released.This should be released as soon as possible, since besidesannoying users when they can't switch to another program, itmay also cause security problems, since the screen lock isjust another client and will be locked out with the rest.

The other primary form of grab is on an input device orevent. Clients can actively grab the keyboard or mouse toforce getting all input from a device, even if the cursormoves outside the application's window. Passive grabs canbe placed on specific input events, such as a particularkeypress event or mouse button event, causing a primary grabto automatically occur for that client when the eventhappens.

More information can be found inhttp://who-t.blogspot.com/2010/11/high-level-overview-of-grabs.html.

Selections, Cut-Copy-Paste

[copy-and-paste fromhttp://keithp.com/~keithp/talks/selection.ps and other docson http://www.x.org/wiki/CutAndPaste ? ]

The X New Developer’s Guide
<< Preface | Communication Between Client and Server >>

The X New Developer’s Guide: X Window System Concepts (2024)
Top Articles
Latest Posts
Article information

Author: Tyson Zemlak

Last Updated:

Views: 6222

Rating: 4.2 / 5 (63 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Tyson Zemlak

Birthday: 1992-03-17

Address: Apt. 662 96191 Quigley Dam, Kubview, MA 42013

Phone: +441678032891

Job: Community-Services Orchestrator

Hobby: Coffee roasting, Calligraphy, Metalworking, Fashion, Vehicle restoration, Shopping, Photography

Introduction: My name is Tyson Zemlak, I am a excited, light, sparkling, super, open, fair, magnificent person who loves writing and wants to share my knowledge and understanding with you.