Tuesday, August 17, 2010

Information dynamics

Information dynamics is the study of the transformation of information through the process of computation. Information processing happens along a very rich scale:

Social
Desktop
Applications
Frameworks
Libraries
System Calls
Kernel
Drivers
Assembly
Macro-ops
Micro-ops
Gates
Elementary particles

There are a dozen conceptual levels of abstraction between the user at the keyboard and the physical process of computation proceeding inside a modern computer. This set of scales has originated through a gradual process of invention from the earliest electronic computers which were programmed by manually manipulating switches and wires and used by a single person at a time to modern personal computers which do not require users to write programs and which interact with millions of people through a global network.

This expansion has been enabled by the exponential growth in computational capacity described by Moore's Law because the addition of each scale of abstraction has a cost in terms of performance. Our computations would proceed more rapidly if we represented them as an optimized set of micro-operations that the microprocessor could directly process without any intermediaries. However, it would take us a tremendous amount more time and effort to represent our computations in such a manner. Instead we perform our computations through applications with usable human interfaces that are built on lower scales of software that can be re-used for a wide variety of tasks.

The social scale is concerned with the design and specification of such user interfaces, the field of human computer interaction, while the desktop scale addresses the combination of applications, such as the window manager, file browser, and a variety of hardware management tools, that interact to form the user interface. Applications, such as web browsers, mail clients, and instant messagers, run within the desktop environment at the application scale, dependent on the frameworks below them.

The framework scale contains both desktop frameworks such as Gnome and web frameworks such as Ruby on Rails. Software frameworks promote a standard structure for applications and control the flow of control between parts of an application. Frameworks are often built on top of a wide variety of libraries at the library scales. Libraries are collections of subroutines that perform a common task, such as data compression or image manipulation, required by many applications.

Libraries request service from the operating system kernel through system calls. System calls provide the interface between the application and the operating system. They are needed for any task requiring access to the hardware or communication with other processes or machines. The kernel scale is responsible for managing system resources, including the processor, memory, and input/output resources such as disk, graphics, and network, and providing a layer of abstraction between applications and the hardware that actually performs computations.

Kernels are written in assembly language or a combination of assembly language another low level programming language like C. Assembly languages provide a symbolic representation of the machine code that is used by a particular microprocessor.

Microprocessors are typically programmed in a machine code that is backwards compatible with many older generations of microprocessors. To enable speed improvements while retaining backwards compatibility, the microprocessor performs arithmetic and logical operations on micro-ops, a lower level machine code that is generated by translation hardware on the microprocessor from macro-ops, the machine code that assembly language is translated into. This translation uses programmable memory on the processor, allowing the processor's machine language to be modified by the kernel to fix bugs or even add features.

The computation described by a micro-op, such as addition of two numbers, is performed by a set of logic gates, combining sets of binary numbers and producing a binary number as output. At the of the scale, ones are represented by a high voltage level and zeros are represented by a low voltage level. At this level, computers are analog devices, having to deal with noise and timing issues in determining whether a particular voltage is a low or high level representing a one or zero respectively.

The survey of information dynamics above has led us through the zone of discrete artificial computation. As such, its foundation rests on the notion of a universal computer, arguably the simplest model of which is the Turing machine. Universality appears at many scales in information dynamics. The language of the microprocessor is every bit as Turing universal as, many layers up, a scripting language for a 3D animation program would be. Indeed both have access to similar scale-invariant features that have evolved to suit the needs of humans: decision statements, loop statements, behavior encapsulation in terms of procedures, and so on, precisely the features that make it universal.

Still, as we go up the scale we make use of more powerful abstractions. Abstractions in computer science are more dynamic and ad hoc than in mathematics (Colburn and Shute 2007), and are more subtle than the simple collapsing of distinctions into broader equivalence classes. Indeed each lower layer presents a public picture of itself to the higher layers, often through an Application Program Interface (API). These can be very elegant object-oriented interfaces, or very “leaky”, allowing layer-violating access to lower levels.

There is heterogeneity of design here, and human programmer communities struggle to build complex systems and come up with a variety of solution schemes. What software engineers are trying to manage here is a scale of complexity, gauged for example by the integral numerical scale of lines of code (LOC). From scripts a few tens of lines long, to operating systems a few tens of millions of lines long, this is a great height one must scale. (It is interesting that English can use the verb to scale to denote the act of traversing a scale, as well as to establish a scale.) The deep idea here is that to handle the vast scale of LOC, software engineers have turned to another type of scale, a scale of increasing abstraction. This subverts the LOC gauge by rescaling it: a ten-line script in Python for a game may translate into hundreds of thousands of lines of code that execute when it runs. Programmers have often seen the humor in this (Munroe 2009).

The field of artificial computation is of course pluralistic. Beyond the traditions sketched above, there is also the tradition coming out of the lambda calculus, and the tradition coming out of nondeterministic and stochastic programming. The Church programming language (Goodman et al. 2008) is a new example of work that combines these traditions.

Underlying artificial computation is the idea of a text that is both dead (data) and alive (executable). The power and limits of Turing computation arise from the consequences of treating code as text and conversely. The performative nature of code (Austin 1975) gives it its power as it acts on the world. (The simple proof of the unsolvability of the halting problem is only a few lines long, a meditation on the notion of executable text.)

It is the notion of text that gives us the LOC scale; it is the notion of execution that gives us the time scales used in asymptotic complexity analysis. These are complementary. Across program text we have the notion of scope, as in the range of text over which a variable has a given meaning. This is a special case of locality of reference (Denning 2005) along a scale. There are many scales of abstract time complexity, the coarsest of which is (P, NP, non-NP).

Natural computation also has its own set of scales, the most obvious being the spatiotemporal scale. Quantum computing and biological computing are two areas of interest here, and insights there have the promise of shaking up both computer science and information theory (Kirby 1998, 2002). For example, the so-called quantum-to-classical transition, and the very qbit/bit distinction, is a scale issue. The emergence of self-replicating macromolecules, and then the emergence of information processing out of uninterpreted dynamics occurs across a complexity scale.

The pervasiveness of scale in computing, taken together with the pervasiveness of computing itself, actually pulls us into a world picture that is quite different from the mechanical world view (Dodig-Crnkovic and Müller 2010) It is this we will explore in the next section.

No comments:

Post a Comment