|
Bloat War - Part 2 The weapon for defeating bloatware is to share a programming task across several different types of languages. Compiled, interpreted and graphic languages, when combined together, provide a well tested and stable programming environment - each type of language complements the other. In the example below it will be seen that by simply using different language families in combination, it is possible to easily produce compact and readable code that uses few system resources - code that is both flexible and powerful. The Case of Punctuated Batch Processing - An Example The Problem: Unfortunately the files have little dependable implied markup (sequences and patterns). All the files have needless returns, yet some of these line-endings are important and must be preserved (e.g. plays and poems). Even fairly straightforward data such as the title, publication details and authorship are not reliably placed in the files. The Aim: Efficient application use requires a flexible GUI front-end for both tools and batch processes. It is a task that will frequent require interruptions over a considerable amount of time. Stray and alien texts Task automation is thus foreseeable even if the nature of this automation is not. It may also be the case, once a few hundred files have been processed, that hidden patterns may yet be recognized. In which case, the program needs to be easily modified on-the-fly! Ideally both the processing of text files and writing the application should go hand in hand, especially as the operator and author are one and the same person. Worse, the whole project must be completed within limited spare time. The Combined Language Approach: However, the reader should bear in mind that writing the same program in pure REXX would result in a great many enormous and awkward batch-processing scripts - it is simply not a practical option under the circumstances outlined. Considering the GUI requirements and the operating system demands even a compiled C version would be large and complex. A compiled version would be easier to use, but the programming would be very time consuming. And very little actual file processing could be done until a substantial In both cases the program would be bloatware. In this scenario, the compiled language will be used in the form of a free function library RexxIO.dll (available from http://www.lestec.com.au). The REXX being used is bog-standard Regina. The graphic language is Modular And Integrated Design (MAID, available from the address above). RexxIO.dll needs to be further explained. It is 215k long and contains nearly a hundred operating system commands, file manipulation functions and general REXX functions. Written in C, the library is extremely generalised and naturally fast - many functions output to both REXX stem variables and to files. The Example's Solution: Because the user needs to open the top part of the file in order to write a XML header, a MLE (Multi-Line-Entryfield) is added to the dialog (the MLE keeps track of all selections, number of lines and characters etc., via a series of stem variables based on its name - this becomes important later on). Once a file is selected from the listbox a short script is needed to load the top of the text file into the MLE. Because MAID takes care of messaging, a few lines of script need to be added to the listbox-selection-event. Having now got the first fifty lines or so of the text into the MLE, the user needs to get publication details and the descriptions which will be used in the XML markup. The simplest method to achieve this uses a REXX function that lifts whatever is selected in the MLE. Thus in the relevant Entryfields, such as AUTHOR and TITLE, the string between the cursor positions of the MLE are placed in the Entryfield when it is clicked. Another Entryfield grabs the position of the cursor itself to indicate the point what should be deleted from the file - a single function call! Naturally, the OK button contains the script that deletes the top of the file and inserts the variables that will become the new XML header. There is of course much more to the application. For instance, GREP-like find-and-replace functions that process the files in batch mode, drop down lists which allow various tags to be given values, and fail-safe copies of the files that are copied and periodically destroyed. The application is Excluding the 215 kilobytes RexxIO.dll, all the scripts (REXX and MAID together), consist of less than 33 kilobytes and that includes nine GUI dialogs. Up to this point, substantially less than eight hours has been spent writing the application and already within that time most of the collected work of Sir Arthur Conan Doyle (4.84 megabytes) has had preliminary markup. By any measure 33 kilobytes is not a lot. Yet even this does not reflect how much script has actually been written, as a good portion of it has been automatically generated by MAID in order to create the GUIs. In future columns we will explore in more detail how this space saving is achieved, why readability increases, and explore the mystery of script shrinkage - or why do the scripts do more while they become smaller? Greg Schofield, schofield@taunet.net.au, the Darwin correspondent for the RexxLA Newsletter |