|
The Charlotte Web Browser, part 2 The first part of this 2-part article was published in the October 1998 issue of the RexxLA Newsletter. Carl Forde, Jonathan Scott, and Perry Ruiter answer questions posed by Scott Ophof on the making of Charlotte, the text-mode web browser for the VM world. SCOTT: Carl, since you now work at Beyond Software, how about some information on their web browser (and how it compares with Charlotte)? CARL: Beyond has a web browser called EnterpriseView "EView" that is based on Charlotte Version 1. But I wouldn't want this to become an "advertorial" for Beyond. SCOTT: Jonathan, how did you get involved with Charlotte? JONATHAN: Charlotte version 1.2 was being used by some enthusiasts within IBM, and we decided to help enhance it to provide effective web access suitable for all CMS users. Some of us worked initially with Carl Forde and then later with Perry Ruiter, implementing changes to support our requirements and then passing the changes back in CMS UPDATE format. The first part of this work (including SOCKS support) was led by Tony Benedetti, then I took over in mid-1996. We also had some useful contributions from James Johnson at Central Missouri State University (the web map support, and the initial support for tables). Charlotte/2.1.0 was eventually made generally available in May 1997. About two thirds of the code in version 2 was new or totally rewritten since version 1, and there were many smaller changes too. SCOTT: Charlotte was originally written in REXX; why? CARL: The original Torun Package which Charlotte is based on was written in REXX. At the time I started working on it, I saw no reason to change that. JONATHAN: Charlotte is mostly written in REXX because it's so easy to write and debug, and because it is the easiest language in which to interface to CMS Pipelines, another of the wonders of the modern world! It was probably originally written in REXX because it was based on previous similar REXX programs such as gopher clients. PERRY: Jonathan is right on the money here. REXX interfaces so well with Pipelines. That coupled with the fact Charlotte was written not by a professional software house, but rather by working VM systems programmers as a tool for them and their users. When VM folk write tools it's almost always done with REXX and Pipes. Looking back at the original Torun code, it was very clear that Rick Troth's VM gopher client (aka Rice Gopher) was the starting point. It's interesting to note that virtually all of the Rice client's screen handling code was lifted from a Pipeline debugger called PipeDemo, written by Chuck Boeheim of SLAC (the Stanford Linear Accelerator Center). An organization well known to REXX fans. So you can see that Charlotte has a checkered pedigree. SCOTT: Some parts were changed from REXX to another language; can you tell us which parts, and why? JONATHAN: The HTML parsing and formatting logic necessarily contained a lot of intensive loops, and was not performing well in REXX, even when compiled. It was also becoming messy to maintain because performance was being given priority over good program structure, and there were many cases in which input was not validated properly because it would have impacted performance too much. SCOTT: Sounds almost like "had we known we were going to rewrite it to a different language, then ...". CARL: Well, when Maciej wrote the original implementation he had no idea what would become of it. When I rewrote it into Charlotte 1.0 I had no idea that it would become as popular as it has. The thought never crossed my mind that (portions of) Charlotte would be ported to another language. I could do everything I needed to do and performance was far superior to the previous attempts. The performance bottleneck of course was the parsing and formatting, which is the portion of the code that Jonathan rewrote into C. SCOTT: Carl, could you comment on the value of REXX as a prototyping language? CARL: The characterization of REXX that I like is: REXX is such a good prototyping language, that when you're done with the prototype you put it into production. SCOTT: And you, Jonathan? How did you go about rewriting REXX code in another language? How useful was REXX to you in the sense of prototyping? Did the rewrite give you any migration problems? JONATHAN: After some experiments with various languages in August 1996, I wrote a new parsing and formatting program using C/370 plus CMS Pipelines. The C code had easy access to REXX variables through CMS Pipelines, so it was easy to integrate it with the existing REXX code. The new parser was initially written as a "batch" parser, taking an HTML file as input and producing a monospaced text file as output, with no reference at all to Charlotte. (I originally started it out of curiosity to see what sort of algorithms were needed to parse and format HTML properly, especially tables). When I integrated it into Charlotte, I looked at the calling routine to determine what variables it expected to be set, but I did not look at the original REXX parser at all. The sense in which the original would have been a "prototype" is therefore rather limited in this particular case. It was also a rather long-lived "prototype", having survived the whole of version 1 and two beta releases of version 2. The new program was written from scratch based on the new HTML 3.2 specification, which meant that the new program was not only much faster than the REXX program but had a lot more function. The program originally required the C/370 run-time libraries, which caused us some problems in early beta testing, but with help from the Master Plumber, John Hartmann, I managed to convert it to use the Systems Programming C (SPC) environment, so that it looks just like Assembler as far as Pipelines is concerned. SCOTT: It is said that one can learn from history, and from the mistakes made by others. Could you give an example or so of "why this routine in REXX, why that one in Pipelines", etc.? JONATHAN: I can't think of any simple examples. Most of the changes were to support new function. We did try to use REXX built-in functions for scanning rather than loops, but this tended to mean that large strings were being manipulated instead of single characters, which often made things worse. Most of the performance changes were superseded when the parser was replaced with the C version. REXX coding for performance is a topic in itself; the main principle that I used was to restructure the code to minimize the logic executed on the main paths. In particular, it often helped performance to replace: SCOTT: Now for a loaded question. Aside from using REXX, which features make Charlotte a better browser than others? JONATHAN: I'm not familiar with any other current browser on CMS, so I can't make direct comparisons. Charlotte/2.1.0 is probably very much faster than any pure REXX browser, but because most of it is still in REXX we still have the flexibility to add new function quite easily when required. Here are some other strong points:
SCOTT: And which points are not so good? JONATHAN: I'm not aware of much in the way of weaknesses, as we tried to eliminate them as far as possible! The majority of limitations encountered when using Charlotte are caused by web page authors increasingly failing to cater for text-only browsers. (Even then, Charlotte can usually get by.) Just a couple of things come to mind:
SCOTT: And what would you say, Perry? PERRY: Jonathan hit on the bad points. Charlotte (and REXX deserves some of the blame) is memory hungry. That and page authors who never considered anything but their browser (eg: a page with nothing but images on it, and no alternate text for the images). SCOTT: How easy is it to use Charlotte? PERRY: Charlotte is fast, easy and intuitive to use. If the pages you're interested in contain primarily textual information it can't be beat! As a sample end user (and recent Charlotte convert) let me present my wife. She was becoming increasingly frustrated with Netscape, so I suggested she JONATHAN: It's very easy to use Charlotte. I think it's probably easier than using XEDIT. SCOTT: Why Pipelines Fullscreen instead of XEDIT for the user interface? JONATHAN: The structure of Charlotte is built around using Pipelines to allow overlapping activities to proceed in parallel, and the use of the Pipelines Fullscreen function fits in well with that design, as well as giving more direct control of the screen layout than we could achieve with XEDIT. CARL: Eg. while waiting for more data to arrive, format what has already arrived. The idea is to "keep the data moving", and here you really get the benefit of Pipelines. PERRY: Remember I mentioned PipeDemo, Chuck Boeheim's Pipeline debugger? Well, no plumber writing a Pipes debugger would ever consider using anything but Pipes for the screen I/O ;-). SCOTT: <grin> Say I want to write a text-mode browser for myself to run in "good old DOS", using Charlotte as example. But my knowledge of C is zero, so I want to write it in REXX. Any advice? JONATHAN: As someone who has written programs to run under PC/DOS (in macro assembler and C) I can safely say that the ONLY way that Charlotte is of any help is in showing that text-mode browsers are feasible. SCOTT: Oh. In other words, "leave it to the experts"... I'd better wrap it up. Carl, anything on the general topic of VM's role on the Web? CARL: I strongly believe that VM has a very important role to play on the Web. A VM Web server can do anything that a Unix Web server can do and more. It is also better positioned, in that it is "closer to" the corporate data that the company wants to make available on the Web. In many cases, everyone in the company has a VM id. It is trivially easy for them to put up personal Web pages should they choose to do so. CGIs are far easier to write using Rexx and Pipelines than C or Perl. The corporate world already has a huge investment in mainframe applications. The hard work, database design, application structure and integration into the business have already been done. In many cases what is required is to make these applications available to people outside the company or on the LAN. The most cost effective way to do that is to simply use the data and application knowledge that exists to write CGIs to generate the HTML that "recreates" the application on the Web. The best place for that Web server is on the mainframe close to the data and where large parts of the application can run unchanged. VM is ideally positioned to take advantage of this booming market. What better corporate Web server than the one that has the data? VM's qualities of stability, reliability and recovery are valuable assets that make it even more desirable as a network server. SCOTT: Thank you Carl, Perry, and Jonathan, for participating in this discussion. The virtual beer was *excellent*! Anyone interested in Charlotte can find it being discussed on the WWW-VM mailing list. To join that list, just send your SUBSCRIBE command to: This generic list server will forward it to the appropriate list server for processing. The authors of this article are: Disclaimer: The authors of this article do not speak officially for their respective (ex-)employers in any capacity. |