October 2003 archives from
Piotr's R&D blog
The Morning After
Here's a project proposal for a software documentation tool tentatively called "The Morning After". There are two assumptions you have to buy into: first, that well-done UML diagrams provide useful documentation for a system, and second, that typically developers will draw such diagrams only under duress. Still with me?
Adoption Issues
Current UML editors suffer from an adoption problem. Unless their use is mandated, few developers will choose to use these tools to do their design. I believe this is because they are too unwieldy for quick sketching1 and don't provide sufficient extra value to offset this deficit. Most agile processes recommend the use of pencil and paper (or whiteboards) for drawing informal diagrams, an approach that has met with some success. Unfortunately, the diagrams drawn this way are not always captured, and almost never updated as the code base evolves. Consequently, their use for documentation purposes is dangerous: an incorrect model is worse than no model at all.
There are two typical responses when people realize that things aren't working out, but still want UML diagrams for whatever reason.
- The big-bang approach says that you should do whatever you need to do to get your system working, then document all the code at the end, possibly with the assistance of some reverse-engineering tools. Though popular with procrastinating SEng students, it requires a lot of discipline to get right since documentation is not an inherently interesting activity for developers, and piling it on when they are burnt out at the end of a development cycle is often ineffective. Even with reverse-engineering, there's just too much to look through and correct all at once.
- The in-sync approach attacks the problems with tools, integrating UML diagrams tightly with the code, making sure that they always match (e.g. Together or EclipseUML). This is tricky to do right and requires that all developers be using the same development platform. Furthermore, in my experience, drawings diagrams and writing code requires two different mindsets; I don't want to be distracted with maintaining a diagram when I'm trying to solve a tricky low-level problem, or be distracted with generating correct code when trying to work out high-level design issues.
A Happy Medium
Here's a third approach, a scenario that I think hasn't been tried before. The developer sketches on paper and writes code all day, checking it in when it's stable. Overnight, our system analyzes the additions and changes, then reverse-engineers appropriate UML diagrams highlighting any changed areas (a visual diff). The morning after his coding binge, the developer comes in and checks his mail over coffee. The diagrams are waiting in his inbox, embedded with SVG and editable in-place. The developer looks over diagrams (perhaps comparing them to his sketches), makes some corrections, and puts the revisions back into the system to be used as the base for the next iteration of diagrams.
I think this system would combine many advantages:
- Editing a diagram requires much less effort than creating one from scratch. Furthermore, the diagrams are built iteratively, avoiding the big-bang effect.
- The editing process fits the developer's workflow and does not require him to install and (remember to) run extra software. Diagram reviews can be postponed by leaving them in the inbox, or the message dragged into a standard to-do list. Diagrams can also be easily annotated and forwarded between team members.
- The system provides some extra value to the developer, since reviewing the reverse-engineered diagrams may allow him to spot high-level bugs (e.g. introducing an undesirable dependency between packages, creating a reference cycle, etc.). The diagrams can also be automatically integrated into a project's on-line documentation (such as Javadocs), giving the team models that they can be confident are not stale.
- The application architect can keep track of the system's evolution in real-time, and step in early when reality starts to deviate from his design (possibly by modifying the design).
Getting There
The project involves many challenges:
- Creating an incremental reverse-engineering algorithm that will preserve the users' model and diagram edits as much as possible even though it's only fed coarse-grained difference information rather than fine-grained edits.
- A diagram selection algorithm that will heuristically decide which diagrams are likely to be most useful, since the system can't present more than a handful of diagrams to a developer every morning.
- A visual differencing algorithm to highlight the changes in the diagrams.
- A (simple) UML editor written in SVG, and a two-way converter between UML models and SVG diagrams.
Footnotes
1 New user interfaces based on tablets or electronic whiteboards might solve this problem in the future.
Shadow Conferencing
All the way back in 2000, Dan O'Sullivan took a step towards making video conferencing bearable. This could even be overlaid as a watermark on top of other applications, so you wouldn't lose screen real-estate, yet still maintain a visual awareness of other participants. Damn shame it looks like it's going to get patented.
Library catalogs are obsolete
Amazon has folded in searching book content into their search engine. Library catalogs, already painful to use compared to full-text Google search, have now become completely irrelevant. (Warning: That was a future-looking statement, based on increasing the content index coverage to a large percentage of published books.) I think this will also reduce the importance of subject ontologies and other explicit metadata, since modern indexing and clustering mechanisms can extract most of that information from the text itself. Extrapolating further, is an explicit Semantic Web a misguided attempt, and we should all rather be working on natural language understanding algorithms?
Addendum: Wired's article about this.
Video Bench at the open house
The Chisel group exhibited the Video Bench at UVic's Engineering open house. I only helped set up and break down (and debug!), but apparently it was a huge hit. A couple interesting things I was told (Peggy, where's your own blog?):
- Kids were sharing seats, so the app had up to four simultaneous independent users with only two distinct identities. Apparently, it held up fine, even though it was most definitely not designed for this eventuality. Moral: always expect your users to do the unexpected, kids doubly so. Code defensively.
- Kids picked up on the gestures and the "rhythm" required much faster than adults. The gestures, while not overly complex, do have some rather peculiar requirements (e.g. hand spread just so). Moreover, due to the relatively slow data feed, you can't move too fast, but for some gestures to be recognized, you can't move too slow. This speed fine-tuning was always the major problem when having adults try out the app at CASCON, but apparently kids just pick it up naturally. Moral: include children in any users tests for Diamond Touch apps.
For that matter, perhaps kids should be the target audience for these apps to begin with? I guess the hands-on interaction style appeals more to children anyway. Moreover, the tabletop's physical configuration promotes social and collaboration skills, two areas in which more traditional computer setups have been found lacking by parents and educators. So, Diamond Touch in kindergarten, anyone?
Spacewarps
I'm getting this idea out of the way because I've recently discovered that it's (mostly) been done.
The problem
The scenario: some kind of application that supports colocated collaboration on a tabletop display. The app involves moving around elements that have a preferred orientation relative to the viewer and can be resized, but should not otherwise be deformed. (Think "Video Bench" if you've seen it, or a collaborative photo album editing app.) You immediately run into two visualization problems:
- The users are sitting on different sides of the table, and hence have different preferred orientations for the elements. Some elements are shared between the participants and might need to be reoriented on demand. Some elements are semi-private and will always be facing their owner.
- The tabletop surface is small and the display resolution is low. For editing, the elements must be large enough to show detail despite the low resolution. But if all the elements are large, you'll run out of space or have lots of overlap. So you need to manage element sizes.
A solution
The simplest solution to the problem is to give users direct control over the orientation and size of each element. However, unless these manipulations are actually the goal of the application (e.g. laying out a floorplan), they are pure overhead for the users and will distract them from the primary task. We'd like to make orientation and sizing automatic, while still giving the user some control.
Element positioning is probably the only generic manipulation you can't live without in this kind of app; the user simply must be able to reposition the elements as they desire. The obvious idea then is to use position to control orientation and size of the element as well. For example, if we somehow know where each user is, the elements get bigger as they get closer to the user and always rotate to face the person interacting with the element.
Great idea, but as I found out today it's been done. Read this paper before continuing.
So what's left to do? Here's my take on the limitations of the approach reported above:
- It's specific to a circular tabletop. ("Well, duh!" mutters the audience.) It's cool, but doesn't match up with the reality that most tables and projectors are decidedly rectangular in shape. I'm not convinced the polar coordinate system would be as useful on a rectangular surface.
- It seems to have only two automatic orientation modes: all elements face away from the center, or all elements face the same global direction. This is very limiting. In the first scheme, the orientation of elements "between" users is useless to all participants (i.e. nobody is facing them). In the second scheme, the collaborative aspect is forgotten as one participant's preference effectively takes over the whole table.
- The orientation and sizing mode is global: the same rules apply all over the table. There is no way to create local "zones" with different rules that may be changed independently.
A better solution?
Here's my idea on a superset of the circular design. The tabletop surface is partitioned into spaces. Each space has its own local coordinate system and, for every point in the space, determines the desired orientation and scaling factor for that point. One concrete but simple implementation would be to have a "baseline" in each space, which could be any 1D shape (e.g. a straight line segment, a circle, a point, discontinuous lines, etc.). Given a point of interest, find the shortest vector between the point and the baseline. The vector's direction gives you the desired orientation (like gravity), while the vector's length is inversely proportional to the scaling factor. The circular "central focus" mode is equivalent to putting a basedot in the centre; the "black hole" mode is a circle around the table's circumferance; the "magnet" mode is a baseline in front of the desired user.
What other interesting properties does this scheme have? For one, each user can have their own space, and arrange its "distortion field" in whatever way suits them. Also, a baseline drawn with a single finger specifies the field for the whole space, a very powerful yet simple interaction. The arbitrary shape of the baseline gives complete freedom in customizing a space, and spaces are no longer specific to a circular display.
Here's a concrete example. Say each user draws a straight baseline at their "bottom" of the display, approximately the width of their body. Then each space would have the following characteristics:
- All elements in front of the user face her properly.
- Elements to the sides are arranged in quarter-circles centred at the baseline's endpoints, facing towards the user at an angle.
- Elements further away (up) from the user or to the sides become smaller with distance.
Control issues
There is a non-obvious problem common to both approaches: since an element is usually larger than a pixel, which point do you use when determining the applicable distortion? I can think of three options, none very inviting:
- Use the centre point of the element for both determining and applying the distortion. This is nice, since the same point will always be used for each element. However, if the user doesn't "grab" the centre of the element when starting to drag it, the element could shrink/rotate out from underneath the user's finger, since the transformation will be about the centre. (For example, I grab a large element near a corner, and move it so its center is in a "small size" area. The element becomes smaller, and my finger is no longer within the element's boundaries.)
- Use the centre point to determine the distortion, but apply it at the grab point. This is a terrible idea, since applying the transform will shift the element's centre, thus changing the distortion factors, necessitating another distortion, etc., ad infinitum if you're not lucky. The converse (determine at grab point, apply at centre) is equally bad.
- Use the grab point to determine and apply the distortion. This leaves the element under the user's finger at all times, and gives a nice "physical" feeling to the movement. However, if the user drags the element using one corner, drops it, then grabs it by another corner, the applicable distortion changes suddenly and maybe unpredictably. You can also get into situations where two elements are visually located "in the same area", but have very different distortions applied since they were grabbed at different points.
I tend towards the last alternative, but I'm not sure how intuitive it would be. I'd be curious to find out what algorithm is used for the circular table.
A last remaining question is how does the table get partitioned into spaces? How are the borders determined, and what happens when a new space gets created? I don't have a good answer to this yet. I think users should be able to control the "strength" of each space, and this would provide the input to some kind of balancing algorithm that would settle the partition. When a new space is created, other spaces must adjust their borders, but it's unclear how to shift elements around. However, this would allow for some cool effects: imagine that you can drop elements into a trashcan. If you want to recover an element, you open the trashcan through some icon, and it "warps in" a space right over top of the icon with all the deleted elements in it, temporarily pushing away the elements in your space. When you close the trashcan, your space recovers its original shape and things return to normal. I think this could be modeled by using a 2D mesh for each space, kind of like "warp" deformations in typical paint tools.
Looking forward
As a final crazy idea, what if this scheme was applied to windows on a typical desktop? Granted, it might only make sense with a larger screen, and the orientation changes might be undesirable, but it's an interesting thought experiment. Compare and contrast this with the Mac OS X Exposé feature.
So, do you think there's still some interesting work to be done in this area? Are my ideas feasible and worth going for, or am I out to lunch? Or have I missed more literature and "it's all been done" already?
Video Bench
Early this year James Chisan, myself, and a gang of undergrads (Jeff Cockburn, Reid Garner, Azarin Jazayeri and Jesse Wesson) developed an application I whimsically called the Video Bench. Against expectations, the application has taken on a life of its own and we recently demoed it at CASCON, getting a very warm reception from the audience. This entry explains what the application is about, what I learned from presenting it at CASCON, and what the future might hold for the Video Bench.
Genesis
It all started when Peggy obtained a Diamond Touch tabletop from MERL and was looking for some enterprising students to take it through its paces. Having recently seen Minority Report, I came up with the idea of using the table for gesture-driven video editing. To keep the project grounded, I decided to take the lead from old cut & paste film editing techniques, where strips of film were physically cut with a knife and pasted back together with tape. The design was fully sketched within days, and the Video Bench implemented in under 6 weeks.
The Video Bench is a hands-on collaborative video editing application meant for casual users. To operate it, you sit on a special conductive mat in front of the Diamond Touch surface and use your hands to manipulate strips of video. Multiple people can use the app simultaneously without interfering with each other. The operations permitted on video strips are mostly pretty simple: play, pause, fast forward & rewind, cut & paste, copy & trash, zoom, and spread.
This last operation requires a bit of explanation. Since there's clearly not enough space on the tabletop to show every frame of each strip, the frames are collapsed into "cels". At first, each video clip loaded into the Video Bench is represented by a strip with only one cel. By moving back and forth through the video, the user can locate the desired point and cut the strip in two. However, navigating through video in this manner is tiresome (if familiar); we can do better. We allow the user to spread video by grabbing the edges of two cels in a strip and pulling them apart. As the edges get further apart, the space is filled with more cels that further subdivide the video between these two points. This is a kind of timeline zoom, where the time axis is partially projected onto the X axis, and is one of the few really novel ideas in this project.
For more details on the Video Bench, and copious amounts of screenshots and diagrams, please refer to our group's final report, keeping in mind that a few things have changed since it was written.
The unveiling
Once the report was written and submitted, I thought the project was pretty much over. However, Peggy suggested that we show it off at CASCON. James and I agreed, and we were soon booked into a prime spot of real-estate on the show floor. I'll spare you the troubled tale of troubles we went through to get all the equipment flown across the country and set up in its new location; suffice it to say, we were ready on D-day.
The exhibit was immensely popular with the public; we had a crowd of people around our booth whenever it was staffed. Most of that can be attributed to the inherent flashiness of the application in the middle of a rather cookie-cutter technology showcase (how exciting can you make a computer and monitor look, anyway, no matter what it's displaying?), but some people were genuinely interested in the concept and had some good suggestions. I also managed to surrepetitiously observe one person attempting to operate the Video Bench when the booth was unattended, which produced further insights on the user interface.
The most striking impression was that people were taking the Video Bench seriously. We had one person ask when (and how) we were planning to commercialize the prototype, and a number of people expressed interest in using the app. We knew, by design, that this app would only appeal to casual users, who are not willing to learn complex user interfaces. I was surprised, however, when a semi-professional videographer claimed that some clients are put off just looking at a professional video editing tool, even if they don't have to use it. Using the Video Bench to rough out a production would result in a much less threatening environment.
I also got some ideas about how to improve the user interface. The jogging operations are nearly useless: they are too imprecise to achieve frame-perfect positioning, yet too slow to quickly scan through video. Instead, dragging the cursor directly in the strip's top edge for absolute positioning, and having a relative positioning control in the bottom edge would be a better combo. (The relative positioning control would fast forward or rewind at a speed proportional to the finger's distance from some center point.) Another issue was moving strips around: people don't expect to be able to use full-hand gestures, and even once instructed they can be difficult to get right consistently. With the jog gestures out, we could use a single-finger drag in the cel to move strips. Alternatively, the adventurous user discovered that the dividers were "live", and tried to use them to move strips around. He was very confused when he discovered that the strips only moved horizontally (remember that the dividers are only meant for spread/fold). It might be a good idea to also use dividers as movement handles, as this would also allow for natural strip rotation when using two fingers!
Finally, other people mentioned that there's video-related work going on at the NRC and at UofT. (For DT-related work, see the next blog entry on spacewarps.) Another person thought that this kind of collaborative surface would be great for bioinformatics work; I didn't catch the details, but it involved visualizing protein structures and manipulating them collaboratively. Other people were keen on seeing this used for software engineering. When I brought up the fact that it's difficult to create content with fingers, they suggested that participants could have personal tablets for input to the communal surface.
Overall, people's enthusiasm and the many ideas I gathered make me think this project has some life in it yet.
A nebulous future
What's next for the Video Bench? We'll probably move it to the next generation prototype of the DT, and exhibit it a few more times at local venues. The previous section gave a few ideas for user interface refinements, and my next blog entry explores other perspectives. Brian Corrie at NewMIC did some work on a distributed version of the app, and has some other improvements in the pipeline. Ultimately, though, the Video Bench is looking for a new home: this is not mine (or James') primary area of research, and sooner or later (probably sooner) we'll have to focus on our dissertations. If the Video Bench is to survive and prosper, the project needs new blood. Anyone?
CASCON Impressions
I attended CASCON this year, for the first time in three or four years. CASCON is a conference put on and paid for by IBM where academia and industry can shmooze together. Herewith my impressions of this year's event.
CSER meeting
Before CASCON proper, CSER (a better web site is forthcoming) has its semi-annual meeting. The mornings were spent prevaricating about CSER's future, with some debate about various funding models and IP policy. This was not terribly interesting, and while everyone says they find the consortium useful and want to keep it alive, I didn't sense much actual enthusiasm in the room. It's not entirely clear to me what the value proposition is, either: I can meet with other researchers at conferences, in workshops and on-line. Now that CSER has settled on letting each project pursue its own funding, the only things left binding its members together seem to be tradition and the IP policy. I think CSER needs to make its value clear to future researchers: instead of asking graduate students "what do you want from CSER" it should tell them why they should join.
Besides the introspection, a number of groups presented overviews of their research. Many were repeats of previous years, or near-copies of paper presentations from other conferences. The quality varied wildly, with some supervisors seemingly taking advantage of the session as a practice ground for their students' (weak) presentation skills. Two talks stood out in my mind: Peggy did a thoughtful critique of her group's research, which I won't try to relate here, and Jim Cordy exposed the ins and outs of the financial software industry. The main take-away point: minimizing risk has absolute priority in that industry. This means that they still write software in COBOL, they clone code with great abandon and don't try to propagate fixes, and believe that "if it ain't source code (or directly linked to it), it ain't worth my time". The talk challenged many accepted wisdoms of software engineering as taught in classes, and will take me a while to digest. Good stuff.
CASCON keynotes
I attended the Google and autonomic computing keynotes. The Google talk was entertaining, discussing the history of Google and some of its features and architecture. The most interesting part was the description of Google's hardware configuration: tens of thousands of cheap off-the-shelf machines, set up in a heavily redundant network (grid?). These cheap machines can be counted on to fail, with up to 100 biting the dust every day due to overheating, vibration, overheating or what have you. The failures are logged and the machines automatically disconnected from the network, but otherwise Google doesn't care. Every week or so, some attendants amble between the racks replacing failed components (or entire machines). As the boxes come back up, they are automatically reintegrated into the network. The moral of the story is that smart software and brute force can compensate for cheap hardware to your economic advantage.
Contrasting against this approach, the autonomic computing talk painted a picture of self-healing machines that can identify, diagnose and repair faults. This field is very wide, ranging from configuration wizards to accelerometers in laptops that park the hard drive heads before an impact. Right now, there are still more questions than answers. How do you effectively monitor machines or software, if the monitor itself is part of the faulty machine? How do you give high privileges to the automated "doctors" without compromising security? Is it worth your while to invest in this potentially fragile infrastructure instead of going with a Google brute-force approach? As always, the answers are likely to be painted in shades of gray, but this should be a lively research field for the next decade or two.
CASCON papers and workshops
I went to some CASCON paper presentations, but didn't run into anything terribly impressive. I didn't have time to look through the proceedings beforehand, so I picked based on the titles—not the best of ideas. The "best paper" was decent, and Holger gave a good talk about our paper as well. For the rest, I'll just have to read the proceedings.
The afternoon workshops were more interesting. The workshop on aspects had the usual introductory stuff (amusingly, each presenter gave a definition of "concern", all slightly different), but there was more interesting content as well. Adrian Coyler spoke on behalf of the ghostly voice of Harold Ossher about the Concern Manipulation Environment. While the talk was fairly abstract, I look forward to the tool. Adrian also talked about an experiment at IBM U.K. that demonstrated the advantages of aspects over plain object-oriented programming. Adrian and his team took Websphere and modularized some cross-cutting concerns using AspectJ. To make it more interesting, the rules of engagement said that you had to try plain object-oriented techniques first, and use aspects only if those were judged inadequate. The project was a success, and demonstrated that aspects are a valuable software abstraction and that AspectJ can scale to large systems. While I haven't been following the aspect literature too closely recently, this seems like the kind of well-documented result that the field needs to prove itself; I look forward to a publication.
I also attended half a workshop on using Eclipse in education and research, and a workshop on automated reasoning. Both had some ideas of interest, but nothing really to write about. The field of computational logic remains as confusing as ever.
Technology Showcase
As usual, there was also an exhibit hall where both IBM and university teams demoed the latest and greatest software. I was mostly busy staffing our own booth (the Video Bench, which gets its own blog entry), but had the time to run around the floor once. A number of displays were interesting, but the simplest one caught my attention: Designing UML Diagrams for Technical Documentation (scroll down to find the entry). Simply put, it's a short set of simple rules that guides developers towards creating clear, readable UML diagrams. Based on my experience in teaching software engineering, this ruleset is badly needed; all too many novice developers draw diagrams that are both technically correct and completely incomprehensible. Drawing UML is both a science and an art form, and sadly most engineers just aren't artists.
I dub thee "Piotr's R&D blog"
I've had a personal blog for a while now (which I may soon host here as well), but it's time to start writing about my work. As a warning to others / reminder to myself, here are some of the topics I want to write about:
- Imperative contracts and nested JUnit test classes
- QDox attributes, the current D in my R&D
- Security glyphs, for establishing a visual trusted path
- Covey, P2P encrypted delta-compressed backup
- Spacewarps, non-linear collaborative work surfaces
I've also done work in the area of the Semantic Web, constraint databases, software engineering education and artificial intelligence, so you'll probably hear about those as well at some point — and whatever else happens to catch my attention.