soma

latest entries from
Piotr's R&D blog

The half-life of UML diagrams

Tuesday, June 28, 2005, 07:22PM - category Software Engineering -

I recently rescued a very insightful comment by Adrian Buehlmann to my original Morning After entry from the (generally effective, but in this case dreadfully inaccurate) comment spam filter. Adrian takes issue with Reef's fundamental principle of keeping diagrams in sync with code, and argues that Cadifra (the UML editor he's working on) is the right way to go. I think that Reef and Cadifra are complementary, and hope to convince the gentle reader that Adrian and I are mostly in violent agreement.

Let me first pull out some points that Adrian makes that I strongly agree with and that drive Reef's design.

  1. The code is the design.

    This is the point that Jack Reeves made back in 1992, and it's just as valid 13 years later.

  2. Diagram editors must be very fast and easy to use.

    This is critical, yet most of today's "model-based" diagrams fail the test as you are forced to edit an abstract underlying model "through" the diagram. These tools attempt to enforce strict consistency rules, and require that you cross your T's and dot your I's, severely impeding the workflow. Strictness has its place if the model must be interpreted by a machine, but...

  3. Don't generate code from diagrams.

    ...the only reason to have machines interpret diagrams is to make them executable (or transform them into executable forms), and there's no good reason to do that. As Adrian says, if the code is too "dirty" to be the design, clean up your code (or improve the programming language!), don't try to escape the situation by making a higher-level model your primary artifact.

So where do we disagree? Adrian implicitly assumes that UML diagrams should only be used during development, as transient artifacts that augment a developer's working memory and help face-to-face communication. Once the ideas discovered through the diagrams are fixed in code, the diagrams are no longer useful. Given this assumption, synchronizing code with diagrams would be foolish, and all you want is computer-assisted diagram sketching, which Cadifra does very well (check it out!).

I believe, however, that UML diagrams are also useful to convey design information across time and space -- that is, for what is traditionally called documentation. Code by itself is not up to the task, since certain meaningful and important abstractions get lost or scattered beyond recognition when embedded in (current) programming languages. Furthermore, since this documentation will be used by people potentially far removed from the context of the initial development, it is critical that it be consistent with the underlying codebase. (That is, it should augment the design represented by the code and never contradict it.)

One could certainly argue that UML diagrams at this level are not useful documentation, and that the code (with comments) is sufficient. I think there are many indications that this is not true, but rather than argue about it I'm setting up an experiment to see whether some realistic maintenance tasks can be performed better given access to appropriate UML diagrams on top of the code. I'll report on the experiment's results here once it happens -- for now, we're still struggling with the ethics approval process.

In summary, I think Cadifra and Reef fill complementary niches in the UML tool pantheon. Cadifra is great for design sketching and discussion support, while Reef will prove invaluable for maintaining UML documentation for the long run. I think there's even potential for some synergy between the tools, but I'll address that some other day...

State of the Practice II @ ICSE

Friday, May 20, 2005, 02:25PM - category Software Engineering -

The Evolution of Development at Microsoft

This talk flowed so well, I didn't want to be distracted with taking down details during the presentation. The basic premise was that Microsoft used to hack things up quickly because that's what the environment (other hackers) accepted; they just weren't bothered enough about bugs to make it worthwhile to invest into QA. The landscape has changed, and Microsoft now concentrates on quality a lot more. They are switching to agile practices, but each team (of 400+ total) chooses their own process. They do have common quality gates (e.g. testing coverage, coding standards, published specs, etc.) that all the groups have to pass through. The speaker encouraged people do "just do what makes sense": you need some rules to avoid chaos, but not so much as to stifle creativity and motivation. Mostly, it comes down to hiring good, ambitious people, and having them compete against each other.

Software Architecture in an Open Source World

Roy Fielding gave this talk, but while I admire the guy's work and his no-nonsense (if a little adiplomatic) attitude on mailing lists, the presentation was pretty bad. He spent most of his time talking about seemingly meaningless details of the various software systems he's been involved with. It's entirely possible that those details were meant to be examples of architectural principles, but if so he hid the relationship really well. Yawn.

Model Driven Architecture talk @ ICSE

Friday, May 20, 2005, 10:46AM - category Software Engineering -

What a contrast to Erich's talk... The speaker is apparently a marketing guy serving a warmed-over business presentation with generous servings of rethoric and argument by assertion ("you know it's true!"). His argument boils down to: naughty developers don't follow the architectural models handed down from on-high, so we must automate the transformation from model to code (...and fire the programmers?). To be able to automate anything, machines must be able to understand the models, so clearly purely visual diagrams that only communicate to humans are inadequate. (This reminds me of the Web vs. Semantic Web debate: the web is only human-readable, while the semantic web makes all that information computer-understandable. Now, which one is wildly successful and which one can't seem to take off even with Tim Berner Lee's personal backing? One guess only!)

The speaker has finally gotten around to talking about MDA proper. It's the usual PIM to PSM to code flow chart, but at least he freely admits that the PSM will need lots of hand-tweaking to actually work. But then he goes right back and claims platform-independence, longevity, filling in pattern templates, etc. Blah. Interestingly, a guy from Motorola stood up and claimed that they're doing MDA successfully, precisely as presented. Gotta look into this to figure out what kinds of applications it's appropriate for.

Moving from plan-driven to agile @ ICSE

Friday, May 20, 2005, 10:28AM - category Software Engineering -

This speaker promises to explain how his organization migrated from a plan-driven (waterfall) development process to an agile one. He starts out by defining plan-driven and agile processes, including a good analogy: plan driven is like building a bridge, agile is like exploring where and how to cross a river. The difference can also be characterized by the tradeoff between YAGNI and DOGBITE: You Ain't Gonna Need It vs. Do it Or you'll Get Bitten In The End.

There follow lots of details about how to convince stakeholders to accept an agile process, and how to set up the process to have the best chance of success. I won't bother scribing all this stuff; if the slides aren't posted eventually on the ICSE site, you can probably get most of the same information from books.

Lessons learned:

  1. The strongest motivator by far to switch to agile development is failure in previous projects.
  2. Expect resistance from Software Engineering Process Groups (in large organizations). (A solution: rotate staffing of SEPG.)
  3. The traditional procurement approach -- write requirements specs, invite for tender, select cheapest offer -- does not work for agile development. (Anecdote: WTO contracts prohibit requirement writers from bidding on the project, and limit with bidders!)
  4. Completely discarding change request processes would be like throwing out the baby with the bath water. (Suggesting to have a lightweight feature request process to avoid duplication and repetition.)

Erich Gamma's keynote @ ICSE

Friday, May 20, 2005, 08:23AM - category Software Engineering -

Erich will be talking about agile, open source, distributed and on-time (and buzzword-compliant, apparently): inside the Eclipse development process. Unfortunately, his accent is difficult to understand over the microphone, and the font on his slides is barely readable [1]. Oh well.

In Eclipse development, the emphasis in on people over process, and on shipping software over other activities, producing a culture of "If you ship, then you may speak". They ship regularly, each milestone is a miniature development cycle so they get feedback early and often. Within a milestone, continuous integration is important (nightly, weekly and milestone builds), and of course they eat their own dog food daily. They also encourage and depend on community involvement, to avoid developing in a vacuum. This requires extra work and support on their part: for example, they weren't getting much feedback on milestone builds, because people didn't know what was new in each one, so they started publishing "new and noteworthy" lists and now get much better results.

Unsurprisingly, they also emphasize testing; the goal is to reduce the time between a bug appearing and being detected. They also have automated performance and resource (?) tests. In their release endgame, they alternate testing and fixing. They do not separate testing and fixing responsibilities: everybody does everything. After the release, they run a decompression cycle and retrospective. The timing is: 9 months of milestones, 1-2 months of endgame, 1 month of decompression.

The plan is prepared early, but remains alive until the day of the release. Risks are mitigated by putting high-risk stuff up front, and dropping things rather than pushing out the schedule. Everybody shares ownership of the code to ease resolving inter-component issues. Teams are put together dynamically to handle tough integration issues.

Eclipse is built to last, with an analogy to actual buildings. A building has multiple layers: the geographical location (permanent), structure (30-300 years), services (plumbing, electricity, 7-15 years) and stuff (furniture, replaceable easily). The various layers change at different rates, so the building's design must tolerate shear between the layers. This is painful but necessary if you don't want the building to become obsolete. In Eclipse, the structure is the plugin layer, the services plumbing are the APIs, the stuff is the UI. APIs matter, and are kept minimal and backwards compatible as much as possible. APIs are designed at the same time as the implementation and the client, to make sure they are practical, and serve to isolate components to maintain development velocity within each component (standard modularity argument).

Things don't always work out. Lower-layer ripples sometimes have a big effect on upper layers; as a result, they have a rule that a lower-layer change is not done until the ripple has (successfully) propagated all the way up. Dynamic teams are not always successful, and require lots of leadership and management support. Finally, they do drop features -- perhaps this could be seen as a gambit to avoid even worse things happening?

Footnotes

[1] Note to self: self, don't use fonts with thin lines, as they'll get washed out on the projector.

P.S. Why is it that today I see people coming in with breakfast goodies at 9:45am, but on Wednesday, when I was 5 minutes late, the breakfast had been taken away by 9am sharp? There is no justice in this world.


Some previous entries (or browse the archives):