The Elephant and the Blind Men

Thanks V for emailing me another link to an off-the-mark Documentum blog at 01:30 and keeping me up all night.

“Blind monks examining an elephant” by Itcho Hanabusa. Blogs about Documentum are popping up all over the place. Some suffer from a myopia based on limited experience with a complex product. Virtual documents are not categorically evil. Documentum back-ending SAP isn’t a good example of anything other than marketing-driven hackery. A Java programmer can’t “learn Documentum” in 21 days.

Documentum is big, really big, Carl Sagan billions-and-billions big. There are several dozen different software components on top of a hybrid object database and content storage system. That object database comes with an incredibly rich schema that has evolved over many years. It carries plenty of junk DNA. Early versions included minimal clients; customization was essential for even basic functionality. In fact Webtop, the client du jour, was a reference implementation that borrowed some very corrupt interaction design DNA from its predecessor, Rightsite. Documentum is getting old and can’t reinvent itself at the core because of all those bloated docbases holding the company jewels.

One point bears repeating. The Documentum server is a much more complex information system than a database server like Oracle because it’s not tabula rasa out of the box. It comes with a vast collection of object types and tables that relate in obscure ways on top of its database server capabilities. Its genome bears the DNA of plenty of evolutionary dead ends like the unix-based security system and the original compound document–a version 1 precursor to the virtual document, not a virtual document mimicking another structure like XML or OLE linking.

The very idea of finding content on the internet about Documentum is still hard for me to comprehend. In its first decade, the company jealously guarded information about its product and actively discouraged third parties from writing books. Now V sometimes tortures me by sending links to the worst posts. Maybe the silver lining in Documentum’s third party blackout was me getting to bed earlier. But I digress.

I have the same problem talking to new developers all the time. They describe Documentum in the language of their technological comfort zone: A database guy will talk to me about tables and queries. A web guy will talk to me about application servers and the web content management lifecycle. A java programmer speaks in DFC with little understanding of what lies beneath.

It doesn’t help when their words have one meaning in their comfort zone and another in Documentum. That even happens across Documentum versions or components. It got so bad that Documentum started including a glossary in the documentation that maps one version’s po-TAY-toe to another’s po-TAT-oh. Honestly, I’m at the point where I pick the name I like and stick with it especially when a real stinker is in vogue like DocPage Server. Bleh!

It’s the Documentum retelling of the Story of the Blind Men and an Elephant. People with a few years of experience use a few tools, not the entire toolbox, to solve a few very particular problems. What makes this worse is clients aren’t trying to solve interesting or novel problems anymore. One reason is the product’s popular enough that clients have preconceived notions about what Documentum can do. Another is EMC (and Documentum before it) trying to open up new markets. Anybody else remember the fiasco when Documentum decided it was a web content management system and snubbed pharma and aerospace? Not good for anybody.

Contractors fare a little better than full-time employees because they see more projects and get faster access to new versions by hopping to new clients who are Documentum first-timers. Even contractors can get bogged down though, getting typecast by technology or industry and only working those kinds of projects or clients. Maybe the nature of the business works against knowing the entire product unless you make it a priority.

It’s also possible that the whole thing is just too big to grok. I certainly don’t know the entire suite of the top two versions in spite of concentrating on Documentum for 14 years now. So how’s a reputable developer or blogger supposed to see the entire elephant? Step back and look at the whole before focusing on any one component.

Ok, I’m biased. I’m a server programmer turned architect so my approach will probably drive java purists crazy. They might argue that if the DFC is a complete abstraction of the system, it would provide a conceptually integral view of structure and behavior while concealing the hackery and junk DNA at the lowest levels. Studying the guts would be more confusing and likely to lead newbies down dead ends.

The problem is that the DFC (at least up to 5.3) doesn’t completely cover the system and is still poorly documented–especially given how long it’s been in use. I still keep the Object and API Reference Guides close at hand when doing DFC for when the javadoc map is blank and the code does something unexpected–and it always does. Our web-enabled cut-and-paste programming culture probably isn’t helping. And yes, I do it too. Damned internet!

Although I’m warming to Java (after more than half a decade), I still think it’s a cumbersome language that can bury unskilled programmers in a mess of abstraction and layering. A first year computer science undergrad should still start with C. Any other science starts with the fundamentals and builds upon them. C is a terse language that sits closer to the hardware; it makes for a difficult but instructive experience by getting closer to how the computer actually works. It’s high-level enough to escape the disconnect from reality that comes from doing assembler–similar to how quantum physics makes no instinctive sense to monkeys evolved in a macroscopic context. It’s also good to know a little about the byte shuffling underneath for when things go wrong. Introduce Java after C and let the student’s understanding evolve just as the languages did, one from the other. But I digress again.

So here’s how I recommend getting to know Documentum better:

  • Study the Architecture Overview document until you can draw the system architecture from memory. It will help you intuit where something happens, especially when something goes wrong.
  • Read Server Fundamentals at least one time all the way through and skim the Administrator’s Guide. The server is the foundation upon which everything is built; these documents provide an overview of the system’s base structure and behavior.
  • Keep the Object and API Reference Guides handy. When something unexpected happens, look at the model to get a feel of what objects are involved. The API guide can provide some history and particulars hidden or ignored by the DFC or the javadocs. Oh, and print out that giant E-R diagram if you have a plotter.
  • Get comfortable with IDQL and IAPI (or one of those swiss army knife tools although it feels like cheating to an old timer like me) so you can examine the underlying state while manipulating it at higher levels (DFC, Webtop, etc.)
  • Run IDQL, IAPI, and scripts on the docbase server’s host to cut out any caching annoyances and docbroker switcheroos. Unix geeks, feel free to tail logs too.

Monkey See, Monkey Do Well

GibbonLinux continues to evolve towards a desktop-ready operating system with the latest release of Ubuntu, Gutsy Gibbon [Ubuntu 7.10]. Installation, configuration, and pre-installed applications all demonstrate a stable, refined, mature product. The GUI feels like a Windows/MacOSX love child that should be familiar enough to both communities.There are some flaws, but overall this is a step forward for Linux-kind.

I installed the latest Ubuntu to host an Alfresco installation. Linux is usually fussy about laptops because of the custom hardware, so my Dell XPS laptop makes a good test platform. My previous Ubuntu install went well, s0 I hoped for a flawless upgrade. Unfortunately the installer tried to do everything at once, core operating system and all the packages I added over time; it failed on one of those additional packages and died gasping, “The installation may be broken.” No details on why or how it failed meant salvaging the installation could be an hours-long goose chase. Ouch!

The laptop used to be my remote-into-work-via-RDP machine via RDP, but the client moved to Citrix (bleh, but works fine on my Mac) and I moved on from the client. Since I don’t use Windows personally for anything but games, I decided to wipe the Windows partition and make a Linux-only box. The fresh installation was flawless, and I’m wondering if I may have caused the upgrade’s problems by doing unspeakable things to the Feisty Fawn previously installed. Maybe the next version, Hardy Heron, will do better with upgrades. Some naming scheme, eh? What’ll they call the version after that? I nominate Irascible Iguana!

What followed next is my inevitable colossal waste of time with a new Linux installation, choosing packages. This is a place where Ubuntu still feels too much like Linux: While the Synaptic Package Manager is a wonderful piece of software, the list of packages and how Ubuntu organizes them is totally overwhelming. I spend hours wandering through the not-so-obvious categorization scheme to select dozens of packages that will probably never see any CPU time ever.

Another traditional Linux pitfall, screen mode, killed more time. Even though there’s a pretty GUI utility for managing screen resolution and display characteristics, it sits atop the most complex and arcane piece of software known to man, XWindows. Whatever I did, the machine could no longer boot in graphics mode and got hung up in a bootstrap/reconfigure loop. Taking a page from Windows, the fastest way to fix this was reinstalling. The brighter side here is my greater restraint in package choice. That concluded the time of woes and everything was smooth sailing from then on.

The Ubuntu interface borrows from both Windows and MacOSX. The task bar along the bottom is very Windows and includes the ever-useful Minimize All Windows button. It also includes trashcan and virtual desktop icons. The menu bar along the top of the desktop is very MacOS. Pull-down menus work as expected with the much-needed improvement in keeping menus updated as packages come and go. Applets on the menu bar provide plenty of current-generation functionality, even a search tool like Apple’s Spotlight. Opening/resizing windows seem to respect the bars well, but that might just be because I haven’t resized or moved the bars themselves. The overall effect is a functional desktop that might not be as eye-candy lickable as MacOSX but is better than a vanilla XP installation.

For applications, we get all the usual (free) suspects to provide the core functionality people want in a computer: Web browsing, email, instant messaging, and office documents. The missing app here is an iTunes equivalent like Miro. This may be because Ubuntu only prepackages software with open source; not all free programs are open source of course. The best applications are the ones available for all platforms; Firefox is good whatever the platform for instance. The Linux-only apps aren’t as polished but do feel full-featured; these aren’t the buggy alpha/beta releases that populated distros even just a few years ago. There’s enough here to use Ubuntu as a primary/sole computer in a pinch thanks to a nice suite of apps.

Ubuntu can run on Macs–Intel and PowerPC. So is this really a Linux? Anyway, I installed the previous Ubuntu on my old PowerBook as a dual-boot system, and it worked flawlessly. Upgrading to Gutsy Gibbon created some graphics/screen problems though. Mac/PowerPC users might want to wait a few months before installing. Now that’s Leopard’s out, the PowerBook is back to being a pure Mac. I don’t think I’ll go back to dual booting now that I have a dedicated Ubuntu box. Dual boots made more sense when my primary box ran Windows (dark, dark days) and I couldn’t get a Unix fix just by clicking on Terminal.

This feels like the first Linux desktop that isn’t years behind the mainstream operating systems. It’s not quite Linux for the masses yet, but it should be accessible to power users, not just developers, since I never had to drop into a command line to install or configure the machine. The disk image Ubuntu provides will run full-featured off a CD-ROM, so it’s easy enough to try without risk.

Java 1.5: Gimme Some Syntactic Sugar, Baby!

How Sweet It IsI don’t like coding in Java. As a collection of libraries it’s not bad, but as a language it’s full of syntactic vinegar. Simple tasks in Perl and Python are chores in Java, distracting me from solving real problems and obscuring the code with extraneous detail. At least Java 1.5 makes things a little sweeter by introducing some language features–instead of more libraries–that I’ve enjoyed in other languages for more than a decade.

Java syntax is basically C minus pointers plus objects and exceptions. I love C for its minimalism, efficiency, and transparency. However those traits also mean working the language in addition to working the problem, like needing to explicitly code string processing behavior atop arrays of characters.

That’s great for teaching how computers work, and I say require C for all first-year computer science students. It’ll weed out the weak and give the survivors a good foundation to build upon. That’s not so great for a developer on a deadline solving a complex business problem. Try coding C against the string-centric Documentum Client Libraries (DMCL) if you don’t believe me. Been there, done that.

Then I migrated my first Documentum project to Perl. This language let me work the business problem while it worried about the details. Writing Perl code against the DMCL was faster, less error-prone, and more enjoyable. Stellar string support helped, but lists and hashs changed my entire approach to solving problems because of how effortless they were. The language understood them in context, “doing the right thing” underneath, like magic.

Coming back to Java (1.4.2) from Python recently, I was particularly vexed by handling collections of things. Besides all the casting, I found myself having to revisit the 70s and do this:

// things[] -- identical to C
for (int i = 0; i < things.count(); i++) { ... things[i] ... }

// List things — still C-like but easier to manipulate elsewhere
for (int i = 0; i < things.size(); i++) { … things.get(i) … }

// List names as iterator — more “say what you mean”, still cumbersome
Iterator thing = things.iterator();
while (thing.hasNext()) { … thing.next() … }
These all feel clumsy and out of date like bell bottoms and leisure suits. Even using the Iterator approach feels wrong. Talking it through in terms of “for each” feels much more natural than “while”. Here’s how I prefer dealing with this kind of loop:

# In python,
for thing in things:
... thing ...

# In perl,
foreach (@things) { … $_ … };
So I’m browsing my monstrously thick new copy of Java in a Nutshell for what’s new in 5.0. I couldn’t actually use any of it since my last client hadn’t upgraded from 5.2.5 yet, but it might be nice to be prepared for my next client. To my surprise I saw this:

for (String name: names) { ... name ... }

Finally! I can say what I mean in the code and not clutter it up with irrelevant implementation details. It might not sound like much, but the less I have to fight with Java means the more work I can actually do. Also, I’ll take what I can get with Java.

It’s not magic of course: The new for syntax manipulates an iterator under the covers. As it should. I get the same functionality with less code and it’s more human-readable. The old way still works of course, and sometimes I might need that approach when position matters more than order.

I hope Java continues to grow as a language instead of just piling on the libraries. My last Java project was unavoidably littered with implementation artifacts and gratuitous casts. Another new feature in Java 1.5 helps out with the latter but isn’t quite as seemless. It’s a start.

What really demonstrated Java’s language deficiencies to me as a Documentum developer was Jython, a Java implementation of Python. Jython can use any Java class out of the box–including the Documentum Foundation Classes (DFC)–with all the syntactic sugar of Python. Tasty! The built-in shell mode even allows experimenting with DFC interactively. Fabulously tasty!

I can’t escape Java as a Documentum professional if I want to avoid being one of those architects that never gets to code. Unfortunately too many IT managers drank the DFC Kool-Aid and believe that knowing Java well is more important than knowing Documentum at all, a fallacy to examine another time. At least I’ve got some syntactic sugar to soften that next bitter swig of Java.


I originally showed printing the name in each example, but perl of course has some ridiculously easy ways to do this (like print "@ARGV";) that would take away from the real issue here. The yadda-yadda-yadda with the reference method doesn’t make Java look quite as cumbersome and reflects the more likely case of needing a few statements to process each element of the list.

Free Time Again

I wrapped up my second contract with Morgan Stanley after three and three-quarters years. I’m going to take a few months to relax, blog, and fiddle with new technologies before hopping back into the market. Here are some of the things I want to blog about in the near future:

  • Implementing Associations in Documentum
  • EMC Says It’s a Hardware Company
  • My First Look at Alfresco
  • Java 1.5: The First Usable Version of Java

Formerly Known As

The debranding of Documentum continues with the conference formerly known as Documentum Users of the Mid Atlantic (DUMA) now being called EMC Content Management and Archiving (EMC CMA) User Group.

I’m still worried that my bread-and-butter won’t survive EMC’s acquisition; software companies don’t fare well after being acquired by hardware companies, like IBM acquiring Lotus and only briefly honoring promises of hands-off management and bank-rolling. Would Documentum just become a value-add to an EMC SAN solution? So far there’s no hard evidence that EMC is dismantling Documentum. D6 has an ambitious list of new features and EMC’s representatives laid out an aggressive plan for the next two years. The troubling part is that this year’s road map looks just like last year’s with a few dates slipped and still no live demonstrations of working software. Perhaps TCFKA DUMA isn’t a big enough draw to roll out the full dog-and-pony show or EMC doesn’t want to upstate their big event in Orando next month.

Comparing the first EMC CMA with the last DUMA, I found attendance somewhat lower, lunch cheaper, and the talks less compelling. At least every presentation didn’t start with a CYA sales pitch this year! It’s hard to say if the lower attendance reflects Documentum losing momentum or the normal yearly variation in the conference: Despite 13 years in Documentum, these are the only two years I’ve attended. Maybe I’m just another year closer to my crotchety old man’s license, but next year will either need a hotter agenda or rodizio to assure my attendance.

The most relevant presentations this year were by EMC on D6’s web services (SOA) and Content Services for SharePoint. Both felt more like sales pitches, but I walked away with enough facts to feel that both products are on the right track. They’ve fooled me before. At first the BOF sounded like server-side behavioral extensions that would make Documentum truly object-oriented but turned out to be client library decorations with major distribution problems. Wishful thinking aside, a bad licensing scheme could kill even well-engineered products; pricing details for SOA were not available which worries me given some custom-client licensing craziness between EMC and one of my clients.

My current client’s evaluated some third party products in these spaces but weren’t impressed enough to write checks. They may be more forgiving with products from EMC like business travelers avoiding great local restaurants for a TGI Friday because it’s familiar and consistent even if it’s mediocre. Regardless, I sympathize with those conference partners who paid to watch EMC muscle in on their territory; they deserve a better lunch next year too.