EMC drops Web Content Publisher

Zombie WaspHere are my thoughts on Brilliant Leap’s latest post about EMC dropping Web Content Manager [Brilliant Leap: Na na na na, Hey hey-ey Goodbye].

I remember when Documentum turned its back on their Big Pharma customers to chase the web content management dream during the tech bubble. So now they’re backing away from WCM after dropping DSM? Hmm. Then there’s EMC’s “we’re not worthy” submissive stance regarding Sharepoint. Hmm.

Will users five years from now actually know what Documentum is? EMC will have to wage a “Documentum Inside” campaign like Intel’s to keep any kind of mind share with customers. They still have Captiva, but does anybody really want to be *known* for scanning, the lowest form of document management?

An optimist would claim that they’re focusing on core technologies, and we’ll see long-needed improvements at the server and in the data model. A pessimist would argue this is another sign of EMC parasitizing Documentum. Think “zombie wasp” from the RadioLab episode on Parasites.

I am not exactly known for being an optimist, but this may be good news for alternatives like Drupal and Alfresco as businesses starting reaching for a can of Raid.

Win7: Well, have you tried it?

A friend at Microsoft messaged me on Facebook and asked me if I’ve tried Windows 7 now that it’s officially released.  The short answer is, “No.”

broken_windows_2At home, I only use Windows on my gaming machine. XP after all these years is running (mostly) smoothly and quickly. The newest game I’m running said yesterday that Win7 is not officially supported despite the developer having a very close relationship with Microsoft. Games are particularly sensitive to change, especially in graphics drivers, audio drivers, and memory management. There’s no benefit under Win7 with any of the games I play (e.g., no DirectX 10 games), only risk.

My current client is still on Windows XP. While I expect they will move to Windows 7 eventually, it won’t be anytime soon: The upgrade inertia of a company with tens of thousands of computers, many of which don’t have the horsepower to make Win7 a good experience, is a frightening thing to behold. Especially if you make your living by selling shrink-wrapped upgrades to companies like them.

Win7 is in a bind; Vista’s problems weren’t entirely technical and may reflect the mature nature of the computer market more than mistakes made at the software level. People upgrade Windows when they buy new computers, not to get new features. The economic downturn means fewer computer sales. Some analysts think Win7 will drive more hardware sales,but that’s a cart-before-the-horse argument to me.

People use applications, not hardware or operating systems. Until those applications require new hardware or Win7, people won’t upgrade. It’s cost without benefit. Microsoft is trying to include useful software with Win7, something they (sometimes unfairly) get into trouble for, but people with Windows right now already have 3rd party software for those things. While I’ve come to doubt that people are rational actors in the economic sense, the cost/benefit equation is just too obvious here, especially when money is tight.

On my Mac, I upgrade more frequently because Apple provides functional improvements to applications I use in daily life as well as new/cool stuff.  There are more applications shipped with the OS that I use regularly, so I am more interested in what an OS upgrade includes. It also helps that Mac OS X upgrades are more frequent and lower impact. Although I’ve wanted to do a clean install, I haven’t *had* to do one and therefore haven’t done it.

27 inches of Sexy
27 inches of Sexy

The most likely thing to get me to buy Win7 right now is if I get one of the new iMacs to act as both gaming and desktop computer. 27 inches, video in, and nice horsepower in the CPU/GPU on the high end have me interested. And it’s lickably sexy. My gaming rig is a few years old (another reason I’m hesitant to push it to Win7 even though I have a Gig of memory XP can’t address) but a new iMac would have plenty of cycles to spare for Win7.

Microsoft sticking to a release date is nice to see, but it’s not without risk. My final hesitation (on almost any 1.0 product) is how rushed it was to get out the door on time. How far into the future is SP1 going to be?

For no real benefit, Win7 would only bring me risk and cost, so I don’t do Windows upgrades–for now.

Twitter Lists: Defeat from the Beak of Victory

Here we go again!

I wanted Twitter lists about five seconds after I clicked my second “follow”. My life is about categories and contexts: I follow people for different reasons, and I want to group those people and their tweets around similarities. Search and hash tags helped a little, but full-text search and uncontrolled tag vocabularies come with a host of problems–I know that all too well from my day job.  In the meantime, a Rube Goldberg of RSS feeds and multiple Twitter accounts provided some degree of order. Now Twitter’s on the eve of releasing lists, and I can’t say for sure I’ll even use them.

Twitter needs to advertise their betas better.

There’s no telling when I got the feature because I don’t use the Twitter website. It’s all about the client: Tweetie 2, Tweetie for Mac, or Google Reader. I even stopped going to the website from email notifications because they don’t have anyway to handle multiple accounts. The email may be about “A”, but I’d end up as “B” because that’s who I last logged in as. Yes, this is another case of clients having it all over web apps in terms of context and state. I hate living in a buzzword-compliant age: Web 2.0 is roughly Client 0.2 in my book.

An API is no substitute for a conceptual model.

I found the API calls for lists easily, but I never found diagrams or narratives explaining what lists are and how they work. Lists don’t appear to be very complicated at first, but it’s not just twiddling the two radio buttons and one text box on lists that creates complexity.  How things interact with lists internally and externally can create unexpected conditions and counter-intuitive behaviors. That leads me to my biggest initial gripe and likely deal-killer …

Lists do not have RSS feeds; they are a walled garden, and not in a good way.

Lists were looking pretty neat until I noticed something. Actually, I noticed the lack of something–an RSS feed icon in the address bar of Firefox. RSS lets me consume and crosspost Twitter anywhere–Google Reader, my blog, Facebook, FriendFeed. Right now lists are only available through the Twitter website, and that’s fine for a beta release (unless you’re Google). However, even when clients start supporting lists, people will still have to come to Twitter. Maybe that’s a hint that Twitter’s getting ready to monetize, or maybe that missing conceptual model contains some details that make RSS problematic.

A little more experimenting is in order …

Google Docs Shared Folders: More Folders, More Lies

Oh, Google. If you can’t get shared folder permissions right, who can?  Nobody. Because the folder is still a lie!

Google Docs Adds Shared Folders — Mashable.com

The Folder Is a Lie

Mashable claims that Google’s new shared folders work just like they should. I beg to differ. Google doesn’t really have folders under the hood, just like some other document management system I used to talk about. Things get tricky when a document can be in more than one place. Google’s full of smart people, so I decided to hope for the best and kick the tires: Create a few folders, create a few documents, and then permission and move things around to see what happens.

The first no-big-surprise was Google Docs has trouble with state. Web applications are still inferior to stand-alone clients (or operating systems) when managing state. Most of the time a refresh would solve the inconsistencies around location or permissions, but sometimes a logout/login was needed. State issues aside, let’s look at the behavior.

Google walks a tightrope with its “folders” because they really aren’t folders; they’re tags. The behavior you get depends on the context: If you’re in a folder, you get the “move” menu item which works as advertised; something is in one place, then it’s in another–or nowhere since documents don’t have to be in folders. Use the folders menu item and you get the “tag” behavior because you’re directly selecting zero or more items from a taxonomy of tags that happen to have folder icons next to them.

Hacking around, I discovered that Google’s “how it should work” is a most permissive model; it seems to just gather the list of every sharing option on the shared folders. This isn’t horrible; however, the metaphorical mismatch it creates will undoubtedly cross the line into “too permissive”. Most people will assume that the permissions on the “last” folder they put something into will determine the permissions. To Google’s credit, they display a permissions summary next to each document. Maybe that’s good enough to prevent mistakes. And maybe everybody reads EULAs before clicking “I agree”, too.

The shame here is that Google really broke ground with ideas like conversations in GMail. Seeing your replies in the thread of a conversation is obviously the right thing to do; segregating part of the conversation to the Sent “folder” is a broken model that requires people to quote the entire previous conversation with each response. Horrible! Google’s always on the verge of freeing people from the tyranny of folders but never fully commits to a pure tag and search approach, so they won’t be overthrowing The Folder Hierarchy with this feature.

Twitter Misses the Mark with Mentions


When Twitter changed their reply functionality, now called mentions, my initial reaction was unmentionable.  After a few weeks to ponder and play with it, I still think they made a big mistake.  A reply was originally a message that began with a twitter username, like this:

@zorak No, really?

Replies were public, but Twitter added a link so you could see just your replies and options to filter other people’s replies out of your friends stream.  According to the Twitter blog, the community came up with the convention that Twitter later embraced and enhanced.  Then Twitter added a separate API call and a “swoosh” button to their web site:


Just what I wanted! Twitter added metadata underneath so that a reply remembered which tweet it replies to.  Pretty soon every Twitter client included swoosh buttons and “in reply to” links.  This was a philosophical break for Twitter–whether they know it or not–because there was no way to distinguish swoosh and “@user …” via SMS.  Supporting SMS creates a larger potential user base, but it drastically limits functionality.  Until everybody has an iPhone, fledgling social networks like Brightkite must consider this trade-off.

The original reply syntax is still supported and continued to create confusion as Tweetie developer AteBits explained on his blog.  People put multiple names in the message or put the @ in the body of the message, assuming the right people would see the replies:

@me @myself @I Remember the milk.
Give @me some sugar, @baby!

Only @me sees the first tweet as a reply; nobody sees the second.  Search was already catching on thanks to other community-grown initiatives like hash tags; users and client developers began using search on @user instead of the reply API to catch such grammatically incorrect tweets.  Apparently this is a bad thing, or at least something Twitter discouraged, perhaps because of its impact on Twitter’s call throttling.  That and other scaling problems should make for a few good dissertations; I just hope Twitter is keeping the historical record and will be willing to share it.

This brings us to mentions which are basically just searches on @user.  Although it’s a good thing that Twitter learns from their community, the big mistake here was changing the functionality under the existing API calls.  I agree that instantly supporting new functionality in all Twitter clients is attractive to a provider, but it can–and did–create unintended consequences.  All those clients blessed with catching those malformed reply tweets were also cursed by all those side-bar mentions crowding the replies page.  Twitterati like @wilw get many more mentions than direct replies, and now there’s no easy way to sort out the two.

The lesson here is that it’s safer to create new API and UI elements for new-ish functionality and let the community migrate over than to replace the guts and hope nothing breaks.  As any API designer knows, developers will do all kinds of unexpected things once your API is released into the wild.  The Twitter community’s active, inventive role in shaping Twitter also provides for some real “They did what?” moments.  Tweaking reply functionality to support only swooshes and adding new methods for mentions would have made everybody happy.

There’s one thing I want from Twitter that they promise in the API FAQ; I want to see all replies for a given tweet.  I disagree the assertion in the Twitter blog post above that people don’t want to wander into the middle of an ongoing conversation.  Sometimes that’s the best way to discover new topics and interesting people.  When that happens, there is a need to go back and discover the source and all its tributaries.  Twitter is aware of the need, as this quote from the Twitter API Wiki FAQ shows:

How do I get all replies to a particular status?
For now, there’s not a great way to do this. We’ve heard the requests, though, and we’ll be providing a solution for it before too long.

Conversation functionality is cropping up in Twitter clients like Nambu and the soon-to-be-released Tweetie for Mac (20 April 2009).  Looks like Nambu constructs conversations based on cached tweets, building little trees as it discovers reply pointers to other already-fetched tweets.  This single-linked list structure makes it easy to find your immediate predecessor but difficult to walk up, across, and back down the tree.

Hmm, where have I seen this problem before?  Oh yeah, version trees in Documentum.  Every document remembers its immediate predecessor (i_antecedent_id) and the root of its version tree family (i_chronicle_id).  A single query on i_chronicle_id returns every version of that document.  That’s just what I want Twitter to do!

Twitter already has its own i_antecedent_id–with a better name I hope.  So add an equivalent to i_chronicle_id and a new getAllReplies API call.  I suggest topic_id since that’s what the root tweet of a tree of replies becomes.  It would be nice to go back and stitch up all the previous replies-of-replies, but I would understand if the hit on the database would be too big.  How many tweets are in there anyway?

Word to XML, Then and Now

The Scream by Edvard Munch
The Scream by Edvard Munch

I was lucky that last month’s XML Philly meeting didn’t trigger my post-traumatic stress syndrome. Quark’s presentation on their XML Author product took me back to the front lines, having done something similar with Word and SGML over a dozen years ago.  Quark says it always produces valid XML for any schema.  I can testify that it’s no small feat if true:  Although Word now produces XML directly, it’s a generic schema that represents formatting, not semantics.  Wasn’t this the schema Microsoft wanted to patent as a part of their contribution to “Open Standards”?  Anyway, this is still a hard problem with no obvious solution.

Their secret is that the plug-in completely replaces the implementation of the Word data model.  XML is always valid because users are always working in XML; there is no messy conversion between the flat, unstructured Word model and the deep, structured XML model.  What XML Author gets from Word is the familiar GUI and a clear list of features to support, like Track Changes.  In theory, this gets around several common XML acceptance problems:  Users don’t have to learn a new interface, and business owners don’t have to pay for two separate word processors on everybody’s desktop.

Both justifications fall apart under closer scrutiny. Authoring XML changes how users work due to structural requirements; in particular, cut-copy-paste between vanilla Word or different schemas requires skill and patience because of the always-on validation.  Although users won’t have extra icons on their desktops, the business will have to cough up significant licensing fees that will feel like having two separate, high-end products installed.  Quark was also pushing their professional services for getting things up and running–both an added cost and an indication that things aren’t as simple as they seem.

Then there’s the question that always comes up at these meetings:  What if you share XML documents with people outside your company? There might be something webbie in the future, but for now let’s not even go there.

We didn’t get a live demo of the product, and an acquaintance who evaluated it warns that it’s not ready for prime time if your business depends on complex XML or heavy-duty Word features.  I would also be wary of the product constantly lagging behind Word features because it is essentially a reverse-engineered product, and it’s an acquisition that Quark’s still trying to fit into its existing product line.  Still, it’s easier than trying to mimic, maintain, and synchronize XML structures in actual Word documents.  I have the scars to prove it.

The Origin of Species of Information

Happy Birthday Chuck!  You've given me so much!    Happy Birthday Chuck! You’ve given us so much!

Last night’s Philadelphia XML Users Group was a pleasant mix of the old and the new: Jim Caine of Jaquette Consulting revisited an earlier talk on content reuse that touched on DITA and Documentum among other things.

Named for today’s birthday boy, the Darwin Information Typing Architecture (DITA) is a simple XML application (in the xml sense) that models information around authoring units like topics and references instead of publishing like documents and books.  It’s meant to be extensible (in the OO sense) rather than definitive.  Somehow DITA never crossed my path until a few months ago, but it represents another step towards the Grail of structured authoring/publishing that I worked on 15 years ago.

Jim’s project involved moving an insurance institute’s learning resources into a single repository and allowing them to create a variety of products (real books, eLearning, flash cards, etc.) from the same content.  The project started last year; Jim first presented on the project back then and gave the group a look at how practice deviated from theory.  He did some really smart things to facilitate reuse like referencing XML wrappers for external entities like images. This allows reuse of data and the metadata.  Kudos to WordPress for a similar albeit not XML approach to images and galleries. I’ll post a link to his presentation when it hits the web.

Turns out that authoring structured content is still the hard part.  The original plan involved a Word plug-in to allow authors to create valid structured content at the very beginning.  This good idea hit some bumps because of vendor support issues and was the hardest conceptual change to make in the whole process.  Authors used to writing a single document now wrote up to a dozen separate learning objects, a subtype of topic.  Deja vu all over again.

A very few actors in the content creation process have a very lively editorial cycle.  We’re talking major rewrites, not “you missed a comma here” kinds of things.   This wasn’t a problem back on RDMS: We dealt more with multiple authors and a review process than the more traditional author/editor interaction going on here.  Even in legal review and approval, I’m used to all actors being subject matter experts, often getting more experty the further along in the lifecycle you go.  Not so in this case–and publishing in general I’d guess.

Here comes more deja-vu-all-over-again:  The plugin couldn’t handle the actors’ heavy dependence on Word collaboration features like Track Changes.  It’s easy to get lulled into a false sense of security by an oh-so-pretty model for the final product of the authoring process. That Emerald City architectural view of content hides all the information and processing necessary to get to that end.  This particular problem has sparked some heavy flirtation between authoring, wikis, and DITA happening in my head, just in time for Valentine’s Day.

Jim’s use of XML Applications (in the Documentum sense) worked well with DITA’s topics and maps.  No big surprise there, but the marriage of DITA maps and Documentum virtual documents came with the usual toilet-seat-down relationship problems, especially because of webtop’s weak handling of virtual documents.  A post-editorial staff using XMetaL bears the brunt of the bickering, so authors are  left to worry about intellectual property, not scaffolding, as it should be.

Most of my work lately has centered on document dumping grounds.  Records management, eDiscovery, and transactional content management don’t concern themselves with the processes of actually making content.  It was great to see what’s happening on the other side again, and I’ve been stupid for not attending this group sooner.  Such is the life of a freelance.

One special note: The Users Group had brownies for Valentine’s Day.  Mmm, tasty!  I suggested that publicizing food at meetings might be some great marketing.  It might also require a bigger conference room for several reasons!