Está en la página 1de 53

Content Management System

Discovery Project Report

September 30, 2004


Table of Contents

Executive Summary…………………………………………….………………………..3

Introduction………………………………………….………..………………………….4
CMS definition
Project Charter
Team

Assumptions………..………………….………………………………………….……...5

The CMS "Space"………….…………………...………………..………………………6

The Process...…………………………………………………………………….……….8
Gathering data from the community
The Survey
The Focus Groups
Conclusions based on community input
Product Review
Functional Requirements
Technical Requirements
Products Eliminated from Consideration
Products which merited further investigation
In-depth Testing
OpenACS – the First Runner Up
Lenya
Atomz
Macromedia Contribute

The CMS Discovery Team Recommendation………………………………………...24


Large site recommendation
For any size site -- An alternative business model to #1
Small site recommendation

The Task to IS&T………………………...…………………………………………….28

Appendices….……...…………………………...……………………………………….30
Project Charter
The Survey
Product Matrix
SiteMaker Evaluation and SiteMaker Issues List
Plone Evaluation
NCSU Lenya Questions and Answers
CMS Consultation Guidelines

-2-
I. Executive Summary
Based on community and product research, the team has three primary CMS recommendations
that will serve the widest range of possible customers.

Recommendation #1: For large sites, particularly those that want a CMS that can repurpose
content, the team recommends the further development of the open source product, Lenya.
Apache Lenya, based on the Apache Cocoon content management framework, is an open source,
full featured content management system that is currently under active development. Lenya is
programmed in cross-platform Java and its content is stored in an XML repository. The team
believes that xml is a forward-looking technology, whose predicted long life and flexibility will
justify the long-term commitment of resources to development and support.

Adopting Lenya would require significant development and ongoing customer support on the
part of IS&T. Nevertheless, given the scale and call for CMS services within the MIT
community, it is appropriate for IS&T to develop in-house CMS expertise to serve those needs.

Recommendation #2: For those sites who want the flexibility of xml and an easy interface, but
choose not to wait for development of Lenya, the team recommends Atomz Publish. Atomz
Publish is a mature, commercial content management system based on a hosted Application
Service Provider business model. Development and asset content is stored in a XML repository
on Atomz servers. Published content resides on the site owner’s web server or Athena locker.

This model is appropriate for those DLC’s able to support the ongoing subscription costs of the
ASP business model. Contracts with Atomz should be negotiated and managed through a
centralized IS&T service, so that the MIT community as a whole spends its Atomz dollars
efficiently, getting more service through coordinated volume. IS&T should first determine
whether Atomz can scale to the potential volume that the MIT community could bring to it.

Recommendation #3: For DLC's who do not support their own web site infrastructure and do
not require an enterprise level CMS system, we recommend Macromedia Contribute as a
workgroup level content authoring tool. Contribute's interface and tools will be familiar to
current Macromedia Dreamweaver users, but offer a considerably less complicated user
interface. Additionally, expanding afs services and tools will alleviate some of the burden on
administrators of Athena-hosted static sites and offer more interactive functionality.

Therefore, the recommendations for small sites are as follows:


1. Volume License of Contribute, pending go-ahead from SWRT process
2. Develop additional Athena services, and promote them aggressively to the community.
3. Training: Athena Web-Site Hosting Information Technology certificate program.

See Small site recommendation for details of recommended Athena services and training.

See The Task to IS&T for specifics on IS&T’s role in delivering these recommendations.

-3-
II. Introduction
1. CMS definition
The team agreed upon this definition for a content management system, based on the projected
needs of the MIT community:

A content management system is a process and/or software application that allows groups to
effectively plan, create, manage, store and distribute content. Content can be anything from:
 published documents (web or print),
 images,
 archived communications,
 presentations,
 or streaming media.

For a CMS to be an effective system it needs to have the following traits:


 Provide a method of content creation and editing that a non-technical user is comfortable
with using; this is usually done with templates.
 Provide a system for describing the content (metadata) so that it may later be searched
upon using a search engine.
 Provide a workflow mechanism for resource assignment, review, and tracking throughout
the content's development cycle.
 Provide a method of tracking and storing various versions of content to provide the
ability to revert to a previous version of a content item or to review the changes that
occurred.
 Provide for the redistribution of content in various forms, whether that is via the web,
print, disk, or other media.

2. Project Charter
See Appendix A.

3. Team
Project sponsor:
• Susan Minai-Azary, Director, IT Architecture and Infrastructure
Project manager:
• Rich Garcia, Discovery Process Project Manager
Team members:
• Cecilia Marra, Office of Academic Services (team leader)
• Mark Begley, Department of Economics
• Sean Brown, Web Communications Services
• Tim Boyden, Facilities
• Roberta Crumrine, Student Services Information Technology (expert resource)
• Tim Griffin, Information Services and Technology
• Carl Jones, Libraries
• Larry Stone, Information Services and Technology
• Johanna Purcell, Technology Review (expert resource)

-4-
III. Assumptions

In reviewing products that might suit the varied needs of the many community customers, the
team made several baseline, universal assumptions:

1. Conforming to a certain level of accessibility standards is a requirement. This may mean,


however, that site templates can be managed to contain and retain accessible features
such as image tagging, skip links, etc. Difficulties arise when new elements are
introduced within content being authored by end-users. In such cases, the burden of
properly tagging elements for accessibility falls either on the author or site manager as a
manual process. This is clumsy, but appears unavoidable.

2. Whatever the team recommends must be better than the existing tools and services.
We will be asking customers to learn new skills and devote both human and financial
resources toward working with a new product. The end results have to justify the cost and
effort to the customer.

3. Whatever the proposed recommendations, there will be a significant investment of


IS&T resources, both human and financial. This investment may involve the
development of new technical expertise among in-house staff or the outlay of financial
resources to external developers and vendors. It will certainly involve in-house training
and documentation resources. The team endeavors to make responsible recommendations
that promise the most value to the customer in return for the commitment of IS&T and
community resources.

-5-
IV. The CMS "Space"
1. In the Academic Community in General
In addition to homegrown CMS solutions and integrated LMS/CMS/portal solutions, many CMS
products are in use throughout the higher education sector. Still, no clear CMS market leaders
have yet emerged. Based on information gathered over the last two years from the archives of the
well-subscribed University Web Developers listserv and the Educause Web User Group
archives, the following chart indicates a selection of what product is in use at which schools:
Product School Customer Comments Team Comments
University of UAB: since 1996 with wonderful results. Most of our
Alabama at campus, approximately 36,000 pages and over 700
Birmingham, authors, uses this software. We do have some
Estrada
Virginia departments that prefer to use their own methods and no
Military campus mandate prevents this. VMI: We have been very
Institute pleased with this commercial product.
We started phasing out Frontier and Manilla in late
2002, when we conducted a discovery project on
content management.
(http://www.dartmouth.edu/goto/webcmsdiscovery/)
We made the decision to move forward with
OmniUpdate for our clients.
(http://www.dartmouth.edu/goto/webcms/)
OmniUpdate Dartmouth We now have fifty sub-sites under the management
of our four-person department, and nearly 100
content developers.
(http://www.dartmouth.edu/goto/webpubsites/)
Right now, all the sites are supporting a standard
template from the Office of Public Affairs. When
we next move forward, into 100% CSS design, each
site will be able to have a distinct look.
http://www.sou.edu/access/.
Southern
Atomz We'll be rolling it out over the summer (2002). I think The Atomz logo is used as
Oregon
Publish we've found an excellent product at a good price. a link to enter the authoring
University
environment.
Ohio OU: It is a VERY robust system that runs on a cold
CommonSpot University, fusion platform. Kent: This application has been a
Kent State tremendously successful program for us.
The
University of It works pretty well and could work MUCH better if I
Midgard
the South had the time to write more custom code.
Sewanee
Canisius CC: It is pretty robust -- ask them about version 1.9 --
College, and the interface is simple enough for those without any
LiquidMatrix
Wilkes HTML knowledge. It will cost you a few $$ but is
University worth it. Wilkes: The CMS has its quirks…

-6-
Cost: $50-100K. Low level template development is
handled by 2 FTE (technical) Template implementation
is done by the hosted site's campus web coordinator
University of
Roxen CMS (semi-technical.) Most everyone else is a content
Alaska
provider (no technical knowledge necessary, including
HTML.) Upcoming portal project will require an
additional 3+ technical FTE's in this area.
Univ
a lot of pre built functionality, a robust user
Typo3 Missouri-
management, active real world user groups
Rolla
used consultants to help us get it up (in a record 3
months from start to finish) and are now in the process of
learning the language. We have about 25 staff doing
University of updates to the site and the CMS has made life much
Zope/Plone www.haskayne.ucalgary.ca
Calgary simpler for all of us. Previously, our site was maintained
with Dreamweaver library items and templates - much
greater learning curve for the users who are generally
area secretaries with no Web background.
Rice: We've rolled this out to about 65 campus
departments, research centers, and other misc
Ektron Rice, Other schools just use the
organizations. It has been a very robust system for us,
eMpower Emerson Ektron editor.
but it is completely lacking in cross platform
compatibility.

2. At MIT in particular
Here at MIT, in the absence of an enterprise-wide CMS product, various DLC's have licensed or
developed their own solutions.
DLC Product
OpenCourseWare MS CMS
Technology Review Was RedDot; now migrating to Atomz
MIT Home page Template Toolkit
Sloanspace built on OpenACS
Curriculum Information System and other
built on Oracle with SQR
SSIT systems
Economics customized Zope solution
MIT News Office custom Filemaker solution
MIT World custom LAMP solution

-7-
V. The Process

1. Gathering data from the community

The CMS team gathered information from the MIT community on their various CMS
experiences and needs, and on web site management in general. The team took a multi-
prong approach to gathering data, in the hopes of gaining as much input as possible. A
round-table discussion was offered during the Fall 2003 IT Partners conference; a web
survey was advertised through TechTalk, yielding 40 respondents; three focus groups were
held; and, finally, a hands-on demo of the two leading products in contention was hosted.

A. The Survey
See Appendix B.

B. The Focus Groups

It became clear to the team early in the data gathering process that, from a CMS
perspective, the MIT web site hosting community fell into three broad groups: small sites
with no CMS, large sites with a CMS, and large sites without a CMS. The small sites
typically had a single site administrator receiving and publishing content from scattered
authors with varying web skill levels. They also typically did not have a structured
business process for publishing. The large sites may have had more structure to their
process in common, and may have had similar issues in terms of needs, but the fact that
some of the large site owners had already experienced publishing with a CMS provided a
natural split in the course of focus group exploration of the topic. The experienced CMS
users were in a position to discuss what features and aspects of a CMS had really paid off,
which had not, and which of their needs were still unserved.

C. Conclusions based on community input


1. The Product:

• No single product will suit all community needs. Therefore, the natural
breakdown of customer types is by site size and the customer’s ability to provide its
own technical (hardware, software, programming) support. We may also need to slice
the customer base out by purpose of content/organization (academic vs. publication vs.
administrative documentation).

• Standards compliance matters. This encompasses 508 accessibility as well as


html, xhtml (if applicable), and css validation. Compliance matters both to our
customers and to MIT as institutional policy.

-8-
• The product should have the ability to integrate with other MIT systems in some
fashion. This may be accomplished elegantly or not, but is an important consideration
in selection and development of the product.

• Templates must be easy to use and revise. Administrators are typically expected
to anticipate all possible scenarios at the time of development, and have little flexibility
afterwards to make changes or enhancements. This has forced users to go outside the
system for any pages that do not fit the template.

• CMS templating may force customers to give up some of the javascript bells and
whistles that they already have and like in their existing site’s look and feel. "Vanilla"
CMS templates are an emphatic no-go for the large sites, and possibly for some of
the small sites.

• If the product separates content from format, the authors must somehow still be
able to preview when creating or editing content. This may be through a staging
server, temp files, or downloads into templates.

• A shared resource is only feasible when different customers can have their own
instance of the application, allowing them full control over their own instance. If
given this control, customers are more than willing to share the cost of services. They
are also willing to have their content stored externally, especially if there is a
mechanism for data dumps in a usable format like xml.

• Many of the open source products will have similar capabilities. Choosing the
right open source product will depend on the Institute’s code preference, based on in-
house expertise and belief in the language’s longevity.

2. The Customers:

• No product can eliminate author personalities, nor can it make content magically
appear. One of the biggest frustrations noted throughout all the focus groups, surveys,
and round tables had to do with aspects of the authors themselves, whether it was their
varying skill levels or their willingness to conform to format requirements or delivering
content in a timely manner.

• The learning curve must be shallow. ‘Nuff said.

• There is no product that does not require a knowledgeable web site


administrator, no matter how large or small the site, no matter how primitive or
sophisticated the product.

3. The Process:

• A CMS is not a substitute for a business process, unless the customer is a one-
person shop. A CMS can support a process but can not force compliance or consistency

-9-
where an ill-defined business process exists. If there is not consensus, investment, and
enforcement in making the tool effective, it will fail. Well-defined business processes
offer the most promise for success when implementing a CMS. On the other hand,
highly-defined processes bordering on excessively complex will be so idiosyncratic that
a custom-built tool will be required, and all the IS&T can do is offer guidelines rather
than product recommendations.

• Workflow must be highly adaptable, or workarounds ensue. Customers need to be


able to remove steps/layers of approval as needed.

2. Product Review

A. Functional Requirements
The team found that these functional requirements listed by the Technology Review
publishing staff were representative of community customer needs:

Need:
• To be able to tag content with categories -- on multiple levels
• Authorization (having an admin account that sets the level of access others will get
for the project)
• Flag items on an element basis (i.e. an article is made up of the elements: deck, title,
author, date, body, etc…)
• Global elements
• Global files (i.e. using the same article over again in different locations in the project)
• Publishing out different versions (e.g., print version)
• Publishing to a dev server before the page goes live to the site
• Ability to preview before publishing
• Integration with rich media (e.g., Flash) and other vendors (e.g., Dart)
• Ability to publish our 3x weekly newsletter
• Internal searchable directory
• Ability to have dynamic pages
• Proprietary database
• Ability to access CMS form the office and form outside the office easily
• Versioning control (e.g., templates)
• Compatibility with browsers
• Customization
• Publishing XML content
• Speedy and accurate publishing
• Notification of publishing (e.g., when it’s finished, if there were errors)
Want:
• Ability to code in a program like Dreamweaver rather then an unusual template that is
part of CMS software
• Cross browser testing
• Cross platform testing
• Ability to have entire site in content management system

- 10 -
• Better integration with Search tool
• Intuitive UI
• Easy upgrades/maintenance
• Small learning curve

B. Technical Requirements

The team compiled a matrix of products and how those products stacked up against the
technical requirements of serving the MIT community enterprise-wide. There was separate
consideration for small sites whose hosting and infrastructure could either be managed by
the site owner or though afs services. Both proprietary and open source technologies were
reviewed.

1. Products Eliminated from Consideration


The team decided that since many open source products had much in common, it was
not necessary to review all of them, but rather take a few that were representative of
their technologies and features.

Also, platform constraints were a weeding factor. Platforms and technologies, such as
IIS/ASP or Cold Fusion, are not ideally or easily suited to the MIT environment. The
team does recognize that platform issues may not be a show stopper for MIT (e.g.,
mod_asp would run ASP on apache) and suggests that consideration be given to
proprietary products, should the right candidate come down the pike in the future. At
this time, we did not find any proprietary product that offered dramatically superior
features to the open source products we reviewed.

In addition to the above considerations, the team decided not to move forward with
testing several of the products on our matrix for various reasons, which are
summarized here:
 Oracle: Enticing due to the existing volume site license, but considered an
"elephant gun" for development and therefore only appropriate for enterprise-
wide applications, none of which are currently outstanding.
 SSIT Toolkit: The SSIT toolkit is not a contender as a product unless it were to
be rewritten in Java. The IAP and CIS deployments, which were developed with
the Toolkit, do have workflow schemes. These may be useful as models of
workflow for comparison to other products.
 Manila and other blogging products: While Manila is perhaps a lightweight
tool, not sufficiently feature-rich for full-scale CMS use, it could be a handy add-
on to site managers that want blog-type community discussion. Userland puts out
a fuller CMS product --in addition to Manila-- called Frontier that might be more
appropriate as a real CMS. Pros include low cost; cons include questionable
performance capability, particularly of concern due to the MIT network's
susceptibility to being "slashdotted." Also, the showstopper for Manila is that
secure ftp is not currently available, but expected in the next release. The team
agreed in general that blogging tools are not sufficient to the community's CMS
needs.

- 11 -
 MS CMS: While we did not want to eliminate products for platform reasons
alone, we did not find the MS CMS had sufficient functionalities to compensate
for the platform and security concerns around running the product. We can not
expect that our customers can or will be willing to give up their existing cross-
platform user base. The primary community user of this product (OCW) is in
process of looking for a technology to replace MS CMS.
 RTFM: This product is currently bare bones, and would require development
almost from scratch. However, it bears watching, particularly as an
implementation of RTFM is already underway as part of the Casetracker-to-RT
development and deployment. RTFM will host Stock Answers. How well RTFM
serves Stock Answers will better inform us as to its value as a CMS service within
the MIT community.
 Template Toolkit and MetaDot (TT application): Template Toolkit lies behind
MIT's home and upper level pages, but for larger sites, much more development
would be required. For the amount of labor invested in development, other open
source products offer more functionality on which to build.

For the complete matrix of products discarded from contention, see


https://web.mit.edu/is/discovery/content-mgmt/internal/matrix_rej.shtml

2. Products which merited further investigation


As a consequence of completing the product matrix, several products stood out as
worthy of more in-depth evaluation, for either large or small sites. Of these, the team
chose to devote significant effort to assessing the viability of following products
within the MIT environment: GVC SiteMaker, OpenACS, Apache Lenya,
MacroMedia Contribute, and Zope/Plone.

For the complete matrix of products considered the strongest contenders, see
Appendix C.

3. In-depth Testing
Because no polished demo versions of these open source products already exist, the team
took up the challenge of implementing our own demo installations for evaluation and
testing purposes. Carl Jones implemented test sites for SiteMaker, Lenya, and Plone.
Larry Stone implemented a test site for OpenACS. Larry and Carl also implemented
templates in their respective sites, while the rest of the team simulated various authoring
tasks and levels of experience. Tim Griffin took on the task of in-depth evaluation of
Contribute, bringing to bear his roles as a Contribute Beta Project tester for MacroMedia
and a member of the IS&T’s Web Communications Services team.

See Appendix D for evaluation of SiteMaker.

See Appendix F for evaluation of Plone.

- 12 -
A. OpenACS – the First Runner Up

The team found OpenACS to be a very strong contender, and could certainly be
developed to serve the community needs, should IS&T determine not to purse Lenya’s
particular set of technologies. The integrated database, the available authoring and cms
modules, and the existing MIT/Sloan expertise in its technologies all make OpenACS a
viable candidate. The disadvantages noted below (standards compliance and lack of
WYSIWYG authoring) point to where further development should be prioritized.
Because of its potential to serve the MIT community’s needs, the team’s evaluation of
OpenACS is included here.

Summary
OpenACS could form the basis of an excellent enterprise-wide CMS, although it would
require a substantial investment in development, customization, and documentation. It
already has virtually all of the features we require. We can also leverage the experience
and efforts of SloanSpace, a major OpenACS installation already on campus.

Overview
OpenACS is a server-based open source Web application framework. It consists of a
suite of core services and a large collection of optional application modules, including
several different types of content management system (CMS). These application
modules are easily modified, extended, and integrated into new services.
OpenACS has its roots in the ArsDigita Community System (ACS). Version 3 of the
ACS was released to the open source community to become OpenACS, while the Java
reimplementation is now with Red Hat.

Features
Some relevant features that exist in the current (5.0) release of OpenACS:

• User registration and login.


• User groups, sophisticated ACL-type permission system.
• Supports multiple independent top-level websites, each with independent
administration, look-and-feel, etc.
• Driven by relational database for performance, reliability, and ability to manage
structured data easily.
• Hierarchical "Content Repository" is the basis of OpenACS' CMS-like applications.
It supports versioning, audit trails, access control.
• A powerful, hierarchical template engine makes it easy to separate content and data
from presentation. Lets you customize all Web pages, even the administrative UI.
• "File Storage" module maintains "assets" like images and opaque (Word, PDF)
documents. Upload and download individual files or whole ZIP archives.
(WebDAV coming soon.)
• Simple but highly user-friendly "Edit This Page" CMS, suitable for maintaining
simple pages and structured data.
• "News" application specialized for managing news items, including automatic
posting and removal.

- 13 -
Technology and Platform
OpenACS is built on the following components:

• Any common Unix operating system such as Linux (e.g. Athena Linux 9.2) or
Solaris 8/9.
• AOLserver open-source Web server. "AOLserver is the backbone of the largest and
busiest production environments in the world. AOLserver is a multithreaded, Tcl-
enabled web server used for large scale, dynamic web sites."
• Relational database, choice of PostgreSQL or Oracle.
• TCL, an interpreted scripting language. OpenACS is primarily implemented in
TCL, with some functions in the database scripting language (PL/SQL and
PostgreSQL's equivalent) for performance.

Although AOLserver is not as widely deployed as Apache, it is superior in some ways;


it is multithreaded, fast, and efficient. Some of the busiest sites on the internet run on it.
It's open-source and has an active development community with good support.

TCL is an unusual choice for a development language, but at least one potential
developer in IS&T is fond of it. TCL is clean, simple, and easy to learn; anyone
exposed to Scheme or LISP will find it familiar. It is also less prone to bugs and pitfalls
than, e.g., Perl.

Although some of the technologies behind OpenACS seem exotic if you are
accustomed to LAMP (Linux + Apache + MySQL + Php/Perl) servers, they are all
well-proven, mature, and actively supported.

Community
There is a large and active OpenACS commnity. See http://openacs.org for references
to large-scale users, such as Greenpeace. A major OpenACS-based project, DotLRN
(http://www.dotlrn.org/), has its roots in the Sloan School here at MIT.

Advantages

• Scalability: Small e.g. departmental site with minimal customization is easy and
cheap to implement, while a sophisticated site is also possible with more time and
effort.
• Very low learning curve to maintain site built with Edit-This-Page. Small updates
and changes are easy.
• Mature software that is unlikely to change in drastic ways, so it is less effort to
maintain a central service than if software is still evolving.
• Can collaborate with other MIT projects (Sloan, DotLRN).
• Additional useful services are also available (news, calendar, workflow, conference
registration, ecommerce, RSS feeds).

Disadvantages

- 14 -
• No HTML validation of content (might be possible to add it).
• Poor and incomplete documentation.
• No tools to extract and repurpose content e.g. as XML or print (could be added).
• No WYSIWYG HTML editing (might be added with in-browser editors).

Additional Work Needed


If OpenACS is adopted, it will need at least this much additional work to be deployed at
MIT:

• Finish adapting OpenACS to login automatically with MIT client certificates


(mostly done).
• Finish implementation of user-editable templates in EditThisPage.
• Write documentation for end users (webmasters, site maintainers).
• Add or improve an asset-management module (like FileStorage), for e.g. images in
web pages. Try adding WebDAV support.
• Improve tools for importing an existing site into CMS.
• Add EditThisPage sub-applications for common cases of structured data (e.g.
faculty and staff lists, publications, etc.).

Extras
Some unexpected extra features and benefits of OpenACS:

• Workflow module available; we didn't investigate it.


• Calendar (scheduling) module, also untried.
• DotLRN project is implementing WebDAV server module that can be used by
itself.
• Ecommerce module and event registration module could be combined to automate
conference registration and payment.
• Can be configured to allow outside users to register and join communities, so they
can be given administrative privileges or just see restricted material.
• OpenACS has other, more sophisticated (and less mature or documented) CMS
modules that should be investigated.

- 15 -
B. Lenya – The “Winning” Product

Lenya could fulfill the role of an enterprise-wide CMS at MIT, although it would
require a substantial investment in development, customization, and documentation. It
has many of the features we require, although some will need enhancement. One of
Lenya's core concepts is to re-enforce the separation of formatting from content
creation.

Overview
Lenya is a server-based open source Web application framework.
Lenya has its roots in the Wyona content management system, which is based upon the
Apache Cocoon web publishing framework. Wyona was then released into open-source
and quickly adopted by the Apache Cocoon community. One of the early adopters was
the University of Zurich. Wyona.com continues to supply consulting and support.

Features
Some relevant features that exist in the current (1.2) release of Lenya:
· XML-centric architecture
· Robust XML/XSLT/CSS support enforces separation of content from formatting
· WYSIWIG XHTML/XML editor suppport from within web browser
· XHTML/XML form editor
· User registration and login
· User groups, sophisticated ACL-type permission system.
· Supports multiple independent top-level websites, each with independent
administration, look-and-feel, etc.
· Customizable workflow engine, with audit trail
· Asset management to keep track of images and documents that belong to a page
· Revision control
· Flexible deployment options
· Security: SSL, LDAP authentication, IP address range
See http://cocoon.apache.org/lenya/roadmap.html for more detail on current and future
release features.

Lenya is built on the following components:


· Unix operating system such as Linux (e.g. Athena Linux 9.2), Mac OSX, or Solaris,
· Apache webserver is the backbone of the largest and busiest production environments
in the world.
· Servlet Container, choice of Tomcat, JBoss, etc. I chose Tomcat for simplicity.
· Cocoon 1.2.5
· Java (jdk 1.4.* and above recommended)
· XML documents stored as xml
· Uses XSLT and CSS for formatting and page presentation
· WebDAV
· Lenya is officially an apache incubation project. Incubation normally lasts for about 1
year. The latest official release is version 1.2, june 27, 04. See
http://forrestbot.cocoondev.org/sites/incubator-site/process.html

- 16 -
Performance
· Servlet containers. There have been reports of stability issues with Tomcat
(depending on release). Further stress testing with servlet containers (tomcat vs. jboss,
etc.) is recommended.

Community
There is a small but growing and active Lenya commnity. See the
http://cocoon.apache.org/lenya/community/ for links to live sites.

Needed
· Browser-based editing (e.g., Kupu, Bxeng) with its heavy reliance on javascript can be
erratic and validation sometimes unpredictable; needs to be more consistent; some of
this is setup and training-related.
· Add web-editable css to file menu, instead of editing files on server.
· Better end-user documentation as well as documentation for maintainer/admin.
Sample of existing documentation:
http://cocoon.apache.org/lenya/docu.html#docs/components/
· Better image/asset selection in authoring mode.
· Improve browser-based html editors.
· Import pre-loaded MIT users and groups.
· Finish MIT cert integration, automatic login. Allow passwords too.
· Examples for integrating cocoon dynamic components and lenya.

Extras
· Lucene search engine integrated into Lenya
· Workflow management included.
· Dynamic component integration through Java-based Cocoon frameworks.
· Wide variety of technologies, both legacy and cutting edge (e.g. JSP's,
Velocity templates, Cocoon forms, Flowscript, relational and xml database
connectivity, portal-management, etc.) may be used.
· WebDAV fosters offline editing with Dreamweaver or other desktop editing
tools, including MSWord, OpenOffice, etc.
· Repository JSR 170 will offer significant ease-of-use improvements for
navigating the file systems -- due out in the fall. This, coupled with the ability to
edit css files, will provide a substantial boost in functionality. See
http://wiki.apache.org/incubator/JcrProposal.

Additional Work to do:


· Configure Apache to accept MIT certs, allow login w/o password.
· Will test using xincludes for content aggregation (does not reguire command-line
access to server to edit xslt files).
· Prototyped simple import of IS&T sample site;

Advantages
· Lenya is built upon the apache open-source software stack

- 17 -
· Lenya is young enough that MIT could have substantial input into the direction of
future development.
· Static or "dynamic" publication is possible.
· May integrate components developed with cocoon (e.g. forms, database access,
customized pages, personalization, etc.).
· Scalable to both large and small sites.
· Standards-based XML editing (validation may be customized on a per-publication
basis).
· Custom "document types" support (customizable XML formats).
· May optionally output to non-html formats, such as PDF, with minimal changes.

Disadvantages
· Relatively new technology with limited penetration at MIT (e.g., Apache Cocoon
with its reliance on the pipeline architecture, XML, XSLT, WEBDav, etc.).
· Some development needed using an "unfamiliar" XML-centered framework.
· Documentation needs to be more complete. Administration and development
documents can be inconsistent, end user documentation leaves much room for
improvement (but is improving).

Other points to make:


See North Carolina State University questions (Appendix G) on Lenya usage.
Lenya outputs XHTML, using a combination of XML, XSLT, and CSS. The
architecture used by most Lenya publications is based on the idea of aggregated
content, whereby the output from multiple XSLT transformations is combined
to form the rendered document. Each transform will construct one part of the
overall document (for example, the masthead). This is a simple approach that is
similar to that provided by other CMS templating systems. An alternate
approach is to use a single XHTML template document that uses XInclude to
aggregate content. In some ways, this is even simpler than the first approach in
that no knowledge of XSLT is required.

C. Atomz – A Viable Product with an Alternative Model

Atomz came to the forefront of the team’s attention late in the project. Based on very
strong positive feedback from Technology Review, the team felt that there was a
customer niche that could be well served by the easy user interface and excellent
customer service offered by this particular vendor through the ASP business model.

Atomz Publish is a well-established commercial content management system product


that could serve as a vendor hosted and supported service for the MIT community.
Atomz Publish will integrate with existing MIT resources (Athena web lockers,
Macromedia Dreamweaver editor) and has been recently implemented by the MIT
Technology Review for their web content management and publication needs. Atomz

- 18 -
Publish has an extensive customer list including high profile customers such as NASA,
AOL/Time Warner, and the Harvard Medical School.

Overview
Atomz Publish was introduced in 2001 as a hosted application solution to web content
management. As a hosted application, Atomz Publish runs on servers located in
Atomz's network operations center where it is maintained and updated, the customer's
data remains on their own servers where it is accessed from Atomz Publish by SFTP or
direct folder access over a VPN. Atomz updates the Publish application on a quarterly
basis taking into account customer comments and feature requests.

Features

• Open standards: XML, XHTML, XSLT, CSS


• Integrated XML content repository - import, export, archive XML
• Cross-platform in browser WYSIWYG editing with spellcheck or integrated
development with Macromedia Dreaweaver or Adobe GoLive
• Form based editing using rich text
• Manage and upload Assets directly using web folders on Windows and Macs
• Integrated and customizable task based workflow with email notifications
• Check-in, check-out and unlimited versioning of content, assets and templates,
change tracking
• Scheduled publishing and content expiration
• Metadata management
• Automatic hyperlink checking
• User and publishing activity reporting tools

Advantages

• Hosted commercial application - no further programming development, software or


hardware required to run application/service.
• End-users will easily be able to create/edit content and templates using in browser
editing tools or Macromedia Dreamweaver and Adobe GoLive.
• Well documented and supported by established entity.
• Works with existing Athena web lockers or departmental web server.
• Web standards/accessibility compliant.
• XML content could be repurposed for other publishing mediums.

Disadvantages

• Hosted commercial application - annual support/subscription costs, application not


under local control.
• Windows/Macintosh-centric, limited support under Linux.
• Not Open Source.

- 19 -
D. Macromedia Contribute – For Smaller Sites

Overview
Macromedia Contribute 2 (2.1 for Windows) is likely the best candidate for a low-end,
low cost content management solution. A user can browse to web pages and then edit
and publish those pages all from within Contribute. Additionally, Contribute can import
Microsoft Office content (Windows only in v2). This feature alone is a big plus for
faculty and administrative staff at MIT. Further, Contribute integrates with Macromedia
Dreamweaver MX for site administration and development purposes. And the price is
right at $79 per user license (education) and likely less with a site license.

Features:
Publishing Process
Contribute offers a "Word-like" WYSIWYG experience that allows content
contributors to browse to a web page, click on the edit button, make changes, and click
on the publish button. Users can easily update content, insert images or add new pages
to a site. Users of Contribute never see the code. This prevents any deviation from an
established style guide set by site administrators and eliminates the undue stress on the
contributor that might be created by such a feature rich tool like Dreamweaver.
Site Management
Site management is done by a site administrator through Dreamweaver, Contribute
and/or at the file level on the server. The site administrator creates the file structure,
sets site permissions, and develops site templates. A site connection key is created and
emailed to the client or stored on a local server for download by the Contribute user.
The Contribute user then simply double clicks on the connection key which then
launches Contribute, asks for appropriate passwords, and makes a connection to the
site. All site settings and permissions are stored in an encrypted XML file at the root of
each site.
Integration with Dreamweaver MX
Contribute is integrated with Dreamweaver MX. This integration allows for easy site
administration and development using DW MX and MX Templates. Contribute uses
the Dreamweaver MX authoring engine and so offers support for CSS, XHTML,
server-side code and Dreamweaver MX templates.
Dreamweaver MX Templates
Dreamweaver templates can be designed with Contribute in mind. The site
administrator can dictate which templates must be used when a Contribute user wants
to create a new page on the site. This ensures adherence to design standards and
protects the code from unintended hacking.

Community and Support


Currently there are a few DLCs looking at rolling out Contribute to their content
contributors. Among these are Facilities, OCW, and MIT Libraries. There is a large
user community in education and the private sector as well as significant documention
provided by the vendor.

- 20 -
Advantages

• Uses Dreamweaver rendering engine


• Intuitive interface
• Server connection wizard is easy
• Creating new pages is easy
• Creating links very easy
• Publishing pages is easy
• No local copy needed or created
• Site Administrator sets permissions
• Uses check-in/check-out for version control
• Supports Word file with drag-and-drop (Windows only)
• Clean code
• Many schools are deploying as part of a content management solution (BU, Notre
Dame, Indiana, USC, ASU)
• Most users are publishing after 15 minute overview

Issues and Concerns


Note: Most of the issues noted below should be remedied in the next version of the
application.
Security

• The shared settings file which holds all permissions, site passwords (not SFTP
passwords) and permission group definitions for website access are held in a
directory called "_MM." The directory name and files cannot be altered or changed
in any way other than with DW or CT. These are encrypted XML files. By default,
these files are visible on Apache servers. In order for these files to be secure/not
visible to the world the server would need to be reconfigured as not to show
files/folders that begin with an underscore.
• The IS&T Net-Security Team states that the algorithm used to encrypt these
settings files, MD5, isn't truly encryption and that the algorithm doesn't provide
much security at all. However, on the spectrum of risk we needn't consider it a
show-stopper.

Permissions
Site permissions set up is going to be tedious.

• Only one admin per folder on the server.


• To establish new editing rules for the site you have to generate a new connection
key. Once that is done the old one is overwritten. Supposedly there is a way to have
site permissions updated automatically from the config file at the site root but
we've not tried this yet. See
http://web/tgriffin/Public/contribute/howto/contrib_setup.pdf

Platform Differences

- 21 -
There are significant differences in features/functionality between the Mac and PC
clients.

• Contribute for Macintosh has the Opera 6 browser built into the program. Unlike
IE on the Windows client, Opera is embedded in the program and can not be
updated. The Contribute installer installs Opera automatically. It does not need to
be installed separately.
• Window users need the 2.01 updater and the Flashpaper updater.
• Microsoft Office cut and paste only available on Windows

FlashPaper

• FlashPaper development only available from Windows client.


• FlashPaper is a new way of making Flash documents. It installs as a printer driver,
which means that you can create a .swf file of any document. This swf can be put
in a web page and viewed in a browser, using the Flash plugin. FlashPaper allows
users to zoom in and out and page through the document. The document can be
printed directly from the FlashPaper web page.
• Not 508 compliant.
• Needs flash player 5 or higher to work. 98% or higher of browsers [industry stat]
have v5 or higher.
• Interesting alternative to PDF but don't know if it will be developed further.

CSS-P
• CSS-P rendering in Edit View does not conform to IS&T's guidelines for
compliance with web standards. This is a major problem.
• The Contribute editor chokes when displaying negative margin widths - which are
used a lot in creating column based liquid layouts with CSS.
• Contribute uses the Dreamweaver MX 6 display engine that is highly flaky with
CSS only designs. We've found no way of hiding styles from Contribute. One
university has deployed Contribute and uses javascript to sniff out Contribute and
bypass the CSS file. This means that when viewing all would look fine but when
editing the user only sees structured html; no styles.

Workflow
There needs to be a defined workflow/business process in place for contributors and
administrators per DLC. This is needed if the DLC is going to manage their own site or
if IS&T develops a service offering based on the Contribute-Dreamweaver web
management model.

Other Issues

1. Sometimes when updating Mac OS the user is asked to input the CT serial number
again. This is actually a minor security problem. See

- 22 -
http://macromedia.com/devnet/security/security_zone/mpsb04-03.html for details
and a patch.
2. What happens to conversion of images from Word documents: they come in as jpg
files only?
3. CT auto uploads files/images on the contributor's desktop to the same directory as
the file being edited unless the user knows to use the choose button.
4. Cannot edit php include files and php includes do not render in the edit mode, only
in view mode.
5. We may want to recommend the disabling of "rollback files" that CT creates.
These are backups of previously published pages. CT by default creates 3 backups
of every file and stores them all in a "_baks" directory. Having potentially 4 copies
of whole sites on the server will eat up file space pretty quickly.
6. Is there such thing as a site that is too large for Contribute? Many current users say
that it's meant for smaller web sites.
7. When logging in CT auto connects to all sites. Why?
8. Need training offerings from IS&T on using CT, Managing CT users, Developing
and managing sites for CT.

Addendum: August 31, 2004


In August of 2004, Macromedia released Contribute 3 as part of the unveiling of its
new Web Publishing System (WPS). The WPS is a content management suite
comprising of Studio MX (Dreamweaver, Fireworks, Flash, Freehand and Cold
Fusion), Flashpaper 2, Contribute 3 and Contribute Publishing Services (CPS). These 8
applications constitute a content management system. While the WPS is designed to
handle large site content and publishing workflow, IS&T does not have the needed
infrastructure in place to support the system. However, Contribute 3 functions as a
stand alone content publishing application. Therefore, the CMS team's recommendation
to release Contribute stands. Macromedia has fixed many of the problems outlined in
the above report. See the provided vendor links and the Contribute 3 feature
comparison at
http://www.macromedia.com/software/contribute/productinfo/features/comparison/ct2_
vs_ct3-wps.pdf for more information.

- 23 -
VI. The CMS Discovery Team Recommendation
1. Large site recommendation

Lenya – See product evaluation above


For large sites, particularly those that want a CMS that can repurpose content or
communicate in some fashion with other systems, the team recommends the further
development of the open source product, Lenya. Apache Lenya, based on the Apache
Cocoon content management framework and several other Apache projects (such as Forrest,
Lucene, Jarkarta Tomcat) is an open source, full featured content management system that
is currently under active development. Apache Lenya is programmed in cross-platform Java
and its content is stored in an XML repository. Apache Lenya also features customizable
workflow, inline WYSIWYG content editing, and versioning of content. Apache Lenya's
main advantages are its open source code, strong user community, and standards based
authoring.

As an xml product, Lenya meets the user requirements of standards compliance. XML also
offers publishers the ability to structure data in such a way that it can be repurposed for
multiple formats. Finally, xml data can be imported readily into databases, for a measure of
integration with other MIT systems. The team also feels that, while cocoon itself may be a
new environment for IS&T, apache and java are solid technologies and within MIT’s
existing expertise.

Adopting Lenya would require significant development and ongoing customer support on
the part of IS&T. Nevertheless, in recognizing the scale and importance of CMS needs
within the MIT community, it is appropriate for IS&T to develop expertise in CMS
technology in-house so that the varied and special needs of the community may be better
served. Priority tasks for development of Lenya include smoothing the user interface and
attaching a database as a content repository for potential high-performance dynamic sites.
The team believes that xml is a forward-looking technology, whose predicted long life and
flexibility will justify the long-term commitment of resources to development and support.
Should IS&T desire the services of an outside vendor to collaborate in the development of
Lenya, Quoin, located in Boston (http://www.quoininc.com), stands ready to work with
MIT toward that goal.

1. For any size site -- An alternative business model to #1

Atomz Publish
For those sites who want the flexibility of xml and an easy interface, but choose not to wait
for development of Lenya, the team recommends Atomz Publish from Atomz Corporation.
Atomz Publish is a mature, commercial content management system based on a hosted
Application Service Provider business model. It is a full-featured system with customizable

- 24 -
workflow, "browse to edit" and inline WYSIWYG page editing, unlimited versioning of
content and cross-platform compatibility. Development and asset content is stored in a
XML repository on Atomz servers, while published content resides on the site owner’s web
server or Athena locker. Atomz Publish's main advantage is its ease of use and low learning
curve, excellent documentation, and superior customer support.

This model is appropriate for those community sites able to support the ongoing
subscription costs of the ASP business model. Pricing is based on a user account and page
count formula. Contracts with Atomz should be negotiated, and managed through a
centralized IS&T service, so that the MIT community as a whole spends its Atomz dollars
efficiently, getting more service through coordinated volume.

An area requiring further investigation is the determination of whether or not Atomz can
scale to the volume that the MIT community could potentially bring to it. There are
performance issues with the instance currently in production for the Technology Review,
though it is unknown whether the performance slowdown is due to TR’s network
infrastructure, or to some aspect of Atomz’ service delivery. This must be determined, and
if indicated and feasible, the team recommends that MIT explore, as part of a contract with
Atomz, the option of hosting Atomz web servers on the MIT network, with Atomz retaining
all ownership and management of the Publish software itself. Securing the rights to the
source code should Atomz not survive would be a wise precaution as well.

Because there are contingencies and unknowns around the viability of Atomz as an
enterprise-wide CMS solution for the MIT environment, we make our second
recommendation conditional on further research by IS&T. Regardless of the outcome, we
urge IS&T to recommend Atomz as an option to individual community customers as
indicated in initial consultation of customers’ business needs.

3. Small site recommendation


A. Tools for sites that use Athena lockers

Five issues stand out for those DLC’s currently not using a CMS:
• Absence of Site administrator and/or content author skill sets
• Ability to support site hosting equipment and software
• Flexibility in site templates
• Flexible workflow
• Dynamic site hosting

The first and foremost burden weighing down the administrators of DLC web sites is the
difficulty of getting content from their authors in any kind of web-ready format.
Frequently, this is a consequence of authors who are inexperienced in web publishing, both
in terms of writing style as well as appropriate file format. Lack of web expertise among
the authors and business process decision-makers often leads to the unfortunate
combination of undercooked content being handed off to overextended administrators who

- 25 -
have to learn on the fly how to be webmasters in an unstructured environment. Both the
web site administrator and content authoring skill sets have been undervalued in the MIT
community, leading to web sites that are managed by insufficiently supported, often
frustrated staff.

Many site administrators in the MIT community are constrained to host their web sites in
an Athena locker. They have neither the human nor financial resources to maintain their
own web server equipment and software. Therefore, they are limited to whatever services
are offered by IS&T. However, IS&T has a lot to offer, and many of its tools often go
underutilized. Additionally, IS&T can introduce more features, making Athena/AFS an
even more viable service, filling some of the community’s CMS needs which are currently
going unmet.

Based on feedback from the various CMS focus groups, potential IS&T customers are
seeking some CMS features more than others, and in fact, would reject some of the
constraints that would be imposed by a CMS. This is an opportunity to make Athena/AFS
more attractive as a hosting service. Many DLC’s who can’t afford to purchase, implement
and support a CMS have chosen instead to sink the resources they do have into attractive
site templates designed by external vendors, often containing scripting that would be
incompatible with many CMS applications. These template designs offer great visual value
for the expense and satisfy the business owners, and in some cases meet the business
requirement for web publications to match the look and feel of print publications. Giving
up the templates in which DLC’s have already invested significant resources is a firm no-
go for many site owners.

Another common CMS feature leaving much to be desired by many site owners is rigid
workflow functionality. Since many small sites either have 1 or few site administrators, a
complex approval system is counter-productive. Even those sites with a larger team of
editors, managers, authors, and approvers would require a workflow that allows flexibility
and end-runs around the system. Rather than a complex workflow, these sites’ business
processes seek either a simple supporting technical process, or one that is infinitely
adaptable to their own site-specific processes.

While many of the major issues for this particular customer base can be resolved by
enhancing existing Athena services, hosting a dynamic site on Athena is still not possible.
Those customers requiring a dynamic site must commit resources to supporting their own
network or cost-sharing a CMS service offered by IS&T, upon IS&T CMS consultation.

Hosting a site on Athena can be made more attractive to this customer base by compiling a
toolbox of functionalities, and training site administrators and authors on their effective
use. A small amount of development would go a long way in streamlining site
maintenance, particularly in offering web interfaces to Athena tools currently only
available through command line interfaces. Additionally, expanded training and support on
selected topics would reduce the learning curve for both content authors and site

- 26 -
administrators. With an effective, intuitive authoring tool, authors can be less dependent on
administrators to clean up their content, and administrators can be freed up to perform more
appropriate site management tasks.

Recommendations for serving small budget or static sites:


Based on our findings, the CMS Discovery Project team recommends the following:

Volume License:
• Contribute 3 (see above for product evaluation)

Develop or enable, and promote:


• AFS web gui for managing acls, files, and some versioning -- explore a suitable
WebDAV product, such as Xythos, or develop web interface for CVS or RCS
• Test and offer newer cgiemail version and other new approved cgi scripts
• Custom error pages to reduce redirect proliferation
• IS&T server with blogging tool
• An effective link checker, such as Xenu or Dreamweaver's built-in site link checker utility
• Web developer's utilities, such as the AIS IE toolbar, Checky for Mozilla, and the Firefox
web developers toolbar
• RSS

Training: Athena Web-Site Hosting Information Technology certificate, encompassing


sessions on the following topics:
1. Assessing and Improving Your Web Publishing Business Process
2. Authoring with Contribute 3
3. Managing Contribute 3 Users and Web Sites
4. Implementing a Custom Events Calendar
5. CSS: proper table-free page layout, efficient formatting, printer-friendly versioning
6. Accessibility and X/HTML code standards
7. Web writing style: bullets vs. narrative, inverted pyramid style, succinct language
8. Using DW MX 7 templates and Contribute 3 effectively for site maintenance
9. Server Side Includes for easy maintenance of footers, navigation, and content
repurposing.

- 27 -
VII. The Task to IS&T

This is an opportunity for IS&T to improve and expand its current web-related services.
Some of these services are a fix to what's already in place as free offerings to the
community; others can become part of a tiered system of fee-paid services.

1. Create IS&T CMS Services team that includes web development and business
process consultation, Lenya product development, template design, training, product
support, and hosting/network management. Because there is a great deal of overlap
(and significant distinctions) between content management and knowledgebase
technology, this team could productively share resources and collaborative services
with a Knowledgebase team.

2. Consultation services. For all size sites and DLC's, IS&T should provide
consultation and recommendations on the right product or service for the customers.
A set of guidelines (See Appendix H) for discussion and decision-making should be
provided to facilitate this crucial first step.

3. Put Contribute 3 through the SWRT and volume licensing processes.

4. Develop Lenya functionalities. The following development work will be required:

• Editing can be erratic and validation unpredictable; needs to be more consistent;


some of this is training-related.
• Add web-editable css to file menu, instead of editing files on server.
• Documentation for end-user, maintainer/admin
• Better image/asset selection in authoring mode.
• Improve browser-based html editors
• Import pre-loaded MIT users and groups.
• Finish MIT cert integration, automatic login. Allow passwords, too.

Desirable additional development for Lenya:

• Lucene search engine integrated into Lenya


• Workflow management
• Cocoon framework allows development of dynamic components incorporating
jsp's, database connectivity, velocity templates, portal functionality, etc.
• WebDAV fosters offline editing with Dreamweaver or other desktop editing
tools, including MSWord, OpenOffice, etc.

- 28 -
5. For both Lenya and Contribute, training must be part of the services offered by
IS&T, running the full spectrum of step-by-step how-to’s with screen shots,
quickstarts, hands-on training, and custom consulting. Creating documentation is
an absolute must for both Lenya and Contribute. Quickstarts can be offered for
authoring with both products. Fee-paid services and training can include proper
design and implementation of templates. Additional revenue can be garnered through
hosting services, data migration assistance, and consultation for DLC's who choose to
do their own hosting.

6. Investigate the feasibility of Atomz as an enterprise offering, while maintaining its


ASP model and excellent customer service. IS&T should explore Atomz performance
capabilities, considering the possibility of support an Atomz implementation on the
MIT network while preserving the application ownership by Atomz. Should Atomz
prove a viable service, IS&T should cost effectively manage the community contracts
with Atomz, allowing the greatest value for the cumulative MIT dollars. IS&T should
also work out a contractual arrangement that grants rights to the source code in case
of Atomz' failure to thrive.

7. Develop new tools for Athena locker owners and promote them. Making Athena
easier to use for site administrators who are unfamiliar with linux will allow users to
take fuller advantage of the great service that Athena is and will give the small site
owners with small budgets CMS-like functionalities without the CMS investment. A
few examples:

• Looking into webDAV offers the prospect of more intuitive file management and
could provide some measure of versioning. If webDAV is not the answer, then
developing a web interface for RCS or CVS may be the right path.
• Review and release a more recent version of cgiemail. There are added
functionalities and improved security in the later versions.

For more detail, see Small site recommendation.

8. The WebPub user group and listserv must become a much more proactive
service to the MIT community. Many of the small site recommendations would
make excellent topics for monthly meetings and would promote existing services and
features that many site owners do not know are already available, such as SSI or
integrating the event calendar. Additionally, it would provide a venue for promoting
newly developed tools, like newly approved perl scripts. Focus group participants
stated that, had they known that applications like Tech Time and the Event Calendar
were in development, they would not have tried to build their own comparable
applications. Keeping the community informed of development efforts can save DLC
labor and would gain much goodwill in the community. The demand for improved
service and communication is out there. The webpub list is large, and is just the right
vehicle to meet that demand.

- 29 -
Appendices

- 30 -
Appendix A. Project Charter

a. Project Justification
A content management system (CMS) is a set of tools that integrate and automate the various phases
of a publishing enterprise. It allows content authors to go on line to create and update their own
sections of a collaborative publication, accommodates quality control by editorial staff and provides
tools for workflow management, including version tracking, messaging, and varying levels of content
approval. Once entered into the system and validated, new and updated content can be edited and
prepared for publication in appropriate media, including print, Web, and CD-ROM. A CMS might
also facilitate layout and production by automatically formatting content for the target medium or by
preparing content for autoformatting (to the fullest possible extent) in allied applications.

A number of administrative offices and academic departments across the MIT campus have a need for
such a system. OpenCourseWare has adopted one, at least for the near term; the Reference
Publications Office has another; and various Web sites (e.g., MIT World) are also using content
management systems. The purpose of this project is to determine whether there is value in deploying
such a system or systems at an enterprise level.

b. Discovery Questions
How widespread is the need for or interest in a CMS?
What would a CMS need to do in order to accommodate the workflow of the various administrative
and academic publishers?
Are there sufficient common needs that one or more systems deployed at an enterprise level would be
more efficient than present arrangements?
Can currently deployed systems be expanded or adapted to accommodate other users?
How would an enterprise-level CMS be financed, if recommended?

c. Approach to the Work


It is unlikely that a single product can meet the workflow and publishing requirements of all potential
users. The team should therefore consider a range of products that address a range of needs. This can
be done by surveying the needs of various departments and offices within the MIT community
through interviews and focus groups; by consulting with those offices that have already adopted a
CMS, to build on what they have learned; and by evaluating commercial and open-source CMS
products to identify the solutions that most closely meet, or can most readily be adapted to meet,
Institute requirements.

d. Expected outcomes
A survey of current CMS’s on campus and their types and use.
One or more lists of functional and technical requirements for a CMS.
An evaluation of costs and benefits, in terms of potential time and workload savings for the various
offices that would use a CMS.
A recommendation, with cost estimate, for purchasing or building products that meet these
requirements, or can be modified to meet them.
If an enterprise CMS or CMSs would add value, then:
 A plan for funding the purchase of an appropriate CMS.
 A plan for documentation and training.
 A plan for maintenance and support.

- 31 -
Appendix B. The Survey

About your existing content:

1. What kind of data do you need to publish? Check all that apply:

Narrative copy Brochures

Table-style data (directory listings, etc.) Other (specify):

Forms
2. In what file formats does your content typically originate?

Paper documents Existing web pages

MS Word docs Database (dynamic or export files)

Quark, Pagemaker, or other Desktop Portable Document Format (pdf)


Publishing format
Other (specify):
Image files
3a. Do you publish some content in both web and print versions?

Yes No
3b. If "Yes," which ones?

MS Word docs Portable Document Format (pdf)

Quark, Pagemaker, or other Desktop Other (specify):


Publishing format
4. What is your web site publication cycle? (How often might a document be updated?)

Daily/Constantly Academic Term or Semi-Annually

Weekly Annually

Monthly

Tell us about your primary web site structure and support:

5a. Approx. how many pages: 5b. URL:


6. How much staffing to maintain? Please enter number of individuals; does not have to be full-
time.

Web site administrator System administrator Content authors

7. Approximate person-hours per week to maintain:


8. Is your site mostly static or dynamic or both?

Dynamic (database driven) Static Both


9. How is your web site hosted?

- 32 -
Maintain own web server hardware and Hosted by I/S services
software
Other:
Athena locker
10. Would you be willing to invest resources in web site management or hosting services?

Yes No

About Content Management (definition of CMS):


11. Do you currently use a Content Management System product?

Yes No
12. If yes, rate how much you like it on a scale of 1-5, where 1 is yucky and 5 is awesome:

1 2 3 4 5 Product Name:
13. Do you have someone who centrally approves content before it's published or do your
individual authors write and publish on their own authority? Describe your content approval
process:

14. What is currently the most difficult aspect of managing your web site?

15. What benefit do you most hope a content management system could provide for you?

16. Many CMS products require the use of templates. You may be able to customize the overall look and
feel of the site, but the flexibility an author has in the layout of individual pages may be limited to varying
degrees.
How important to you is the ability to control content layout at the page level?

1 Not very important 2 3 4 5 Extremely important


17. What else would you like to tell us about your web publishing needs?

- 33 -
Whether or not you currently use a CMS, please tell us a little about yourself:
First Name: Last Name: Email:

May we contact you for further


Title: Department:
discussion?

Please check for Yes:


Your web site role/responsibilities:

System Administrator Graphic Designer

Programmer/Coder Content Approver

Content Author/Writer Other role(s):


Submit

- 34 -
Appendix C. Product Matrix

Open Source (more or less)


zope/plone lenya OpenACS SiteMaker RTFM
Platform python; Version 1.0 RC1, Requires java, apache, Almost any Perl
apache;Zope's java-based; AOLserver WebObjects, platform: Unix,
object database, apache cocoon; (open source any db windoze, MacOS
ZODB but can uses xml and xslt webserver), X. Solaris works
connect to other extensively; TCL (open well.
db's through formerly known source) and
odbc; as "Wyona either
CMS"; Should PostgreSQL
work well on any (OSS) or
platform that can Oracle; runs
support apache, on most any
tomcat, cocoon, Unix platform
etc. Can be including
deployed on Unix Solaris and
or Windows. Linux.
Licensing/ GPL. Zope.com Apache License OpenACS Developed by GNU General
Business can be consulted v.1.1/Main office source is UMich, but Public License,
Model for a charge. is located in under GPL. licensing sold version 2.
Plone also under Switzerland; AOLserver is to GVC who Consulting/Custom
the GPL, but can Support available under Mozilla does all development
be licensed if through license; and development available.
desired. wyona.com PostgreSQL is and resells the
Zope4edu a under BSD product.
zope.com/Duke license. Development
project, does not here would
seem to a public have strings,
license, but I am though could
not sure. This be done if only
may be worth for our own
inquiry on use. Also
licensing and offers ASP.
progress.
Single Site Zope can do this Enterprise. Either. Enterprise. Easy to
vs. with out problem. The software isolate "classes"
Enterprise Easy enough to components and even multiple
assign groups to can handle instances running
different sites. high loads, different versions,
Most done and OpenACS on one server
through a GUI. If easily host.
you want to use supports
Vhosts, then this multiple
is best done independent
through Apache "subsites" and
from my hostname-
experience. based "virtual
servers". (all

- 35 -
administered
through the
GUI)
Integration There are a few None. None, except None. None. (this is a
with MS modules out there documents good thing.)
Office for this. One that can be stored
seems to be the for
best and leading downloading
is MSWord as opaque
Document and format.
the other is
wvWare. Both
haven't been
updated in some
time. I have
never used them,
but there seems
to be allot of
people who have.
There was a
person by the
name of Ross
Lazerus over at
Harvard that
worked with it a
while back, he
may be a good
contact.
Authoring Uses DHTML and Works with any Any modern Works with any
browser Python. There are modern browser. Enter modern browser
platforms a few wysiwig browser;supports HTML or plain (requires CSS).
editors available, inline wysiwig text in
and you can editors. Site TEXTAREA
configure external editors do not boxes, or
editors to have to learn upload local
interface with xml; For files.
plone/zope. The developers EditThisPage
one with plone is plugin available module allows
rumored to be for Eclipse java previewing.
decent, but again, IDE
I have never tried
to install it or use
it. Java authoring
add-on allows
cross-platform.
Ease of The beauty of Unknown Extensive Can use Hard right now,
templating Zope. To change, template templates, requires authoring
update, and build language including Mason pages.
templates is similar to those with (Mason is like JSP
really easy. You Server-Side javascript. for Perl.) Could be
do need to get Includes Occasional as simple as we

- 36 -
used to the (SSI), workarounds want to make it.
structure of Zope supports are required,
to find it easy. variable particularly
Python is substitution, with
occasionally conditionals, navigational
necessary if you iteration, etc. elements as
want to get Templates are some aspects
detailed, but you hierarchical so of navigation
can get by with look & feel can are
dhtml, which be dictated by "dynamically"
should not come one top level controlled
difficult to template. Only through the
someone who is downside is authoring
accustomed to most modules interface.
maintaining a require editing
site. Templates templates on
can use server.
javascript, if
desired.
Code base I would imagine Makes heavy use OpenACS Can sit on There is
consistent that this would of cocoon xml evolved from almost any considerable local
with MIT not be a problem. publishing ArsDigita database, and expertise in RT
community The learning framework;built Community the rendering (the companion
skill sets curve with Zope around off-the- System (ACS) is through tracking system to
and Plone is shelf created by java, but the RTFM). Large and
pretty components from MIT framework is active worldwide
intimidating at the apache community proprietary user community is
first, but after software stack member WebObjects. also helpful.
spending a few (e.g. cocoon, (Greenspun).
hours with it, it tomcat 1.4, java, Although
becomes pretty Xalan, Xerces, some
straight forward. I etc.), so should components
do not know if be within MIT's are a bit
there are any reach; curious to exotic
experts in Python know if anyone is (AOLserver,
here at MIT, but I already using TCL) there are
truly believe you Cocoon on groups using it
can get by just on campus? now on
DHTML. campus.
SloanSpace is
major
OpenACS
developer.
How much I believe quite a XML-centric Probably half- Lots of
code bit at first. It is architecture, FTE devoted development and
mucking / pretty labor startup may be to maintenance
maintenance intensive to get a complicated. maintainence, needed in the near
needed production What's the pace but it would future since RTFM
product. By no of development be fun :-). is "young". Should
ways a ready out bug-fixes, Some initial be more stable in
of the box addding new development a year or so. It is

- 37 -
product. But with features) in the and used in production
templates Lenya open documentation now in places.
provided to the source needed.
customer from a community?
service provider, I
believe this is a
easily maintained
product.
Access If we can tap it Administrative Sophisticated Can create Very good; Fine-
control into Moira, then interface allows fine-grained user groups, grained
simple. We advanced users hierarchical but not fine- permissions and
authenticate it to monitor the permission grained ability to configure
against CMS and perform system control. groups as roles.
LDAP/Active configuration designed to Permissions
Directory in Econ, tasks make it easy are site wide,
so I do not think to build with the
this would be a communities. exception of
problem or very data tables.
difficult.
Security Works through Built on Apache Excellent. In Quite good; also it
reputation Apache... software stack production use is built on a
in many quality foundation
academic (Apache, Perl 5.8,
environments. rdbms).
(see
DotLRN.org)
Output to PDF None in Whatever you
non-html general; some want to code in
formats in certain Perl and Mason.
modules.
Performance On my small site Demo sites Reasonably fast at
we have had are fast; my serving pages. Its
great kludged foundation, RT,
performance. I untuned demo has proven robust
have read about is very quick. in very large
no preformance installations.
related issues.
Integration can do Other than ssl AOLserver has Already Good; integrates
with support through module to developed to cleanly with MIT
kerb/certs apache, integrate use kerberos Personal Web
(single sign unknown; will OpenSSL (like authentication. Certificates.
on) probably require MIT Apache)
development by but a little
MIT work is
needed. Looks
quite possible
(and we can
get help from
Sloan if they
haven't done

- 38 -
it already).
Cost free unless Free Free. Nothing but our
(ballpark) Zope.com is time. Some cost
consulted. for MySQL
accessories (hot
backup tool) if we
choose to use
them.
Comments No sftp --would have special feature: Much more Seems to work See BP's demo of
to build interface output to PDF than a CMS. best as an a simple
using SSL. (benefit of extensive OpenACS has umbrella for presentation front-
Java wysiwyg add-on xml support)
a large small sites, end. Currently
meets platform needs Features: Revision
Control, Scheduling,
number of e.g., faculty RTFM is mostly a
Can work with any
programming/scripting a optional sites within a powerful
language plug-in, so built-in Search modules department framework and
not locked into Engine, seperate including that could toolkit. Has a
python. Staging Areas, and
Workflow; uses
several inherit the pretty good
Templates not XML/XSLT throughout different kinds template. administrative UI
constrained to vanilla
Concerns: barriers to of CMS, for forms-based
look and feel.
entry, May be initially workflow, authoring and
complex to support; surveys, editing of
how much of Cocoon
do we need to know
ecommerce, "documents". The
to support/extend etc. "customer" UI
Lenya? It would be a must be developed
How big is the Lenya valuable resource (custom) for each
community?? Doe it to have on site but a
have momentum or campus.
generalized
relatively small- OpenACS has a
scale? sizable and active
solution is possible
In use at the user/developer if we develop it.
University of community.
Switzerland Biggest weakness
is the terrible
documentation.
We would need to
provide better
end-user
documentation to
deploy it here.

Small Site Tools


Contribute Manila (Frontier?)
Platform Win 98, SE, 2000, XP; Mac OS X Frontier is built around an
10.1.5 and later integrated object database.
Frontier has a built-in Web
server, but can also work with
IIS, Apache and other Web
servers using static rendering in
Manila.
Licensing/ Business Model The licensing includes an auto-
updater for product updates.
Single Site vs. Enterprise single or site license license is per URL

- 39 -
Integration with MS Office yes, but different for Mac and PC. None. But you can upload Office
Mac may only attach docs as files or (or any other files
links
Authoring browser Mac functions limited WYSIWYG authoring, except for
platforms Netscape on PC, and Safari and
IE on Mac.
Ease of templating supposedly Select from "themes" or paste in
your own html/javascript/etc.
Code base consistent with
yes
MIT community skill sets
How much code mucking /
remains to be seen
maintenance needed
Access control yes
Security reputation fair: uses SFTP; password
protection now in place on app start
up because Contribute auto
connects to all sites at start up
Output to non-html formats no; Contribute will open a doc in None, except a machine-readable
another app which can then be export file.
uploaded
Performance Offers full text search, but it
degrades performance. Not used
at HLS. Searching is done by
date instead. Slashdot can crash
it. To check performance you can
check the daily hit rankings to
get a feel for hit rate it can
sustain (check blogs.salon.com).
Integration with kerb/certs yes; exception is no cert support in
(single sign on) Mac client
Cost (ballpark) $79 per single academic license The price is $899 per year for
each subscription; academic
pricing for qualified academic
users. According to HLS, cost is
$300 academic pricing per
site/URL.
Comments May be good for interim use till blogging tool grown into cms; can't be
fuller product is purchased or used with Athena unless you enable static
developed. rendering (sftp?).
Offers RSS syndication.

- 40 -
Appendix D. SiteMaker Evaluation

SiteMaker is a java/WebObjects/sql-db product that is not quite open-source, not quite


proprietary. It was developed at the University of Michigan and licensing rights were sold to
GVC. Purchase of a license runs in the $15,000 neighborhood. There is a provision that allows
educational institutions to buy a license that includes source code access (in the $50K ballpark).
This would essentially grants all rights to source code, although no royalty rights.

Site Management: This product lends itself to the management of small sites. The site
author/administrator can add new “Sections” which are the equivalents of site pages. The flat
directory structure, however, quickly makes managing anything but the smallest site a
cumbersome process. Uploaded files, including site images, are displayed laundry list style,
forcing the user to scroll through all files associated with the site whenever editing or managing
the site. There is no conceptual or actual hierarchy to the site architecture, and is only for very
small sites. The only large site configuration for which this tool might be appropriate would be a
collection of small subsites that can inherit a style from a parent umbrella site. The developer has
indicated that a virtual directory structure will be part of the next product upgrade, and that file
management capabilities will be offered through the incorporation of WebDAV. This might
enable the product to handle sites with more than a handful of pages.
Incorporating a template and hosting the sites might fall under fee paid services offered by IS&T.
For those small site owners or departments without technical human resources, such a service at
a reasonable cost may be desired. SiteMaker offers virtual hosting and could be configured for
static rendering to Athena lockers, so that URL’s for existing sites may be preserved.

Content Authoring: SiteMaker has an authoring limitation that is shared to some degree by all
products we reviewed, with the exception of Contribute: the WYSIWYG vs. cut-and-paste
approach for marking up content. In Sitemaker, there is a WYSIWYG authoring interface that
does not require the user to understand html. However, the non-html savvy user is limited to
simple content –only straight narrative. Sitemaker will insert <p> tags and will allow the user to
select the font style and color for highlighted text. In fact, changing the font attributes results in
non-compliant html, inserting deprecated font tags which will override any associated style
sheet. Therefore, SiteMaker’s WYSIWYG editor is actually counterproductive to good html
authoring. It should be noted that the developer himself does not recommend the built-in editor,
and recommends cutting and pasting tagged content from another editor. Authoring with
SiteMaker should be done in conjunction with Dreamweaver, or with the incorporation of an in-
line third-party editing tool, such as EditLive.

Another difficulty for the content author is the lack of an intuitive interface for referencing site
files within anchored hyperlinks. Determining the path to include in the link’s URL is at best a
workaround that requires right-clicking, copying, toggling, and pasting between windows.
Clumsy methods of capturing URLS for links is not exclusive to SiteMaker, but it is an issue that
must be addressed for any CMS product to consider itself user-friendly.

With all these limitations, why would a site owner in the MIT community choose this product
over authoring a site with Dreamweaver and hosting it in an Athena locker? SiteMaker does
offer one unique feature that might compensate for other lacks, especially if those lacks will be

- 41 -
adequately addressed in future product releases. The unique feature is the incorporation of data
tables that can be served to site visitors. This allows for interactivity and can be used to great
effect, e.g., students viewing a faculty member’s real-time office hours schedule, and signing up
for open slots on the spot. Users seeking to reserve meeting rooms could do so from any location
institute-wide.

Please refer to the Issues document (Appendix E) for further details on requested features for
future releases. Many of these suggested features have already been incorporated into the
development plans for version 3.5 of SiteMaker.

Conclusion: SiteMaker has much to offer as a tool, but like all cms-type products, it has its
limitations as well. The authoring interface needs to be enhanced with either more functionality
or access to a third-party html editor. And until the developers can offer a hierarchical site
structure, or at least the semblance of one, it is not practical for any but the smallest sites.
SiteMaker is probably bested suited for those sites that wish to umbrella many small sites that
share a look and feel, such as a department web site that houses individual faculty sites. On the
plus side, it offers significant value to those small sites that might wish to serve structured
content in an interactive manner, through use of its data table feature.

- 42 -
Appendix E. SiteMaker Issues List

Priority Issue Response


High File Enhancement Package (FEP) in next release
will include:
Can this be authenticated through Kerberos? HTTP file access using webdav will allow authors to
mount the site volume.
Directory structure for improved file management
A new sub-type for the Links section type will
display files through hyperlinks
New section type which allows the author to specify
an html file that holds tagged content (allowing
authoring in DW without needing to copy and paste
back into SM.
Authoring:
High – Accessibility On the upload image page, could “File Description” be This can be added.
issue. used for alt text?
High On the upload file pages, it would be nice to designate Accomplished in FEP.
the applicable section.
Display - Medium; Edit - On the revise file screen, it would be nice if the Could display, but editing the name could create
Low filename of the file you are edited is displayed, and problems.
could be edited.
High How can a team preview a site? This can be added by giving the author the ability to
enter a list of email addresses. Recipients will
receive a link with a token that will allow them to
view the page for an author-specified length of time.
? Given that content may end up being tagged and No such utility exists in any CMS.
pasted from another application, can a
html/javascript/? de-bugger utility be added to detect
product-specifc (e.g., MM) tags and paths?
Medium to Low How to meta-tag? Can we incorporate a search Meta-tagging can be added, which will make MIT’s
engine beyond what’s built in for data tables? googling effective.
High – user-friendly How to easily attach an html editor? Could the The existing WYSIWYG does not work very well
authoring in addition to WYSIWYG be expanded to include un/ordered lists and it does use font tags. A more effective
DW is a necessity. and bolding? And can the WYSIWYG elements be approach might be to incorporate a third-party in-
turned off if we don’t want authors overriding style place editor (like Lenya does). All browser-based
rules. editors will introduce browser/platform issues.
High The authoring textarea box must be much larger,
given that there’s only one place to enter content.
Site Organization:
High – user-friendly The path is not obvious for putting links to site files Currently, the workaround is to right click the file
authoring and file into content. Maybe have the “File Description” field from the main editing page, and then paste the path
management is a specify hyperlink text that you can later reference in into the section content in question. One possibility
necessity. content, and not have to worry about the path (a la worth pursuing is adding the ability to select from a
Manila)? pull-down list of site files to the WYSIWYG which
will then insert the anchor tag wherever the cursor is
positioned in the content. Another option is
exploring the Manila model.
High Organize uploaded files into their appropriate FEP
sections, with subdirs for images. The flat laundry list
is very difficult to work with.
High How to make a new page (not tied to navigation) Going to stew on this one.
within a section is not intuitive. This is really about
having subordinate pages within a section.

- 43 -
High How to use more than one template per site (not This can be added by providing a pop-up list of
counting data table templates)? templates on the section editing form.
High Can you use CSS? Currently you can incorporate in-line style tags, but
you could conceivably have an external style file
that is referenced in the template.
It would be nice to be able to change the section type No way (understandably). This becomes a training
after it’s been created because you often have to set issue.
the template/navigation before or in parallel with
developing content.
How large a site can SiteMaker handle? What about This is a hardware issue, not a limitation of SM.
performance? The only hit to performance would be serving the
data tables.
High Can the pages be rendered statically to a locker or Yes, for all site pages except data table sections.
other server off the SiteMaker web app server? If not, Can use virtual hosting to preserve URLs.
is there another way to simulate/preserve existing
sites’ URL’s?
Medium-High Horizontal navigation. Already proposed.
Tables:
It’s hard to map fields for importing if you can’t see the
order of fields coming in.
Date-time input mask hint would be nice. In fact,
importing dates is a misery. Can there be a simple
date type?
Having the Import button at the top doesn’t send the
message that you must click at the end. In fact the
order of tasks is exactly opposite to the display order,
as the file path disappears if you enter it before the
other things.
How do I display table records within a published
section page after the section has already been
created? Do I have to delete and recreate the section
to preserve the nav?
Editing table rows with restricted access can only be
accomplished through “submit and view” from the
appropriate section. Is that correct? Can there be a
“hidden” navigation displayed for groups?
Thoughts:
“Send message” features can be used as phony
workflow for approval/lifecycling.
Table data types include “file” which is useful for
managing news releases, etc., or is that better served
by the links section type?
I like the appointments example of the data table.
So, would this be suitable for academic departments
with lists of events, publications, and faculty. Not for
the unsavvy administrator. Would the FL model work
for IS service?

- 44 -
Appendix F. Plone Evaluation

Overview

Plone is built atop the Zope web application development frameworks. Plone leverages the
existing content management framework (CMF) within Zope, adds numerous features and a
user-friendly interface to make it into a very capable out-of-the-box CMS solution.

By default, Plone comes ready to run as a web-portal (like Yahoo) with built in calendars,
feedback forms, public logins, etc. Plone outputs valid xhtml and tries to adhere to the latest web
usability standards as much as possible. Beyond this one may customize (or "skin") the Plone UI
to better suit any kind of content environment (portal or non-portal). Functionality that is not
needed for a more static site may be turned on or off, etc. Much of the appearance configuration
options is XML-driven. In addition, all of the normal web application development tools
available through Zope may be used to add or extend functionality in Plone. Zope includes its
own application server so there is no need to use the apache webserver although Zope/Plone is
fully compatible with Apache. Zope also includes an integrated database, ZODB, which is used
for revision control by the content management framework.

For administrative tasks Plone makes use of the underlying Zope Management interface (ZMI) to
offer a suite of advanced tools for site, user account, and style management. There are many
options for customizing the look-and-feel of Plone sites. As open-source projects go, the overall
Plone administrative user interface seems more complete than most. However, the number of
administrative and configuration options can be difficult to navigate. It leaves one with the
impression of having too many features grafted on top of each other in a somewhat uncontrolled
fashion. In other words, the administrative and configuration option learning curve seems fairly
steep. As Zope is a relatively mature open source project, the sheer number of administrative
and configuration options available through the ZMI looks impressive but using them may be
another story.

The administrative interface could be more consistent and easier to use, While Zope no doubt is
a powerful framework and has matured over time with a large number of contributed modules
and new technologies but using the ZMI has a somewhat bloated, legacy, feeling. However,
since the ZMI is the primary management interface this leaves us feeling somewhat less enthused
about our ability to effectively manage Zope in the MIT environment.

On the more positive side, there appears to be a very good in-browser editor called Epoz, which
works surprisingly well. The validation is set up more loosely than with Lenya, and performance
was more than satisfactory. Zope, as is well known, is based on python, and may be used to
extend Plone as well as to write general purpose web applications.

Finally, Plone benefits from a large and active user community that has led to a feature rich and
rapidly evolving environment.

- 45 -
Features

• Good support for large or small sites


• Personalization and community features are there if needed, but may easily be disabled if
not
• WYSIWIG (e.g. Epoz) or simple form-based content editing
• XML-based UI tools (controls view, forms, portlets, etc.)
• Version control through integrated ZODB database; can also use Oracle, or other
database if desired; XML and compressed data format import/export available
• Compatible with Apache web server
• Python programming environment for development; other languages may also be used
(e.g. perl)
• Supports wide range of authentication and security options (user ACL's, LDAP, other)
• Ability to create own directory structure through menu options, e.g. css, js, etc. files of
content authors choosing
• WebDAV support
• Static and dynamic content publishing
• Supports Zope page templating language
• Workflow
• Scalability: load balancing, caching, ZEO (Zope Enterprise Objects) and other techniques
to improve performance

Presentation/Templating

From the markup contract:


http://plone.org/development/teams/ui/p2uicookbook/TheMarkupContract:

Component developers can create all manner of views, forms, and portlets for their content-types
and tools that can be "slotted" into any Plone 2 site. At the same time, designers can take a
design and produce a skin using just CSS which will also work on all Plone 2 sites. The aspect of
the Plone 2 UI that makes this possible is the Markup Contract.

The Markup Contract is a standardized structure for the XML that makes up the Plone UI. This
standard consists of not just tag nesting but also IDs and classes. The contract specifies what
information is present in the XML, and how this information is structured. Included in the
standard are global constructs, such as the portal logo or columns, and local widgets such as the
document description or a simple form field.

Documentation

Plone documentation is well organized and while it is relatively more extensive than many open
source alternatives it is still lacking in completeness. One must regularly consult a mix of online
documentation (e.g. Plone Book), published material, and listserv archives. The good thing is
the community is large and there are likely answers to most questions but can take some effort to
piece together complete answers.

- 46 -
Additional Work

• There was no time to fully convert one of our sample MIT sites to Plone.
• Create page templates
• Turn off portal features to get a better idea of styling options in non-portal mode
• Integration with certificates, probably done the standard apache way

Comments (Advantages)

• Easy to install
• In-browser editing content easy using Epoz (browser specific)
• Many add-on modules which can be installed to extend functionality
• Plone seems more "complete", polished, at this point than many of the other systems we
have reviewed so far.
• Plone is more than Zope
• Python for development

Disadvantages

• Although Python, an object-oriented scripting language, is little-used by IS&T, it is not


difficult to learn.
• Zope Management Interface needs improvement.
• Learning curve may be high

Questions:

• Review underlying architecture. Is the underlying architecture sound?


• Future directions: is Plone being developed in a coherent way?
• Zope templates, how far does it enforce the separation of concerns?

Sites:

Plone sites: (http://plone.org/about/sites)

- 47 -
Appendix G. NCSU Lenya Questions and Answers

Systems and Operations


In our operational environment, we have applications in perl, Cold Fusion, php, and jsp as well
as databases using Oracle, mySQL, and Postgres. Can Lenya (and how would it) handle them?
There are two parts to this answer. How does Cocoon/Lenya CMS let you write documents that
interface to these applications at authoring time? Can the underlying Cocoon platform interface to
existing applications written in a variety of languages and databases at run time?
Lenya is only a CMS tool and is not used directly for serving pages at run time. Thus the question of
whether Lenya can handle these existing applications reduces to whether Lenya allows the author to
insert the user interface components into a page at authoring time. For example, can an author
insert into a document the required PHP tags for a registration application written in PHP? The
answer is yes though the degree of integration depends on how XML friendly the interface
components are. For example, JSPs can be written to not be well formed XML. Lenya can certainly
insert a JSP into a document, but if it is not well formed, additional processing by a Cocoon pipeline
may not be possible.
Once a site has been authored, the site can be deployed statically or with any number of server
platforms if there is dynamic functionality. There is no requirement that the deployed site use
Cocoon as the server platform. One compelling reason for using Cocoon as the deployment server is
its ability to inter-operate with a wide variety of existing applications. If the existing application can
produce XML (and most web applications can), it is relatively straightforward to incorporate the
output on the application into the page composition facilities of Cocoon. Interfacing to existing
applications that capture user input can also be handled though there are issues with transaction
semantics and session control that must be addressed. Cocoon also has a range of out of the box
techniques for interfacing to databases which is explained below.
How can one query an external RDBMS through Lenya? How is the RDBMS results set handled
by Lenya?
As with the previous question, this question can be answered for both the authoring phase and the
run time deployment phase.
Lenya natively uses the operating system's file system as its document repository. Each documents
is stored as a single XML file, including the meta information associated with the document.
Currently Lenya does not use a RDBMS for any persistence. However, since Lenya is based on
Cocoon, there are a wide range of options available for querying external RDBMS. The simplest
solution is the use of ESQL (http://cocoon.apache.org/2.1/userdocs/xsp/esql.html) within an XSP
Logicsheet (http://cocoon.apache.org/2.1/userdocs/xsp/logicsheet.html). ESQL is a wrapper around
JDBC so all databases supported by your JDBC driver are supported. The next level up would be to
use native JDBC directly in XSP or Java transformers you write yourself. At the most sophisticated
level, you can use persistence mapping tools such as hibernate (http://www.hibernate.org/) that
provide enterprise scale functionality and performance. In all these cases, result sets are handled by
the facilities of the persistence tool used.
If Cocoon is chosen as the run time application server, you have this same set of database access
methods to implement your dynamic behavior.
Implementation issues - LDAP authentication.
Lenya can manage authentication itself or use and LDAP server.
Structure
What is the file system used?
Lenya uses the standard Java IO library to abstract the file system operations. Therefore, it uses the
native file system provided by the operating system.
How do I use Lenya to create and manage my web site?
This is a large question that will be addressed in the seminar. A brief synopsis is: a development
team will construct a new publication that defines the document types, styles, work flow and
functionality for your web site as part of the setup of Lenya; administrators will add users and
manage their access privileges using Lenya; authors will create document content using Lenya; and
reviewers will reject or publish changes to documents using Lenya.

- 48 -
How do I use Lenya to put content into my web site?
Lenya provides a number of editors to allow you to put or edit content in your web site. This is one
of the distinguishing characteristics of Lenya compared to other CMS systems which provide only a
single editor. Probably the most common approach is to use one of the browser based WYSIWYG
editors which allows you to edit content in a manner similar to that of a word processor. We will be
using the BXE editor as an example of this type of editor in the seminar. Lenya also provides a
forms based editor that allows simple entry of structured content. Lastly, Lenya provides some
facilities for uploading content authored outside of Lenya.
Should I be modifying the cocoon sitemap to use Lenya?
Well, it depends on what you objective is. First, Lenya uses a number of sitemap files to implement
its functionality. The base sitemap file is the base sitemap file provided by Cocoon. Flow of control is
then passed to a number of Lenya specific sitemap files that provide standard CMS functionality
such as work flow, revision control, UI, etc. Finally, each publication will have its own set of sitemap
files that define the functionality and look and feel for each site. Typically, only the publication
sitemap files will be constructed when the publication is initially constructed. Users of Lenya will be
unaware of sitemap files. 5.Can you please help me read and understand your sitemap? Yes, we will
walk through the various sitemap files used by Lenya. To help you understand sitemap processing,
we suggest that you enable debugging level logging in Lenya by modifying the WEB-INF/logkit.xconf
file and examining the logging output in the WEB-INF/logs/sitemap.log file. There is also a sequence
diagram describing the flow of control through the various sitemap files at the Lenya wiki.
How do I tap into all that "PageEnvelope" stuff? Example, how do I put a value into the
"publication-id"? How do I get a value out of the "publication-id"?
The PageEnvelope stuff is implemented with the cocoon input module facility. Modules are an
alternative to URI parsing for passing information to a pipeline, typically session information
Cocoon input modules are defined in the coccon.xconf file. For example, the declaration of the
page-envelope module is:
<input-modules>

<component-instance
name="page-envelope"

class="org.apache.lenya.cms.cocoon.components.modules.input.PageEnvelop
eModule"
logger="sitemap.modules.input.page-envelope"
/>

</input-modules>

Input modules can be referenced in a sitemap file using the {…} syntax. For example, to access the
id of the requested document:

<map:parameter name="documentid" value="{page-


envelope:document-id}"/>

Let's say I need to make a small subset of content elements required on a subset of webpages
(e.g. FAQs pages). Do I need to create a new Document Type for each class of webpage? Or is
there a way to specify a "subtype" of XHTML with required elements?
XHTML-1.1 specifically supports the ability to define subsets of the overall functionality using its
modularization facilities. Lenya supports this modularization using the XML RELAX NG scheme
language. The example default publication that ships with Lenya defines a subset of XHTML.
Is schema validation the only way to implement form validation in Lenya?

- 49 -
Scheme validation is used only for validating that a document is valid according to the XML scheme
it is declared to be an instance of. Lenya uses a RELAX NG validator for both the BXE WYSIWYG and
the forms based editors.
You're question seems to be related to a completely different issue: how user input is validated in a
dynamic application that uses XHTML forms. This has nothing to do with XML schemes and thus
scheme validation is not applicable. Instead, the dynamic application will perform input validation
either in the browser with JavaScript or in the server using whatever language the server supports.
Cocoon has several form processing components which include input validation.
Search engine
How is the Lenya search tool configured? For example, how would I create a search index of
only a specific Document Type?
Lenya is distributed with the Lucene search library but any other search application can be used.
Lucene is not used directly by Lenya, but is instead used during as part of the production delivery
system.
Lucene is highly configurable. It can be easily configured to search only documents that are an
instance of a specific XML scheme. Lucene also supports non-XML documents such as Microsoft
Word and PDF using external applications.
What search options (e.g. stemming, phrasing, stop words) can I configure? Where is this done?
How and when are search indices updated?
Lucene uses its own Query Parser to define matches.
Indexing would normally be done whenever the live site is modified. Lucene has a complete API for
controlling the created index. More detail on indexing can be found in the Lucene FAQ indexing
section.
Lenya supports multi-channel publishing (e.g. publish xhtml doc as a PDF). What is the process
for creating a new publishing output type, for example Atom XML feed. Schema validation is
currently supported with Relax NG XML Syntax. Is Relax NG Compact Syntax supported as
well?
Document types are configured for each publication. The following steps are required:

• The desired documents types for the publication are declared in the configuration
file <pub>/config/doctypes/doctypes.xconf.
• For each declared document type, a RELAX NG scheme is written and placed in
the directory <pub>/config/doctypes/schemes/.

Currently, the BXE editor does not support RELAX NG compact syntax.
Editor
Rich text editor - to plug-in HTMLarea instead of KUPU
Lenya is architected to allow the use of multiple editors. Currently there is support for BXE, Kupu,
and Xopus in-browser editors.. Currently, BXE is the best supported in-browser editor. It is possible
to integrate htmlArea but this will require some effort. There are also issues about the range of
supported XML schemes and the guarantees of validity when using editors other than BXE.
Metadata - need to increase the number of metadata elements
Lenya ships with direct support for the Dublin Core meta data standard. It is straight forward to
support other standards in your specific publications.
If I'm a web page content provider, how can I build a publication that retrieves customized
content from an existing database-driven tool? Can I do this, or must I also involve a web page
designer? Is there a real example in Lenya?
Again, we must first consider if you are referring to accessing the content from the existing
database at authoring time or production time.
For authoring time, Lenya does not provide an out of the box solution to interfacing to existing
databases that would allow an author to insert content into a document they are authoring.

- 50 -
For production time, any application server can be used to serve existing content from a database.
Lenya does not get involved at all. You can use Cocoon as the application server. Cocoon has
extensive facilities for interfacing to databases. Generally, you will need a web graphic designer to
design the page layout and look and feel. Cocoon has several examples that shows database access.
Basic Templating - any documentation?
Templating is provided by Cocoon. All the Cocoon documentation is relevant here.
Explanation of template file structure and how it relates to Lenya as a whole.
Lenya does not enforce a fixed template system. Each publication defines its own templating
approach including the file structure used. Please see the next question for more on this.
Templating options - I know we can use XHTML, and XML/ XSLT. Are there any additional
templating options.
XHTML, XML, and XSLT are simply underlying technologies. An architecture must be selected first
that employs these technologies to define a templating system. Lenya does not impose any
particular templating approach.
The architecture used by most Lenya publications is based on Cocoon Aggregators. Generally, the
output from multiple XSLT transformers are combined to form the overall document. Each transform
will construct one part of the overall document (for example, the masthead). This is a simple
approach that is similar to that provided by other CMS templating
An alternate approach is to use a single XHTML template document that uses XInclude to aggregate
content. In some ways, this is even simpler than the first approach in that no knowledge of XSLT is
required.
Best way to include files in templates - is it a special procedure, or standard type include in the
template. (e.g.,, a different DHTML navigation menu).
Hmmm, what is the best way is subjective. Lenya's approach is to allow the developer to use
whatever templating approach they feel is most effective. The answer to the previous question
provides two alternatives. Many more are possible.
Features
The following are features listed as available in Lenya. We would like clarification and/or more information
on how we would implement/access them, etc.

• Pluggable Authentication - clarification


• Session Management
• Asset Management
• CGI-mode Support
• Content Reuse - to have the Kinderspital piece in the default publication?
• Content Scheduling
• Email To Discussion
• Internationalization - not priority but of interest
• Macro Language
• Server Page Language
• Sub-sites / Roots
• Template Language
• Themes / Skins
• Blog - is the content here integrated with the application, reusable content?
• Document Management - how is this implemented?
• FAQ Management - we plan on implementing an FAQ service on our website
• File Distribution
• Link Management - this is listed as "Limited" but would be useful for us to see it
• Mail Form

- 51 -
Appendix H. CMS Consultation Guidelines

Regardless of the size of your web site or the number of authors and site administrators you
have, your site management team must address these issues in the decision-making process of
choosing a Content Management System.

Do you need Content Management or a Knowledgebase? The first question the customer
should consider is whether or not a content management system is really the right tool for the
job. Is the customer more concerned with search and retrieval of data nuggets? If so, a
knowledgebase may be in order. Is content presentation and publishing more important than
indexing? In that case, a CMS may be what the customer needs.

Platform matters. Every CMS product has platform issues to be taken into consideration, both
at the hosting and authoring levels. While system specs will tell you in a straightforward manner
what your hosting constraints are, the capabilities of authoring tools blur very quickly. While
many of the most sophisticated browser-based WYSIWYG tools require the PC version of
Internet Explorer, there are java-based interfaces that have a wider range of application.
Additionally, there are third-party html editors that may be used in place of whatever less
sophisticated tool comes with the product you select. You must become familiar with exactly
which features are available to which browsers on what platforms. Macromedia Contribute is
cross-platform, but its Mac functionalities are very different from its PC capabilities, in its
current version. Interfaces that separate data from presentation are the most flexible in terms of
platform, but are less intuitive to the non-html savvy author and may require development. Be
aware both of what's available in the product and what operating systems your authors will be
using, as well as their level of web expertise.

Do you require print publishing? Do you want pdf output of your content? Then you may
want to pay attention to those products, like Lenya, that house data in a structured format, with
the ability to provide content output in XML format for easier repurposing from web to print.
Lenya also offers direct pdf output.

Do you ever intend to use your content as single source but with different templated
layouts? For example, do you display spotlight articles that later become FAQ's? If so, you may
not want to include html markup in your authoring interface, because any later changes to
content formatting will require manual stripping and re-editing of html tags. Your best bet may
be a structured database product that separates presentation from content, with form-based
authoring and no tagging stored with the content.

Do you need tight integration with MS Office or easy WYSIWYG authoring? Then
Contribute may be the right product for you. If your authors are completely unversed in html
tagging, or are uncomfortable copying and pasting source code from Dreamweaver into an
authoring form, the Contribute provides the easiest authoring interface of any product we
reviewed.

Content migration will be a painful issue for any CMS product. If you already have a web
site, chances are that you have a lot of static pages, all tagged up. There is no easy way around

- 52 -
stripping out html tags for importing the content into the database of a CMS. If you decide to go
with a product that does not separate content from format, you may have an easier time
migrating your data. Either way you must consider the complications of migrating your data and
plan your resource allocation accordingly. If you are planning to start a web site from scratch,
migration is less of an issue than the authoring process.

What are the technical resources available to you? Do you choose to support your own web
hosting infrastructure, or do you prefer to receive hosting services from IS&T? Are you
constrained to Athena for any reason? Hosting your own CMS solution will require the
acquisition, installation, and ongoing maintenance of hardware and software. It may also involve
acquiring new technical skill sets within your departmental technical staff.

Do you already have a content management business process? If not, get one before
attempting to implement a CMS solution. A CMS is a supporting technology for a business
process, not a substitute for one. Things to think about in your business process in terms of how
they will translate to your CMS solution: How much workflow do you need, i.e., how many
stages of approval do you need? Do you have agreed-upon authoring conventions? What about
versioning?

Is there top-level buy-in in your department for moving forward toward adapting your
current publishing process to incorporate a CMS? If your authors do not use your CMS,
you've wasted a lot of resources implementing it.

Do you need a dynamic site that generates pages on the fly, or can your needs be met with
static rendering of system-housed content? If your content does not change frequently, you
will get better performance with static pages. Some or all of your site may need to be database-
driven, but don't automatically make that assumption. Your needs may be well served by static
pages generated from a database repository or xml structured data.

- 53 -