DITA on your Local Disk: Organizing your Suite

Peter Fournier

This series of articles explores the how and why of using DITA XML in the file system: in the early stages of DITA adoption it saves you time and money. You cannot afford to not use DITA.

This article explains the what a good file organization looks like and why it is important, even more so for DITA than older authoring technologies.

Back to "DITA on your Local Disk"

Contents

The Typical Situation before DITA XML

Authors, you and I, have been using computers to write and publish documents, or information products, since the 1980s. During that time the usual way to organize documents was to create directories in the file system, one folder for every document.

Within each of these directories we have organized information in chapter files and sub-directories, for example a folder of graphics for the document. Depending on your work habits there might also have been sub-folders for engineering specifications, source files for the graphics used in the document and others.

In more advanced contexts we also had folders at a higher level, or even on another machine, that held common items like logos, legal notices, warnings and copyright information.

So, working with FrameMaker or Micrsoft Word, we created a file system organization that looked like this:

After DITA XML: Early Stages

In the very earliest phases of the transition to DITA Samalander recommends converting chapter files into DITA XML.

The only required change at the file system level is to convert the chapter files from ".fm", ".doc" or ".docx" to ".dita". If you are moving from MS Word to DITA you will also have to change your authoring tool by switching to FrameMaker or the equivalent.

You will notice that this strategy contradicts the usual advice about transitioning to DITA XML. The usual advice is to transform your chapter files into hundreds of independent topic, task, concept, and reference files controlled by a CMS. But, in the early stages, why bother? The DITA specification does not require this level of granularity and jumping from chapter-level files to independent topics poses several problems.

It can cost orders of magnitude more than deploying DITA in the file system, thus making the transition to DITA harder to justify.
It increases the training overhead you will impose on your writers.
By requiring a CCMS or CMS it increases both the capital cost of deployment and the time required to make the transition.
It increases maintenance costs.
It forces you to make shaky RoI estimates because of your lack of experience with DITA.
It forces you to choose a CMS or CCMS before you have enough experience with DITA to be confident in your choice.
Worst of all, the standard advice presents a barrier to the adoption of DITA even though it typically leads to a
- two, three or even four times increase in productivity, radically improved flexibility, and improved job satisfaction at the writing level, and
- cuts independent contractors out of the DITA universe altogether because their clients (and their requirements) are not big enough to justify the capital and maintenance costs required by the standard advice.

So, for now, let's just do a straight transition from chapter oriented unstructured information to chapter-level structured information in DITA XML on the local file system.

At the file system level this early stage transition to DITA XML looks like this:

After DITA XML: With CONREFs

After you have transitioned from unstructured information to structured information in DITA XML you can begin to realize the benefits and cost savings of DITA XML.

The first, and most important, benefit is information re-use. Normally this occurs when you point out to yourself or your team that DITA XML includes the concept of a CONREF, or Content Reference.

Given that your information is in DITA XML, writers exposed to the concept of CONREFs will immediately start implementing a content re-use strategy. Why? Because changing the same text or graphic in multiple files is boring, really boring. In fact it's the most boring part of technical writing, and the hardest to manage. So what can we do to enable technical writers to take advantage of the concept of re-use through CONREFs? We structure the file system to support re-use with the a new folder called "Common".

A "Common" folder makes the creation of CONREFs easy. When a technical writer finds a fragment of content that appears in two or more publications, a simple copy and paste into a new DITA XML file stored in "Common" makes a CONREF to that content easy to do. In the early stages of introducing content reuse, elements such as WARNINGS and CAUTIONS are the most likely to be CONREF'd. In later stages common installation or configuration tasks are the most likely to be CONREF'd.

At the file system level this early stage of CONREF / re-use looks like this:

A frequent objection to this strategy is that keeping track of all these CONREF links and making sure they are valid quickly exceeds the author's ability to cope, especially if files are moved around in the file structure. As this help file for DITA Link Fixer explains, this does not need to be a problem. Regular use of the DITA Link Fixer module or the DITA Sanity Checker module will ensure that all your links are valid. Once you know your links are valid it's a simple matter to rearrange your files and re-run the Link Fixer module which simplifies re-attaching links to the correct files in their new locations.

After DITA XML: Keeping up with Releases

Once you have converted to DITA XML, introduced CONREFs and a "Common" folder, the next challenge is keeping up with versions of the document, usually driven by new releases of software or hardware. This usually requires some form of version control, the same kind of version control used by software developers.

If you have implemented version control of your files with SVN or GiT, the easiest way to move your files to another release is to use the "Export" function in SVN or GiT.

Note: This advice applies to groups and independent contractors who don't want to spend the time becoming experts in SVN or GiT branching. If your are an SVN or GiT expert you will disagree with this advice. However, for most small groups and independent contractors avoiding branching, push/pull, clone and GiT flow is recommended. For more information on SVN and GiT, see "Version Control: Why is it Critical when Authoring DITA XML?"

Create a new folder with the new release number plus the word TEMP (as in "Release 2.0.0TEMP").
Use the "Export" function to copy your existing suite of documents to the new folder.
Note: If you are using SVN or GiT, you MUST NOT copy the contents of the old release into the new folder. Both SVN and GiT use invisible directories to track changes. A copy and paste may copy these invisible items to the new folder. This will make using SVN or GiT in the new folder impossibly difficult. "Export" merely copies the contents of the folder, not the SVN or GiT control files.
After exporting your content to the new folder you will want to implement versioning control for this folder. Use SVN or GiT to create a new repository folder and import the contents of the "TEMP" folder into the repository.
Once the import is complete, create another folder, such as "Release 2.0.0", and use the "Checkout" function of SVN or GiT to copy files from the repository to the new folder.
Once the checkout is done, you can delete the "TEMP" folder.

If you are not using SVN or GiT, simply copy the contents of one release to another.

At the file system level this early stage of release-specific suites looks like this:

Summary

This article has provided a brief outline of the basic organization principles required to work with DITA on your local disk, no CMS required. It has also outlined how you can enjoy the most important, early and continuing benefit of using DITA: reuse. The graphs on the Samalander home page show how significant this benefit can be.

The next article, The "Common" Concept, will explore the concept of "Common" in greater detail.

Back to "DITA on Local Disk"