Está en la página 1de 58

bc

Acrobat PDF Consultant and Accessibility Checker


Technical Note #5410 Version : Acrobat 6.0

ADOBE SYSTEMS INCORPORATED

Corporate Headquarters 345 Park Avenue San Jose, CA 95110-2704 (408) 536-6000 http://partners.adobe.com

May 2003

Copyright 2003 Adobe Systems Incorporated. All rights reserved. NOTICE: All information contained herein is the property of Adobe Systems Incorporated. No part of this publication (whether in hardcopy or electronic form) may be reproduced or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of the Adobe Systems Incorporated. PostScript is a registered trademark of Adobe Systems Incorporated. All instances of the name PostScript in the text are references to the PostScript language as defined by Adobe Systems Incorporated unless otherwise stated. The name PostScript also is used as a product trademark for Adobe Systems implementation of the PostScript language interpreter. Except as otherwise stated, any reference to a PostScript printing device,PostScript display device, or similar item refers to a printing device, display device or item (respectively) that contains PostScript technology created or licensed by Adobe Systems Incorporated and not to devices or items that purport to be merely compatible with the PostScript language. Adobe, the Adobe logo, Acrobat, the Acrobat logo, Acrobat Capture, Distiller, PostScript, the PostScript logo and Reader are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries. Apple, Macintosh, and Power Macintosh are trademarks of Apple Computer, Inc., registered in the United States and other countries. PowerPC is a registered trademark of IBM Corporation in the United States. ActiveX, Microsoft, Windows, and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. UNIX is a registered trademark of The Open Group. All other trademarks are the property of their respective owners. This publication and the information herein is furnished AS IS, is subject to change without notice, and should not be construed as a commitment by Adobe Systems Incorporated. Adobe Systems Incorporated assumes no responsibility or liability for any errors or inaccuracies, makes no warranty of any kind (express, implied, or statutory) with respect to this publication, and expressly disclaims any and all warranties of merchantability, fitness for particular purposes, and noninfringement of third party rights.

Acrobat SDK Documentation Roadmap


Getting Started
Getting Started Using the Acrobat Software Development Kit PDF Specification Acrobat SDK Release Notes Acrobat Development Overview Acrobat Developer FAQ Acrobat Plug-in Tutorial Reader Enabling Acrobat SDK Samples Guide PDF Reference Manual

Upgrading Plug-ins from Acrobat 5.0 to Acrobat 6.0

Acrobat Core API Acrobat Core API Overview Acrobat Core API Reference

Extended API for Plug-in AcroColor API Reference

PDF Creation APIs and Specifications Acrobat Distiller Parameters Acrobat Distiller API Reference pdfmark Reference

JavaScript Acrobat JavaScript Scripting Reference Acrobat JavaScript Scripting Guide Programming Acrobat JavaScript Using Visual Basic

Acrobat Interapplication Communication (IAC) Acrobat IAC Overview Acrobat IAC Reference

ADM Programmers Guide and Reference

Catalog API Reference

Digital Signature API Reference

Forms API Reference

PDF Consultant Accessibility Checker

Search API Reference

Spelling API Reference

Using the Save as XML Plug-in

Weblink API Reference

Contents

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
What Is In This Document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Who Should Read This Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Other Useful Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Conventions Used in This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Chapter 1

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

The PDF Consultant and Accessibility Checker Plug-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Acrobat Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Reclassifying and Revisiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Agent Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 How the Consultant Works . . . . . . . Removing or Modifying Objects Reclassifying Objects. . . . . . . . Consultant Itinerary

Important Issues For Consultant Development . Maintaining the Traversal Stack . . . . . . . . Deciding Who Does The Work . . . . . . . . . Avoiding Agent Collisions. . . . . . . . . . . . Avoiding Visitation Collisions . . . . . . . . .

Chapter 2

Using the Consultant . . . . . . . . . . . . . . . . . . . . . . . . . 17

Importing the Consultant HFTs Into a Plugin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 HFT Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Creating and Destroying Consultants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Registering Agents With The Consultant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Starting The Consultant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Using the Traversal Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 Consultant Object Type Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Object Type Subclassing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Creating Your Agent Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Acrobat PDF Consultant and Accessibility Checker

Contents

Agent Constructors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 Recognizing Objects of Interest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 The Post Processing Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Chapter 3

Consultant API Reference . . . . . . . . . . . . . . . . . . . . . . . 27




Consultant Management Methods. . ConsultantCreate . . . . . . . ConsultantDestroy . . . . . . ConsultantRegisterAgent . . ConsultantResume . . . . . . ConsultantSetStart . . . . . . ConsultantSuspend. . . . . . ConsultantTraverseFrom . . PDFObjTypeGetSuperclass .

Consultant Traversal Stack Methods . . . . . . . . . . . ConsStackGetCount . . . . . . . . . . . . . . . ConsStackIndexGetArrayIndex. . . . . . . . . ConsStackIndexGetDictKey . . . . . . . . . . . ConsStackIndexGetObj . . . . . . . . . . . . . ConsStackIndexGetTypeAt . . . . . . . . . . . ConsStackIndexGetTypeCount . . . . . . . . ConsStackIndexIsArray. . . . . . . . . . . . . . ConsStackIndexIsDict . . . . . . . . . . . . . . ConsultantGetNumDirectVisited . . . . . . . ConsultantGetNumIndirectVisited . . . . . . ConsultantGetNumUniqueIndirectsVisited . ConsultantGetPercentDone . . . . . . . . . . ConsultantNextObj . . . . . . . . . . . . . . . .

Chapter 4

Declarations and Callbacks . . . . . . . . . . . . . . . . . . . . . . 49




Data Declarations. . . . . . . . . . . ConsStack . . . . . . . . . . Consultant . . . . . . . . . ConsultantAgent . . . . . tagConsultantAgent . . . ConsultantAgentAction . PDFObjType . . . . . . . . RegAgentFlag . . . . . . .

Callbacks . . . . . . . . . . . . . . . . . . . . . ConsAgentObjFoundCallback . . ConsAgentPostProcessCallback . ConsPercentDoneCallback . . . .

Acrobat PDF Consultant and Accessibility Checker

Preface

The PDF Consultant and Accessibility Checker is a plug-in that walks through a PDF document, checking for items and/or making modifications. The Consultant achieves this through the use of Agents. You will write Agents to perform your specific actions on PDF documents.

What Is In This Document


This document contains the following chapters:

Chapter 1, Overview, provides a general description of the PDF Consultant and Accessibility Checker, and presents development issues to keep in mind. Chapter 2, Using the Consultant, descibes how to use the PDF Consultant and Accessibility Checker, and provides examples. Chapter 3, Consultant API Reference, provides a complete reference for the PDF Consultant and Accessibility Checker methods. Chapter 4, Declarations and Callbacks, provides a complete reference for the data declarations and callbacks used by the PDF Consultant and Accessibility Checker methods.

Who Should Read This Document


If you are writing plug-ins that process PDF documents, and wish to take advantage of Acrobats PDF Consultant and Accessibility Checker plug-in, you should read this book. The plug-in is provided for you; you only need to create an instance of it in your code. You will be writing specific Agents that interact with the plug-in.

Prerequisites
This document assumes that you are familiar with:

the Acrobat product and the provided API how to write an Acrobat plug-in PDF format

Acrobat PDF Consultant and Accessibility Checker

Preface
Other Useful Documentation

Other Useful Documentation


The Acrobat SDK includes many other books that you might find useful. When mentioned in this document, those books will often appear as live links (blue italic links). However, in order to actually jump from this document to those books, those books must exist in the proper directories within your computer's file system. This happens automatically when you install the SDK onto your system. If for some reason you did not install the entire SDK onto your system and you do not have all of the documentation, please visit the Adobe Solutions Network web site (http://partners.adobe.com/asn) to find the books you need. Then download them and install them in the proper directories, which can be determined by looking at the Acrobat SDK Documentation Roadmap, included at the beginning of each book in the SDK. You should certainly be familiar with the Acrobat Core API Reference document, and should keep it handy while writing your Consultant plug-in. You should also have the PDF Reference available.

Conventions Used in This Book


The Acrobat documentation uses text styles according to the following conventions. Font monospaced Used for Paths and filenames Code examples set off from plain text monospaced bold Code items within plain text Parameter names and literal values in reference documents monospaced italic Pseudocode Placeholders in code examples Examples
C:\templates\mytmpl.fm

These are variable declarations:


AVMenu commandMenu,helpMenu;

The GetExtensionID method ... The enumeration terminates if proc returns false.
ACCB1 void ACCB2 ExeProc(void) { do something } AFSimple_Calculate(cFunction, cFields)

Acrobat PDF Consultant and Accessibility Checker

Preface
Conventions Used in This Book

Font blue

Used for Live links to Web pages Live links to sections within this document

Examples The Acrobat Solutions Network URL is: http://partners/adobe.com/asn See Using the SDK.

Live links to other See the Acrobat Core API Overview. Acrobat SDK documents Live links to code items within this document bold PostScript language and PDF operators, keywords, dictionary key names User interface names italic Document titles that are not live links New terms PostScript variables Test whether an ASAtom exists. The setpagedevice operator

The File menu Acrobat Core API Overview User space specifies coordinates for... filename deletefile

Acrobat PDF Consultant and Accessibility Checker

Preface
Conventions Used in This Book

10

Acrobat PDF Consultant and Accessibility Checker

Overview

The PDF Consultant and Accessibility Checker Plug-in


Acrobat comes with a plug-in called the PDF Consultant and Accessibility Checker. This plug-in walks through PDF documents, visiting each object and determining its type and other statistics. It can make certain modifications or repairs to the PDF document. The objects that the Consultant visits can range from simple, primitive types such as CosStrings to higher-level objects such as Images. Users call the Consultant to run on a particular PDF document, choose which tests or repairs to run, then view the results and/or select repair options. The Consultant visits the objects in a PDF document according to instructional flags you pass to it. After the Consultant has visited an object, the object may be different. The Consultant reclassifies modified objects before moving on to the next object. As the Consultant traverses a PDF document, gathering objects of interest, it can perform the following functions:

walk a given hierarchy keep track of cycles ensure that objects are only visited once, if desired recognize object types keep a traversal stack list

Acrobat Agents
The Consultant accomplishes its task by using Agents, which are pieces of code you design to gather the statistics and recommend to the Consultant the necessary repairs of the document. Separate Agents handle each area of analysis and repair. The Agents inform the Consultant of the particular types of objects in which they are interested by registering with the Consultant. When the Consultant has one or more Agents registered, it hands each object of the requested type(s) in the current document to each of the Agents that requested that type. The Consultant gives objects to each Agent in turn, depending on the order in which they registered. The Consultant must intelligently determine the type of each object it comes across (both direct and indirect), so it can pass appropriate objects to the Agents, or replace or remove ones that it has been instructed to handle itself. The Consultant communicates directly with Agents, keeping lists of which Agents are interested in which objects, and obtaining instructions from the Agent as to an objects visitation status.

Acrobat PDF Consultant and Accessibility Checker

11

Overview
How the Consultant Works

Agents can perform their own repairs and modifications to the PDF document, and can return a corrected object to serve as a replacement for the object the Consultant originally passed to it. Agents can also modify the Cos graph themselves (including adding or removing Cos objects or modifiying the contents such as keys or array elements). The Consultant keeps a list of each object (starting with the object which began the traversal) that it visits on its way to any given object. Agents must be careful not to make any modifications that would affect any of the objects in this list, which is referred to as the traversal stack. For this reason, Agents can specify a post-processing callback that the Consultant calls once it has finished traversing the entire document. See Important Issues For Consultant Development for more detailed information on this point.

Reclassifying and Revisiting


If an Agent or the Consultant itself modifies an object, the Consultant reclassifies that object, possibly changing its type. Agents also pass to the Consultant the visitation flags that determine how object types should be visited. Limiting the traversal is important, as PDF documents are graphs, arbitrarily complex, and often there are many ways to visit a single object. If the Consultant has reclassified an object, it may also change the way that object is revisited. You must keep this in mind as you develop your Agents.

Agent Architecture
Your Agent code will primarily consist of a structure, as defined in the ConsExpt.h header file. Acrobat provides a C++ wrapper class to facilitate writing agents; you can derive an agent class from this base class. See Creating Your Agent Class for an example agent from which you can generalize.

How the Consultant Works


The Consultant completes a full, non-recursive traversal of the Cos graph that comprises a PDF document, keeping track of cycles as it goes. Note that there is no guarantee that objects will be visited in any particular order, only that the Consultant will visit all objects (except isolated objects such as the DocInfo object or previously orphaned objects) at least once, provided no Agents modify the graph such that graph paths are removed or redirected.

Removing or Modifying Objects


If an Agent removes, replaces or modifies an object, the Consultant will pass to other Agents the modified objects (if they are encountered). For example, Dict A refers to Dict B.The first Agent replaces all references to Dict B with references to Dict C,so when later agents receive Dict A from the Consultant, they will see the references to Dict C.

12

Acrobat PDF Consultant and Accessibility Checker

Overview
How the Consultant Works

Reclassifying Objects
In general, the Consultant reclassifies an object after an Agent is finished performing operations on it. It is possible that, in the process of modifying the object, the Agent may actually have changed the type of the object. This could mean that Agents originally interested in the object might no longer wish to see it. So the Consultant must reclassify an object after each Agent has finished with it. Since the default behavior in "revisit upon reclassification" mode is to revisit objects when they are reclassified, new objects added in this mode will actually be visited again if they are reclassified as the traversal continues. Determining the higher-level type (the PDFObjType, as the Consultant code calls it) of a given Cos object is not always easy. The Consultant not only looks at construction of objects (what keys are present in the object) but also at how the object was reached (through what particular object type and via what keys). Objects that are interpreted differently depending on how they are traversed can be properly identified. For example, if the Consultant is looking at an object containing "/Type /Annot" and "/Subtype /Widget" it is clear that the object is a Widget Annotation; however, when traversed via the AcroForms section, that same object is actually a form field. It is because of such possible dualities that the Consultant can operate in a "revisit upon reclassify" mode that would visit the above object twice: once as a Widget Annotation and again as a form field.

Consultant Itinerary
The Consultant process works like this (See Consultant API Reference for details on how to write the actual code to do these steps.): 1. You create a Consultant. 2. You create an Agent. 3. Register your Agent with the Consultant, with information as to which object types are of interest. 4. The user calls the Consultant to work on a particular PDF document. 5. The Consultant creates a traversal stack to keep track of where it is in walking through the PDF document. 6. The Consultant begins traversing the PDF document. If Agents have instructed the Consultant to modify or remove the object, it does so, returning the appropriate replacement. 7. The Consultant pushes the object onto the traversal stack and sends a message to the Agent that the object was found. 8. The Agent sends messages to the Consultant about what to do to objects: replace them, remove them, revisit them later or not. 9. When the entire PDF document has been traversed, the Consultant calls the Agent back to perform any postprocessing repairs it might want to do.

Acrobat PDF Consultant and Accessibility Checker

13

Overview
Important Issues For Consultant Development

10.Consultant unregisters all Agents. 11.Remove the Agent object. 12.Remove the Consultant object.

Important Issues For Consultant Development


First, you must decide if you actually do want to use the Consultant. The Consultant walks through an entire PDF document. If you only need to modify a small number of objects, and you know how to locate those objects, it makes more sense to write the object-finding code yourself. If you do decide to use the Consultant, there are a number of issues that are important to keep in mind when you are developing your Agent. You should make your decisions about all of these issues before you write your code, so you will know exactly what to expect. Some of these issues lead to errors that are difficult to debug, so it is best to understand them all while creating your plug-in.

Agents must not modify objects on the traversal stack while the Consultant is still walking through the document, otherwise infinite loops and other problems can occur. Decide which piece of code is actually going to do the workthe Consultant or the Agentin order to optimize your plug-in. The order in which Agents interact with the Consultant is very important, as Agents can modify objects that other Agents want to see.

Maintaining the Traversal Stack


The Consultant keeps track of the objects it has visited in the PDF document in the traversal stack. If an Agent were to modify an object such that it affected the traversal stack, the entire process would be derailed.The Consultant might no longer know if it had visited an object, which could cause infinite loops, multiple, unnecessary visitations, or objects that remain unvisited. It is extremely important that the integrity of the traversal stack remain undamaged. You must design your Agent carefully so as to avoid this problem. You can use the postprocessing step of your Agent to handle many repair tasks, thereby avoiding dealing with objects still on the traversal stack.

Deciding Who Does The Work


If the Consultant performs object modifications it does so as it goes through its traversal. Modifications that might affect the objects type or properties would alter the traversal stack and corrupt the traversal process. For these kinds of modifications, set up an Agent to perform the tasks in the postprocessing step.

14

Acrobat PDF Consultant and Accessibility Checker

Overview
Important Issues For Consultant Development

For instance, suppose an Agent wants to remove annotations while there are form widgets present in the document. There are a few ways the Agent can remove the annotions while the Consultant is working, but they all have problems:

Calling the Agent for all annotations and removing them at the Cos level does not clean up the forms tree if there are Widget Annots in the document. Calling the Agent for all Annots and using PDPageAnnotRemove modifies the page object, which might still be in the traversal stack.

The best solution in this case is to enumerate all of the Annot objects by having the Consultant look for Annot objects and keep a list of them, then let the Agent call PDPageAnnotRemove on them in the postprocessing step.

Avoiding Agent Collisions


When running multiple Agents on a document, the order in which you register your Agents is the order in which the Consultant will hand them objects. If your earlier Agents modify objects, they may change the objects in such a way that they are missing important information or are of a different type than they were originally. For example, one Agent might consider it correct to remove a given field of an object, while another would complain that the field was not present and would want to add it. If the first Agent modified an object with respect to its type, subsequent Agents would no longer think they were interested in it, and their processing would not take place. You must group your Agents so that you do not run multiple Agents with conflicting goals at the same time. A rarer problem could occur with self-referential objects. For example, if Dict A contains a reference to itself and the first Agent replaces Dict A with Dict B (which would still contain a reference to Dict A), another Agent cannot work with Dict B until the internal reference is changed. But if you are running the Agents concurrently, there will be a collision. This would be a case best handled by the Consultant.

Avoiding Visitation Collisions


Objects that have multiple classifications can be reached from multiple paths. In such cases you might allow the Consultant to revisit such objects if, and only if, they have been reclassified on a new path. However, you must take care not to allow revisitation under other circumstances, or the Consultant could miss objects, which would defeat the reason for using a mode that considers object classification.

Acrobat PDF Consultant and Accessibility Checker

15

Overview
Important Issues For Consultant Development

16

Acrobat PDF Consultant and Accessibility Checker

Using the Consultant

Importing the Consultant HFTs Into a Plugin


The Consultant exports its functions via an HFT (see document XXXX for a discussion of HFTs). The variable name your plugin uses for the HFT must be of type HFT and named gConsultantHFT. Your SDK provides a macro, InitPDFConsultantHFT, in ConsHFT.h , for easy initialization of the HFT variable; see Example: Importing Consultant HFTs. The Consultants HFT allows you to create Consultants. (See Creating and Destroying Consultants for more details.) The Consultant exports an HFT that deals with the general operation of the Consultant, including the creation and deletion of Consultant objects and Agent registration. Naturally, you must load the Consultant plugin before the HFTs plugins can import it. Importing the Consultants HFT is the same as importing any other plugins HFT. Refer to the Acrobat documentation for details on how to do that. Note to myself to check the exact book here. To get access to the HFT, you must include ConsHFT.h for the core Consultant API. In a plugin, the importReplaceAndRegisterCallback should contain the code that imports the HFT; see the following example. Example: Importing Consultant HFTs
HFT gConsultantHFT= (HFT)NULL; ACCB1 ASBool ACCB2 DumpAllObjectsAgentImportHFTs( void ) { ASBool bRetVal = false; /* Import the Consultant's main HFT */ gConsultantHFT = Init_PDFConsultantHFT; /* Macro in ConsHFT.h */ if(gConsultantHFT != (HFT)NULL) bRetVal = true; else // Put in error message about the absence of the Consultant HFT... return bRetVal;

};

Acrobat PDF Consultant and Accessibility Checker

17

Using the Consultant


Creating and Destroying Consultants

HFT Functions
The Consultant defines the following functions for HFT usage:
ConsultantCreate ConsultantDestroy ConsultantTraverseFrom ConsultantRegisterAgent ConsultantSetStart ConsultantNextObj ConsultantGetPercentDone ConsultantGetNumDirectVisited ConsultantGetNumIndirectVisited ConsultantSuspend ConsultantResume ConsStackGetCount ConsStackIndexGetObj ConsStackIndexGetTypeCount ConsStackIndexGetTypeAt ConsStackIndexIsDict ConsStackIndexIsArray ConsStackIndexGetDictKey ConsStackIndexGetArrayIndex PDFObjTypeGetSuperclass ConsultantGetNumUniqueIndirectsVisited

Creating and Destroying Consultants


The Consultants HFT allows you create a Consultant for your own use. Once you have finished writing your Agent class, you are ready to register it with the Consultant and begin processing documents. You should keep your Agent separate from the Consultant objectthat is, do not make the Consultant object a member of your Agent class. Use a plugin as the owner for both the Consultant and your Agent object. As there is some memory overhead in creating a Consultant, create a Consultant object only when it is needed, not before. If your target application is a plugin, the most logical place to perform all operations is in the menu item execute procedure. Whether or not it makes sense to destroy the Consultant object after each execution of the menu item depends on your project. The Consultant HFT provides the functions ConsultantCreate and ConsultantDestroy, for creating and destroying Consultant objects. It also provides the Consultant data type, an opaque type for passing handles to Consultant objects. ConsultantCreate returns variables of that type and requires them as parameters to all other HFT functions having the prefix Consultant. After each run the Consultant unregisters all the Agents that were registered with it; however the memory for the Consultant object itself remains, and the object must be explicitly destroyed to free it. Depending on the duties you assign your Consultant, you

18

Acrobat PDF Consultant and Accessibility Checker

Using the Consultant


Registering Agents With The Consultant

may want to destroy it after each execution of the menu item that launches it, or you may wish to keep it running. See Example: Registering An Agent With A Consultant for an illustration of creating and destroying a Consultant object.

Registering Agents With The Consultant


Although it is not an error condition for the Consultant to be run on a document when there are no Agents registered, in order to perform any operations on or analysis of the document you must first register your Agent with the Consultant, using the method ConsultantRegisterAgent. Once the Agent is registered with the Consultant, it remains registered until a call to ConsultantTraverseFrom is completed. You must re-register Agents before each successive call to ConsultantTraverseFrom. When you register an agent, you supply a rule (one of the RegAgentFlag values) for revisitation of objects as the Consultant runs through the document from the starting object. The following example demonstrates registering and running the Consultant. Example: Registering An Agent With A Consultant
ACCB1 void ACCB2 DumpAllObjectsAgentExecute( void* pvData ) { AVCursor hCurrentCursor = AVSysGetCursor(); AVCursor hWaitCursor = AVSysGetStandardCursor( WAIT_CURSOR ); AVSysSetCursor( hWaitCursor ); /* Declare volatile Consultant since its inside a DURING block */ Consultant volatile hConsultant = (Consultant)NULL; DURING AVDoc hAVDoc = AVAppGetActiveDoc(); miAssert( hAVDoc != ( AVDoc )NULL ); if( hAVDoc != ( AVDoc )NULL ) { /* Create a Consultant object */ hConsultant = ConsultantCreate( DumpAllObjectsAgentPercentDone ); miAssert( hConsultant != ( Consultant )NULL ); if( hConsultant != ( Consultant )NULL ) { /* Get the current document root */ PDDoc hPDDoc = AVDocGetPDDoc( hAVDoc ); /* Create our Agent and register it */

Acrobat PDF Consultant and Accessibility Checker

19

Using the Consultant


Starting The Consultant

gDumpAllObjectsAgent = new DumpAllObjectsAgent( hPDDoc ); if((gDumpAllObjectsAgent == (DumpAllObjectsAgent*)NULL) || (gDumpAllObjectsAgent->IsValid() == false)) { ASRaise( GenError( genErrNoMemory ) ); } else { ConsultantRegisterAgent(hConsultant, *gDumpAllObjectsAgent, REG_REVISITRECLASS_ALL ); /* Start the Consultant */ ConsultantTraverseFrom(hConsultant, CosDocGetRoot(PDDocGetCosDoc(hPDDoc)),PT_CATALOG); } } } HANDLER ... Destroy Consultant...Free Memory... END_HANDLER if( hConsultant != ( Consultant )NULL ) ConsultantDestroy( hConsultant ); if( gDumpAllObjectsAgent != ( DumpAllObjectsAgent* )NULL ) { delete gDumpAllObjectsAgent; gDumpAllObjectsAgent = ( DumpAllObjectsAgent* )NULL; } AVSysSetCursor( hCurrentCursor );

Starting The Consultant


The ConsultantTraverseFrom function instructs the Consultant to begin traversing a document, starting at a particular Cos object. The Cos object should be the Catalog of a currently open document. (Later expansions of the API may allow you to specify the object type from which to begin, but currently it must begin at the Catalog.) ConsultantTraverseFrom has no return value and instead raises an Acrobat exception on error. The Consultant API also supplies methods for suspending a traversal, (ConsultantSuspend) and resuming it (ConsultantResume).

20

Acrobat PDF Consultant and Accessibility Checker

Using the Consultant


Using the Traversal Stack

Using the Traversal Stack


This example demonstrates how to use the traversal stack manipulation functions. Example: Using The Consultant Traversal Stack
char* GetTraversalString(ConsStack stack, char *traversalString, ASUns32 strLen) { ASUns32 Index, NumItems, CurStrLen; char StringUns32[16]; traversalString[0] = '\0'; CurStrLen = strlen(traversalString); /* Get the number of items in the current traversal */ NumItems = ConsStackGetCount(stack); for(Index = 0; (Index < NumItems) && (CurStrLen < strLen); Index++) { if((CurStrLen += strlen(TRAVERSAL_SEP)) < strLen) strcat(traversalString, TRAVERSAL_SEP); /* Add the parent key, if this stack entry has one */ if(ConsStackIndexIsDict(stack, Index)) { char* strParentKey = ASAtomGetString(ConsStackIndexGetDictKey(stack, Index)); if((CurStrLen += strlen(strParentKey)) < strLen) strcat(traversalString, strParentKey); } /* Add the parent index, if this stack entry has one */ else if(ConsStackIndexIsArray(stack, Index)) { sprintf(StringUns32, "%u", ConsStackIndexGetArrayIndex(stack, Index)); if((CurStrLen += (strlen(StringUns32) + 2)) < strLen) { strcat(traversalString, "["); strcat(traversalString, StringUns32); strcat(traversalString, "]"); } } } return traversalString; }

Acrobat PDF Consultant and Accessibility Checker

21

Using the Consultant


Consultant Object Type Identification

Consultant Object Type Identification


One of the main features the PDF Consultant and Accessibility Checker framework gives you is the use of its identification engine. This engine can look at Cos objects in a PDF file and, based on properties of the objects and of the objects' parents, assign "PDF Object Type" identifiers to them. Though on a very basic level each Cos object has a simple Cos type and attributes, in the scheme of the document as a whole each object serves a particular purpose. The PDF Object Type assigned to each object represents that object's role in the PDF Document. Some PDF Object Types represent higher-level, conceptuallyfamiliar objects like PT_PAGE (which indicates that the object is a page in the document), while others (like PT_AADICTIONARY) are a bit more obscure, particularly to those who are not familiar with the PDF Document Format. PDF Object Types are represented using the enumerated type PDFObjType, which is defined in ConsObTp.h. A good way to see all of the various PDF Object Types that the Consultant can identify is to look at the constants defined in that file. Some object types (in particular many simpler objects such as strings and numbers) are not assigned a particular type. In general the Consultant can identify those objects that are of most use to you. If the Consultant cannot identify a particular object, for one reason or another, it assigns the identity of PT_UNKNOWN to the object. Just because the Consultant assigns this value to an object does not mean the object is foreign or malformed (although it can potentially mean that), it may simply mean that the object type is not particularly significant in the realm of the PDF Document Format, and thus the Consultant does not know about it.

Object Type Subclassing


To allow for greater Agent flexibility, the Consultant understands PDF Object Type subclasses and superclasses. Certain PDF Object Types are members of more generic classes of PDF Object Type. Agents can often make use of this information, so the Consultant assigns object types that are actually arrays of types. The Consultant assigns to an object the most specific classification as well as the more generic classes of which the object is a member. Agent structures include a field called "WantSubclasses" that indicates whether or not the Agent wants be called for all the interesting objects subclasses as well as their directly interesting types. For example, the PDF Object Type PT_ANNOTATION has a number of more specific subclasses such as PT_LINKANNOTATION, PT_LINEANNOTATION, and so on. If an Agent requests only objects of type PT_ANNOTATION, and its WantSubclasses member is false, it may not be called back for very many objects. If the WantSubclasses member is true, then the Consultant will call the Agent back for objects of all specific types of annotations as well as those classified only as PT_ANNOTATION. This also means that when an Agent retrieves the type of an object, it must specify which type it wants. The types in the array that is the classification of the object always go from the most specific (at index 0) to the least specific (the last index in the array).

22

Acrobat PDF Consultant and Accessibility Checker

Using the Consultant


Creating Your Agent Class

Creating Your Agent Class


A minimal Agent class needs only to define the functions defined as virtual in the ConsultantAgentObject class declared in ConsExpt.h. The following example shows this minimal definition. Example: Minimal Agent Class Definition
#include "ConsExpt.h" class DumpAllObjectsAgent : public ConsultantAgentObj { protected: // ----------------- Data Members -------------------FILE* m_DumpFile; const static PDFObjType s_hAgentObjects[ ]; const static ASUns32 s_iNumAgentObjects; public: // --------------- Constructor / Destructor -----------------------DumpAllObjectsAgent( PDDoc hPDDoc ); virtual ~DumpAllObjectsAgent( void ); // --------------------- Required Methods --------------------------virtual void ConsAgentPostProcess( void ); virtual ASInt32 ObjFound(CosObj Obj, const PDFObjType* pObjTypeHierarchy, const ASUns32 SizeObjHierarchy, TraversalStack Stack,CosObj* pObjToReturn ); };

Agent Constructors
In order to write an Agent class derived from the ConsultantAgentObj baseclass, you must call the base constructor in the derived class construction list. The base constructor requires a constant array of so-called objects of interest (of type PDFObjType) as well as the length of the array (as ASUns32) to be passed as parameters. It is up to you as to where and how the array of types is stored; however the storage must persist, as the base class saves only a pointer to the data. This has important implications for authoring agents; the derived class cannot initialize the data in its own constructor since the base constructor will be called first. The following example shows an example constructor. In the example Agent the array of types and array length are static data members of the Agent class. In larger-scale systems it is better to create a host object for the Agent that will be responsible for determining the proper objects to include in the array and passing them on to the Agent constructor. The list of object types is passed on to the Consultant when ConsultantRegisterAgent is called.

Acrobat PDF Consultant and Accessibility Checker

23

Using the Consultant


Creating Your Agent Class

Example: An Agent Constructor


/* Define static const data to be passed to parent class constructor */ const ASUns32 DumpAllObjectsAgent::s_iNumAgentObjects = 1; const PDFObjType DumpAllObjectsAgent::s_hAgentObjects[DumpAllObjectsAgent::s_iNumAgentObj ects] = { DT_ALL }; /* Derived Agent Class Constructor */ DumpAllObjectsAgent::DumpAllObjectsAgent( PDDoc hPDDoc ) : ConsultantAgentObj( &s_hAgentObjects[ 0 ], s_iNumAgentObjects ) { Open Temporary File and Initialize Data Members ...} }

Recognizing Objects of Interest


Agents register with the Consultant a list of objects in which they are interested. When the Consultant classifies an objects as any of the types the Agent registered with, the Consultant calls the ObjFound callback function, a virtual function in the ConsultantAgentObj base class.

The parameters the Consultant passes to this function allow the function to set up a return value with information about the current object, its parents, and the state of the Consultant traversal stack. The return value from the callback is an OR of bit flags that instruct the Consultant as to handling the current object.

See ConsAgentObjFoundCallback for details of the syntax. The Agent in Example: An Agent Constructor simply gathers information about each object encountered and outputs it to a file. It does not need to have the Consultant make any modifications to the document. Therefore, in the definition of the ObjFound callback function, the return value is always OD_NOCHANGE and the object returned in pObjToReturn is simply the same object that was found. In many cases it makes the most sense for an Agent to make all document modifications itself, without the Consultants replace and remove facilities. In these cases you must take special care not to modify objects that are currently on the Consultants traversal stack. The DumpAllObjects plug-in demonstrates that PDFConsultant agents can access any Cos object from any point in the document. The plug-in writes information about certain Cos objects to an output file, called AllObjects.txt. The ObjFound callback function of the DumpAllObjects agent writes to a file the Cos object traversal path that it took to reach a specific Cos object. The function calls GetTraversalString, which describes, with respect to other objects, where a given object lives in the document. For example, the following shows the format of a traversal path of a text annotation:
18 0 obj PT_TEXTANNOTATION | PT_ANNOTATION | ->AcroForm->Fields->[0]-> P->Annots->[1]

24

Acrobat PDF Consultant and Accessibility Checker

Using the Consultant


Creating Your Agent Class

The traversal path illustrates the "hierarchy" of types that the PDF Consultant and Accessibility Checker assigns to some objects (for example, a PT_TEXTANNOTATION is also listed as PT_ANNOTATION). The Consultant will look at all Cos objects. To simplify the output, the DumpAllObjects agent only involves the most useful Cos objectsCosString, CosDict, CosArray, and CosStream.

The Post Processing Stage


The second and final required function definition in any ConsultantAgentObj derived class is the PostProcess callback. This function is called when the Consultant has finished its traversal and is preparing to unregister agents to prepare for the next possible run. This callback takes no parameters and returns no values (see ConsAgentPostProcessCallback). There are also no restrictions on what types of operations the Agent can perform on the document in this function. The PostProcess callback function is the place to perform any operations that might otherwise damage Consultants traversal by modifying objects up the Consultants current traversal stack. The PostProcess callback function should also perform any cleanup that cannot be done in the Agent destructor. The example Agent in Example: Minimal Agent Class Definition does not need to do any complicated processing, but simply indicates the end of the output log for this run.

Acrobat PDF Consultant and Accessibility Checker

25

Using the Consultant


Creating Your Agent Class

26

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference

This chapter is a complete reference for the methods specific to the Consultant. The methods are organized alphabetically within logical groupings: Consultant Management Methods Consultant Traversal Stack Methods

Consultant Management Methods


These methods register agents, create and destroy consultant objects, start, suspend, and resume traversal, and identify an object type.

ConsultantCreate
Consultant ConsultantCreate (ConsPercentDoneCallback cb);
Description Allocates and intializes a new Consultant object. Use the returned object to call the other Consultant API functions. When you are finished with this object, you must destroy it using the ConsultantDestroy function. Parameters

cb
Return Value

A function pointer to be called back with progress updates. May be NULL..

The Consultant object that was created. Exceptions Raises an Acrobat exception on failure. Header File

ConsHFT.h
Related Methods

ConsultantDestroy

Acrobat PDF Consultant and Accessibility Checker

27

Consultant API Reference


Consultant Management Methods

ConsultantDestroy
void ConsultantDestroy(Consultant hConsultantToDestroy);
Description Detaches all Agents and destroys the given Consultant object, invalidating its handle. You must never call this on a Consultant that is currently running. Parameters

hConsultantToDestroy

A valid Consultant object handle as returned by ConsultantCreate. Handle is invalid after the call returns.

Return Value None. Exceptions Raises an Acrobat exception on failure. Header File

ConsHFT.h
Related Method

ConsultantCreate

28

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Management Methods

ConsultantRegisterAgent
void ConsultantRegisterAgent (Consultant s, const ConsultantAgent *tagConsultantAgent, RegAgentFlag kFlag);
Description Registers the given agent with the given consultant, so that the agent is called when the consultant encounters objects of interest. Parameters

A valid Consultant object handle as returned by ConsultantCreate. The Consultant with which the Agent will be registered. The Agent to register, of a type derived from the ConsultantAgentObj base class. Flag indicating the mode that the Consultant should operate in.

tagConsultantAgent kFlag

Return Value None. Exceptions Raises an Acrobat exception if the Consultant has been started and is not in a suspended state. Header File

ConsHFT.h
Related Method None.

Acrobat PDF Consultant and Accessibility Checker

29

Consultant API Reference


Consultant Management Methods

ConsultantResume
void ConsultantResume(Consultant s);
Description Resumes a previously suspended Consultant at the point in the traversal where it stopped. This function does not return from traversing and notifying Agents until the traversal is complete or ConsultantSuspend is called. The function does nothing if the Consultant object is already running or has not been started. Parameters

s
Return Value None. Header File

A valid Consultant object handle as returned by ConsultantCreate.

ConsHFT.h Related Method

ConsultantSuspend ConsultantSetStart

30

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Management Methods

ConsultantSetStart
void ConsultantSetStart (Consultant s, CosObj objstart, PDFObjType InitType);
Description Resets the suspended Consultant and starts a new traversal from the given starting object. If you do not know the type of the object, the Consultant will attempt to determine it. This function does not return until the entire path beneath the starting object has been traversed. The Consultant passes to the registered Agents all objects it encounters that have been registered as interesting. Parameters

A valid Consultant object handle as returned by ConsultantCreate. The Consultant with which the Agent will be registered. document, this is the Catalog.

objstart Object at which to restart traversal. Usually, for traversing an entire InitType The object type of the specified start object. May be PT_NULL, in which
case the Consultant attempts to determine the type of the object itself. You should specify a value other than PT_NULL whenever possible In most cases, for traversing the entire document, the starting object is the Catalog so the type is PT_CATALOG. Return Value None. Exceptions Raises an Acrobat exception if the Consultant has been started and is not in a suspended state. Header File

ConsHFT.h
Related Method

ConsultantSuspend ConsultantResume ConsultantRegisterAgent

Acrobat PDF Consultant and Accessibility Checker

31

Consultant API Reference


Consultant Management Methods

ConsultantSuspend
void ConsultantSuspend(Consultant s);
Description Suspends the Consultant, even if it is currently executing a call to ConsultantCreate or ConsultantResume. This function causes currently executing calls to ConsultantTraverseFrom to return. It is legal to call this function from within the ScrubPercentDoneCallback passed to the Consultant on ConsultantCreate. Calls to ConsultantTraverseFrom that are currently in progress will return when ConsultantSuspend is called. To resume, call ConsultantResume. You can call ConsultantNextObj on a suspended Consultant, which removes the suspension and causes the Consultant to process the next object.

You can destroy a Consultant that has been suspended. If you call ConsultantTraverseFrom on a suspended Consultant it will reset the operation of the Consultant, but the Consultant will remain in a suspended state and will not process the document further.

This function does nothing if you call it on a Consultant object that is already suspended, or was never started. Parameters

s
Return Value None. Header File

A valid Consultant object handle as returned by ConsultantCreate.

ConsHFT.h Related Method

ConsultantResume ConsultantSetStart

32

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Management Methods

ConsultantTraverseFrom
void ConsultantTraverseFrom(Consultant s, CosObj obj, PDFObjType ObjType;
Description Starts the given Consultant object traversing at the given Cos object. It traverses and processes all objects beneath obj, classifying the type of objects based on the fact that obj is of the given ObjType. It is never legal to destroy a Consultant object that is currently executing a call to

ConsultantTraverseFrom. To properly destroy a running Consultant, you must call ConsultantSuspend first. ConsultantTraverseFrom raises an exception under any other conditions, and may also raise an exception as the result of a registered Agents
raising an exception during the operation. Parameters

s obj ObjType

A valid Consultant object handle as returned by ConsultantCreate. The Consultant with which the Agent will be registered. Object at which to start traversal. The object type of the specified start object. May be PT_NULL, in which case the Consultant attempts to determine the type of the object itself. You should specify a value other than PT_NULL whenever possible.

Return Value None. Exceptions Raises an Acrobat exception if the Consultant has been started and is not in a suspended state. Header File

ConsHFT.h
Related Method None.

Acrobat PDF Consultant and Accessibility Checker

33

Consultant API Reference


Consultant Management Methods

PDFObjTypeGetSuperclass
PDFObjType PDFObjTypeGetSuperclass (PDFObjType Type);
Description Gets the superclass, if any, of the given PDFObjType. Parameters

Type
Return Value

The type that might have a superclass.

The superclass of the given type or DT_NULL if no superclass exists. Header File

ConsHFT.h

34

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

Con s ul ta nt Trave r s a l St a c k M et ho ds
The Consultant uses a traversal stack to maintain knowledge of which objects it has visited. These methods allow you to manipulate the traversal stack and track the consultants progress through the traversal.

ConsStackGetCount
ASUns32 ConsStackGetCount(ConsStack s);
Description Returns the number of objects currently on Consultants traversal stack. The stack includes the objects that the Consultant has visited on its path to the current object, or, in other words, all parents of the current object, but not the object itself. Parameters

s
Return Value

The Consultants traversal stack.

The number of objects on the Consultant.s traversal stack. Exceptions Raises an Acrobat exception on error.
***** An error like not a valid stack argument? What other errors might occur?

Header File

ConsHFT.h
Related Method

ConsultantGetNumDirectVisited

Acrobat PDF Consultant and Accessibility Checker

35

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexGetArrayIndex
ASUns32 ConsStackIndexGetArrayIndex(ConsStack s, ASUns32 Index)
Description Get the array index of the object at the given index into the stack (that is, the index that led from the given object to the next object in the traversal). It is only valid to call this function on an index if ConsStackIndexIsArray returns true for that index. Parameters

s Index
Return Value

The Consultants traversal stack. Index in the stack where the object in question is located.

The array index that led from the object at the given index in the stack to the next object in the Consultants traversal path. Header File

ConsHFT.h
Related Method

ConsStackIndexIsArray

36

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexGetDictKey
ASAtom ConsStackIndexGetDictKey(ConsStack s, ASUns32 Index);
Description Gets the key string atom of the object at the given index into the stack (that is, the key that led from the given object to the next object in the traversal). It is only valid to call this function on an index if ConsStackIndexIsDict returns true for that index. Parameters

s Index
Return Value

The Consultants traversal stack. Index in the stack where the object in question is located.

The key that led from the object at the given index in the stack to the next object in the Consultants traversal path. Exceptions Raises an Acrobat exception on error. Header File

ConsHFT.h
Related Method

ConsStackIndexIsDict

Acrobat PDF Consultant and Accessibility Checker

37

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexGetObj
CosObj ConsStackIndexGetObj(ConsStack s, ASUns32 Index);
Description Gets the the Cos object at the given index into the stack. Parameters

s Index
Return Value

The Consultants traversal stack. Point at which to find the object.

The object at the specified point in the Consultants traversal stack. Header File

ConsHFT.h
Related Method

ConsStackIndexGetTypeAt

38

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexGetTypeAt
PDFObjType ConsStackIndexGetTypeAt(ConsStack s, ASUns32 Index, ASUns32 TypeIndex);
Description Gets a type from the type array at each index in the stack. Since there are potentially multiple types for each object, you can access the type classifications one at a time. Parameters

s Index TypeIndex

The Consultants traversal stack. The position in the stack of the object in question. The type classification of the object. 0 is the most specific type classification. The higher the number, the more general the type classification.

Return Value One type of an object at a particular location in the traversal stack. Header File

ConsHFT.h
Related Method

ConsStackIndexGetObj ConsStackIndexGetTypeCount

Acrobat PDF Consultant and Accessibility Checker

39

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexGetTypeCount
ASUns32 ConsStackIndexGetTypeCount (ConsStack s, ASUns32 Index);
Description Gets the size of the type hierarchy at the given index into the stack. Parameters

s Index
Return Value

The Consultants traversal stack. The object in question.

Size of the type hierarchy. Header File

ConsHFT.h
Related Method

ConsStackIndexGetObj ConsStackIndexGetTypeAt

40

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexIsArray
ASBool ConsStackIndexisArray(ConsStack s, ASUns32 Index);
Description Tests whether the given index into the stack is a CosArray. Parameters

s Index
Return Value

The Consultants traversal stack. Index in the stack where the object in question is located.

true if the object found at the index point is an array, false otherwise.
Header File

ConsHFT.h
Related Method

ConsStackIndexGetArrayIndex

Acrobat PDF Consultant and Accessibility Checker

41

Consultant API Reference


Consultant Traversal Stack Methods

ConsStackIndexIsDict
ASBool ConsStackIndexisDict(ConsStack s, ASUns32 Index);
Description Tests whether the object at the given index into the stack is a CosDict object. Parameters

s Index
Return Value

The Consultants traversal stack. Index in the stack where the object in question is located.

true if the object found at the index point is a dictionary, false otherwise.
Header File

ConsHFT.h
Related Method

ConsStackIndexGetDictKey

42

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

ConsultantGetNumDirectVisited
ASUns32 ConsultantGetNumDirectVisited(Consultant s);
Description Returns the number of direct objects that the Consultant has processed so far. This count may include some objects twice, depending on revisitation of objects.This count is reset on calls to ConsultantTraverseFrom and ConsultantSetStart. Parameters

s
Return Value

A valid Consultant object handle as returned by ConsultantCreate.

The number of direct objects the Consultant has visited so far. Header File

ConsHFT.h
Related Method

ConsultantGetNumIndirectVisited ConsultantGetNumIndirectVisited ConsStackGetCount

Acrobat PDF Consultant and Accessibility Checker

43

Consultant API Reference


Consultant Traversal Stack Methods

ConsultantGetNumIndirectVisited
ASUns32 ConsultantGetNumIndirectVisited(Consultant s);
Description Returns the number of indirect objects that the Consultant has processed so far. This count may include some objects twice, depending on revisitation of objects.This count is reset on calls to ConsultantTraverseFrom and ConsultantSetStart. Parameters

s
Return Value

A valid Consultant object handle as returned by ConsultantCreate.

The number of indirect objects the Consultant has visited so far. Header File

ConsHFT.h
Related Method

ConsultantGetNumDirectVisited
ConsultantGetNumUniqueIndirectsVisited

44

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

ConsultantGetNumUniqueIndirectsVisited
ASUns32 ConsultantGetNumUniqueIndirectsVisited(Consultant s);
Description Returns the number of unique indirect objects that the Consultant has processed so far. This count is reset on calls to ConsultantTraverseFrom and ConsultantSetStart. Visited objects are not counted more than once; if an object is revisited, the count is not incremented. Parameters

s
Return Value

A valid Consultant object handle as returned by ConsultantCreate.

The number of unique indirect objects the Consultant has visited so far. Header File

ConsHFT.h
Related Method

ConsultantGetNumDirectVisited ConsultantGetNumIndirectVisited

Acrobat PDF Consultant and Accessibility Checker

45

Consultant API Reference


Consultant Traversal Stack Methods

ConsultantGetPercentDone
ASReal ConsultantGetPercentDone(Consultant s);
Description Returns an estimate (from 0 - 100) of what percentage of the current document has been processed by the Consultant. You can call this function at any time. Parameters

s
Return Value

A valid Consultant object handle as returned by ConsultantCreate.

Number between 0 - 100. Header File

ConsHFT.h
Related Method None.

46

Acrobat PDF Consultant and Accessibility Checker

Consultant API Reference


Consultant Traversal Stack Methods

ConsultantNextObj
ASBool ConsultantNextObj(Consultant s);
Description Instructs the Consultant to process the next object in the current traversal. Assumes that the Consultant has been suspended and reset with calls to ConsultantSuspend and ConsultantSetStart. This function does not unsuspend a Consultant, so you can call it repeatedly. It returns after all registered Agents have processed the object. Parameters

s
Return Value

A valid Consultant object handle as returned by ConsultantCreate.

true if the process is done or there has been a problem, false otherwise.
Exceptions Raises an Acrobat exception if you call it on a running Consultant. Header File

ConsHFT.h
Related Method

ConsultantSuspend ConsultantSetStart

Acrobat PDF Consultant and Accessibility Checker

47

Consultant API Reference


Consultant Traversal Stack Methods

48

Acrobat PDF Consultant and Accessibility Checker

Declarations and Callbacks

This chapter provides a reference for the data declarations and callback functions used by the Consultant methods and to construct Agents. Data Declarations Callbacks

Data Declarations ConsStack


typedef struct tagConsStack* ConsStack;
Description An opaque traversal stack object. The ConsStack... methods allow retrieval of individual members of the PDFObjType and CosObj stacks associated with a Consultant object. Related Methods

ConsStackGetCount ConsStackIndexGetArrayIndex ConsStackIndexGetDictKey ConsStackIndexGetObj ConsStackIndexGetTypeAt ConsStackIndexGetTypeCount ConsStackIndexIsArray ConsStackIndexIsDict


Related Callbacks

ConsAgentObjFoundCallback

Acrobat PDF Consultant and Accessibility Checker

49

Declarations and Callbacks


Data Declarations

Consultant
typedef struct tagConsultant* Consultant;
Description The opaque type to allow programs to retain handles to created PDF Consultant and Accessibility Checker objects. Related Methods numerous

50

Acrobat PDF Consultant and Accessibility Checker

Declarations and Callbacks


Data Declarations

ConsultantAgent tagConsultantAgent
typedef struct tagConsultantAgent { ASSize_t Size; const PDFObjType* pFindObjects; ASUns32 NumFindObjects;

ConsAgentPostProcessCallback PostProcess; ConsAgentObjFoundCallback ObjFound;


ASBool WantSubclasses; } ConsultantAgent;

Description During traversal, the Consultant checks the Agents list of object types of interest to see if the Agent is interested in the current object, and it calls the callback function pointers when objects of interest are found and when traversal is complete. All Agents should be C++ classes derived from the ConsultantAgentObj class (found in agentobj.h) which can be converted (via a C++ cast operator) to a pointer to this structure type. Wherever the Consultant HFT calls for a (struct Agent*), you can pass the class with no conversion. Members

Size pFindObjects NumFindObjects PostProcess ObjFound WantSubclasses

Size of the data structure. Set to sizeof(Agent). An array of object types of interest. The number of object types in the pFindObjects array. A callback procedure for post-processing. A callback procedure for when an object is found.

true if the Agent is interested in subclasses of specified object


types.

Related Methods

ConsultantRegisterAgent
Related Callbacks

ConsAgentObjFoundCallback

Acrobat PDF Consultant and Accessibility Checker

51

Declarations and Callbacks


Data Declarations

ConsultantAgentAction
Description Bit flags that instruct the Consultant about how to handle a found object. A logical OR of these values should be returned by the ObjFound callback. Values Flag Description The Consultant makes no changes to the current object. Use this if the Agent is only gathering information of if the Agent is in charge ofmaking all the modifications itself. Instructs the Consultant to replace this occurence of the current object in the document with the one retured via the pObjToReturn parameter to the ObjFound callback. You can optionally combine this with OD_REVISIT or OD_CHANGEALL. Instructs the Consultant to remove this occurence of the current object in the document. You can optionally combine this with OD_REVISIT or OD_CHANGEALL. Instructs the Consultant to visit this object again if it is encountered again. You can combine this with any flag except OD_NEVERREVISIT or OD_CHANGEALL. You must use this in conjunction with either OD_REPLACE or OD_REMOVE. It instructs the Consultant to silently perform the desired operation on all instances of the current object, without calling the ObjFound callback again for this object. Instruct the Consultant that under no circumstances should the object be revisited, regardless of whether it is reclassified when encountered again. Only applicable in the mode in which the Consultant pays attention to object classification when determining whether or not an obect has been visited already.

OD_NOCHANGE

OD_REPLACE

OD_REMOVE

OD_REVISIT

OD_CHANGEALL

OD_NEVERREVISIT

Related Callbacks

ConsAgentObjFoundCallback

52

Acrobat PDF Consultant and Accessibility Checker

Declarations and Callbacks


Data Declarations

PDFObjType
typedef ASUns32 PDFObjType;
Description Type corresponding to the enum defined in ConsObTp.h. This type is used to refer to specific object types in the Adobe PDF Document format. Specifically used by Agents to make object requests of the framework, and used by the framework to report the types of objects found. Related Methods

PDFObjTypeGetSuperclass
Related Callbacks

ConsAgentObjFoundCallback

Acrobat PDF Consultant and Accessibility Checker

53

Declarations and Callbacks


Data Declarations

RegAgentFlag
Description Constants that specify an operation mode for the Consultant. This value determines whether and how often the Consultant should revisit objects that have been previously encountered. Values Mode Flag Description Revisit objects of an unknown type always, unless an Agent returns AC_NEVERREVISIT for the object. Visit known types only once, unless an Agent returns AC_REVISIT for the object. Visit all objects once unless an Agent returns AC_REVISIT for the object. Revisit objects of an unknown type when encountered again as a known type that the object has not previously been encountered as, unless an Agent returns AC_NEVERREVISIT for the object. Revisit known types when encountered again as a new known type or as unknown, unless an Agent returns AC_NEVERREVISIT for the object. If an agent returns OD_REVISIT, revisit the object (of any known or unknown classification) the next time its encountered. Revisit an object whenever it is encountered again with a new classification; but always revisit objects classified as unknown (even if the object has previously been encountered and classified as unknown)

REG_ONLYREVISITUNKNOWN

REG_REVISITNONE REG_REVISITRECLASS_ALL

REG_REVISITRECLASS_ALWAYSUNKNOWN

Related Methods

ConsultantRegisterAgent

54

Acrobat PDF Consultant and Accessibility Checker

Declarations and Callbacks


Callbacks

Ca l l ba c k s ConsAgentObjFoundCallback
ConsultantAgentAction ConsAgentObjFoundCallback (struct tagConsultantAgent* agent, CosObj hObj, const PDFObjType* objTypeHierarchy, ASUns32 iSizeObjHierarchy ConsStack Stack, CosObj* pObjToReturn);
Description Returns a set of flags instructing the Consultant as to how to handle the current object. The Consultant calls this method when it recognizes the current object as a type which an Agent has declared interesting. Parameters

agent hObj

The agent containing the callback. The object the Consultant has just encountered, which has matched on of the types in any of the registered Agents array of interesting types. A list of the object type classifications this object met. the array runs from index 0, most specific object classification, to index iSizeObjHierarchy, the most general. The size of the type array. A reference to the Consultants traversal stack, which allows read-only access to parents of the current object as well as their respective types. If present, an object the Consultant uses to replace the current object in the document.

pObjTypeHierarchy

iSizeObjHierarchy Stack

pObjToReturn
Return Value

A logical OR of bit flags that instruct the Consultant how to handle the current object (remove it, replace it, ignore it, and so on.) Header File

ConsExpT.h
Related Method None.

Acrobat PDF Consultant and Accessibility Checker

55

Declarations and Callbacks


Callbacks

ConsAgentPostProcessCallback
void ConsAgentPostProcess (void)
Description The Consultant calls this method when it is ready to finish a cycle. You should perform any document modifications assigned to your Agent at this point. Parameters None. Return Value None. Header File

ConsExpT.h

56

Acrobat PDF Consultant and Accessibility Checker

Declarations and Callbacks


Callbacks

ConsPercentDoneCallback
void ConsAgentPercentDoneCallback (ASReal fPercentDone);
Description The Consultant calls this method with progress updates. It can display a progress bar. Parameters

fPercentDone

A number between 0 and 100, indicating the percent of the current document that the Consultant has processed so far.

Return Value None. Header File

ConsExpT.h

Acrobat PDF Consultant and Accessibility Checker

57

Declarations and Callbacks


Callbacks

58

Acrobat PDF Consultant and Accessibility Checker