Difference between revisions of "Improving API Documentation Usability with Knowledge Pushing"

From havefunsoft wiki
Jump to: navigation, search
m
m (Introduction)
Line 15: Line 15:
 
==Introduction==
 
==Introduction==
 
Modern software systems combine code written by many
 
Modern software systems combine code written by many
individuals and make heavy use of external libraries and Application Programming Interfaces (APIS). Stakeholders in
+
individuals and make heavy use of external libraries and ''Application Programming Interfaces'' (APIS). Stakeholders in
 
these settings are not likely to be fully acquainted with all
 
these settings are not likely to be fully acquainted with all
 
current knowledge about artifacts and services in the project
 
current knowledge about artifacts and services in the project
Line 25: Line 25:
 
Since many API functions are meant for widespread use,
 
Since many API functions are meant for widespread use,
 
their authors are motivated to invest significant effort in
 
their authors are motivated to invest significant effort in
creating elaborate documentation that fully specifies everything that a client may need to know about a function. Such
+
creating elaborate documentation that fully specifies ''everything'' that a client may need to know about a function. Such
 
specifications are crucial for assuring correctness during inspections and the development of testing plans [5, 12]. Unfortunately, the potential consumers of this documentation
 
specifications are crucial for assuring correctness during inspections and the development of testing plans [5, 12]. Unfortunately, the potential consumers of this documentation
 
spend much of their time browsing code [4] that includes
 
spend much of their time browsing code [4] that includes
Line 32: Line 32:
 
and may therefore miss important information.
 
and may therefore miss important information.
  
 +
Consider, for example, the documentation of
 +
method ''setClientId'' from the ''Java Messaging Service'' (JMS) API, which is depicted in Fig. 1 as it is
 +
displayed in the ''Eclipse'' IDE. The detailed narrative
 +
covers many details, including purpose, configuration,
 +
and exceptions. Stakeholders skimming the text may miss
 +
the highlighted sentence deep within the third paragraph,
 +
which defines a protocol that explicitly forbids prior
 +
method invocations on this object.
 +
 +
This problem is compounded by the significant fan-out
 +
(number of outgoing edges in the call graph) of many functions. Sifting through the documentation of one invoked
 +
function is challenging enough, so searching all targets for
 +
important knowledge is even less practical. For instance,
 +
consider the code excerpt of Fig. 2, which creates a message queue in JMS. When writing or examining this relatively straightforward code we must decide which, if any,
 +
of the four invoked methods should have their documentation
 +
examined for additional requirements. The IDE support does not offer any cues drawing us to (or away from)
 +
any particular call, though we might be inclined to examine
 +
the complex-looking calls that take one or more arguments.
 +
 +
It turns out, however, that the documentation of the seemingly straightforward call
 +
to createQueueConnection mentions that connections are created in a “stopped mode” and no messages will
 +
be delivered until their start method is invoked. Since this
 +
detail is not mentioned in the queue’s receive method, a
 +
lack of awareness of this directive here may result in the
 +
program hanging when messages are eventually retrieved.
 +
 +
Casual observations confirm that developers only investigate the documentation of a small portion of invoked
 +
methods. We suspect that this may also have an indirect
 +
effect on the willingness of authors of project artifacts to
 +
document less “visible” functions. Such functions are often
 +
written with specific assumptions, expectations, and limitations in mind, but developers presumably weigh the potential future benefits to their peers against the immediate costs
 +
of capturing this knowledge. Increasing the prospects that
 +
the documentation would actually be read may create better
 +
incentives for preserving it.
 +
 +
We also note that project artifacts are likely to have associated action items or bug reports [11]. Stakeholders need
 +
to become aware of these caveats in invoked functions to
 +
avoid depending on a faulty implementation.
  
  
 
[[Category:Article]]
 
[[Category:Article]]

Revision as of 09:41, 16 May 2019

Source: http://www.cs.cmu.edu/~udekel/papers/udekel_emoose_icse2009.pdf

Uri Dekel and James D. Herbsleb

Institute for Software Research, School of Computer Science

Carnegie Mellon University

5000 Forbes Avenue, Pittsburgh, PA 15213 USA

{udekel|jdh}@cs.cmu.edu

Abstract

The documentation of API functions typically conveys detailed specifications for the benefit of interested readers. In some cases, however, it also contains usage directives, such as rules or caveats, of which authors of invoking code must be made aware to prevent errors and inefficiencies. There is a risk that these directives may be “lost” within the verbose text, or that the text would not be read because there are so many invoked functions. To address these concerns for Java, an Eclipse plug-in named eMoose decorates method invocations whose targets have associated directives. Our goal is to lead readers to investigate further, which we aid by highlighting the tagged directives in the JavaDoc hover. We present a lab study that demonstrates the directive awareness problem in traditional documentation use and the potential benefits of our approach.

Introduction

Modern software systems combine code written by many individuals and make heavy use of external libraries and Application Programming Interfaces (APIS). Stakeholders in these settings are not likely to be fully acquainted with all current knowledge about artifacts and services in the project and third-party code. When focused on a particular code fragment, however, it may be critical for them to be wellversed in all the services that it uses. A lack of awareness of usage guidelines and caveats can result in runtime failures and maintenance difficulties.

Since many API functions are meant for widespread use, their authors are motivated to invest significant effort in creating elaborate documentation that fully specifies everything that a client may need to know about a function. Such specifications are crucial for assuring correctness during inspections and the development of testing plans [5, 12]. Unfortunately, the potential consumers of this documentation spend much of their time browsing code [4] that includes numerous method invocations. They are therefore limited in the time and effort they can spend on any particular call and may therefore miss important information.

Consider, for example, the documentation of method setClientId from the Java Messaging Service (JMS) API, which is depicted in Fig. 1 as it is displayed in the Eclipse IDE. The detailed narrative covers many details, including purpose, configuration, and exceptions. Stakeholders skimming the text may miss the highlighted sentence deep within the third paragraph, which defines a protocol that explicitly forbids prior method invocations on this object.

This problem is compounded by the significant fan-out (number of outgoing edges in the call graph) of many functions. Sifting through the documentation of one invoked function is challenging enough, so searching all targets for important knowledge is even less practical. For instance, consider the code excerpt of Fig. 2, which creates a message queue in JMS. When writing or examining this relatively straightforward code we must decide which, if any, of the four invoked methods should have their documentation examined for additional requirements. The IDE support does not offer any cues drawing us to (or away from) any particular call, though we might be inclined to examine the complex-looking calls that take one or more arguments.

It turns out, however, that the documentation of the seemingly straightforward call to createQueueConnection mentions that connections are created in a “stopped mode” and no messages will be delivered until their start method is invoked. Since this detail is not mentioned in the queue’s receive method, a lack of awareness of this directive here may result in the program hanging when messages are eventually retrieved.

Casual observations confirm that developers only investigate the documentation of a small portion of invoked methods. We suspect that this may also have an indirect effect on the willingness of authors of project artifacts to document less “visible” functions. Such functions are often written with specific assumptions, expectations, and limitations in mind, but developers presumably weigh the potential future benefits to their peers against the immediate costs of capturing this knowledge. Increasing the prospects that the documentation would actually be read may create better incentives for preserving it.

We also note that project artifacts are likely to have associated action items or bug reports [11]. Stakeholders need to become aware of these caveats in invoked functions to avoid depending on a faulty implementation.