cooperative-software-development
cooperative-software-development copied to clipboard
Comprehension: Include survey of research on APIs
This was from a grant proposal. Probably belongs in the program comprehension chapter.
Establishing the problem
There is always a "gulf of execution" mismatch between goals and documentation, requiring learners to translate their ill-defined desired behaviors into the capabilities of an API (Sillito & Begel 2013, Robillard 2009, Ko et al. 2004, Duala-Ekoko & Robillard 2010, Duala-Ekoko & Robillard 2012)
Examples are the vastly preferred starting point for learning (Meng et al. 2017, Robillard 2009, Ko et al. 2004, Stylos & Myers 2006, Stylos et al. 2009, Duala-Ekoko & Robillard 2010, Robillard & Deline 2011, Sadowski et al. 2015, Nykaza et al. 2002, Watson 2015, Shull et al. 2000), but API source code and (Garousi et al. 2015) and experts are also preferred, especially when examples are not available (Lutters & Seaman 2007, Parnin & Treude 2011, Ko et al. 2007)
Unfortunately, documentation is often incomplete, inaccurate, ambiguous (Robillard 2009, Ko et al. 2007, Maalej & Robillard 2013). Crowd documentation like Stack Overflow overcomes some of these problems, offering more examples and more completeness, but only for popular APIs and only after significant time (Parnin et al. 2012), which doesn’t help for new APIs, new versions of APIs, or APIs with small user communities.
Even if documentation were full of perfect examples, copying and modifying examples leads to defects (Ko & Myers 2005, Kim et al. 2004)
Framing the solution
Good examples explain rationale to help developers understand intended and alternative use and API limitations (Meng et al. 2017, Nasehi et al. 2012, Robillard & Deline 2011, Ko et al. 2007)
Good examples provide conceptual background knowledge to help understand concepts modeled in an API (Meng et al. 2017, Ko & Riche 2011)
Good examples show how to coordinate elements of an API to achieve new behavior (Robillard & Deline 2011, Ko et al. 2004)
Developers need to have a model of the internal execution semantics of APIs in order to both use the API and debug code using it (Robillard & Deline 2011, Ko et al. 2004)
Critiquing attempted solutions
Mining information about APIs and bringing it forward in search can accelerate learning, but only accelerates retrieval, still leading to copying and knowledge gaps (Stylos & Myers 2006, Uddin et al. 2012)
APIs can be designed to be more learnable, but must be designed right upfront and are hard to change (Myers & Stylos 2016, Stylos et al. 2009, Stylos & Myers 2008, Dagenais & Robillard 2010, Jeong et al. 2009, McLellan et al. 1998)
It is possible to automatically mine rules about how APIs are to be used, but there are few examples of applications of these mined patterns to actually supporting learning (Heuzeroth et al. 2003, Balanyi & Ferenc 2003, Shi & Olsson 2006, Kagdi et al. 2005, Xie & Pei 2006m, Acharya et al. 2007, Uddin et al. 2012, Wang et al. 2013, Robillard et al. 2013)
It is possible to generate basic tutorials from code examples that improve learning, but these are only semi-automated and really just decompose code into steps (Harms et al. 2013, Dahotre et al. 201)
Acharya, M., Xie, T., Pei, J., & Xu, J. (2007, September). Mining API patterns as partial orders from source code: from usage scenarios to specifications. In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering (pp. 25-34). http://doi.acm.org/10.1145/1287624.1287630. It is possible to mine API usage rules as specifications on call sequences.
Balanyi, Z., & Ferenc, R. (2003, September). Mining design patterns from C++ source code. In Software Maintenance, 2003. ICSM 2003. Proceedings. International Conference on (pp. 305-314). https://doi.org/10.1109/ICSM.2003.1235436. Design patterns can be discovered automatically by describing them formally.
Clarke, S. (2004). Measuring API Usability. Dr. Dobb\'s Journal, 29, S6-S9. http://www.drdobbs.com/windows/measuring-api-usability/184405654. API usability can be measured with Cognitive Dimensions.
Dagenais, B., & Robillard, M. P. (2010, November). Creating and evolving developer documentation: understanding the decisions of open source contributors. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering (pp. 127-136). https://doi.org/10.1145/1882291.1882312. Working on documentation improves code quality and increases interactions between API developers and their user community.
Dahotre, A., Krishnamoorthy, V., Corley, M., & Scaffidi, C. (2011). Using intelligent tutors to enhance student learning of application programming interfaces. Journal of Computing Sciences in Colleges, 27(1), 195-201. http://dl.acm.org/citation.cfm?id=2037151.2037190. Tutorials can be semi-automatically generated from source code, and they can improve learning.
Duala-Ekoko, E., & Robillard, M. P. (2010). The information gathering strategies of API learners. Technical report, TR-2010.6, School of Computer Science, McGill University. https://pdfs.semanticscholar.org/a7ff/4cc954744f761e8697be4e73aa25166a76c4.pdf. Developers start by mapping a concept to a starting point, using the web, package browsers, or types, finding dependencies related to the concept they found, then finding concept implementation requirements. Variation in whether code examples were trusted.
Duala-Ekoko, E., & Robillard, M. P. (2012, June). Asking and answering questions about unfamiliar APIs: An exploratory study. In Proceedings of the 34th International Conference on Software Engineering (pp. 266-276). IEEE Press. http://dl.acm.org/citation.cfm?id=2337223.2337255. All questions are essentially gulf of execution and evaluation questions, mapping goal to user action, and system action to goal.
Garousi, G., Garousi-Yusifoğlu, V., Ruhe, G., Zhi, J., Moussavi, M., & Smith, B. (2015). Usage and usefulness of technical software documentation: An industrial case study. Information and Software Technology, 57, 664-682. https://doi.org/10.1016/j.infsof.2014.08.003. When available, source code was considered most frequently as the preferred information source during software maintenance.
Harms, K. J., Cosgrove, D., Gray, S., & Kelleher, C. (2013, June). Automatically generating tutorials to enable middle school children to learn programming independently. In Proceedings of the 12th International Conference on Interaction Design and Children (pp. 11-19). http://doi.acm.org/10.1145/2485760.2485764. Code examples can be converted into step-by-step tutorials, aiding in near-transfer tasks.
Heuzeroth, D., Holl, T., Hogstrom, G., & Lowe, W. (2003, May). Automatic design pattern detection. In Program Comprehension, 2003. 11th IEEE International Workshop on (pp. 94-103). https://doi.org/10.1109/WPC.2003.1199193. Static and dynamic analysis can detect observer, composite, mediator, chain of responsibility and visitor patterns.
Hou, D., & Li, L. (2011, June). Obstacles in using frameworks and APIs: An exploratory study of programmers' newsgroup discussions. In Program Comprehension (ICPC), 2011 IEEE 19th International Conference on (pp. 91-100).https://doi.org/10.1109/ICPC.2011.21. Replicates many of the learning barrier observations using discussion forum data.
Jeong, S., Xie, Y., Beaton, J., Myers, B., Stylos, J., Ehret, R., ... & Busse, D. (2009). Improving documentation for eSOA APIs through user studies. End-User Development, 86-105. User studies can improve API design.
Kagdi, H., Collard, M. L., & Maletic, J. I. (2005, May). Towards a taxonomy of approaches for mining of source code repositories. In ACM SIGSOFT Software Engineering Notes (Vol. 30, No. 4, pp. 1-5). https://doi.org/10.1145/1082983.1083159. There are many source-code aware techniques for mining software repositories.
Kim, M., Bergman, L., Lau, T., & Notkin, D. (2004, August). An ethnographic study of copy and paste programming practices in OOPL. In Empirical Software Engineering, 2004. ISESE'04. Proceedings. 2004 International Symposium on (pp. 83-92). https://doi.org/10.1109/ISESE.2004.1334896. Copying is frequent and results in defects, but is rational due to limitations of programming languages.
Ko, A. J., & Myers, B. A. (2005). A framework and methodology for studying the causes of software errors in programming systems. Journal of Visual Languages & Computing, 16(1), 41-84. https://doi.org/10.1016/j.jvlc.2004.08.003. Defects come from fragile knowledge.
Ko, A. J., & Riche, Y. (2011, September). The role of conceptual knowledge in API usability. In Visual Languages and Human-Centric Computing (VL/HCC), 2011 IEEE Symposium on (pp. 173-176). https://doi.org/10.1109/VLHCC.2011.6070395. To understand APIs, developers also need to understand the high-level concepts the API leverages in order to understand the hidden semantics in its abstractions.
Ko, A. J., DeLine, R., & Venolia, G. (2007, May). Information needs in collocated software development teams. In Software Engineering, 2007. ICSE 2007. 29th International Conference on (pp. 344-353). https://doi.org/10.1109/ICSE.2007.45. API learning is a frequent unmet need even amongst industry professionals.
Ko, A. J., Myers, B. A., & Aung, H. H. (2004, September). Six learning barriers in end-user programming systems. In Visual Languages and Human Centric Computing, 2004 IEEE Symposium on (pp. 199-206). https://doi.org/10.1109/VLHCC.2004.47. When learning an API, developers face design, selection, coordination, use, information, and understanding barriers.
Lethbridge, T. C., Singer, J., & Forward, A. (2003). How software engineers use documentation: The state of the practice. IEEE software, 20(6), 35-39. https://doi.org/10.1109/MS.2003.1241364. Organizations typically do not update documentation as timely or completely as software process personnel and managers advocate.
Lutters, W. G., & Seaman, C. B. (2007). Revealing actual documentation usage in software maintenance through war stories. Information and Software Technology, 49(6), 576-587. https://doi.org/10.1016/j.infsof.2007.02.013. Human sources of information are an important substitute when documentation is not available, and word of mouth is relied on as a method for finding knowledgeable people. Structural properties of documentation (e.g., tables of contents, indices) have a substantial effect on the ability of maintainers to make use of it.
Maalej, W., & Robillard, M. P. (2013). Patterns of knowledge in API reference documentation. IEEE Transactions on Software Engineering, 39(9), 1264-1282. https://doi.org/10.1109/TSE.2013.12. Java and .NET Documentation tends to contain descriptions of functionality, concepts, purpose, quality, control, structure, patterns, examples, environment, references, but the majority of is functionality and non-information that was irrelevant or redundant in context.
McLellan, S. G., Roesler, A. W., Tempest, J. T., & Spinuzzi, C. I. (1998). Building more usable APIs. IEEE software, 15(3), 78-86. https://doi.org/10.1109/52.676963. Traditional usability methods are a good guide for API design.
Meng, M., Steinhardt, S., & Schubert, A. (2017). Application Programming Interface Documentation: What Do Software Developers Want?. Journal of Technical Writing and Communication. http://dx.doi.org/10.1177/0047281617721853. API documentation needs to provide API structure, rationale behind API decisions, code examples, conceptual background knowledge, completeness, accuracy.
Myers, B. A., & Stylos, J. (2016). Improving API usability. Communications of the ACM, 59(6), 62-69. http://dl.acm.org/citation.cfm?id=2896587. APIs can be designed to be more learnable.
Nasehi, S. M., Sillito, J., Maurer, F., & Burns, C. (2012, September). What makes a good code example?: A study of programming Q&A in StackOverflow. In Software Maintenance (ICSM), 2012 28th IEEE International Conference on (pp. 25-34). https://doi.org/10.1109/ICSM.2012.6405249. Explanations accompanying examples are as important as the examples themselves. Good examples were concise, highlighted key elements of solution, step-by-step explanations, links to further resources for learning, multiple alternative solutions, limitations of solutions, and API limitations.
Nykaza, J., Messinger, R., Boehme, F., Norman, C. L., Mace, M., & Gordon, M. (2002, October). What programmers really want: results of a needs assessment for SDK documentation. In Proceedings of the 20th annual international conference on Computer documentation (pp. 133-141). https://doi.org/10.1145/584955.584976. Examples help. This is one of the earliest studies to investigate this.
Parnin, C., & Treude, C. (2011, May). Measuring API documentation on the web. In Proceedings of the 2nd international workshop on Web 2.0 for software engineering (pp. 25-30). https://doi.org/10.1145/1984701.1984706. Stack Overflow achieves high coverage for popular things eventually, but is slow to cover.
Parnin, C., Treude, C., Grammel, L., & Storey, M. A. (2012). Crowd documentation: Exploring the coverage and the dynamics of API discussions on Stack Overflow. Georgia Institute of Technology, Tech. Rep. Developers directly browse official documentation in intermittent bursts. Some developers learn APIs through “apprenticeships” with expert Stack Overflow users. Developers continuously reference Stack Overflow questions during development via search
Robillard, M. P., & Deline, R. (2011). A field study of API learning obstacles. Empirical Software Engineering, 16(6), 703-732. https://link.springer.com/article/10.1007/s10664-010-9150-8. When learning APIs, developers struggle to discover intent to understand the correct and most efficient way to use the API, code examples to solve coordination barriers, and instruction on how an API executes so that one can predict what will happen and diagnose failures.
Robillard, M. P., Bodden, E., Kawrykow, D., Mezini, M., & Ratchford, T. (2013). Automated API property inference techniques. IEEE Transactions on Software Engineering, 39(5), 613-637. https://doi.org/10.1109/TSE.2012.63. There are over 60 techniques for mining usage patterns, including unordered usage patterns, sequential usage patterns, behavioral specifications, migration mappings, and general information.
Robillard, M. P. (2009). What makes APIs hard to learn? Answers from developers. IEEE software, 26(6). https://doi.org/10.1109/MS.2009.193. Developers want API documentation to include include good examples, be complete, support many complex usage scenarios, be conveniently organized, and include relevant design elements. Examples lead to frustration when not well adapted to a task.
Sadowski, C., Stolee, K. T., & Elbaum, S. (2015, August). How developers search for code: a case study. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (pp. 191-201). http://doi.acm.org/10.1145/2786805.2786855. Programmers search for code an average of 12 times each workday, and seeking answers to questions about how to use an API, what code does, why something is failing, or where code is located.
Shi, N., & Olsson, R. A. (2006, September). Reverse engineering of design patterns from java source code. In Automated Software Engineering, 2006. ASE'06. 21st IEEE/ACM International Conference on (pp. 123-134). https://doi.org/10.1109/ASE.2006.57. Static and dynamic analysis can detect the Gang of Four patterns.
Shull, F., Lanubile, F., & Basili, V. R. (2000). Investigating reading techniques for object-oriented framework learning. IEEE Transactions on Software Engineering, 26(11), 1101-1118. https://doi.org/10.1109/32.881720. Generates hypothesis that examples are best.
Sillito, J., & Begel, A. (2013, May). App-directed learning: An exploratory study. In Cooperative and Human Aspects of Software Engineering (CHASE), 2013 6th International Workshop on (pp. 81-84). IEEE. https://doi.org/10.1109/CHASE.2013.6614736. When doing self-directed learning, the features they wanted to build determined a syllabus for learning. The uniqueness of goals creates a mismatch between prepared documentation and learning goals.
Stylos, J., & Myers, B. (2007, September). Mapping the space of API design decisions. In Visual Languages and Human-Centric Computing, 2007. VL/HCC 2007. IEEE Symposium on (pp. 50-60). https://doi.org/10.1109/VLHCC.2007.44. API designs have many quality dimensions, including speed, memory usage, expressiveness, extensibility, evolvability, learnability, productivity, and error prevention.
Stylos, J., & Myers, B. A. (2006, September). Mica: A web-search tool for finding API components and examples. In Visual Languages and Human-Centric Computing, 2006. VL/HCC 2006. IEEE Symposium on (pp. 195-202). https://doi.org/10.1109/VLHCC.2006.32. Extracting information about an API and overlaying it can accelerate API learning.
Stylos, J., & Myers, B. A. (2008, November). The implications of method placement on API learnability. In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering (pp. 105-112). http://doi.acm.org/10.1145/1453101.1453117. Where methods are placed in an API affects API learnability because developers have expectations from prior APIs.
Stylos, J., Faulring, A., Yang, Z., & Myers, B. A. (2009, September). Improving API documentation using API usage information. In Visual Languages and Human-Centric Computing, 2009. VL/HCC 2009. IEEE Symposium on (pp. 119-126). https://doi.org/10.1109/VLHCC.2009.5295283. Crowdsourcing expected API features and linking them to examples or the intended API makes developers faster at finding what they need.
Uddin, G., & Robillard, M. P. (2015). How API documentation fails. IEEE Software, 32(4), 68-75. https://doi.org/10.1109/MS.2014.80. Failures include incomplete, ambiguous, unexplained, obsolete, inconsistent, incorrect content, and bloated fragmented, entangled, and over-structured information.
Uddin, G., Dagenais, B., & Robillard, M. P. (2012, June). Temporal analysis of API usage concepts. In Proceedings of the 34th International Conference on Software Engineering (pp. 804-814). http://dl.acm.org/citation.cfm?id=2337223.2337318. Usage patterns can be mined over time, analyzing their introduction and evolution to a project.
Wang, J., Dang, Y., Zhang, H., Chen, K., Xie, T., & Zhang, D. (2013, May). Mining succinct and high-coverage API usage patterns from source code. In Proceedings of the 10th Working Conference on Mining Software Repositories (pp. 319-328). http://dl.acm.org/citation.cfm?id=2487085.2487146. It is possible to mine usage patterns and measure the success of mining them.
Watson, R. B. (2015). The Effect of Visual Design and Information Content on Readers’ Assessments of API Reference Topics (Doctoral dissertation). https://digital.lib.washington.edu/researchworks/handle/1773/33466. Visual design doesn’t matter, but content does.
Xie, T., & Pei, J. (2006, May). MAPO: Mining API usages from open source repositories. In Proceedings of the 2006 international workshop on Mining software repositories (pp. 54-57). http://doi.acm.org/10.1145/1137983.1137997. Given a query that describes a method, class, or package for an API, MAPO leverages existing source code search engines to gather a short list of frequent API usages for developers to inspect.
Zhang, C., Yang, J., Zhang, Y., Fan, J., Zhang, X., Zhao, J., & Ou, P. (2012, June). Automatic parameter recommendation for practical API usage. In Proceedings of the 34th International Conference on Software Engineering (pp. 826-836). http://dl.acm.org/citation.cfm?id=2337223.2337321. Mining frequently used parameters in APIs is useful in some tasks.