Proceedings of the Second International Workshop on Library-Centric Software Design (LCSD ''''06) docx

122 547 0
Proceedings of the Second International Workshop on Library-Centric Software Design (LCSD ''''06) docx

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Technical Report No. 06-18 Proceedings of the Second International Workshop on Library-Centric Software Design (LCSD '06) JOSHUA BLOCH JAAKKO JÄRVI (PROGRAM CO-CHAIRS) ANDREAS PRIESNITZ SIBYLLE SCHUPP (PROCEEDINGS EDITORS) Department of Computer Science and Engineering Division of Computing Science CHALMERS UNIVERSITY OF TECHNOLOGY/ GÖTEBORG UNIVERSITY Göteborg, Sweden, 2006 Smith Nguyen Studio. Technical Report in Computer Science and Engineering at Chalmers University of Technology and G¨oteborg University Technical Report No. 06-18 ISSN: 1652-926X Department of Computer Science and Engineering Chalmers University of Technology and G¨oteborg University SE-412 96 G¨oteborg, Sweden G¨oteborg, Sweden, October 2006 Smith Nguyen Studio. Proceedings of the Second International Workshop on Library-Centric Software Design (LCSD ’06) An OOPSLA Workshop October 22, 2006 Portland, Oregon, USA Joshua Blo ch and Jaakko J¨arvi (Program Co-Chairs) Andreas Priesnitz and Sibylle Schupp (Proceedings Editors) Chalmers University of Technology Computer Science and Engineering Department Technical Report 06-18 Smith Nguyen Studio. Smith Nguyen Studio. Foreword These proceedings contain the papers selected for presentation at the workshop Library-Centric Software Design (LCSD), held on October 22nd, 2006 in Portland, Oregon, USA, as part of the yearly ACM OOPSLA conference. The current workshop is the second LCSD workshop in the series. The first ever LCSD workshop in 2005 was a success—we are thus very pleased to see that interest towards the current workshop was even higher. Software libraries are central to all major scientific, engineering, and business areas, yet the design, implementation, and use of libraries are underdeveloped arts. The goal of the Library-Centric Software Design workshop therefore is to place the various aspects of libraries on a sound technical and scientific footing. To that end, we welcome both research into fundamental issues and the documentation of best practices. The idea for a workshop on Library-Centric Software Design was born at the Dagstuhl meeting Software Libraries: Design and Evaluation in March 2005. Currently LCSD has a steering committee developing the workshop further, and coordinating the organization of future events. The committee is currently served by Josh Bloch, Jaakko J¨arvi, Sibylle Schupp, Dave Musser, Alex Stepanov, and Frank Tip. We aim to keep LCSD growing. For the current workshop, we received 20 submissions, nine of which were accepted as technical papers, and additional four as position papers. The topics of the papers covered a wide area of the field of software libraries, including library evolution; abstractions for generic manipulation of complex mathematical structures; static analysis and type systems for software libraries; extensible languages; and libraries with run-time code generation capabilities. All papers were reviewed for soundness and relevance by three or more reviewers. The reviews were very thorough, for which we thank the members of the program committee. In addition to paper presentations, workshop activities included a keynote by Sean Parent, Adobe Inc. At the time of writing this foreword, we do not yet know the exact attendance of the workshop; the registrations received suggest close to 50 attendees. We thank all authors, reviewers, and the organizing committee for their work in bringing about the LCSD workshop. We are very grateful to Sibylle Schupp, David Musser, and Jeremy Siek for their efforts in organizing the event, as well as to DongInn Kim and Andrew Lumsdaine for hosting the CyberChair system to manage the submissions. We also thank Tim Klinger and the OOPSLA workshop organizers for the help we received. We hope you enjoy the papers, and that they generate new ideas leading to advances in this exciting field of research. Jaakko J¨arvi Joshua Bloch (Program co-chairs) 1 Smith Nguyen Studio. Organization Workshop Organizers - Josh Bloch, Google Inc. - Jaakko J¨arvi, Texas A&M University - David Musser, Rensselaer Polytechnic Institute - Sibyl le Schupp, Chalmers University of Technology - Jeremy Siek, Rice University Program Committee - Dave Abrahams, Boost Consulting - Olav Beckman, Imperial College London - Herv´e Br¨onnimann, Polytechnic University - Cristina Gacek, University of Newcastle upon Tyne - Douglas Gregor, Indiana University - Paul Kelly, Imperial College London - Doug Lea, State University of New York at Oswego - Andrew Lumsdaine, Indiana University - Erik Meijer, Microsoft Research - Tim Peierls, Prior Artisans LLC - Doug Schmidt, Vanderbilt University - Ant hony Simons, University of Sheffield - Bjarne Stroustrup, Texas A&M University and AT&T Labs - Todd Veldhuizen, University of Waterloo 2 Smith Nguyen Studio. Contents Active Libraries 5 An Active Linear Algebra Library Using Delayed Evaluation and Runtime Code Gen- eration Francis P. Russell, Michael R. Mellor, Paul H. J. Kelly, and Olav Beckmann 5 Efficient Run-Time Dispatching in Generic Programming with Minimal Code Bloat Lubomir Bourdev and Jaakko J¨arvi 15 Generic Library Extension in a Heterogeneous Environment Cosmin Oancea and Stephen M. Watt 25 Adding Syntax and Static Analysis to Libraries via Extensible Compilers and Lan- guage Extensions Eric Van Wyk, Derek Bodin, and Paul Huntington 35 Typ e Systems and Static Analysis 45 A Static Analysis for the Strong Exception-Safety Guarantee Gustav Munkby and Sibylle Schupp 45 Extending Type Systems in a Library Yuriy Solodkyy, Jaakko J¨arvi, and Esam Mlaih 55 Anti-Deprecation: Towards Complete Static Checking for API Evolution S. Alexander Spoon 65 Libraries Manipulating Complex Structures 75 A Generic Lazy Evaluation Scheme for Exact Geometric Computations Sylvain Pion and Andreas Fabri 75 A Generic Topology Library Ren´e Heinzl, Michael Spe vak, and Philipp Schwaha 85 Position Papers 95 A Generic Discretization Library Michael Spevak, Ren´e Heinzl, and Philipp Schwaha 95 The SAGA C++ Reference Implementation Hartmut Kaiser, Andre Merzky, Stephan Hirmer, and Gabrielle Allen 101 3 Smith Nguyen Studio. A Parameterized Iterator Request Framework for Generic Libraries Jacob Smith, Jaakko J¨arvi, and Thomas Ioerger 107 Pound Bang What? John P. Linderman 113 4 Smith Nguyen Studio. An Active Linear Algebra Library Using Delayed Evaluation and Runtime Code Generation [Extended Abstract] Francis P Russell, Michael R Mellor, Paul H J Kelly and Olav Beckmann Department of Computing Imperial College London 180 Queen’s Gate, London SW7 2AZ, UK ABSTRACT Active libraries can be defined as libraries which play an ac- tive part in the compilation (in particular, the optimisation) of their client code. This paper explores the idea of delay- ing evaluation of expressions built using library calls, then generating code at runtime for the particular compositions that occur. We explore this idea with a dense linear algebra library for C++. The key optimisations in this context are loop fusion and array contraction. Our library automatically fuses loops, identifies unnecessary intermediate temporaries, and contracts temporary arrays to scalars. Performance is evaluated us ing a benchmark suite of linear solvers from ITL (the Iterative Template Li- brary), and is compared with MTL (the Matrix Template Li- brary). Excluding runtime compilation overheads (caching means they occur only on the first iteration), for larger ma- trix sizes, performance matches or exceeds MTL – and in some cases is more than 60% faster. 1. INTRODUCTION The idea of an “active library” is that, just as the library extends the language available to the programmer for prob- lem solving, s o the library should also extend the compiler. The term was coined by Czarnecki et al [5], who observed that active libraries break the abstractions common in con- ventional compilers. Active libraries are described in detail by Veldhuizen and Gannon [8]. This paper presents a prototype linear algebra library which we have developed in order to explore one interesting ap- proach to building active libraries. The idea is to use a combination of delayed evaluation and runtime code gener- ation to: Delay library call execution Calls made to the library are used to build a “recipe” for the delayed computa- tion. When execution is finally forced by the need for a result, the recipe will commonly represent a complex composition of primitive calls. Generate optimised code at runtime Code is generated at runtime to perform the operations present in the de- layed recipe. In order to obtain improved performance over a conventional library, it is important that the generated code should on average, execute faster than a statically generated counterpart in a conventional li- brary. To achieve this, we apply optimisations that exploit the structure, semantics and context of each library call. This approach has the advantages that: • There is no need to analyse the client source code. • The library user is not tied to a particular compiler. • The interface of the library is not over complicated by the concerns of achieving high performance. • We can perform optimisations across both statement and procedural bounds. • The code generated for a recipe is isolated from client- side code - it is not interwoven with non-library code. This last point is particularly important, as we shall see: because the structure of the code for a recipe is restricted in form, we can introduce compilation passes sp ecially targeted to achieve particular effects. The disadvantage of this approach is the overhead of run- time compilation and the infrastructure to delay evaluation. In order to minimise the first factor, we maintain a cache of previously generated code along with the recipe used to gen- erate it. This enables us to reuse previously optimised and compiled code when the same recipe is encountered again. 5 Smith Nguyen Studio. There are also more subtle disadvantages. In contrast to a compile-time solution, we are forced to make online de- cisions about what to evaluate, and when. Living without static analysis of the client code means we don’t know, for example, which variables involved in a recipe are actually live when the recipe is forced. We return to these issues later in the paper. Our exploration covers the following ground: 1. We present an implementation of a C++ library for dense linear algebra which provides functionality suf- ficient to operate with the majority of methods avail- able in the Iterative Template Library [6] (ITL), a set of templated linear iterative solvers for C++. 2. This implementation delays execution, generates code for delayed recipes at runtime, and then invokes a ven- dor C compiler at runtime - entirely transparently to the library user. 3. To avoid repeated compilation of recurring recipes, we cache compiled code fragments (see Section 4). 4. We implemented two optimisation passes which trans- form the code prior to compilation: loop fusion, and array contraction (see Section 5). 5. We introduce a scheme to predict, statistically, which intermediate variables are likely to be used after recipe execution; this is used to increase opportunities for array contraction (see Section 6). 6. We evaluate the effectiveness of the approach using a suite of iterative linear system solvers, taken from the Iterative Template Library (see Section 7). Although the exploration of these techniques has used only dense linear algebra, we believe these techniques are more widely applicable. Dense linear algebra provides a simple domain in which to investigate, understand and demon- strate these ideas. Other domains we believe may benefit from these techniques include sparse linear algebra and im- age processing operations. The contributions we make with this work are as follows: • Compared to the widely used Matrix Template Li- brary [7], we demonstrate performance improvements of up to 64% across our benchmark suite of dense linear iterative solvers from the Iterative Template Library. Performance depends on platform, but on a 3.2GHz Pentium 4 (with 2MB cache) using the Intel C Com- piler, average improvement across the suite was 27%, once cached complied code was available. • We present a cache architecture that finds applicable pre-compiled code quickly, and which supports anno- tations for adaptive re-optimisation. • Using our experience with this library, we discuss some of the design issues involved in using the delayed-evaluation, runtime code generation technique. We discuss related work in Section 8. Figure 1: An example DAG. The rectangular node denotes a handle held by the library client. The expresssion represents the matrix-vector multiply function from Level 2 BLAS, y = αAx + βy. 2. DELAYING EVALUATION Delayed evaluation provides the mechanism whereby we col- lect the sequences of operations we wish to optimise. We call the runtime information we obtain about these operations runtime context information. This information may consist of values such as matrix or vector sizes, or the various relationships between successive library calls. Knowledge of dynamic values such as matrix and vector sizes allows us to improve the performance of the implementation of operations using these objects. For example, the runtime code generation system (see 3) can use this information to specialise the generated code. One specialisation we do is with loop b ounds. We incorporate dy- namically known sizes of vectors and matrices as constants in the runtime generated code. Delayed evaluation in the library we developed works as fol- lows: • Delayed expressions built using library calls are repre- sented as Directed Acyclic Graphs (DAGs). • Nodes in the DAG represent either data values (liter- als) or operations to be performed on them. • Arcs in the DAG point to the values required before a node can be evaluated. • Handles held by the library client may also hold refer- ences to nodes in the expression DAG. • Evaluation of the DAG involves replacing non-literal nodes with literals. • When a node no longer has any nodes or handles de- pending on it, it deletes itself. 6 Smith Nguyen Studio. [...]... with the result that the matrix involved was only iterated over once for both operations A graph of the speedup obtained across matrix sizes is shown in Figure 2 The second optimisation implemented was array contraction We only evaluated this in the presence of loop fusion as the former is often facilitated by the latter The array contraction pass did not show any noticeable improvement on any of the. .. accessed via the getErasedSTL function in the form of an unsigned long value The implementation of the erase function retrieves the STL objects corresponding to the GIDL wrapper parameters, calls the STL erase function on the STL vector reference, and creates a new GIDL server corresponding to the iterator result Note that the semantics of the erase function are irrelevant in what the translation mechanism... depend on the types of the elements contained in these containers, a high-quality implementation is expected to hoist this functionality to non-generic functions The GNU Standard C++ Library v3 does exactly this: the tree balancing functions operate on pointers to a non-generic base class of the tree’s node type In the case of associative containers, the tree node type is split into a generic and non-generic... extension of multiple, independent dimensions of the library’s behavior In this situation, there are questions of how the extended library’s hierarchy relates to the original library’s hierarchy, how objects from independent extensions may be used and how the extensions interact This paper examines the question of library extension in a heterogeneous environment We consider the situation where software. .. properties of the extension: • The extension interface should be type-precise and it should allow type-safety reasoning with respect to the extension itself The type-safety result for the whole framework would thus be derived from the ones of the extensions and of the underlying architecture • The extension should be split in first-class value components In the GIDL case for example, one component should... implementation retrieves the parameters’ UA-objects, invokes the UA method on these, and perform the reverse operation on the result The wrapper skeleton functionality is the inverse of the client The wrapper skeleton method creates GIDL stub wrapper objects encapsulating the UA objects, thus recovering the generic type erased information It then invokes the user-implemented server method with these parameters,... for the application of a convolution filter to an image As the size and the values of the convolution matrix are known at the runtime code generation stage, the two inner loops of the convolution can be unrolled and specialised with the values of the matrix elements Another example shows how a runtime search can be performed to find an optimal tile size for a matrix multiply TaskGraph is also used as the. .. the STL orthogonal design of its domains For example GIDL iterators are themselves valid STL iterators and thus they can be manipulated by the STL containers and algorithms In this context we investigate the issues that prevent the translation to conform with the library semantics, the techniques to amend them, and the tradeoffs between translation ease -of- use and performance The second objective was... comparison of the BiConjugate Gradient solver against MTL running on architecture 2 is shown in Figure 4 In the figures just quoted, we excluded the runtime compilation overhead, leaving just the performance increase in the numerical operations As the iterative solvers use code caching, the runtime compilation overhead is independent of the number of iterations executed Depending on the number of iterations... retrieves the UA IDL-object or value of the result and passes it to the IDL skeleton The extension introduces an extra level of indirection with respect to the method invocation mechanism of the underlying framework This is the price to pay for the generality of the approach: this generic extension will work on top of any UA vendor implementation while maintaining backward compatibility However, since the . Studio. Proceedings of the Second International Workshop on Library-Centric Software Design (LCSD ’06) An OOPSLA Workshop October 22, 2006 Portland, Oregon,. Technical Report No. 06-18 Proceedings of the Second International Workshop on Library-Centric Software Design (LCSD '06) JOSHUA BLOCH JAAKKO

Ngày đăng: 14/03/2014, 11:20

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan