C++

From GnuCash
Revision as of 19:25, 15 September 2014 by Fell (talk | contribs) (Still 1 "Stroustroup" found)
Jump to: navigation, search

Converting Gnucash to C++ from sort-of GObject

The underlying rationale to this page is a long thread on the gnucash-devel list titled Beyond 2.6. It's worth frequent review, as there are a lot of good ideas as well as valid concerns expressed there.

To summarize, Gnucash has grown over 10 years without a lot of thought on continuing design. The old design documents are still in the tree, and though they might get deleted soon will always be in git history. They haven't been updated in more than 10 years. The reimplementation in GObject was done with a poor understanding of how GObject's emulation of object orientation works. That's not surprising; it was done in the early days of GObject, and GObject itself is complicated and generally lacking in the "syntactic sugar' which makes real object oriented languages usable.

There are lots of object oriented languages, though, so why C++?

  • C++ compilers are available on all major platforms except Android, and there's a shim library available there which may work to wrap a C++ library in a Java GUI. While Java is available on all the desktop platforms, it isn't available on iOS.
  • The conversion will take a long time, so interoperability with the existing C code is essential. Only C++ and Objective C can be interspersed with C one line at a time, but Objective-C is not available on Microsoft platforms using native tools.
  • There are two widely-used C++-based cross-platform GUI libraries, wxWidgets and Qt. Both support all three major desktop platforms and iOS.
cstim's comment: In addition to this reasoning, everyone is invited to have a look at some already existing C++ wrapper objects in src/optional/gtkmm/gncmm, which make use of the glibmm/gtkmm C++ wrapper library around glib/gtk to present classes that "look like" real C++ classes. Using this sort of wrappers would even make it possible to do a step-by-step onversion to C++ - as long as we accept the dependency on glibmm/glib in the core objects for some time being. See Cutecash on how to compile this part of the code with CMake.

Developer Preparation

C++ is easier to learn and to write than is GObject. Low bar. GObject is a bitch to write, and takes a lot of work to understand. C++ is easy to write, but it still takes some work to understand. Some very strongly recommended (I'd say required, but I don't want to be too scary) reading:

  • Bjarne Stroustrup, The C++ Programming Language, Fourth Edition, Addison-Wesley, 2013.
  • Nicolai Josuttis, The C++ Standard Library, A Tutorial and Reference, Second Edition, Addison-Wesley, 2012
  • Scott Meyers, Effective C++, More Effective C++, Effective STL, Addison-Wesley, 2005, 1996, and 2001 respectively. Meyers has promised a new Effective C++ updated for C++11/14 for Spring 2014.
  • Herb Sutter, Exceptional C++, Addison-Wesley, 1999

If you buy all of them new, it will run well over $200 in the US and probably more in Europe. Meyers's and Sutter's books are widely available used at a fraction of the cost, and understanding them will make you a much better C++ programmer. There's a fair amount of overlap between Stroustrup and Josuttis. Of the two, Josuttis is much more approachable for the working developer; Stroustrup is a bit academic. OTOH, Josuttis's coverage is a bit narrow, being mostly focused on the Standard Library. Earlier and therefore cheaper, especially used, versions of either are fine, except that they won't cover the recent developments in the language.

Study in particular templates and generic programming. I found it easier to grasp than rewiring my brain from structured to object oriented design, but that was a long and painful process. The seminal work about that was

  • James Coplien, Advanced C++ Programming Styles and Idioms, Addison-Wesley, 1991

which I found utterly impenetrable, but I haven't tried to read it in a long time. The book that really made it popular is

  • Andrei Alexandrescu, Modern C++ Design, Addison-Wesley, 2001

Stroustrup added a large section in the fourth edition of The C++ Programming Language; earlier editions have a chapter which discusses the feature but doesn't go into much detail about what to do with it. The important thing about Templates is that they allow one to push off some of the work and overhead consumed by pure OO onto the compiler, making the compiled result faster and smaller. Templates are also very helpful in reducing dependencies between classes, a major problem with Gnucash's current code. Moreover, most of the Standard Library and Boost are written using templates and using those 'libraries' effectively depends on a good understanding of templates.

One other note about templates: They're much better than preprocessor macros. While macros work in C++ just like they do in C, they're very clumsy compared to templates. Whenever possible replace macros with const variables, constfuncs, and templates. Remember that the compiler can't really understand macros, but it does understand templates.

Also familiarize yourself with the Standard Library Algorithms. These are highly optimized implementations of common things you need to do. You're not likely to have time to write better code, so use the algorithms every chance you get.

If you're new to C++ or haven't used it in a long time, the most highly regarded introductory book is Stanley Lippman, Losée Laoie, and Barbara Moo, C++ Primer, 5th Edition, Addison-Wesley, 2013.

A couple of other excellent reference sites are

C++11/14

A greatly improved language was released in 2011. The standards committee expects to release some tweaks and fixes in 2014. Fortunately most of those fixes are already implemented in already-available compilers or in Boost. This modern C++ is vastly more expressive, easier to use, and safer than the previous C++98 standard, so it's what we'll adopt for the project.

Dependencies

GLib provides a ton of useful cross-platform support functions, macros, and classes. Almost all of them are replaceable with algorithms and containers from the Standard Library. Most of gnucash's use cases of an extensible collection uses GList, but could better use std::vector<> which is vastly more efficient. (Discussion in Talk:C++). The rest of what GLib and GObject provide, along with a couple of things that we did ourselves (GUID, gnc-date, gnc-numeric, and QofSignal) are provided by the Boost libraries. The goal is to have no dependencies other than the Standard Library and Boost except in the GUI, import-export, and backends.

The Plan

The first phase of the conversion focuses on three source directories, in order: libqof/qof, backend, and engine.

QOF

  1. Make the module compile in C++. Complete
  2. Replace the internals of some utility classes with Boost template classes.
    1. GncGUID with boost::uuid. In Progress
    2. GncNumeric to boost::rational<boost::multiprecision>, which removes overflow problems that occur with certain currencies and securities requiring very small fractions.
    3. GncDate to boost::datetime, which offers an opportunity to greatly simplify the API and resolve long-standing complaints about entry dates shifting with the timezone in which the computer is being used.
These classes are very independent and so in theory can be converted without a lot of side effects, which will help us to gain experience in the process without a lot of complications.
  1. QofSession and QofBackend
  2. QofId, QofClass, QofObject
  3. Other QOF utility classes
  4. QofInstance and QofBook
  5. QofQuery

Backends

  1. DBI: Replace with ODB
  2. SQL
  3. XML

Engine

  1. [TBD]

Coding and Design Recommendations

Coding

  • Continue to use the GnuCash style:
    • Class names are camel-cased
    • Function names are underscore-separated. C++ class member function names should not repeat the class name, C wrappers should.
    • Indent 4 spaces, one space before an opening parenthesis or curly brace and after a comma, otherwise no spaces.
    • Opening and closing curly braces delimiting a block should be on a separate line, aligned with each other and with the statement which introduces the block; statements inside the block should be indented.
  • But use C++ comments introduced with // when they're only one line.
  • Use C++11 syntax whenever possible; in particular use curly braces for initializers and auto for typenames.
  • Avoid naked pointers when practical, using std::unique_ptr and std::shared_ptr. Note that you can't pass these to C, it doesn't know how to dereference them, so for functions passed to C you must use naked pointers.
  • Use good judgement when using the standard namespaces std and boost. Consider using only specific identifiers so that infrequently used ones are tagged with the namespace. This can make it easier to understand code by making clear what is local and what is imported.

Design

  • Don't waste time with API that isn't immediately used. When you find unused C API, don't reimplement it, remove it.
  • If there's an STL or Boost algorithm that does what you need, use it. Those guys are better programmers than all of us put together. Take advantage of that. That requires some familiarity with what's available, so do study in particular the STL algorithms.
  • Minimize dependencies between classes. Dependencies make it harder to change class internals and harder to write good tests.
  • Pay close attention to memory management and concurrency. In particular avoid statics and provide locking when writing to object members. These are both particular weaknesses of the existing implementation.
  • Write unit tests for your C++ code. When possible test the C code with the tests first so that you can be confident that the C++ implementation doesn't make any unintended changes. Don't forget to test failure conditions as well as success conditions.
  • Prefer compile-time polymorphism (using templates or concepts) when it makes sense to do so: It's faster and smaller at runtime.
  • Don't make radical changes to the class hierarchy at this point, but do make changes that improve design flexibility.
  • Be mindful of patterns and use them to inform your design decisions.
  • Overloading in C++ is good, but every overload visible to C needs a separate function, so don't get carried away.