1. Introduction
The phrase "it doesn’t matter, just be consistent" has really harmed the C++ community more than it has helped. "A foolish consistency," if you will. Developers' desire to keep their codebase unique and "special" has held back the development of tooling when every user insists that a tool support their own custom style.
For some things, like where you put the opening brace, the difference might truly be insignificant. For other things, this isn’t the case.
One such thing is the layout of a project on a filesystem.
Incongruous layouts not only harms tooling, they create unnecessary disparity, create friction during on-boarding of contributors, and present a huge burden to beginners for a task that shouldn’t even take more than a few seconds of thought.
For virtually all projects, they are some (maybe improper) subset of a few key components:
-
Compilable source files
-
Public headers
-
Private headers
-
Source files containing entry points (
functions)main () -
Documentation files
-
Tests
-
Samples and examples
-
External libraries which have been embedded within the project structure
-
"Add-ons" to the source (e.g. language bindings, optional plugins, platform bindings)
In addition, it is very common to see projects which subdivide themselves into submodules. These submodules may also have dependencies between each-other. Submodules greatly increase the complexity potential, but aren’t completely out of the question. A project layout standard must be able to handle subdivided projects.
1.1. Background and Terminology
The ideas and concepts presented herein are not novel. Rather, they are built upon the work of others. Some come from professional idea and experience, while others come from community convention, which has gradually converged on to prefer certain patterns while decrying others.
1.1.1. Physical vs. Logical Layout
This document builds heavily upon the work of John Lakos, who has spear-headed much work on the aspects of physical design. Much work has been put into evangelizing in the area of logical design, but physical design to this day remains a topic that (in this author’s opinion) is too often overlooked when designing and implementing systems, as well as the teaching thereof.
John Lakos’s 1996 book Large-Scale C++ Software Design, to this day, remains as the book on physical design in C++.
So what’s the difference between physical and logical design?
Logical design pertains closely to programming-language-level aspects of the design of a system.
Physical design pertains to the aspects of design outside of what is encoded by the language.
The source unit dealt with by the C and C++ standards is the translation unit. This corresponds to a source file which has had all preprocessor directives resolved. This is the closest that the language standards get to discussing physical design of systems.
Physical design, despite its name, still lives within the digital space, of course. It concerns the placement and naming of files and directories within a filesystem, as well as how to coordinate communication between physical units where the relationship between their filesystem locations is not deterministic.
This document does not address much in the way of logical design. Rather, it is specifically written to address many common questions and resolve common answers for physical design.
1.1.2. The Physical Component
The most fundamental unit of physical design is the Component.
A component should correspond to exactly two files: A header and a source file. For some rare cases it may be beneficial to use more than a single source file in a single physical component, but it may be a sign that further refactoring is needed.
Additionally, a component may have a test source file, but it is not considered a part of the physical component.
1.1.3. Logical/Physical Coherence
This document assumes that developers will work to maintain logical/physical coherence. There must be a coherent relationship between the logical and physical components of a system, and a single logical component must not be spread across multiple physical components. Multiple logical components should not appear within the same physical component.
Note: Logical components can include more than an individual class or
function. For example, it may include classes representing private data (for
the PIMPL pattern), or
classes and functions. Blindly splitting every
class and function into individual physical components is not required nor
recommended.
Physical components must not form cyclic dependencies.
// list.hpp namespace acme { template < typename T > class list ; namespace detail { // Class implementing a "list iterator" template < typename T > class list_iterator { public : // Public default constructor list_iterator () = default ; private : // A private constructor used to initialize the iterator to the proper state explicit list_iterator ( list < T >& ); // Permit our list class to construct us with the private constructor friend class list < T > ; }; } template < typename T > class list { public : using iterator = detail :: list_iterator < T > ; // The iterator class is a friend, giving it access to our internals friend class detail :: list_iterator < T > ; }; }
The
is empty other than a single include directive:
// list.cpp #include "list.hpp"
The purpose of the otherwise empty
is to ensure that the
file will compile in isolation, ensuring that the header has the necessary
directives and/or forward declarations.
Despite having two distinct language-level
templates, we have a single logical component. We say that these two class templates are "co-located."
We also have two source files, a
and
, but they together
form a single physical component.
// connection.hpp #include <memory>namespace acme { // Forward-declare the private member data type namespace detail { class connection_ipml ; } // The wrapper class class connection { private : // The actual PIMPL std :: unique_ptr < detail :: connection_impl > _impl ; public : connection ( const std :: string & address ); ~ connection (); }; }
And a corresponding
:
#include "connection.hpp"namespace acme :: detail { class connection_impl { // Whatever we need for the connection... }; } namespace { std :: unique_ptr < acme :: detail :: connection_impl > init_private_data ( const std :: string & address ) { // Initialize and return the connection data... } } acme :: connection ::~ connection () = default ; acme :: connection :: connection ( const std :: string & address ) : _impl { init_private_data ( address ) } {}
Again we have two different (non-
) classes, and a single logical
component. In the header, the implementation detail class is only forward declared, not fully defined. In the
we provide the definition
for the detail class.
1.1.4. Packages, Projects, Modules, and Submodules
The terms defined in prior sections were heavily borrowed from John Lakos’s work. His work also often uses the term package to refer to a collection of source components. This would roughly correspond to the entire source repository, including build files, support files, documentation, scripts, and data.
Unfortunately, the term "package" has been heavily overloaded. These days, "package" most often refers to a unit of distribution of software, rather than the software itself. For this reason, this document prefers the term "project," and will use it instead of "package," although they mean the same thing.
In a similar vein, the term module has also been heavily overloaded. In the wake of upcoming C++ module specifications and implementations, the word "module" will be avoided as to avoid ambiguity and confusion as a "module" corresponds with neither a package/project nor a physical component.
Still, a term is needed to refer to the subdivisions of a large project into smaller elements. For example, Qt is an enormous framework of many interconnected pieces. To refer to these pieces Qt uses the term "module", which has already been excluded.
For lack of a better unqualified term, the best this author can find is a qualified term: Submodule. This term will be used to denote separate sections of a project which can be consumed on as-needed basis. See the § 4 Submodules for more information.
1.2. Project Files
PFL prescribes several files that should be present in the root of the project:
-
A
file should be present. It should be easily readable in plaintext, but may use "enhanced" plaintext like Markdown or similar. It should contain a description of the contents of the directory and subdirectories.README -
A
file must be present for projects that wish to redistribute themselves. It must be plaintext (ie. not enhanced with markup).LICENSE
Tool-support files, such as
and
, may be present in
this directory.
Other files in the root directory must be pertinent to the build system of the project. Other files should not appear in the root of the project.
1.3. Project Directories
PFL prescribes several directories that should appear at the root of the project tree. Not all of the directories are required, but they have an assigned purpose, and no other directory in the filesystem may assume the role of one of these directories. That is, these directories must be the ones used if their purpose is required.
Other directories should not appear at the root.
Note: If you have a need not fulfilled by a PFL directory listed below, that is a bug in this specification, and I would love to hear from you! Before reporting, double-check that what you need isn’t listed below and in the following sections.
- build/
-
A special directory that should not be considered part of the source of the project. Used for storing ephemeral build results. must not be checked into source control. If using source control, must be ignored using source control ignore-lists.
- src/
-
Main compilable source location. must be present for projects with compiled components that do not use submodules.
In the presence of include/, also contains private headers.
- include/
-
Directory for public headers. may be present. may be omitted for projects that do not distinguish between private/public headers. may be omitted for projects that use submodules.
- tests/
-
Directory for tests.
- examples/
-
Directory for samples and examples.
- external/
-
Directory for packages/projects to be used by the project, but not edited as part of the project.
- extras/
-
Directory containing extra/optional submodules for the project.
- data/
-
Directory containing non-source code aspects of the project. This might include graphics and markup files.
- tools/
-
Directory containing development utilities, such as build and refactoring scripts
- docs/
-
Directory for project documentation.
- libs/
-
Directory for main project submodules.
2. Top-Level Directories
Pitchfork specifies several top-level directories. Other directories should not be present in the root directory, except for what is required by other tooling.
2.1. build /
This directory is not required, but its name should be reserved.
The
directory is special in that it must not be committed to a source
control system. A user downloading the codebase should not see a
directory present in the project root, but one may be created in the course of
working with the software. The
directory is also reserved.
Note: Some build systems may commandeer the
directory for themselves.
In this case, the directory
should be used in place of
.
The
directory may be used for ephemeral build results. Other uses of
the directory are not permitted.
Creation of additional directories for build results in the root directory is not permitted.
Note: Although multiple root directories are not allowed, the structure and
layout of the
directory is not prescribed. Multiple subdirectories of
may be used to hold multiple build results of different configuration.
2.2. include /
Note: The
and src/ directories are very closely
related. Be sure to also read its section in addition to this one.
The purpose of the
directory is to hold public API headers.
The
directory should not be used if using § 3.1.2 Merged Header Placement.
See § 3 Library Source Layout.
2.3. src /
Note: The
and include/ directories are very closely
related. Be sure to also read its section in addition to this one.
The purpose and content of
depends on whether the project authors choose
to follow § 3.1.2 Merged Header Placement or § 3.1.1 Separate Header Placement.
See § 3 Library Source Layout.
2.4. tests /
This directory is not required.
The
directory is reserved for source files related to (non-unit) tests
for the project.
The structure and layout of this directory is not prescribed by this document.
A project which can be embedded in another project (such as via § 2.6 external/), must disable its
directory if it can detect that it
is being built as an embedded sub-project.
Project maintainers must provide a way for consumers to disable the compilation and running of tests, especially for the purpose of embedding.
2.5. examples /
This directory is not required.
The
directory is reserved for source files related to example and
sample usage of the project. The structure and layout of this directory is not
prescribed by this document.
Project maintainers must provide a way for consumers to disable the compilation of examples and samples.
2.6. external /
This directory is not required.
The
directory is reserved for embedding of external projects. Each
embedded project should occupy a single subdirectory of
.
should not contain files other than those required by tooling.
This directory may be automatically populated, either partially or completely,
by tools (eg.
submodules) as part of a build process. In this case,
projects must declare the auto-populated subdirectories as ignored by relevant
source control systems.
Subdirectories of
should not be modified as part of regular project
development. Subdirectories should remain as close to their upstream source as
possible.
2.7. data /
This directory is not required.
The
directory is designated for holding project files which should be
included in revision control, but are not explicitly code. For example,
graphics and localization files are not code in the same sense as the rest of
the project, but are good candidates for inclusion in the
directory.
The structure and layout of this directory is not prescribed by this document.
2.8. tools /
This directory is not required.
The
directory is designated for holding extra scripts and tools related
to developing and contributing to the project. For example, turn-key build
scripts, linting scripts, code-generation scripts, test scripts, or other tools
that may be useful to a project develop.
The contents of this directory should not be relevant to a project consumer.
2.9. docs /
This directory is not required.
The
directory is designated to contain project documentation. The
documentation process, tools, and layout is not prescribed by this document.
2.10. libs /
The
directory must not be used unless the project wishes to subdivide
itself into submodules. Its presence excludes the src/ and include/ directories.
See § 4 Submodules and § 4.3 libs/.
2.11. extras /
This directory is not required.
is a submodule root. See § 4 Submodules and § 4.4 extras/.
3. Library Source Layout
A library source tree refers to the layout of source code files that comprise a single library, which is a collection of code that is exposed to the library’s consumer.
3.1. Header File Placement
This document supports two different methods of placing headers in a single library: separate and merged. These two methods are mutually exclusive within a single library source tree.
3.1.1. Separate Header Placement
In separated placement, there are two source directories, § 2.2 include/ and § 2.3 src/. The
directory is designated to contain the public headers of the library, while the src/ directory is designated to contain
the compilable source code and private headers.
Note: Not all projects will necessarily have private headers.
In separate placement, a single physical component is split between the two
directories. The relative path to the parent directory of a compilable source
file in the
directory must be equivalent to the relative path to the
parent directory of the header in the
directory that corresponds to
the compilable source file.
meow . hpp
and a source file meow . cpp
, we might place the header at include / cat / sounds / meow . hpp
. The
relative path from include /
to the parent directory of meow . hpp
is cat / sounds .
Thus, we can compose the path to the compilable source file as joining the
source directory name
, the relative path
, and the filename
to get the path
. The following layout
results:
<root>/ include/ cat/ sounds/ meow.hpp src/ cat/ sounds/ meow.cpp
Note: The purpose of the deterministic header/file path relationship is to aid both tools and human viewers in understanding and manipulating the source directory structure.
Note: The relative paths of these physical components is not arbitrary. See § 3.3 Source Directory Layout.
Consumers of a library using separated header layout should be given the path to
the § 2.2 include/ directory as the sole include search directory for the
library’s public interface. This prevents users from being able to
paths which exist only in the
directory.
The library itself should be compiled with both its § 2.2 include/ and § 2.3 src/ directories as include search directories. This ensures that the library itself can access all files within both source directories.
3.1.2. Merged Header Placement
In merged header placement, there is a single source directory, § 2.3 src/.
Much like with separated placement, the relative path from the source directory to the parent of directory of the files of a physical component must be the same. This implies that the files of a physical component will always be sibling files in the same directory.
hiss . hpp
and hiss . cpp
, and a relative
path of cat / sounds
, the resulting layout is defined:
<root>/ src/ cat/ sounds/ hiss.hpp hiss.cpp
3.2. Test Placement
This document distinguishes between unit tests, and other tests. Unit tests are tests that roughly correspond to a single single unit of the source code. This may be a physical component, public API, or combination thereof. The distinguishing of a unit test has implications on where it may be placed.
3.2.1. Merged Test Placement
Optional but recommended is to use merged test placement. In this method,
a unit test should have exactly one compilable source file, and that filename
stem should be the same as the filename stem of the physical component under
test, with a
appended to it. For example, a test for the physical
component comprised of
and
will be named
.
This unit test source file should be placed in the same directory as a
compilable source file of the physical component under test. Therefore, when
the unit has a compilable source file, the unit test source file will appear as
a sibling of the compilable source file.
3.2.2. Separate Test Placement
If not using merged tests, all tests should be placed within the tests/ top-level directory. There are no mandates on the layout
within
.
3.3. Source Directory Layout
For the purposes of this section, the include/ and src/ top-level directories are both included in the definition of "source directories." They are root of the library source tree. They are named as such because they contain the primary "source files" of the source language (C and/or C++).
No non-source-code files will should be placed or generated in any subdirectories of a source directory. That is, the root of a source directory may contain non-source-code files, but no child directories should.
Conversely, no source-code files should be placed in the root of a source directory. That is, all source files must have qualified paths relative to the root of their source directory.
Header files and source files should correspond to a logical component of the
project. For example, a
library might contain a
class along
a single header and (optional) source file to represent it. If no other
logical components appear in that header and/or source file, the logical
component can be said to be the "main component" of the corresponding source
component. The main logical component may be a
, function, or some
grouping thereof.
The layout of the source tree should closely correspond to the namespace structure of the project.
In C, there is no language-level concept of a namespace, but there is the
convention of qualifying globally visible identifiers with a "pseudo" namespace.
For example, a
might define a
, where the prefix
acts as the "namespace" for the identifier. The namespace for these purposes can
be said to be
.
In C++, which has a language-level
, the need to qualify identifiers
in this way is not necessary (when using C++ linkage). Instead, these qualifiers
are put in
s.
Given that each logical component has a namespace, we can associate that namespace with the physical component in which it is defined. This namespace can then be used to generate a qualified path by which the physical component can be found. In this way, physical components can be considered "content-addressable".
Source files should be placed in a directory relative to the source directory where the relative path is composed by joining the elements of the component namespace as intermediate directories. The stem of the source filename should correspond to the name of the logical component which it declares or defines.
geo :: shapes :: circle
, we can use the qualifying
namespace to generate the relative path geo / shapes
, and the component’s leaf
name as a basis for the filename of circle
. Thus, the full path to the sources
defining the logical component are geo / shapes / circle . hpp
and geo / shapes / circle . cpp
.
If using § 3.1.1 Separate Header Placement, the path from the project or
submodule root to the header will be
, and the
path to the compiled source file will be
.
If using § 3.1.2 Merged Header Placement, then the header will appear as a
sibling in
.
The unit test for this component will appear at
.
In some cases, it may be advantageous to separate the compiled source file into
multiple compiled source files while maintaining a single header file. In this
case, the stem of the source file should begin the same as if it were not
subdivided, then qualified with a
separating the distinguishing
characteristic of the source file.
geo :: shapes :: circle
, and two extremely complex member functions circumference ()
and area ()
, we might split the
implementations of these methods into separate translation units. The resulting
physical component will have the following appearance:
< root > src / geo / shapes / circle . hpp circle . cpp circle . circumference . cpp circle . area . cpp circle . test . cpp
4. Submodules
Very large projects (eg. Qt, Boost, JUCE, LLVM) will benefit from the concept of submodules.
- Submodule
-
A subdivision of a larger project which can be consumed as-needed. Contains its own source trees, tests, data, and documentation.
Note: Splitting a project into submodules should be considered very carefully. It is an extremely heavy tool with subtleties that often trip people up. Very few projects warrant subdividing themselves in this way. Most projects will do just fine with multiple namespaces and directories within their source tree. Don’t reach for this tool when namespaces and subdirectories will suite you just fine. Converting a project to/from a submodule layout is a very cumbersome task.
The following rules must be taken into consideration when considering or using submodules:
-
Submodules are not themselves standalone projects.
-
They should not pretend to be entire projects.
-
They cannot be consumed independent of the rest of the project.
-
They should not be versioned separately from the project.
-
They cannot further subdivide into sub-submodules.
4.1. Submodule Root
- Submodule Root
-
A directory whose child directories are submodules.
Submodule roots include:
Each subdirectory of a submodule root must correspond to exactly one submodule.
4.2. Submodule Directory
A submodule is represented as a subdirectory of the project which may contain the following directories:
-
src/ for submodule sources
-
include/ for submodule includes (if splitting headers)
-
tests/ for submodule tests
-
data/ for submodule data
-
examples/ for examples
-
docs/ for submodule documentation
Note: Most of the top-level directories are absent from this list.
Submodules directories should not contain other files or directories except those required by tooling.
4.3. libs /
The
directory is for main submodules. It is a submodule root.
When the
directory is present, the top-level
and
directories must not be present. Instead of having a root source tree, a project
using
for submodules should instead refactor itself such that the
project has a common basis submodule upon which other submodules will may
depend.
The main difference between
and extras/ is that the
submodules found in
should be built _by default_, although a
consumer may opt-out of the submodules on an as-needed basis.
4.4. extras /
The
directory is designated for containing additional submodules for
the project which build upon the main component(s). This may include submodules
that are not part of the project’s "default" build, or otherwise impose special
requirements to be used.
For example, the following might be candidates for
rather than regular
components:
-
"Language bindings" or extra libraries that provide integrations of the project with programming languages or runtimes different from its own.
-
"Platform bindings" or extra libraries (plugins) that integrate the project with a particular platform. For example, a windowing library that needs to understand how to talk with Windows, Quartz, X11, and Wayland would include its platform integration implementations in this directory.
-
"Contributed" submodules. Additional submodules that are contributed by the project’s users and included in upstream, but are not officially supported by the project.
-
Optional submodules that require additional dependencies, or may be prohibitive to include for all users. For example, Qt’s Webkit module is prohibitively time consuming to build, and it requires the presence of dependencies that are only required exactly for that one component.
5. Build Systems
This document does not mandate any particular build system. The only requirements is that the chosen build system support the layout herein defined.
6. Libraries
For library projects, a source tree should correspond to exactly one library.
That is, at most one one linkable result, and exactly one public include
directory. A single
tree must not require linking more than one
library to access all symbols exported from that tree.
Note: Submodules, having their own source tree, may each contain a library that can be linked and consumed independently.
A single source tree should not vary its public interface based on anything other than the target platform. This has several big implications, including (but not limited to) the following:
-
A library should not offer the user controls for tweaking its public interface.
-
A library should not change its public interface based on the presence/absence of external software.