Commit d1b57263 authored by Linus Torvalds's avatar Linus Torvalds

Merge branch 'docs' of git://git.lwn.net/linux-2.6

* 'docs' of git://git.lwn.net/linux-2.6:
  Document panic_on_unrecovered_nmi sysctl
  Add a reference to paper to SubmittingPatches
  Add kerneldoc documentation for new printk format extensions
  Remove videobook.tmpl
  doc: Test-by?
  Add the development process document
  Documentation/block/data-integrity.txt: Fix section numbers
parents c472273f 656e6c00
......@@ -21,6 +21,9 @@ Changes
- list of changes that break older software packages.
CodingStyle
- how the boss likes the C code in the kernel to look.
development-process/
- An extended tutorial on how to work with the kernel development
process.
DMA-API.txt
- DMA API, pci_ API & extensions for non-consistent memory machines.
DMA-ISA-LPC.txt
......
......@@ -6,7 +6,7 @@
# To add a new book the only step required is to add the book to the
# list of DOCBOOKS.
DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml videobook.xml \
DOCBOOKS := wanbook.xml z8530book.xml mcabook.xml \
kernel-hacking.xml kernel-locking.xml deviceiobook.xml \
procfs-guide.xml writing_usb_driver.xml networking.xml \
kernel-api.xml filesystems.xml lsm.xml usb.xml kgdb.xml \
......
This diff is collapsed.
......@@ -405,7 +405,7 @@ person it names. This tag documents that potentially interested parties
have been included in the discussion
14) Using Test-by: and Reviewed-by:
14) Using Tested-by: and Reviewed-by:
A Tested-by: tag indicates that the patch has been successfully tested (in
some environment) by the person named. This tag informs maintainers that
......
......@@ -246,7 +246,7 @@ will require extra work due to the application tag.
retrieve the tag buffer using bio_integrity_get_tag().
6.3 PASSING EXISTING INTEGRITY METADATA
5.3 PASSING EXISTING INTEGRITY METADATA
Filesystems that either generate their own integrity metadata or
are capable of transferring IMD from user space can use the
......@@ -283,7 +283,7 @@ will require extra work due to the application tag.
integrity upon completion.
6.4 REGISTERING A BLOCK DEVICE AS CAPABLE OF EXCHANGING INTEGRITY
5.4 REGISTERING A BLOCK DEVICE AS CAPABLE OF EXCHANGING INTEGRITY
METADATA
To enable integrity exchange on a block device the gendisk must be
......
1: A GUIDE TO THE KERNEL DEVELOPMENT PROCESS
The purpose of this document is to help developers (and their managers)
work with the development community with a minimum of frustration. It is
an attempt to document how this community works in a way which is
accessible to those who are not intimately familiar with Linux kernel
development (or, indeed, free software development in general). While
there is some technical material here, this is very much a process-oriented
discussion which does not require a deep knowledge of kernel programming to
understand.
1.1: EXECUTIVE SUMMARY
The rest of this section covers the scope of the kernel development process
and the kinds of frustrations that developers and their employers can
encounter there. There are a great many reasons why kernel code should be
merged into the official ("mainline") kernel, including automatic
availability to users, community support in many forms, and the ability to
influence the direction of kernel development. Code contributed to the
Linux kernel must be made available under a GPL-compatible license.
Section 2 introduces the development process, the kernel release cycle, and
the mechanics of the merge window. The various phases in the patch
development, review, and merging cycle are covered. There is some
discussion of tools and mailing lists. Developers wanting to get started
with kernel development are encouraged to track down and fix bugs as an
initial exercise.
Section 3 covers early-stage project planning, with an emphasis on
involving the development community as soon as possible.
Section 4 is about the coding process; several pitfalls which have been
encountered by other developers are discussed. Some requirements for
patches are covered, and there is an introduction to some of the tools
which can help to ensure that kernel patches are correct.
Section 5 talks about the process of posting patches for review. To be
taken seriously by the development community, patches must be properly
formatted and described, and they must be sent to the right place.
Following the advice in this section should help to ensure the best
possible reception for your work.
Section 6 covers what happens after posting patches; the job is far from
done at that point. Working with reviewers is a crucial part of the
development process; this section offers a number of tips on how to avoid
problems at this important stage. Developers are cautioned against
assuming that the job is done when a patch is merged into the mainline.
Section 7 introduces a couple of "advanced" topics: managing patches with
git and reviewing patches posted by others.
Section 8 concludes the document with pointers to sources for more
information on kernel development.
1.2: WHAT THIS DOCUMENT IS ABOUT
The Linux kernel, at over 6 million lines of code and well over 1000 active
contributors, is one of the largest and most active free software projects
in existence. Since its humble beginning in 1991, this kernel has evolved
into a best-of-breed operating system component which runs on pocket-sized
digital music players, desktop PCs, the largest supercomputers in
existence, and all types of systems in between. It is a robust, efficient,
and scalable solution for almost any situation.
With the growth of Linux has come an increase in the number of developers
(and companies) wishing to participate in its development. Hardware
vendors want to ensure that Linux supports their products well, making
those products attractive to Linux users. Embedded systems vendors, who
use Linux as a component in an integrated product, want Linux to be as
capable and well-suited to the task at hand as possible. Distributors and
other software vendors who base their products on Linux have a clear
interest in the capabilities, performance, and reliability of the Linux
kernel. And end users, too, will often wish to change Linux to make it
better suit their needs.
One of the most compelling features of Linux is that it is accessible to
these developers; anybody with the requisite skills can improve Linux and
influence the direction of its development. Proprietary products cannot
offer this kind of openness, which is a characteristic of the free software
process. But, if anything, the kernel is even more open than most other
free software projects. A typical three-month kernel development cycle can
involve over 1000 developers working for more than 100 different companies
(or for no company at all).
Working with the kernel development community is not especially hard. But,
that notwithstanding, many potential contributors have experienced
difficulties when trying to do kernel work. The kernel community has
evolved its own distinct ways of operating which allow it to function
smoothly (and produce a high-quality product) in an environment where
thousands of lines of code are being changed every day. So it is not
surprising that Linux kernel development process differs greatly from
proprietary development methods.
The kernel's development process may come across as strange and
intimidating to new developers, but there are good reasons and solid
experience behind it. A developer who does not understand the kernel
community's ways (or, worse, who tries to flout or circumvent them) will
have a frustrating experience in store. The development community, while
being helpful to those who are trying to learn, has little time for those
who will not listen or who do not care about the development process.
It is hoped that those who read this document will be able to avoid that
frustrating experience. There is a lot of material here, but the effort
involved in reading it will be repaid in short order. The development
community is always in need of developers who will help to make the kernel
better; the following text should help you - or those who work for you -
join our community.
1.3: CREDITS
This document was written by Jonathan Corbet, corbet@lwn.net. It has been
improved by comments from Johannes Berg, James Berry, Alex Chiang, Roland
Dreier, Randy Dunlap, Jake Edge, Jiri Kosina, Matt Mackall, Arthur Marsh,
Amanda McPherson, Andrew Morton, Andrew Price, Tsugikazu Shibata, and
Jochen Voß.
This work was supported by the Linux Foundation; thanks especially to
Amanda McPherson, who saw the value of this effort and made it all happen.
1.4: THE IMPORTANCE OF GETTING CODE INTO THE MAINLINE
Some companies and developers occasionally wonder why they should bother
learning how to work with the kernel community and get their code into the
mainline kernel (the "mainline" being the kernel maintained by Linus
Torvalds and used as a base by Linux distributors). In the short term,
contributing code can look like an avoidable expense; it seems easier to
just keep the code separate and support users directly. The truth of the
matter is that keeping code separate ("out of tree") is a false economy.
As a way of illustrating the costs of out-of-tree code, here are a few
relevant aspects of the kernel development process; most of these will be
discussed in greater detail later in this document. Consider:
- Code which has been merged into the mainline kernel is available to all
Linux users. It will automatically be present on all distributions which
enable it. There is no need for driver disks, downloads, or the hassles
of supporting multiple versions of multiple distributions; it all just
works, for the developer and for the user. Incorporation into the
mainline solves a large number of distribution and support problems.
- While kernel developers strive to maintain a stable interface to user
space, the internal kernel API is in constant flux. The lack of a stable
internal interface is a deliberate design decision; it allows fundamental
improvements to be made at any time and results in higher-quality code.
But one result of that policy is that any out-of-tree code requires
constant upkeep if it is to work with new kernels. Maintaining
out-of-tree code requires significant amounts of work just to keep that
code working.
Code which is in the mainline, instead, does not require this work as the
result of a simple rule requiring any developer who makes an API change
to also fix any code that breaks as the result of that change. So code
which has been merged into the mainline has significantly lower
maintenance costs.
- Beyond that, code which is in the kernel will often be improved by other
developers. Surprising results can come from empowering your user
community and customers to improve your product.
- Kernel code is subjected to review, both before and after merging into
the mainline. No matter how strong the original developer's skills are,
this review process invariably finds ways in which the code can be
improved. Often review finds severe bugs and security problems. This is
especially true for code which has been developed in a closed
environment; such code benefits strongly from review by outside
developers. Out-of-tree code is lower-quality code.
- Participation in the development process is your way to influence the
direction of kernel development. Users who complain from the sidelines
are heard, but active developers have a stronger voice - and the ability
to implement changes which make the kernel work better for their needs.
- When code is maintained separately, the possibility that a third party
will contribute a different implementation of a similar feature always
exists. Should that happen, getting your code merged will become much
harder - to the point of impossibility. Then you will be faced with the
unpleasant alternatives of either (1) maintaining a nonstandard feature
out of tree indefinitely, or (2) abandoning your code and migrating your
users over to the in-tree version.
- Contribution of code is the fundamental action which makes the whole
process work. By contributing your code you can add new functionality to
the kernel and provide capabilities and examples which are of use to
other kernel developers. If you have developed code for Linux (or are
thinking about doing so), you clearly have an interest in the continued
success of this platform; contributing code is one of the best ways to
help ensure that success.
All of the reasoning above applies to any out-of-tree kernel code,
including code which is distributed in proprietary, binary-only form.
There are, however, additional factors which should be taken into account
before considering any sort of binary-only kernel code distribution. These
include:
- The legal issues around the distribution of proprietary kernel modules
are cloudy at best; quite a few kernel copyright holders believe that
most binary-only modules are derived products of the kernel and that, as
a result, their distribution is a violation of the GNU General Public
license (about which more will be said below). Your author is not a
lawyer, and nothing in this document can possibly be considered to be
legal advice. The true legal status of closed-source modules can only be
determined by the courts. But the uncertainty which haunts those modules
is there regardless.
- Binary modules greatly increase the difficulty of debugging kernel
problems, to the point that most kernel developers will not even try. So
the distribution of binary-only modules will make it harder for your
users to get support from the community.
- Support is also harder for distributors of binary-only modules, who must
provide a version of the module for every distribution and every kernel
version they wish to support. Dozens of builds of a single module can
be required to provide reasonably comprehensive coverage, and your users
will have to upgrade your module separately every time they upgrade their
kernel.
- Everything that was said above about code review applies doubly to
closed-source code. Since this code is not available at all, it cannot
have been reviewed by the community and will, beyond doubt, have serious
problems.
Makers of embedded systems, in particular, may be tempted to disregard much
of what has been said in this section in the belief that they are shipping
a self-contained product which uses a frozen kernel version and requires no
more development after its release. This argument misses the value of
widespread code review and the value of allowing your users to add
capabilities to your product. But these products, too, have a limited
commercial life, after which a new version must be released. At that
point, vendors whose code is in the mainline and well maintained will be
much better positioned to get the new product ready for market quickly.
1.5: LICENSING
Code is contributed to the Linux kernel under a number of licenses, but all
code must be compatible with version 2 of the GNU General Public License
(GPLv2), which is the license covering the kernel distribution as a whole.
In practice, that means that all code contributions are covered either by
GPLv2 (with, optionally, language allowing distribution under later
versions of the GPL) or the three-clause BSD license. Any contributions
which are not covered by a compatible license will not be accepted into the
kernel.
Copyright assignments are not required (or requested) for code contributed
to the kernel. All code merged into the mainline kernel retains its
original ownership; as a result, the kernel now has thousands of owners.
One implication of this ownership structure is that any attempt to change
the licensing of the kernel is doomed to almost certain failure. There are
few practical scenarios where the agreement of all copyright holders could
be obtained (or their code removed from the kernel). So, in particular,
there is no prospect of a migration to version 3 of the GPL in the
foreseeable future.
It is imperative that all code contributed to the kernel be legitimately
free software. For that reason, code from anonymous (or pseudonymous)
contributors will not be accepted. All contributors are required to "sign
off" on their code, stating that the code can be distributed with the
kernel under the GPL. Code which has not been licensed as free software by
its owner, or which risks creating copyright-related problems for the
kernel (such as code which derives from reverse-engineering efforts lacking
proper safeguards) cannot be contributed.
Questions about copyright-related issues are common on Linux development
mailing lists. Such questions will normally receive no shortage of
answers, but one should bear in mind that the people answering those
questions are not lawyers and cannot provide legal advice. If you have
legal questions relating to Linux source code, there is no substitute for
talking with a lawyer who understands this field. Relying on answers
obtained on technical mailing lists is a risky affair.
This diff is collapsed.
3: EARLY-STAGE PLANNING
When contemplating a Linux kernel development project, it can be tempting
to jump right in and start coding. As with any significant project,
though, much of the groundwork for success is best laid before the first
line of code is written. Some time spent in early planning and
communication can save far more time later on.
3.1: SPECIFYING THE PROBLEM
Like any engineering project, a successful kernel enhancement starts with a
clear description of the problem to be solved. In some cases, this step is
easy: when a driver is needed for a specific piece of hardware, for
example. In others, though, it is tempting to confuse the real problem
with the proposed solution, and that can lead to difficulties.
Consider an example: some years ago, developers working with Linux audio
sought a way to run applications without dropouts or other artifacts caused
by excessive latency in the system. The solution they arrived at was a
kernel module intended to hook into the Linux Security Module (LSM)
framework; this module could be configured to give specific applications
access to the realtime scheduler. This module was implemented and sent to
the linux-kernel mailing list, where it immediately ran into problems.
To the audio developers, this security module was sufficient to solve their
immediate problem. To the wider kernel community, though, it was seen as a
misuse of the LSM framework (which is not intended to confer privileges
onto processes which they would not otherwise have) and a risk to system
stability. Their preferred solutions involved realtime scheduling access
via the rlimit mechanism for the short term, and ongoing latency reduction
work in the long term.
The audio community, however, could not see past the particular solution
they had implemented; they were unwilling to accept alternatives. The
resulting disagreement left those developers feeling disillusioned with the
entire kernel development process; one of them went back to an audio list
and posted this:
There are a number of very good Linux kernel developers, but they
tend to get outshouted by a large crowd of arrogant fools. Trying
to communicate user requirements to these people is a waste of
time. They are much too "intelligent" to listen to lesser mortals.
(http://lwn.net/Articles/131776/).
The reality of the situation was different; the kernel developers were far
more concerned about system stability, long-term maintenance, and finding
the right solution to the problem than they were with a specific module.
The moral of the story is to focus on the problem - not a specific solution
- and to discuss it with the development community before investing in the
creation of a body of code.
So, when contemplating a kernel development project, one should obtain
answers to a short set of questions:
- What, exactly, is the problem which needs to be solved?
- Who are the users affected by this problem? Which use cases should the
solution address?
- How does the kernel fall short in addressing that problem now?
Only then does it make sense to start considering possible solutions.
3.2: EARLY DISCUSSION
When planning a kernel development project, it makes great sense to hold
discussions with the community before launching into implementation. Early
communication can save time and trouble in a number of ways:
- It may well be that the problem is addressed by the kernel in ways which
you have not understood. The Linux kernel is large and has a number of
features and capabilities which are not immediately obvious. Not all
kernel capabilities are documented as well as one might like, and it is
easy to miss things. Your author has seen the posting of a complete
driver which duplicated an existing driver that the new author had been
unaware of. Code which reinvents existing wheels is not only wasteful;
it will also not be accepted into the mainline kernel.
- There may be elements of the proposed solution which will not be
acceptable for mainline merging. It is better to find out about
problems like this before writing the code.
- It's entirely possible that other developers have thought about the
problem; they may have ideas for a better solution, and may be willing
to help in the creation of that solution.
Years of experience with the kernel development community have taught a
clear lesson: kernel code which is designed and developed behind closed
doors invariably has problems which are only revealed when the code is
released into the community. Sometimes these problems are severe,
requiring months or years of effort before the code can be brought up to
the kernel community's standards. Some examples include:
- The Devicescape network stack was designed and implemented for
single-processor systems. It could not be merged into the mainline
until it was made suitable for multiprocessor systems. Retrofitting
locking and such into code is a difficult task; as a result, the merging
of this code (now called mac80211) was delayed for over a year.
- The Reiser4 filesystem included a number of capabilities which, in the
core kernel developers' opinion, should have been implemented in the
virtual filesystem layer instead. It also included features which could
not easily be implemented without exposing the system to user-caused
deadlocks. The late revelation of these problems - and refusal to
address some of them - has caused Reiser4 to stay out of the mainline
kernel.
- The AppArmor security module made use of internal virtual filesystem
data structures in ways which were considered to be unsafe and
unreliable. This code has since been significantly reworked, but
remains outside of the mainline.
In each of these cases, a great deal of pain and extra work could have been
avoided with some early discussion with the kernel developers.
3.3: WHO DO YOU TALK TO?
When developers decide to take their plans public, the next question will
be: where do we start? The answer is to find the right mailing list(s) and
the right maintainer. For mailing lists, the best approach is to look in
the MAINTAINERS file for a relevant place to post. If there is a suitable
subsystem list, posting there is often preferable to posting on
linux-kernel; you are more likely to reach developers with expertise in the
relevant subsystem and the environment may be more supportive.
Finding maintainers can be a bit harder. Again, the MAINTAINERS file is
the place to start. That file tends to not always be up to date, though,
and not all subsystems are represented there. The person listed in the
MAINTAINERS file may, in fact, not be the person who is actually acting in
that role currently. So, when there is doubt about who to contact, a
useful trick is to use git (and "git log" in particular) to see who is
currently active within the subsystem of interest. Look at who is writing
patches, and who, if anybody, is attaching Signed-off-by lines to those
patches. Those are the people who will be best placed to help with a new
development project.
If all else fails, talking to Andrew Morton can be an effective way to
track down a maintainer for a specific piece of code.
3.4: WHEN TO POST?
If possible, posting your plans during the early stages can only be
helpful. Describe the problem being solved and any plans that have been
made on how the implementation will be done. Any information you can
provide can help the development community provide useful input on the
project.
One discouraging thing which can happen at this stage is not a hostile
reaction, but, instead, little or no reaction at all. The sad truth of the
matter is (1) kernel developers tend to be busy, (2) there is no shortage
of people with grand plans and little code (or even prospect of code) to
back them up, and (3) nobody is obligated to review or comment on ideas
posted by others. If a request-for-comments posting yields little in the
way of comments, do not assume that it means there is no interest in the
project. Unfortunately, you also cannot assume that there are no problems
with your idea. The best thing to do in this situation is to proceed,
keeping the community informed as you go.
3.5: GETTING OFFICIAL BUY-IN
If your work is being done in a corporate environment - as most Linux
kernel work is - you must, obviously, have permission from suitably
empowered managers before you can post your company's plans or code to a
public mailing list. The posting of code which has not been cleared for
release under a GPL-compatible license can be especially problematic; the
sooner that a company's management and legal staff can agree on the posting
of a kernel development project, the better off everybody involved will be.
Some readers may be thinking at this point that their kernel work is
intended to support a product which does not yet have an officially
acknowledged existence. Revealing their employer's plans on a public
mailing list may not be a viable option. In cases like this, it is worth
considering whether the secrecy is really necessary; there is often no real
need to keep development plans behind closed doors.
That said, there are also cases where a company legitimately cannot
disclose its plans early in the development process. Companies with
experienced kernel developers may choose to proceed in an open-loop manner
on the assumption that they will be able to avoid serious integration
problems later. For companies without that sort of in-house expertise, the
best option is often to hire an outside developer to review the plans under
a non-disclosure agreement. The Linux Foundation operates an NDA program
designed to help with this sort of situation; more information can be found
at:
http://www.linuxfoundation.org/en/NDA_program
This kind of review is often enough to avoid serious problems later on
without requiring public disclosure of the project.
This diff is collapsed.
5: POSTING PATCHES
Sooner or later, the time comes when your work is ready to be presented to
the community for review and, eventually, inclusion into the mainline
kernel. Unsurprisingly, the kernel development community has evolved a set
of conventions and procedures which are used in the posting of patches;
following them will make life much easier for everybody involved. This
document will attempt to cover these expectations in reasonable detail;
more information can also be found in the files SubmittingPatches,
SubmittingDrivers, and SubmitChecklist in the kernel documentation
directory.
5.1: WHEN TO POST
There is a constant temptation to avoid posting patches before they are
completely "ready." For simple patches, that is not a problem. If the
work being done is complex, though, there is a lot to be gained by getting
feedback from the community before the work is complete. So you should
consider posting in-progress work, or even making a git tree available so
that interested developers can catch up with your work at any time.
When posting code which is not yet considered ready for inclusion, it is a
good idea to say so in the posting itself. Also mention any major work
which remains to be done and any known problems. Fewer people will look at
patches which are known to be half-baked, but those who do will come in
with the idea that they can help you drive the work in the right direction.
5.2: BEFORE CREATING PATCHES
There are a number of things which should be done before you consider
sending patches to the development community. These include:
- Test the code to the extent that you can. Make use of the kernel's
debugging tools, ensure that the kernel will build with all reasonable
combinations of configuration options, use cross-compilers to build for
different architectures, etc.
- Make sure your code is compliant with the kernel coding style
guidelines.
- Does your change have performance implications? If so, you should run
benchmarks showing what the impact (or benefit) of your change is; a
summary of the results should be included with the patch.
- Be sure that you have the right to post the code. If this work was done
for an employer, the employer likely has a right to the work and must be
agreeable with its release under the GPL.
As a general rule, putting in some extra thought before posting code almost
always pays back the effort in short order.
5.3: PATCH PREPARATION
The preparation of patches for posting can be a surprising amount of work,
but, once again, attempting to save time here is not generally advisable
even in the short term.
Patches must be prepared against a specific version of the kernel. As a
general rule, a patch should be based on the current mainline as found in
Linus's git tree. It may become necessary to make versions against -mm,
linux-next, or a subsystem tree, though, to facilitate wider testing and
review. Depending on the area of your patch and what is going on
elsewhere, basing a patch against these other trees can require a
significant amount of work resolving conflicts and dealing with API
changes.
Only the most simple changes should be formatted as a single patch;
everything else should be made as a logical series of changes. Splitting
up patches is a bit of an art; some developers spend a long time figuring
out how to do it in the way that the community expects. There are a few
rules of thumb, however, which can help considerably:
- The patch series you post will almost certainly not be the series of
changes found in your working revision control system. Instead, the
changes you have made need to be considered in their final form, then
split apart in ways which make sense. The developers are interested in
discrete, self-contained changes, not the path you took to get to those
changes.
- Each logically independent change should be formatted as a separate
patch. These changes can be small ("add a field to this structure") or
large (adding a significant new driver, for example), but they should be
conceptually small and amenable to a one-line description. Each patch
should make a specific change which can be reviewed on its own and
verified to do what it says it does.
- As a way of restating the guideline above: do not mix different types of
changes in the same patch. If a single patch fixes a critical security
bug, rearranges a few structures, and reformats the code, there is a
good chance that it will be passed over and the important fix will be
lost.
- Each patch should yield a kernel which builds and runs properly; if your
patch series is interrupted in the middle, the result should still be a
working kernel. Partial application of a patch series is a common
scenario when the "git bisect" tool is used to find regressions; if the
result is a broken kernel, you will make life harder for developers and
users who are engaging in the noble work of tracking down problems.