Simplify daemon management: "phd start"
Summary: - Merge CommitTask daemon into PullLocal daemon. This is another artifact of past instability (and order-dependent parsers). We still publish to the timeline, although this was the last consumer. Long term we'll probably delete timeline and move to webhooks, since everyone who has asked about this stuff has been eager to trade away the durability and ordering of the timeline for the ease of use of webhooks. There's also no reason to timeline this anymore since parsing is no longer order-dependent. - Add `phd start` to start all the daemons you need. Add `phd restart` to restart all the daemons you need. So cool~ - Simplify and improve phd and Diffusion daemon documentation. Test Plan: - Ran `phd start`. - Ran `phd restart`. - Generated/read documentation. - Imported some stuff, got clean parses. Reviewers: btrahan, csilvers Reviewed By: csilvers CC: aran, jungejason, nh Differential Revision: https://secure.phabricator.com/D2433
This commit is contained in:
@@ -6,17 +6,17 @@ Explains Phabricator daemons and the daemon control program ##phd##.
|
||||
= Overview =
|
||||
|
||||
Phabricator uses daemons (background processing scripts) to handle a number of
|
||||
tasks, like:
|
||||
tasks:
|
||||
|
||||
- tracking repositories and discovering new commits;
|
||||
- sending mail;
|
||||
- updating objects in the search index; and
|
||||
- custom tasks you define.
|
||||
- tracking repositories, discovering new commits, and importing and parsing
|
||||
commits;
|
||||
- sending email; and
|
||||
- collecting garbage, like old logs and caches.
|
||||
|
||||
Daemons are started and stopped with **phd** (the **Ph**abricator **D**aemon
|
||||
launcher). Daemons can be monitored via a web console.
|
||||
|
||||
You do not need to run daemons for most parts of Phabricator to work, but a few
|
||||
You do not need to run daemons for most parts of Phabricator to work, but some
|
||||
features (principally, repository tracking with Diffusion) require them and
|
||||
several features will benefit in performance or stability if you configure
|
||||
daemons.
|
||||
@@ -33,22 +33,24 @@ a list of commands, run ##phd help##:
|
||||
|
||||
Generally, you will use:
|
||||
|
||||
- **phd launch** to launch daemons;
|
||||
- **phd debug** to debug problems with daemons;
|
||||
- **phd start** to launch all daemons;
|
||||
- **phd restart** to restart all daemons;
|
||||
- **phd status** to get a list of running daemons; and
|
||||
- **phd stop** to stop all daemons.
|
||||
|
||||
NOTE: When you upgrade Phabricator or change configuration, you should restart
|
||||
the daemons by stopping and relaunching them.
|
||||
If you want finer-grained control, you can use:
|
||||
|
||||
NOTE: When you **launch** a daemon, you can type any unique substring of its
|
||||
name, so **phd launch metamta** will work correctly.
|
||||
- **phd launch** to launch individual daemons; and
|
||||
- **phd debug** to debug problems with daemons.
|
||||
|
||||
NOTE: When you upgrade Phabricator or change configuration, you should restart
|
||||
the daemons by running `phd restart`.
|
||||
|
||||
= Daemon Console =
|
||||
|
||||
You can view status and debugging information for daemons in the Daemon Console
|
||||
via the web interface. Go to ##/daemon/## in your install or click
|
||||
**Daemon Console** from the homepage.
|
||||
**Daemon Console** from "More Stuff".
|
||||
|
||||
The Daemon Console shows a list of all the daemons that have ever launched, and
|
||||
allows you to view log information for them. If you have issues with daemons,
|
||||
@@ -56,7 +58,7 @@ you may be able to find error information that will help you resolve the problem
|
||||
in the console.
|
||||
|
||||
NOTE: The easiest way to figure out what's wrong with a daemon is usually to use
|
||||
**phd debug** to launch it instead of **phd launch**. This will run it without
|
||||
**phd debug** to launch it instead of **phd start**. This will run it without
|
||||
daemonizing it, so you can see output in your console.
|
||||
|
||||
= Available Daemons =
|
||||
@@ -65,7 +67,72 @@ You can get a list of launchable daemons with **phd list**:
|
||||
|
||||
- **libphutil test daemons** are not generally useful unless you are
|
||||
developing daemon infrastructure or debugging a daemon problem;
|
||||
- **PhabricatorTaskmasterDaemon** runs a generic task queue; and
|
||||
- **PhabricatorRepository** daemons track repositories, descriptions are
|
||||
available in the @{article:Diffusion User Guide}.
|
||||
- **PhabricatorTaskmasterDaemon** performs work from a task queue;
|
||||
- **PhabricatorRepositoryPullLocalDaemon** daemons track repositories, for
|
||||
more information see @{article:Diffusion User Guide}; and
|
||||
- **PhabricatorGarbageCollectorDaemon** cleans up old logs and caches.
|
||||
|
||||
= Debugging and Tuning =
|
||||
|
||||
In most cases, **phd start** handles launching all the daemons you need.
|
||||
However, you may want to use more granular daemon controls to debug daemons,
|
||||
launch custom daemons, or launch special daemons like the IRC bot.
|
||||
|
||||
To debug a daemon, use `phd debug`:
|
||||
|
||||
phabricator/bin/ $ ./phd debug <daemon>
|
||||
|
||||
You can pass arguments like this (normal arguments are passed to the daemon
|
||||
control mechanism, not to the daemon itself):
|
||||
|
||||
phabricator/bin/ $ ./phd debug <daemon> -- --flavor apple
|
||||
|
||||
In debug mode, daemons do not daemonize, and they print additional debugging
|
||||
output to the console. This should make it easier to debug problems. You can
|
||||
terminate the daemon with `^C`.
|
||||
|
||||
To launch a nonstandard daemon, use `phd launch`:
|
||||
|
||||
phabricator/bin/ $ ./phd launch <daemon>
|
||||
|
||||
This daemon will daemonize and run normally.
|
||||
|
||||
== General Tips ==
|
||||
|
||||
- You can set the number of taskmasters that `phd start` starts in the config.
|
||||
If you have a task backlog, try increasing it.
|
||||
- When you `phd launch` or `phd debug` a daemon, you can type any unique
|
||||
substring of its name, so `phd launch pull` will work correctly.
|
||||
- `phd stop` and `phd restart` stop **all** of the daemons on the machine, not
|
||||
just those started with `phd start`. If you're writing a restart script,
|
||||
have it launch any custom daemons explicitly after `phd restart`.
|
||||
- You can write your own daemons and manage them with `phd` by extending
|
||||
@{class:PhabricatorDaemon}. See @{article: libphutil Libraries User Guide}.
|
||||
- See @{article:Diffusion User Guide} for details about tuning the repository
|
||||
daemon.
|
||||
|
||||
== Multiple Machines ==
|
||||
|
||||
If you have multiple machines, you should use `phd launch` to tweak which
|
||||
daemons launch, and split daemons across machines like this:
|
||||
|
||||
- `PhabricatorRepositoryPullLocalDaemon`: Run one copy on any machine.
|
||||
On each web frontend which is not running a normal copy, run a copy
|
||||
with the `--no-discovery` flag.
|
||||
- `PhabricatorGarbageCollectorDaemon`: Run one copy on any machine.
|
||||
- `PhabricatorTaskmasterDaemon`: Run as many copies as you need to keep
|
||||
tasks from backing up. You can run them all on one machine or split them
|
||||
across machines.
|
||||
|
||||
A gratuitously wasteful install might have a dedicated daemon machine which
|
||||
runs `phd start` with a large pool of taskmasters set in the config, and then
|
||||
runs `phd launch PhabricatorRepositoryPullLocalDaemon --no-discovery` on each
|
||||
web server. This is grossly excessive in normal cases.
|
||||
|
||||
= Next Steps =
|
||||
|
||||
Continue by:
|
||||
|
||||
- learning about the repository daemon with @{article:Diffusion User Guide};
|
||||
or
|
||||
- writing your own daemons with @{article: libphutil Libraries User Guide}.
|
||||
|
||||
@@ -43,16 +43,17 @@ The primary goal of callsigns is to namespace commits to SVN repositories: if
|
||||
you use multiple SVN repositories, each repository has a revision 1, revision 2,
|
||||
etc., so referring to them by number alone is ambiguous. However, even for Git
|
||||
they impart additional information to human readers and allow parsers to detect
|
||||
that something is a commit name with high probability.
|
||||
that something is a commit name with high probability (and allow distinguishing
|
||||
between multiple copies of a repository).
|
||||
|
||||
Diffusion uses this callsign and information about the commit itself to generate
|
||||
a commit name, like "rE12345" or "rP28146171ce1278f2375e3646a1e1ea3fd56fc5a3".
|
||||
The "r" stands for "revision". It is followed by the repository callsign, and
|
||||
then a VCS-specific commit identifier (for SVN, the commit number; for Git, the
|
||||
commit hash). When writing the name of a Git commit you may abbreviate the hash,
|
||||
but note that hash collisions are probable for short prefix lengths. See this
|
||||
post on the LKML for a historical explanation of Git's occasional internal use
|
||||
of 7-character hashes:
|
||||
then a VCS-specific commit identifier (for SVN, the commit number; for Git and
|
||||
Mercurial, the commit hash). When writing the name of a Git commit you may
|
||||
abbreviate the hash, but note that hash collisions are probable for short prefix
|
||||
lengths. See this post on the LKML for a historical explanation of Git's
|
||||
occasional internal use of 7-character hashes:
|
||||
|
||||
https://lkml.org/lkml/2010/10/28/287
|
||||
|
||||
@@ -84,8 +85,8 @@ tracking in Diffusion.
|
||||
Most of the options in the **Tracking** tab should be self-explanatory or are
|
||||
safe to leave at their defaults. In broad strokes, Diffusion tracks SVN
|
||||
repositories by issuing an "svn log" command periodically against the remote to
|
||||
look for new commits. It tracks Git repositories by cloning a local copy and
|
||||
issuing "git fetch" periodically.
|
||||
look for new commits. It tracks Git and Mercurial repositories by cloning a
|
||||
local copy and issuing `git fetch` or `hg pull` periodically.
|
||||
|
||||
Once you've configured everything (and made sure **Tracking** is set to
|
||||
"Enabled"), you can launch the daemons to begin actually tracking the
|
||||
@@ -93,20 +94,15 @@ repository.
|
||||
|
||||
= Running Diffusion Daemons =
|
||||
|
||||
For an introduction to Phabricator daemons, see
|
||||
@{article:Managing Daemons with phd}. To actually track repositories, you need
|
||||
to:
|
||||
In most cases, it is sufficient to run:
|
||||
|
||||
- run ##phd repository-launch-master## on one machine;
|
||||
- run at least one @{class:PhabricatorTaskmasterDaemon} with
|
||||
##phd launch taskmaster##. You should probably launch a few of these
|
||||
somewhere. They are generic workers which run many different kinds of
|
||||
background tasks, so if you already have some running you don't need to
|
||||
launch more. However, if you are importing a very large repository, import
|
||||
rate will primarily be a function of how many taskmasters you are running so
|
||||
you may want to launch a bunch of them; and
|
||||
- if you have multiple web frontends and have tracked Git repositories, run
|
||||
##phd repository-launch-readonly## on each web frontend.
|
||||
phabricator/bin/ $ ./phd start
|
||||
|
||||
...to start the daemons. For a more in-depth explanation of `phd` and daemons,
|
||||
see @{article:Managing Daemons with phd}.
|
||||
|
||||
NOTE: If you have an unusually large install with multiple web frontends, see
|
||||
notes in @{article:Managing Daemons with phd}.
|
||||
|
||||
You can use the Daemon Console to monitor the daemons and their progress
|
||||
importing the repository. Small repositories should import quickly, while
|
||||
@@ -116,39 +112,32 @@ discovering commits in Facebook's 350,000-commit primary repository, and about
|
||||
should begin appearing in Diffusion within a few minutes for all but the
|
||||
largest repositories.
|
||||
|
||||
In detail, Diffusion uses several daemons to track, parse and import
|
||||
repositories:
|
||||
== Tuning Daemons ==
|
||||
|
||||
- **PhabricatorRepositoryGitFetchDaemon**: periodically runs "git fetch" to
|
||||
keep git repositories up to date
|
||||
- **PhabricatorRepositoryGitCommitDiscoveryDaemon**: periodically looks for
|
||||
new commits and imports them
|
||||
- **PhabricatorRepositorySvnCommitDiscoveryDaemon**: periodically runs
|
||||
"svn log" to look for new commits and import them
|
||||
- **PhabricatorRepositoryCommitTaskDaemon**: creates tasks to parse and
|
||||
import newly discovered commits
|
||||
By default, Phabricator launches one daemon to pull and discover all of the
|
||||
tracked repositories. This works well for a small number of repositories or
|
||||
a large number of relatively inactive repositories, but might benefit from
|
||||
tuning in some cases. The daemon makes a rough effort to respect pull
|
||||
frequencies defined in repository configuration, but may not be able to import
|
||||
new commits very quickly if you have a large number of repositories (as it is
|
||||
blocked waiting on I/O from other repositories). If you want to provide lower
|
||||
commit import latency for some repositories, you can launch additional
|
||||
dedicated daemons:
|
||||
|
||||
The ##repository-launch-master## command just chooses the right daemons to
|
||||
launch based on which repositories you've configured to be tracked. If you add
|
||||
new repositories in the future, you should stop all the daemons and rerun
|
||||
##repository-launch-master##.
|
||||
For example, if you want low latency on the repositories with callsigns
|
||||
`A` and `B`, but don't care about latency for the other repositories, you could
|
||||
launch two daemons like this:
|
||||
|
||||
If you run Phabricator with multiple web frontends, have your deployment script
|
||||
do a ##phd stop## and ##phd repository-launch-readonly## when it deploys. It is
|
||||
very unlikely you are impacted by this unless you are one of the largest
|
||||
installs in the world.
|
||||
phabricator/bin $ ./phd launch RepositoryPullLocal -- A B
|
||||
phabricator/bin $ ./phd launch RepositoryPullLocal -- --not A --not B
|
||||
|
||||
= Building New Parsers =
|
||||
|
||||
You can add new classes which will extend or enhance Diffusion's ability to
|
||||
parse commit messages.
|
||||
|
||||
TODO: This is an advanced feature which doesn't currently have documentation and
|
||||
isn't terribly stable.
|
||||
The first one will work only on `A` and `B`, and should be able to import
|
||||
commits with low latency more reliably. The second one will work on all other
|
||||
repositories.
|
||||
|
||||
= Next Steps =
|
||||
|
||||
- Learn about creating a symbol index at
|
||||
- Learn about creating a symbol index at
|
||||
@{article:Diffusion User Guide: Symbol Indexes}; or
|
||||
- understand daemons in detail with @{article:Managing Daemons with phd}; or
|
||||
- give us feedback at @{article:Give Feedback! Get Support!}.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user