Clean up some old cluster-ish documentation

Summary:
Ref T10751. We currently have a placeholder Almanac document, and a fairly-bad-advice section in Daemons.

Pull these into the modern cluster documentation.

Test Plan: 17 phabricator PHDs

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T10751

Differential Revision: https://secure.phabricator.com/D15689
This commit is contained in:
epriestley
2016-04-12 10:46:19 -07:00
parent 33060d1652
commit afb0f7c7af
7 changed files with 234 additions and 69 deletions

View File

@@ -83,8 +83,7 @@ final class PhabricatorAlmanacApplication extends PhabricatorApplication {
phutil_tag(
'a',
array(
'href' => PhabricatorEnv::getDoclink(
'User Guide: Phabricator Clusters'),
'href' => PhabricatorEnv::getDoclink('Clustering Introduction'),
'target' => '_blank',
),
pht('Learn More')));

View File

@@ -178,7 +178,7 @@ abstract class AlmanacController
'a',
array(
'href' => PhabricatorEnv::getDoclink(
'User Guide: Phabricator Clusters'),
'Clustering Introduction'),
'target' => '_blank',
),
pht('Learn More'));

View File

@@ -26,6 +26,9 @@ operations personnel who need this high degree of flexibility.
The remainder of this document summarizes how to add redundancy to each
service and where your efforts are likely to have the greatest impact.
For additional guidance on setting up a cluster, see "Overlaying Services"
and "Cluster Recipes" at the bottom of this document.
Cluster: Databases
=================
@@ -44,7 +47,8 @@ For details, see @{article:Cluster: Databases}.
Cluster: Repositories
=====================
Configuring multiple repository hosts is complex.
Configuring multiple repository hosts is complex, but is required before you
can add multiple daemon or web hosts.
Repository replicas are important for availability if you host repositories
on Phabricator, but less important if you host repositories elsewhere
@@ -55,3 +59,123 @@ naturally somewhat resistant to data loss: every clone of a repository includes
the entire history.
For details, see @{article:Cluster: Repositories}.
Cluster: Daemons
================
Configuring multiple daemon hosts is straightforward, but you must configure
repositories first.
With daemons running on multiple hosts, you can transparently survive the loss
of any subset of hosts without an interruption to daemon services, as long as
at least one host remains alive. Daemons are stateless, so spreading daemons
across multiple hosts provides no resistance to data loss.
For details, see @{article:Cluster: Daemons}.
Cluster: Web Servers
====================
Configuring multiple web hosts is straightforward, but you must configure
repositories first.
With multiple web hosts, you can transparently survive the loss of any subset
of hosts as long as at least one host remains alive. Web hosts are stateless,
so putting multiple hosts in service provides no resistance to data loss.
For details, see @{article:Cluster: Web Servers}.
Overlaying Services
===================
Although hosts can run a single dedicated service type, certain groups of
services work well together. Phabricator clusters usually do not need to be
very large, so deploying a small number of hosts with multiple services is a
good place to start.
In planning a cluster, consider these blended host types:
**Everything**: Run HTTP, SSH, MySQL, repositories and daemons on a single
host. This is the starting point for single-node setups, and usually also the
best configuration when adding the second node.
**Everything Except Databases**: Run HTTP, SSH, repositories and daemons on one
host, and MySQL on a different host. MySQL uses many of the same resources that
other services use. It's also simpler to separate than other services, and
tends to benefit the most from dedicated hardware.
**Just Databases**: Separating MySQL onto dedicated nodes
Database nodes tend to benefit the most from
**Repositories and Daemons**: Run repositories and daemons on the same host.
Repository hosts //must// run daemons, and it normally makes sense to
completely overlay repositories and daemons. These services tend to use
different resources (repositories are heavier on I/O and lighter on CPU/RAM;
daemons are heavier on CPU/RAM and lighter on I/O).
Repositories and daemons are also both less latency sensitive than other
service types, so there's a wider margin of error for underprovisioning them
before performance is noticably affected.
These nodes tend to use system resources in a balanced way. Individual nodes
in this class do not need to be particularly powerful.
**Frontend Servers**: Run HTTP and SSH on the same host. These are easy to set
up, stateless, and you can scale the pool up or down easily to meet demand.
Routing both types of ingress traffic through the same initial tier can
simplify load balancing.
These nodes tend to need relatively little RAM.
Cluster Recipes
===============
This section provides some guidance on reasonable ways to scale up a cluster.
The smallest possible cluster is **two hosts**. Run everything (web, ssh,
database, repositories, and daemons) on each host. One host will serve as the
master; the other will serve as a replica.
Ideally, you should physically separate these hosts to reduce the chance that a
natural disaster or infrastructure disruption could disable or destroy both
hosts at the same time.
From here, you can choose how you expand the cluster.
To improve **scalability and performance**, separate loaded services onto
dedicated hosts and then add more hosts of that type to increase capacity. If
you have a two-node cluster, the best way to improve scalability by adding one
host is likely to separate the master database onto its own host.
Note that increasing scale may //decrease// availability by leaving you with
too little capacity after a failure. If you have three hosts handling traffic
and one datacenter fails, too much traffic may be sent to the single remaining
host in the surviving datacenter. You can hedge against this by mirroring new
hosts in other datacenters (for example, also separate the replica database
onto its own host).
After separating databases, separating repository + daemon nodes is likely
the next step.
To improve **availability**, add another copy of everything you run in one
datacenter to a new datacenter. For example, if you have a two-node cluster,
the best way to improve availability is to run everything on a third host in a
third datacenter. If you have a 6-node cluster with a web node, a database node
and a repo + daemon node in two datacenters, add 3 more nodes to create a copy
of each node in a third datacenter.
You can continue adding hosts until you run out of hosts.
Next Steps
==========
Continue by:
- learning how Phacility configures and operates a large, multi-tenant
production cluster in ((cluster)).

View File

@@ -0,0 +1,59 @@
@title Cluster: Daemons
@group intro
Configuring Phabricator to use multiple daemon hosts.
Overview
========
WARNING: This feature is a very early prototype; the features this document
describes are mostly speculative fantasy.
You can run daemons on multiple hosts. The advantages of doing this are:
- you can completely survive the loss of multiple daemon hosts; and
- worker queue throughput may improve.
This configuration is simple, but you must configure repositories first. For
details, see @{article:Cluster: Repositories}.
Since repository hosts must run daemons anyway, you usually do not need to do
any additional work and can skip this entirely.
Adding Daemon Hosts
===================
After configuring repositories for clustering, launch daemons on every
repository host according to the documentation in
@{article:Cluster: Repositories}. These daemons are necessary: repositories
will not fetch, update, or synchronize properly without them.
If your repository clustering is redundant (you have at least two repsoitory
hosts), these daemons are also likely to be sufficient in most cases. If you
want to launch additional hosts anyway (for example, to increase queue capacity
for unusual workloads), see "Dedicated Daemon Hosts" below.
Dedicated Daemon Hosts
======================
You can launch additional daemon hosts without any special configuration.
Daemon hosts must be able to reach other hosts on the network, but do not need
to run any services (like HTTP or SSH). Simply deploy the Phabricator software
and configuration and start the daemons.
Normally, there is little reason to deploy dedicated daemon hosts. They can
improve queue capacity, but generally do not improve availability or increase
resistance to data loss on their own. Instead, consider deploying more
repository hosts: repository hosts run daemons, so this will increase queue
capacity but also improve repository availability and cluster resistance.
Next Steps
==========
Continue by:
- returning to @{article:Clustering Introduction}; or
- configuring repositories first with @{article:Cluster: Repositories}.

View File

@@ -0,0 +1,42 @@
@title Cluster: Web Servers
@group intro
Configuring Phabricator to use multiple web servers.
Overview
========
WARNING: This feature is a very early prototype; the features this document
describes are mostly speculative fantasy.
You can run Phabricator on multiple web servers. The advantages of doing this
are:
- you can completely survive the loss of multiple web hosts; and
- performance and capacity may improve.
This configuration is simple, but you must configure repositories first. For
details, see @{article:Cluster: Repositories}.
Adding Web Hosts
================
After configuring repositories in cluster mode, you can add more web hosts
at any time: simply deploy the Phabricator software and configuration to a
host, start the web server, and then add the host to the load balancer pool.
Phabricator web servers are stateless, so you can pull them in and out of
production freely.
You may also want to run SSH services on these hosts, since the service is very
similar to HTTP, also stateless, and it may be simpler to load balance the
services together.
Next Steps
==========
Continue by:
- returning to @{article:Clustering Introduction}.

View File

@@ -1,50 +0,0 @@
@title User Guide: Phabricator Clusters
@group config
Guide on scaling Phabricator across multiple machines.
Overview
========
IMPORTANT: Phabricator clustering is in its infancy and does not work at all
yet. This document is mostly a placeholder.
IMPORTANT: DO NOT CONFIGURE CLUSTER SERVICES UNLESS YOU HAVE **TWENTY YEARS OF
EXPERIENCE WITH PHABRICATOR** AND **A MINIMUM OF 17 PHABRICATOR PHDs**. YOU
WILL BREAK YOUR INSTALL AND BE UNABLE TO REPAIR IT.
See also @{article:Almanac User Guide}.
Managing Cluster Configuration
==============================
Cluster configuration is managed primarily from the **Almanac** application.
To define cluster services and create or edit cluster configuration, you must
have the **Can Manage Cluster Services** application permission in Almanac. If
you do not have this permission, all cluster services and all connected devices
will be locked and not editable.
The **Can Manage Cluster Services** permission is stronger than service and
device policies, and overrides them. You can never edit a cluster service if
you don't have this permission, even if the **Can Edit** policy on the service
itself is very permissive.
Locking Cluster Configuration
=============================
IMPORTANT: Managing cluster services is **dangerous** and **fragile**.
If you make a mistake, you can break your install. Because the install is
broken, you will be unable to load the web interface in order to repair it.
IMPORTANT: Currently, broken clusters must be repaired by manually fixing them
in the database. There are no instructions available on how to do this, and no
tools to help you. Do not configure cluster services.
If an attacker gains access to an account with permission to manage cluster
services, they can add devices they control as database servers. These servers
will then receive sensitive data and traffic, and allow the attacker to
escalate their access and completely compromise an install.

View File

@@ -113,25 +113,16 @@ This daemon will daemonize and run normally.
- See @{article:Diffusion User Guide} for details about tuning the repository
daemon.
== Multiple Machines ==
If you have multiple machines, you should use `phd launch` to tweak which
daemons launch, and split daemons across machines like this:
Multiple Hosts
==============
- `PhabricatorRepositoryPullLocalDaemon`: Run one copy on any machine.
On each web frontend which is not running a normal copy, run a copy
with the `--no-discovery` flag.
- `PhabricatorTriggerDaemon`: Run one copy on any machine.
- `PhabricatorTaskmasterDaemon`: Run as many copies as you need to keep
tasks from backing up. You can run them all on one machine or split them
across machines.
For information about running daemons on multiple hosts, see
@{article:Cluster: Daemons}.
A gratuitously wasteful install might have a dedicated daemon machine which
runs `phd start` with a large pool of taskmasters set in the config, and then
runs `phd launch PhabricatorRepositoryPullLocalDaemon -- --no-discovery` on each
web server. This is grossly excessive in normal cases.
= Next Steps =
Next Steps
==========
Continue by: