Summary: Ref T10262. This removes one-time tokens and makes file data responses always-cacheable (for 30 days). The URI will stop working once any attached object changes its view policy, or the file view policy itself changes. Files with `canCDN` (totally public data like profile images, CSS, JS, etc) use "cache-control: public" so they can be CDN'd. Files without `canCDN` use "cache-control: private" so they won't be cached by the CDN. They could still be cached by a misbehaving local cache, but if you don't want your users seeing one anothers' secret files you should configure your local network properly. Our "Cache-Control" headers were also from 1999 or something, update them to be more modern/sane. I can't find any evidence that any browser has done the wrong thing with this simpler ruleset in the last ~10 years. Test Plan: - Configured alternate file domain. - Viewed site: stuff worked. - Accessed a file on primary domain, got redirected to alternate domain. - Verified proper cache headers for `canCDN` (public) and non-`canCDN` (private) files. - Uploaded a file to a task, edited task policy, verified it scrambled the old URI. - Reloaded task, new URI generated transparently. Reviewers: chad Reviewed By: chad Maniphest Tasks: T10262 Differential Revision: https://secure.phabricator.com/D15642
39 lines
1.4 KiB
PHP
39 lines
1.4 KiB
PHP
<?php
|
|
|
|
final class PhabricatorRobotsController extends PhabricatorController {
|
|
|
|
public function shouldRequireLogin() {
|
|
return false;
|
|
}
|
|
|
|
public function processRequest() {
|
|
$out = array();
|
|
|
|
// Prevent indexing of '/diffusion/', since the content is not generally
|
|
// useful to index, web spiders get stuck scraping the history of every
|
|
// file, and much of the content is Ajaxed in anyway so spiders won't even
|
|
// see it. These pages are also relatively expensive to generate.
|
|
|
|
// Note that this still allows commits (at '/rPxxxxx') to be indexed.
|
|
// They're probably not hugely useful, but suffer fewer of the problems
|
|
// Diffusion suffers and are hard to omit with 'robots.txt'.
|
|
|
|
$out[] = 'User-Agent: *';
|
|
$out[] = 'Disallow: /diffusion/';
|
|
|
|
// Add a small crawl delay (number of seconds between requests) for spiders
|
|
// which respect it. The intent here is to prevent spiders from affecting
|
|
// performance for users. The possible cost is slower indexing, but that
|
|
// seems like a reasonable tradeoff, since most Phabricator installs are
|
|
// probably not hugely concerned about cutting-edge SEO.
|
|
$out[] = 'Crawl-delay: 1';
|
|
|
|
$content = implode("\n", $out)."\n";
|
|
|
|
return id(new AphrontPlainTextResponse())
|
|
->setContent($content)
|
|
->setCacheDurationInSeconds(phutil_units('2 hours in seconds'))
|
|
->setCanCDN(true);
|
|
}
|
|
}
|