Add cached summarization to PhamePost
Summary: Restore summarization. Use the remarkup cache, and try to do it somewhat-intelligently (pick the first paragraph that looks like it's text).
Test Plan:
{F21323}
{F21324}
Reviewers: btrahan
Reviewed By: btrahan
CC: aran
Maniphest Tasks: T1373
Differential Revision: https://secure.phabricator.com/D3715
This commit is contained in:
@@ -495,4 +495,54 @@ final class PhabricatorMarkupEngine {
|
||||
return $mentions;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Produce a corpus summary, in a way that shortens the underlying text
|
||||
* without truncating it somewhere awkward.
|
||||
*
|
||||
* TODO: We could do a better job of this.
|
||||
*
|
||||
* @param string Remarkup corpus to summarize.
|
||||
* @return string Summarized corpus.
|
||||
*/
|
||||
public static function summarize($corpus) {
|
||||
|
||||
// Major goals here are:
|
||||
// - Don't split in the middle of a character (utf-8).
|
||||
// - Don't split in the middle of, e.g., **bold** text, since
|
||||
// we end up with hanging '**' in the summary.
|
||||
// - Try not to pick an image macro, header, embedded file, etc.
|
||||
// - Hopefully don't return too much text. We don't explicitly limit
|
||||
// this right now.
|
||||
|
||||
$blocks = preg_split("/\n *\n\s*/", trim($corpus));
|
||||
|
||||
$best = null;
|
||||
foreach ($blocks as $block) {
|
||||
// This is a test for normal spaces in the block, i.e. a heuristic to
|
||||
// distinguish standard paragraphs from things like image macros. It may
|
||||
// not work well for non-latin text. We prefer to summarize with a
|
||||
// paragraph of normal words over an image macro, if possible.
|
||||
$has_space = preg_match('/\w\s\w/', $block);
|
||||
|
||||
// This is a test to find embedded images and headers. We prefer to
|
||||
// summarize with a normal paragraph over a header or an embedded object,
|
||||
// if possible.
|
||||
$has_embed = preg_match('/^[{=]/', $block);
|
||||
|
||||
if ($has_space && !$has_embed) {
|
||||
// This seems like a good summary, so return it.
|
||||
return $block;
|
||||
}
|
||||
|
||||
if (!$best) {
|
||||
// This is the first block we found; if everything is garbage just
|
||||
// use the first block.
|
||||
$best = $block;
|
||||
}
|
||||
}
|
||||
|
||||
return $best;
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user