Page MenuHomePhabricator

Significantly speeding up Recent Changes (RC): show only last revision
Closed, DeclinedPublic

Description

I loved the way, UseMod-Wiki software displays Recent Changes: you see each page
only ONCE, ie. the last and most recently changed version.

After my analysis of the big WikiPedias - standard and enhancement RC view - I
found, that the "UseMod-RC-style" could significantly save processing time and
bandwidth (for current MediaWiki software, a lot of usually
not-at-the-first-view interesting information is sent to the client: date and
time of ALL pages *and their older revisions* until the cut-off condition is
met.

So, I found a tiny patch to significantly speed-up the RC view:

In module SpecialRecentchanges.php change the line (about line #100)

WHERE rc_timestamp > '{$cutoff}' {$hidem} ".

to

WHERE rc_timestamp > '{$cutoff}' {$hidem} AND rc_this_oldid=0 ".

This has the side effect, that enhance RC views do not have the older revisions
transmitted to the clients browser. But who actually has needed these ? You can
always click on "hist" link to get the older revisions sent to your client.

My goal was, to quicken the RC view significantly. The patch saves bandwidth, as
users who are interested in older revisions can simply click on hist links to
get these.

Remark:
A new, forthcoming Enotif patch revision will enable each user to have ONE
additional link "lastvisited" for WATCH-LISTED pages

(bold, because watchlisted)page title (diff)(hist)(lastvisited)

on which he/she can click to get immediately the diff between the current
version and THE LAST VISITED ONE.


Version: 1.3.x
Severity: normal
URL: http://meta.wikimedia.org/wiki/Email_notification_versions#Recent_Changes_view:_UseMod_style

Details

Reference
bz756

Revisions and Commits

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 7:02 PM
bzimport set Reference to bz756.
bzimport added a subscriber: Unknown Object (MLST).

rowan.collins wrote:

The big disadvantage of hiding all but the most recent change from RC is that it
provides a neat way of slipping vandalism past anyone watching: vandalise one
section of an article, then quickly make a valid and minor edit to a different
section. If only the second edit shows up on RC, people "patrolling" will check
the diff and assume the edit is sound; whereas currently, *both* edits will show
up, and be checked seperately by patrollers.

Perhaps there could be a *option* to simply hide all but one edit per page, but
I'm not sure how useful this would be, since we'd want it turned off by default
because people wouldn't know it was there to turn off (if they wanted to go on
patrol, and avoid the scenario above).

I agree with Rowan's comments; I don't see how this would be any faster either.

(In reply to comment #2)

I agree with Rowan's comments; I don't see how this would be any faster either.

PLEASE: Try it first, to see how it works. Then comment. Thank you,
(I came from UseMod, and the RC view is quicker there: you save all the (first)
not needed revision data)

UseMod's RecentChanges display also shows the number of changes so you can directly see that there have been multiple changes that need to be
reviewed -- an important part that making this change would lose, but which is already included in our 'enhanced recent changes' mode.

Can you explain what you mean by 'quicker'? As far as I can see it will pull and format the same number of rows, so what's the difference? If anything,
hopping about through the indexes more might make the query slower. Have you benchmarked it?

rowan.collins wrote:

I presume the assumption was that by excluding all but one revision in the
actual SQL query, it would be retrieving less info from the DB. But this would,
of course, exclude even the "X changes have been made" labels, since the query
would have no idea of any but the most recent change per article.

(In reply to comment #5)

I presume the assumption was that by excluding all but one revision in the
actual SQL query, it would be retrieving less info from the DB. But this would,
of course, exclude even the "X changes have been made" labels, since the query
would have no idea of any but the most recent change per article.

Your are right. However, for me, the past does not count. What counts, is the
current version. The enotif patch comes with an advanced method of having a
direct link to the last visited page version (diff between current and last
visited version). This additional link is much more sexier than to see hundreds
of "edit wars" on an article.

But I will definitely pay attention to your remark and I *will COUNT and show
the available recent VERSIONS. By the way, I already show the number of watching
users.

If you have access to the source of your wiki, it's worth to apply my patch for
a while and to see the difference in the output. Most of my colleagues do like it.

For the MediaWiki Software CVS version, I already programmed a global variable,
with which the Admin can enable or disable ($wgRCUseModStyle = true | false) the
here proposed feature.

That you anyway for your contributions.
Tom

rowan.collins wrote:

(In reply to comment #6)

If you have access to the source of your wiki, it's worth to apply my patch for
a while and to see the difference in the output. Most of my colleagues do like it.

I admit, I haven't tried making the change mentioned in comment #0, but as far
as I can make out, the result will be something similar to the "enhanced recent
changes" when all the multiples are collapsed. Correct me if I'm wrong, and
indeed I apologise if this assumption is wrong enough to invalidate any of my
comments below.

Your are right. However, for me, the past does not count. What counts, is the
current version. The enotif patch comes with an advanced method of having a
direct link to the last visited page version (diff between current and last
visited version). This additional link is much more sexier than to see hundreds
of "edit wars" on an article.

Just to clarify, this isn't about the importance of "the past", it's about the
fact that diffs only highlight what has been changed in a particular revision.
So if you show only one change to a large article, other changes will simply
disappear; people patrolling RecentChanges to spot vandalism do *not* read the
whole of every article that has been changed, to check for *old* vandalism, they
check the diff of each change to see if *it* is vandalism. I do think the "(last
visited)" link for watchlisted pages is a very good idea, but it is the solution
to a different problem: how to keep track of an article that you have read and
understand the whole of, and make sure no vandalism or dodgy edit slips through
on a longer timescale.

I note that the "parent" of each group of edits on enhanced recent changes
already has a link labelled "changes" that shows a diff incorporating all the
collapsed changes, so you don't even have to look at each one individually and
trawl through edit wars etc.

As an example, look at
http://en.wikipedia.org/w/wiki.phtml?title=User:IMSoP/sandbox&action=history As
I understand it, your proposed RecentChanges view would only list the most
recent change, which is
http://en.wikipedia.org/w/wiki.phtml?title=User:IMSoP/sandbox&diff=0&oldid=6772150

  • a perfectly valid edit, no vandalism here. What won't be spotted is that a

short time before there was another edit (which in reality would probably be an
unlabelled contribution from a non-registered user):
http://en.wikipedia.org/w/wiki.phtml?title=User:IMSoP/sandbox&diff=6772150&oldid=6772102
Thus the one-change-only view severely compromises one of the main roles of the
RecentChanges page: spotting vandalism.

As you say, if I had the page on my watchlist, a "last visited" link might point
to
http://en.wikipedia.org/w/wiki.phtml?title=User:IMSoP/sandbox&diff=0&oldid=6772102
and I would spot the problem, but this is no good if no-one happens to be
actively watching it (i.e. checking their watchlist, or recentchanges where it
is bolded, at that particular moment). The current "enhanced recent changes"
would have a link pointing to that diff anyway, comparing <oldest version at
$cutoff> with <version now>, but if you're not pulling old edits from the
database, you can't have this, either.

But I will definitely pay attention to your remark and I *will COUNT and show
the available recent VERSIONS. By the way, I already show the number of watching
users.

The thing with this is, that in order to count the number of changes, you will
be pulling exactly the same information out of the database as the current
version, but just not transmitting all of it to the client. So essentially
you'll have something like the current "enhanced recent changes", but without
the JavaScript that lets you expand the multiples. Personally, I'm not sure the
bandwidth involved in transmitting this extra information is significant enough
to bother about, but that's to some extent a matter of opinion.

(In reply to comment #7)

Thank you for your really *valuable comments* !

The thing with this is, that in order to count the number of changes, you will
be pulling exactly the same information out of the database as the current
version, but just not transmitting all of it to the client.

Regarding your last paragraph, exactly, you've got it, it was my intention to
save bandwidth. (Wikipedians have told me ugly things, when I introduced SOME
more bytes with a visible NEW marker, but that's history).

Everyone who wants to patrol (with my RCUseModStyle=true) can click on (hist) to
pull the history of a single article. For small scale wikis, the RCUseModStyle
view seems to fit better, for big scale, perhaps you are right. This is, why I
have the patch indeed configurable: Admins may switch the feature on or off.

It would be perhaps better to have this as a (third) USER PREFERENCE option,
like this:

User Preferences:

RC View:
1- Standard = Every recent change shown on a separate line (*)
2- Enhanced RC View = recent changes of a page are hidden per day (JavaScript
fold view) (*)
3- UseMod Style View = only one (the recent) change of a page is shown (**)

(*) already possible options now
(**) new as proposed herein

Tom

I've already explained how the proposed mode would be significantly
different from UseMod's RecentChanges view, while the enhanced
recent changes view by default provides an almost identical view to
UseMod's (with extra detail if you click the arrow).

If you'd like to improve the compatibility and bandwidth usage of the
enhanced mode, please file that as a separate request.

(In reply to comment #9)

I've already explained how the proposed mode would be significantly
different from UseMod's RecentChanges view,

UseMod shows .... ( 3 versions ) after the paeg title, indicating, that 3
versions are still kept in the database for retrieval on users' demand.

while the enhanced recent changes view by default provides an almost identical

view to

UseMod's (with extra detail if you click the arrow).

It is not the same. It messes the screen up, because pages with many changes
appear on several days even in folded view. My bugreport aimed indeed more to
hide these (plenty) "changed pages appear every day when they were changed even
in folded enhanced view" lines.

I have nothing against - and I am in favour as already mentioned - in signalling
the number of versions available (as in UseMod), but I guess the superfluous
view of page changes in the past can be avoided. This is why I still propose to
implement "RCUseMod-Style (exactly)" as a THIRD option for users.

(In reply to comment #9)

If you'd like to improve the compatibility and bandwidth usage of the
enhanced mode, please file that as a separate request.

Yes, thanks for suggestion.
I filed this as http://bugzilla.wikipedia.org/show_bug.cgi?id=782 "Third user
option for Recent Changes View: UseMod-Style" taking your valuable comments into
consideration.

I did not declare "dependency" between the both bugzillas.

http://meta.wikimedia.org/wiki/Email_notification_versions#Recent_Changes_view:_UseMod_style
shows a *screenshot* of the implementation of the alternative RC view in Enotif
versions > 1.32 .

The UseMod style can be globally disabled by an Admin setting in
DefaultSettings.php. If this is enabled, users get a third alternative on their
user preference page and can decide to haev RC views in "UseMod" style (to see
only the recent changed version including the standard (diff) and (hist) links)
or not. This feature can advantageously be combined with Enhanced RC view, as it
the suppresses the listing of older revisions than the recent changed one.

Created attachment 116
screenshot of RC UseMod style view showing only the recent version

Attached:

enotif133_rcview_usemod_style_with_one_updated_page.png (434×1 px, 28 KB)

epriestley added a commit: Unknown Object (Diffusion Commit).Mar 4 2015, 8:20 AM