Page MenuHomePhabricator

Function to compress or delete old content in database
Closed, ResolvedPublic

Description

Author: mr.primus

Description:
Since MW 1.5beta3, compressOld.php doesn't seem to work. In database the "old
table" doesn't exist anymore. A tool, special page or whatever is needed to
reduce the size of database and delete obsolete content. Refer for future
descriptions here: http://meta.wikimedia.org/wiki/Help:Reduce_size_of_the_database .


Version: 1.5.x
Severity: enhancement
URL: http://meta.wikimedia.org/wiki/Help:Reduce_size_of_the_database

Details

Reference
bz3612

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:51 PM
bzimport set Reference to bz3612.
bzimport added a subscriber: Unknown Object (MLST).

robchur wrote:

Just to clarify; by obsolete data, do *you* mean old page revisions?

mr.primus wrote:

Right, perhaps older than three month. This could be a variable to set in
LocalSettings.php or AdminSettings.php. :)

robchur wrote:

You misunderstood my changing of the bug options. This bug effectively affects
all users, regardless of OS or platform (because it doesn't exist for anyone);
also, it's not a bug fix, it's a feature request/enhancement. And it's not
*critical* (think in terms of what the Wikimedia Foundation wants, if you want a
more accurate way of thinking how the developers regard critical issues) so it's
got normal priority.

mr.primus wrote:

I cant agree to you, because these old revisions *cause much costs of hardware*,
so it should be major, perhaps even critical.

robchur wrote:

I'm sorry, but I can't continue to argue over this. I'll have our CTO or one of
the other devs take a gander and relabel as required.

robchur wrote:

One point worth making is that the WMF tends to need to keep all old revisions,
for various reasons, including GFDL issues.

mr.primus wrote:

I think, that this point should be discussed by community, not only by you and me.

If its not possible to delete them, perhaps compressing would be ok. This is no
new feature, because it was already there before and compressOld.php is still in
the install package, but broken.

You can't think that *causing costs is a good solution* here, do you?

compressOld.php works fine as far as I know in current 1.5. (Check rc4 and CVS.)

mr.primus wrote:

can I use compressOld.php of 1.5rc4 in 1.5beta3?

mr.primus wrote:

How to do it without causing problems? I searched and only found this:

http://meta.wikimedia.org/wiki/Help:Upgrading_MediaWiki

The upgrade procedure for 1.5 is not yet available there.

robchur wrote:

You should be able to copy the files for 1.5 over the top of those for 1.5rc4,
but I'd recommend backing up the wiki folder (files) and the mySQL database
before doing this, as a precaution.

mr.primus wrote:

I run 1.5beta3 now, is it possible in the same way? Any database changes needed
or upgrade script?

If you look in the source archive, you'll see these files:
RELEASE-NOTES
UPGRADE
INSTALL

They are in capital letters for a reason. Please read them.

mr.primus wrote:

I just did as you said. The update to 1.5rc4 works fine with the default install
script there, but when i execute compressold.php it says the following:
compressOld is known to be broken at the moment. So i checked CVS and got
1.6alpha working with this failure at the end:

PHP 5.0.4 installed
Warning: PHP's register_globals option is enabled. Disable it if you can.
MediaWiki will work, but your server is more exposed to PHP-based security
vulnerabilities.
PHP server API is apache2handler; ok, using pretty URLs (index.php/Page_Title)
Have XML / Latin1-UTF-8 conversion support.
PHP is configured with no memory_limit.
Have zlib support; enabling output compression.
Neither Turck MMCache nor eAccelerator are installed, can't use object caching
functions
GNU diff3 not found.
Found GD graphics library built-in, image thumbnailing will be enabled if you
enable uploads.
Installation directory: D:\apachefriends\xampp\htdocs\wiss
Script URI path: /wiss
Environment checked. You can install MediaWiki.
Warning: $wgSecretKey key is insecure, generated with mt_rand(). Consider
changing it manually.
Database type: mysql
Trying to connect to database server on localhost as root...
Connected as root (automatic)
Connected to 4.1.12
Database wikidb exists
There are already MediaWiki tables in this database. Checking if updates are
needed...
DB user account ok
...hitcounter table already exists.
...querycache table already exists.
...objectcache table already exists.
...categorylinks table already exists.
...logging table already exists.
...validate table already exists.
...user_newtalk table already exists.
...transcache table already exists.
...trackbacks table already exists.
...have ipb_id field in ipblocks table.
...have ipb_expiry field in ipblocks table.
...have rc_type field in recentchanges table.
...have rc_ip field in recentchanges table.
...have rc_id field in recentchanges table.
...have rc_patrolled field in recentchanges table.
...have user_real_name field in user table.
...have user_token field in user table.
...have user_email_token field in user table.
...have log_params field in logging table.
...have ar_rev_id field in archive table.
...have ar_text_id field in archive table.
...have page_len field in page table.
...have rev_deleted field in revision table.
...have img_width field in image table.
...have img_metadata field in image table.
...have img_media_type field in image table.
...have val_ip field in validate table.
...have ss_total_pages field in site_stats table.
...have iw_trans field in interwiki table.
...already have interwiki table
...indexes seem up to 20031107 standards
Already have pagelinks; skipping old links table updates.
...image primary key already set.
The watchlist table is already set up for email notification.
...user table does not contain old email authentication field.
Logging table has correct title encoding.
...page table already exists.
revision timestamp indexes already up to 2005-03-13
...rev_text_id already in place.
...page_namespace is already a full int (int(11)).
...ar_namespace is already a full int (int(11)).
...rc_namespace is already a full int (int(11)).
...wl_namespace is already a full int (int(11)).
...qc_namespace is already a full int (int(11)).
...log_namespace is already a full int (int(11)).
...already have pagelinks table.
No img_type field in image table; Good.
Already have unique user_name index.
...user_groups table already exists.
...user_groups is in current format.
...wl_notificationtimestamp is already nullable.
Updating indexes to 20050912: Query "ALTER TABLE wissimage
ADD INDEX img_major_mime (img_major_mime)
" failed with error code "No database selected".

if I execute compressOld.php, i got the same again: compressOld is known to be
broken at the moment. Whats wrong? :-)

mr.primus wrote:

I startet the process again with 1.5beta3 and updated via CVS to 1.5.0. This
worked fine and without failures, but if I execute compressOld.php, i got the
same again: compressOld is known to be broken at the moment. -.-

mr.primus wrote:

Any update here?

robchur wrote:

Someone was asking about this in the IRC channel last night, and I've got it
pegged down to have a look, so I'll probably take a look at the compressOld.php
script while I'm at it.

Looks like somebody forgot to remove the old compressOld.inc/.php on the REL1_5 branch
when the new storage code was moved into maintenance/storage.

Run the version in maintenance/storage, not the one in maintenance/. I'll make sure the
old one is removed in the next 1.5 release.

mr.primus wrote:

Seems to work now.

mr.primus wrote:

Any suggestion - perhaps a SQL query - for killing the old revisions?

Problem was fixed, not sure why this was reopened.

Could you open a new enhancement request for a delete-old-revisions script?
While it's evil and horrible, it gets asked for a lot ;) so maybe we could
include something.