Page MenuHomePhabricator

BiDi: request for a "BiDi balancing function" to avoid BiDi overlapping between objects
Closed, DeclinedPublic

Description

Author: gangleri

Description:
Hallo!

This request should offer a posibility to render pages properly *without*
implementing a restrictive request as made at
Bug 3819: strip phantom general punctuation characters from page titles

The function should examine the number and the correct order of general
punctuation characers RLE, LRE, RLO, LRO, PDF and the usage of LRM and RLM. You
may start reading about these characters at
http://www.dpawson.co.uk/xsl/sect2/Unicode.html.

"objects" can be whatever block of code rendered in MediaWiki. Typical examples
are page titles (regardless of the namespace), section headers, MediaWiki
messages etc.

The function should verify:
a) if the number of LRE's + the number of RLE's equals the number of PDF's
b) if they are used in a "valid" / "harmless" order
example: "PDF foo RLE" is *not* "harmless" because this can break code

a) and b) can be implemented with a counter (increased by RLE, LRE, RLO, LRO and
decreased by PDF) which triggers an exception if it's value gets negative and
another exception if the final value is not zero

exception b1) if the final value is positive and not zero an appropriate number
of PDF's would be appended
exception b2) if the value gets negative an LRE / RLE is "added" as header; the
decision depends (mainly?) on the content language

It should be investigated what would be the best to do if an object "starts!"
with LRM or RLM. This could break the rendering also. Maybe we should insert a
"Unicode Character ZERO WIDTH SPACE - U+200B" as header.

What would this fuction solve?
see the mess at

  1. http://bugzilla.wikimedia.org/attachment.cgi?bugid=4231&action=enter
  2. http://test.leuksman.com/view/Special:Recentchangeslinked/Category:Bugzilla/BiDi
  3. http://test.leuksman.com/view/Special:Recentchanges

This change may implay that some MediaWiki messages using these characters would
need to be adapted according to whatever is newly implemented so far.

This function should go hand in hand with additional embeding suggested at
Bug 4066: BiDi: general request for special pages: special pages should display
LTR / RTL according to user interface not according to content language (tracking)

The embeding required (implicitely) there relates to BiDi overlapping in a
"list" of objects. This would fix typical overlapping at
[[ar:Special:Recentchanges]] (and at all other RTL wiki's)

Best regards reinhardt [[user:gangleri]]


Version: unspecified
Severity: enhancement

Details

Reference
bz4232

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 8:57 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz4232.
bzimport added a subscriber: Unknown Object (MLST).

gangleri wrote:

removed direct blocks bug 745
Bug 745: RTL/bidirectional issues (tracking)

gangleri wrote:

*note*
to be investigated
What should be done if
(general case) either LRM or RLM follows LRE RLE LRO RLO PDF ?
I can notsee that this would make much sense.
Shold LRM / RLM be filtered in these cases?

gangleri wrote:

I opened a bug about the same topic at
https://bugzilla.mozilla.org/show_bug.cgi?id=320273
[Bug 320273 Bugzilla BiDi]: request for a "BiDi balancing function" to avoid
BiDi overlapping between objects

When the issue is solved there we should be able to get some help to solve it
also in MediaWiki.

I don't see any reason to implement such a function.

These characters are used very rarely, and when they are used, they must be used correctly. If they are not used correctly, this can be seen very easily, because the display will be clearly broken. This should not be detected or fixed automatically, but manually, by inserting the correct balanced number of characters.