Page MenuHomePhabricator

Definition list nested with anything (;#foo:bar) is unpredictable
Closed, ResolvedPublic

Description

Author: bugs

Description:
Consider the parsing of these snippets:

//#1
;#term
:def1
:def2

// #2
;#term:def1
:def2

//#3
;#term
:#def1
:#def2

//#4
;#term:def1
:#def2

//#5
;term
:#def1
:#def2

Now, obviously it's far from clear what the semantics are or ought to be, but these aspects of behaviour are weird:

  • In snippets #3 and #5, the definitions are rendered like the term itself.
  • Snippets #1 and #2 render very differently although they should be equivalent.
  • Snippet #4 is perhaps the most likely thing you'd actually want (one term with multiple definitions), but requires the most tortured syntax.
  • #4 and #5 seem very similar but produce very output with different indenting and bolding.
  • It's quite unfortunate that ";foo:bar" is supposed to act like ";foo<newline>:bar", but no attempt is made to render list elements like ";foo:#bar".

Most of this weirdness seems to be caused by the parser line around "if ($this->findColonNoLinks($t, $term, $t2) !== false)" which treats the part of the line after : as if it was actually a line starting with ;.

Possible resolutions:

  • Don't allow any nesting of list items after ";". Treat ;#term as defining the literal string "#term". #; is fine for an ordered list of definitions.
  • Make #5 above the canonical way to express a term with a list of definitions, but make it render as #4 above.
  • Longer term, totally scrap the ";definition:term" (without newline) syntax. It's not of enormous benefit, makes the parsing more complex, and causes all this weird behaviour.

Version: 1.9.x
Severity: minor

Details

Reference
bz11894

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:54 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz11894.
bzimport added a subscriber: Unknown Object (MLST).

This is resolved in Parsoid by making the single-line ; foo : bar pair a tightly-bound unit. See bug #6569 for more information about the reasoning.

So

*#; foo : bar

is treated as equivalent to

*#; foo
*#: bar

Extra ':' in the dd part are recognized as plain text in Parsoid.

We added tests documenting this behavior to parserTests.txt, but are marked the differing ones as disabled for the PHP parser until that becomes compatible.

  • This bug has been marked as a duplicate of bug 6569 ***