Regular expressions in SPIP

Hi folks,

As a UNIX person, I often find myself reaching for regular expressions as the tool for any problem. Today I had reason to want to use replace with extended RE's use back references to modify the value of a tag. Unfortunately, SPIP doesn't seem to support using the `match` or `replace` filters with patterns that contain any of the symbols ()[]{}. Whenever I use such a pattern, it expands the tag, and outputs the filter and surrounding brackets.

Compare the following three examples of a [extended] regular expression used with BSD sed on Apple OS X, PHP 4.3.10 on Debian, and SPIP (on the same server). The first two behave as expected, whereas SPIP encounters problems and outputs the filter rather than executing it.

BSD sed (a binary without a version) on Apple OS X:
  me$ echo "Example Site Name" | sed -Ee 's/([a-zA-Z])([a-zA-Z]+)/<span>\1<\/span>\2/g'
  <span>E</span>xample <span>S</span>ite <span>N</span>ame

PHP 4.3.10 on Debian:
  <?php echo preg_replace('/([a-zA-Z])([a-zA-Z]+)/imsSx', '<span>\1</span>\2', 'Example Site Name'); ?>
  <span>E</span>xample <span>S</span>ite <span>N</span>ame

SPIP 1.9.2c [10268]:
  [(#NOM_SITE_SPIP|replace{'([a-zA -Z])([a-zA-Z]+)','<span>\1</span>\2,'imsSx'})]
  [(Example Site Name|replace{'([a-zA-Z])([a-zA-Z]+)','<span>\1</span>\2,'imsSx'})]

This seems to occur when the pattern contains any of the closing bracketing characters used in SPIP's tags and filters syntax: ])}. This suggests that there is a problem with the SPIP syntax parser falsely matching against these characters as the end of some piece of it's own syntax.

Is this a known issue, is it already (or likely to be) fixed and, if not, can someone point me to the file implementing the SPIP template parser so that I can try to fix it?

Cheers,

Thomas Sutton

P.S.: If people get a little sick of my messages, do please let me know and I'll try to moderate myself. :slight_smile:

Thomas Sutton wrote:

    [(#NOM_SITE_SPIP|replace{'([a-zA -Z])([a-zA-Z]+)','<span>\1</span>\2,'imsSx'})]

Hello,

Unless the syntax has moved on faster than I realised (which is quite possible!), it is not possible to use just any PHP function as a SPIP filter.

What you can do is define your own filter and then put the PHP code in mes_options.php

Paolo

Hi Thomas,

I am not sure this has been fixed, you might want to try with a 1.9.3 SVN version to see, but I doubt there has been work on that.

The parser is in ecrire/inc/phraser_html.php (phraser being the french word for parse...). According to trac, there has been only two correction to this file since 10268, and they are not fixing this...

I am sure the devs will be happy if you can fix this, but for what I know, last time I checked, it's a very complex problem (as you want SPIP to still understand when you put a SPIP tag as a parameter to a filter...).

The easiest is probably to do what Paolo proposes and create specific filters for each of your replace cases. It would after all be easier to read and manage afterward :wink:

So you could rename:
[(#NOM_SITE_SPIP|replace{'([a-zA-Z])([a-zA-Z]+)','<span>\1</span>\2,'imsSx'})]
to something like:
[(#NOM_SITE_SPIP|span_first_letter)]

and declare the span_first_letter filter in you fonctions file.

Pierre

PS: don't be afraid to post in English to the spip-dev list if you think you found a bug somewhere, or file a bug report in trac ticket system.

Thomas Sutton wrote:

Hi folks,

As a UNIX person, I often find myself reaching for regular expressions as the tool for any problem. Today I had reason to want to use replace with extended RE's use back references to modify the value of a tag. Unfortunately, SPIP doesn't seem to support using the `match` or `replace` filters with patterns that contain any of the symbols (){}. Whenever I use such a pattern, it expands the tag, and outputs the filter and surrounding brackets.

Compare the following three examples of a [extended] regular expression used with BSD sed on Apple OS X, PHP 4.3.10 on Debian, and SPIP (on the same server). The first two behave as expected, whereas SPIP encounters problems and outputs the filter rather than executing it.

BSD sed (a binary without a version) on Apple OS X:
    me$ echo "Example Site Name" | sed -Ee 's/([a-zA-Z])([a-zA-Z]+)/<span>\1<\/span>\2/g'
    <span>E</span>xample <span>S</span>ite <span>N</span>ame

PHP 4.3.10 on Debian:
    <?php echo preg_replace('/([a-zA-Z])([a-zA-Z]+)/imsSx', '<span>\1</span>\2', 'Example Site Name'); ?>
    <span>E</span>xample <span>S</span>ite <span>N</span>ame

SPIP 1.9.2c [10268]:
    [(#NOM_SITE_SPIP|replace{'([a-zA -Z])([a-zA-Z]+)','<span>\1</span>\2,'imsSx'})]
    [(Example Site Name|replace{'([a-zA-Z])([a-zA-Z]+)','<span>\1</span>\2,'imsSx'})]

This seems to occur when the pattern contains any of the closing bracketing characters used in SPIP's tags and filters syntax: ])}. This suggests that there is a problem with the SPIP syntax parser falsely matching against these characters as the end of some piece of it's own syntax.

Is this a known issue, is it already (or likely to be) fixed and, if not, can someone point me to the file implementing the SPIP template parser so that I can try to fix it?

Cheers,

Thomas Sutton

P.S.: If people get a little sick of my messages, do please let me know and I'll try to moderate myself. :slight_smile: