Strange Regex/Validation behavior

Enonic version: XP: 7.8.4 CS: 4.0.3
OS: Windows
Chrome
In a Enonic environment, not locally

On certain TextLine inputs where we expect url inputs we use a regex to validate that the input is a valid url.
The regex was provided by you (Enonic) some months ago and is what, quote, “…which we ourselves use in the Insert Link dialog for validating URLs…” and up until now it has worked well.
Regex:

<config>
  <regexp>^http(s)?:\/\/.?(www\.)?[a-zA-Z0-9][[email protected]:%._\+~#=]{0,255}\b([[email protected]:%_\+.~#?&amp;//=]*)</regexp>
</config>

But yesterday our client tried to insert this link (ignore the first letter “a”, added it to prevent the link becoming some visual link here on the forum):
ahttps://www.oslo.kommune.no/barnehage/finn-barnehage-i-oslo/#!c%7Cf_preschool_type_student/c.f_preschool_type_student//m.list

and with this when we try to Save we get the CS studio error at the bottom, orange colored: “Invalid property for content: /theContentIm/On”.
When purposely adding an invalid url (like not adding http/https) to the TextLine input I get the “normal” field-error message right below the field telling me that it’s not a valid url (+ the red border around the field) - I don’t get the orange error “banner” at the bottom of the site.

I know there is validation in the CS (via Regex) and validation in the Enonic-backend and it sounds like it is the Enonic-backend that is the strict one here. To further confirm this; when testing this locally I do not get a orange error “banner” and it allows me to save the “oslo.kommune”-url. I believe it allows me to do this because when running locally there is no “Enonic-backend validation” in play(?).

After some further testing I’ve discovered that it only fails if the TextLine input is “the first element of e.g. an option set”. Meaning, if the TextLine input with the regex is ‘one of multiple option-set-items’, it will only fail if the “invalid” url is in the first item. If I place the “invalid” url into any other item than the first one, it works fine - as if the Enonic backend validation only does the strict checking for the first element of an option set (what?).

Image: left hand side works, right hand side gives error banner upon trying to Save.

GIF: Screen capture - d174060a0e9392f7a0f2517dd46a0c1b - Gyazo

Key points:

  • Doesn’t look like one can debug/reproduce this locally.
  • Doesn’t seem to be an error with the CS regex since it doesn’t give a “normal” field-error message / red border" when placing the url as e.g. item #2, #3 etc.
  • Gives an orange error banner if given url is the first element of a option set (partly unknown if it would do the same outside of a set).

Additional:
In /Site itself we have some TextLine inputs for e.g. Facebook/Instagram urls etc that we “give” to the site.xml via an ‘x-data’ called “site-common.xml”. These SoMe input fields have the same regex as previous, but these are within a field-set rather than an option-set (if that matters at all, would think not…). Attempting to put the “invalid” url into these inputs gives us no orange error banner. So either

  • /site is special and is not as affected by ‘strict Enonic backend validation’
  • items with the “invalid” url works when inside field-set // or // first item with the “invalid” url doesn’t work when inside option-set
  • there is a differece in validation depending whether the fields come via Mixin or X-data.

It seems like it’s the opposite - client-side validation seems to be a bit too loose and allows character “!” that should not be allowed by this regex.

If you change the regex to

^http(s)?:\/\/.?(www\.)?[a-zA-Z0-9][[email protected]:%._\+~#=]{0,255}\b([[email protected]:%_\+.~#?&amp;//=!]*)

then it will work as intended, both client- and server-side.

You’ve also seem to have discovered two bugs:

  1. Server-side seems to validate regexp only in the first occurrence.
  2. Server-side doesn’t seem to validate regexp in site config.
1 Like

Shouldn’t it display this:


if the regexs fails client-side?

A test done locally here shows that element #2 behave different depending on what url was provided:
Image 1:


Image 2:

It’s strange why it doesn’t treat the “oslo.kommune” url as a regular invalid url as in the first image.
If it is client-side validation that fails then it’s wierd why the second element get checked in “Image 1” and not in “Image 2”.

Shouldn’t it display this:

It should. But it doesn’t, because client-side validation (which in reality is in-browser validation) successfully validates your URL with that regexp. So we need to change the regexp. I believe this one does a better job:

<config>
  <regexp>^http(s)?:\/\/?[\w.-]+(?:\.[\w\.-]+)+[\w\-\._~:/?#\[\]@!\$&amp;'\(\)\*\+,;=.%]+</regexp>
</config>

We’ll also replace the regexp we are using in the Insert Link dialog to this one.

1 Like