John Davidson

php - Regex to "normalize" usage of SPACE after . , : chars (and some exceptions)

0 comments
Message:


I need to normalize some texts (product descriptions) in regard to the correct usage of .,,,: symbols (no space before and one space after)


The regex I've come up with is this:


$variation['DESCRIPTION'] = preg_replace('#\s*([:,.])\s*(?!<br />)#', '$1 ', $variation['DESCRIPTION']);


The problem is that this matches four cases it shouldn't touch:



  • Any decimal number, like 5.5

  • Any thousand separator, like 4,500

  • A "fixed" phrase in Greek, ό,τι

  • The ellipsis symbol, ... - Basically ellipsis is a totally special case, that I'm thinking should be taken care of in a separate preg_replace maybe? I mean, the three dots should be treated as one thing, meaning that some text ... should indeed be matched and converted to some text... but not to some text. . .


Especially for the numeric exception, I know it can be achieved with some negative lookahead/lookbehind but unfortunately I can't combine them in my current pattern.


This is a fiddle for you to check (the cases that shouldn't be matched are in lines 2, 3, 4).


EDIT: Both of the solutions posted below work fine, but end up adding a space after the last fullstop of the description. This is not much of a problem, as earlier in my code, I was taking care of the <br />s and spaces at the beginning and end of the description, so I moved this preg_replace before that one...


So, the final code I ended up using is this:


$variation['DESCRIPTION'] = preg_replace('#\s*([:,.])(?!(?<=\d.)\d)(?!(?<=ό,)τι)\s*#ui', '$1 ', $variation['DESCRIPTION']);
$variation['DESCRIPTION'] = preg_replace('#^\s*(<br />)*\s*|\s*(<br />)*\s*$#', '', $variation['DESCRIPTION']);

So the only thing that's left to achieve is alter this code so that it treats the ellipsis the way I describe above.


Any help with this last requirement will be very much appreciated! TIA



My Answer:
match a valid email address:

php
$email = "example@example.com";

if (preg_match("/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/", $email)) {
echo "Valid email address";
} else {
echo "Invalid email address";
}


This regex pattern checks for a valid email address format, including alphanumeric characters, special characters like ".", "_", "%", "+", and "-", and a valid domain name with at least 2 characters.

Rate this post

3 of 5 based on 3216 votes

Comments




© 2024 Hayatsk.info - Personal Blogs Platform. All Rights Reserved.
Create blog  |  Privacy Policy  |  Terms & Conditions  |  Contact Us