Skip to main content

PHP Function to remove the HTML tags along with their contents - PHP function to remove HTML tags only

Usually strip_tags() function is used for removing tags from an html string. but there are some issues with this function


1. It does not validate the HTML, partial or broken tags can result in the removal of more text/data than expected.
2. It does not modify any attributes on the tags that you provide as allowable_tags parameter.
3. It may give different outputs for different versions of the same tag. For example for <br> and <br />
4. For a badly formated HTML string like " PHP guys <b<b>> rocks </b<b>> ", it may give unexpected results.



Here are some work arounds:

PHP function to remove the HTML tags along with their contents:


<?php
function strip_tags_content($text, $tags = '', $invert = FALSE) {
$op_string = "";

preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
$tags = array_unique($tags[1]);

if(is_array($tags) AND count($tags) > 0) {
if($invert == FALSE) {
$op_string = preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
}
else {
$op_string = preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
}
}
elseif($invert == FALSE) {
$op_string = preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
}

// ----- remove multiple spaces -----
$op_string = trim(preg_replace('/ {2,}/', ' ', $op_string));

return $op_string;
}
?>



PHP function to remove the HTML tags and the line control characters

<?php
function remove_all_tags($string) {

// ----- remove HTML TAGs -----
$string = preg_replace ('/<[^>]*>/', ' ', $string);

// ----- remove control characters -----
$string = str_replace("\r", '', $string); // --- replace with empty space
$string = str_replace("\n", ' ', $string); // --- replace with space
$string = str_replace("\t", ' ', $string); // --- replace with space

// ----- remove multiple spaces -----
$string = trim(preg_replace('/ {2,}/', ' ', $string));

return $string;

}
?>




Illustration


Input Text
$text = '<b>PHP</b> <b<b>>guys</b<b>> are <div>rocking</div>';

Result for strip_tags($text) :
PHP <b>guys</b> are rocking

Result for strip_tags_content($text) :
PHP are

Result for strip_tags_content($text, '<div>'):
PHP are <div>rocking</div>

Result for strip_tags_content($text, '<b>', TRUE);
text with <div>tags</div>

Result for remove_all_tags($text):
PHP guys are rocking




Use of this Strip functions
1. Can be used for validating User inputs for html elements
2. Can be used to check the GET parameters to find the presence of html elements like script tags which hackers use to include unauthorised JS scripts into a web page ( *Cross-Site Scripting vulnerabilities [ XSS ] ) .

Sample Vulnerable URL for the above mentioned scenario:
https://www.yourdomain.com/test.php?op=1&place=Kerala-India<Script>alert(\"You are hacked\")</Script>
If this Get parameters are not validated and user is printing the $_GET['place'] variable. The when the page is loaded it will alert the message "You are hacked".


For more details visit http://php.net/strip_tags


*Cross-Site Scripting [ XSS ] Attack
A target system is identified with XSS which occurs when dynamically generated web pages display user input, such as login information, that is not properly validated, allowing an attacker to embed malicious scripts into the generated page which is then executed by the browser on the machine of any user that views the page with the malicious content.
If successful, Cross-Site Scripting vulnerabilities can be exploited to manipulate or steal cookies, create requests which appear to come from a valid user, compromise confidential information, or execute malicious code on end user systems.

XSS attack can also be prevented using the .htaccess file. Click here for more details. Here it checks for the presence of script or iframe tags in the url or the query string and if found , hacker will be redirected to a custom error page.

Popular posts from this blog

How to delete videos from your Youtube Watch History list?

How to Delete Individual or all videos from your Youtube Watch History list? Youtube keeps a fine record of the videos that you had watched earlier. You can view this by visiting the History section. If you want to remove the video's from the list do the following: Logon to Youtube and click on the "History" tab on the left menu to view Watch History ( Read more ) There will be check boxes corresponding to each video in the list Tick the check boxes of the videos which you want to remove Click on " Remove " button to delete the videos.

How to add "Link to this page" option under blogger posts?

Steps in adding Link to this page to your blogger posts Links to your page can improve your page rank. So it is a good option to add HTML code for linking to your web page. So that reader can copy and paste it on their web page. if another website links to your web page, this is considered an external link to your website. External links to your website are the most important source of ranking power and in SEO terminology it is considered as third party ranking vote for your page.

Intex Aqua 5.5 VR Plus genuine Review - Dont Buy Intex Aqua 5.5 VR Plus - Board complaint and low battery backup issues

Intex Aqua 5.5 VR Plus  Review - Dont Buy Intex Aqua 5.5 VR + - Board complaint and low battery backup issues I bought an Intex Aqua 5.5 VR Plus on April 23, 2018, With in a week it started to show Battery backup issues. Even if it is charged full, it will completely drain out with in 12 or 13 hours. During this time No internet was used, only 2 or 3 calls were done. Some times there was issues with net connection also.


Urgent Openings for PHP trainees, Andriod / IOS developers and PHP developers in Kochi Trivandrum Calicut and Bangalore. Please Send Your updated resumes to recruit.vo@gmail.com   Read more »
Member
Search This Blog