Remove HTML Tags
Below we'll see how to remove HTML tags from a character string.
With Python
Function responsible for removing HTML tags:
main.py
import re
def strip_tags(value):
return re.sub(r'<[^>]*?>', '', value)
Let's test an HTML fragment with the strip_tags
function:
Example usage
import re
def strip_tags(value):
return re.sub(r'<[^>]*?>', '', value)
html_text = """
<!DOCTYPE HTML>
<html>
<head>
<title>Title</title>
</head>
<body>
<p>Paragraph</p>
</body>
</html>"""
print(strip_tags(html_text))
If we run the script we get the following result:
Title
Paragraph
If it was easy with Python, let's see how it works with Django.
Django
Django offers a function for this: strip_tags
.
First, you just need to install the library: pip install django
.
Using Django's strip_tags
from django.utils.html import strip_tags
html_text = """
<!DOCTYPE HTML>
<html>
<head>
<title>Title</title>
</head>
<body>
<p>Paragraph</p>
</body>
</html>"""
print(strip_tags(html_text))