cheeriojs / cheerio

Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

JavaScript     12247   today

sparklemotion / nokogiri

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.

Ruby     4402   2 days ago

martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON

Python     2455   27 days ago

jch / html-pipeline

HTML processing filters and utilities

Ruby     1786   2 days ago

leizongmin / js-xss

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

JavaScript     1402   today

inikulin / parse5

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.

JavaScript     1393   8 days ago

mozilla / bleach

An easy, HTML5, whitelisting HTML sanitizer.

Python     1296   5 days ago

fb55 / htmlparser2

forgiving html and xml parser

JavaScript     1288   4 months ago

xhtml2pdf / xhtml2pdf

HTML/CSS to PDF converter.

Python     1239   2 days ago

gawel / pyquery

A jquery-like library for python

Python     1204   7 months ago

yorickpeterse / oga

Oga is an XML/HTML parser written in Ruby.

Ruby     1115   1 months ago

mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.

JavaScript     1018   5 days ago

lxml / lxml

The lxml XML toolkit for Python

Python     994   4 days ago

technosophos / querypath

QueryPath is a PHP library for manipulating XML and HTML. It is designed to work not only with local files, but also with web services and database resources.

PHP     715   15 days ago

isaacs / sax-js

A sax style parser for JS

JavaScript     706   today

flavorjones / loofah

HTML/XML manipulation and sanitization based on Nokogiri

Ruby     619   24 days ago

html5lib / html5lib-python

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Python     594   today

ohler55 / ox

Ruby Optimized XML Parser

Ruby     545   2 days ago

kurtmckee / feedparser

Parse feeds in Python

Python     469   19 days ago

masterminds / html5-php

An HTML5 parser and serializer for PHP.

PHP     422   15 days ago

stchris / untangle

Converts XML to Python objects

Python     233   17 days ago

empact / roxml

ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.

Ruby     179   %d years ago

pallets / markupsafe

Implements a XML/HTML/XHTML Markup safe string for Python.

Python     160   1 months ago

scrapy / cssselect

working with DOM tree with CSS selectors

Python     152   3 months ago

dam5s / happymapper

Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)

Ruby     99   3 months ago

mbklein / equivalent-xml

Easy equivalency tests for Nokogiri and Oga XML

Ruby     84   4 months ago

matiasb / demiurge

PyQuery-based scraping micro-framework.

Python     53   4 months ago

alir3z4 / python-sanitize

Bringing sanity to world of messed-up data.

Python     31   %d years ago

compileinc / hodor

Simple lxml wrapper group results from structured pages with pagination and grouping 🕷

Python     16   13 days ago