Refine by Language

Refine by Category

HTML/XML Processing Projects


cheeriojs / cheerio

Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

JavaScript     12247   today


sparklemotion / nokogiri

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.

Ruby     4402   2 days ago


martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON

Python     2455   27 days ago


jch / html-pipeline

HTML processing filters and utilities

Ruby     1786   2 days ago


leizongmin / js-xss

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

JavaScript     1402   today


inikulin / parse5

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.

JavaScript     1393   8 days ago


mozilla / bleach

An easy, HTML5, whitelisting HTML sanitizer.

Python     1296   5 days ago


fb55 / htmlparser2

forgiving html and xml parser

JavaScript     1288   4 months ago


xhtml2pdf / xhtml2pdf

HTML/CSS to PDF converter.

Python     1239   2 days ago


gawel / pyquery

A jquery-like library for python

Python     1204   7 months ago


yorickpeterse / oga

Oga is an XML/HTML parser written in Ruby.

Ruby     1115   1 months ago


mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.

JavaScript     1018   5 days ago


lxml / lxml

The lxml XML toolkit for Python

Python     994   4 days ago


technosophos / querypath

QueryPath is a PHP library for manipulating XML and HTML. It is designed to work not only with local files, but also with web services and database resources.

PHP     715   15 days ago


isaacs / sax-js

A sax style parser for JS

JavaScript     706   today


flavorjones / loofah

HTML/XML manipulation and sanitization based on Nokogiri

Ruby     619   24 days ago


html5lib / html5lib-python

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Python     594   today


ohler55 / ox

Ruby Optimized XML Parser

Ruby     545   2 days ago


kurtmckee / feedparser

Parse feeds in Python

Python     469   19 days ago


masterminds / html5-php

An HTML5 parser and serializer for PHP.

PHP     422   15 days ago


stchris / untangle

Converts XML to Python objects

Python     233   17 days ago


empact / roxml

ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.

Ruby     179   %d years ago


pallets / markupsafe

Implements a XML/HTML/XHTML Markup safe string for Python.

Python     160   1 months ago


scrapy / cssselect

working with DOM tree with CSS selectors

Python     152   3 months ago


dam5s / happymapper

Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)

Ruby     99   3 months ago


mbklein / equivalent-xml

Easy equivalency tests for Nokogiri and Oga XML

Ruby     84   4 months ago


matiasb / demiurge

PyQuery-based scraping micro-framework.

Python     53   4 months ago


alir3z4 / python-sanitize

Bringing sanity to world of messed-up data.

Python     31   %d years ago


compileinc / hodor

Simple lxml wrapper group results from structured pages with pagination and grouping 🕷

Python     16   13 days ago