Refine by Language

Refine by Category

HTML/XML Processing Projects


cheeriojs / cheerio

Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

JavaScript     11952   yesterday


sparklemotion / nokogiri

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.

Ruby     4368   22 days ago


martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON

Python     2401   2 months ago


jch / html-pipeline

HTML processing filters and utilities

Ruby     1776   15 days ago


inikulin / parse5

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.

JavaScript     1374   1 months ago


leizongmin / js-xss

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

JavaScript     1366   2 days ago


mozilla / bleach

An easy, HTML5, whitelisting HTML sanitizer.

Python     1265   today


fb55 / htmlparser2

forgiving html and xml parser

JavaScript     1256   3 months ago


xhtml2pdf / xhtml2pdf

HTML/CSS to PDF converter.

Python     1227   1 months ago


gawel / pyquery

A jquery-like library for python

Python     1181   6 months ago


yorickpeterse / oga

Oga is an XML/HTML parser written in Ruby.

Ruby     1109   8 days ago


mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.

JavaScript     984   3 months ago


lxml / lxml

The lxml XML toolkit for Python

Python     970   2 days ago


technosophos / querypath

QueryPath is a PHP library for manipulating XML and HTML. It is designed to work not only with local files, but also with web services and database resources.

PHP     709   1 months ago


isaacs / sax-js

A sax style parser for JS

JavaScript     694   26 days ago


flavorjones / loofah

HTML/XML manipulation and sanitization based on Nokogiri

Ruby     612   22 days ago


html5lib / html5lib-python

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Python     587   5 days ago


ohler55 / ox

Ruby Optimized XML Parser

Ruby     536   2 days ago


kurtmckee / feedparser

Parse feeds in Python

Python     446   9 days ago


masterminds / html5-php

An HTML5 parser and serializer for PHP.

PHP     412   19 days ago


stchris / untangle

Converts XML to Python objects

Python     227   2 months ago


empact / roxml

ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.

Ruby     179   %d years ago


pallets / markupsafe

Implements a XML/HTML/XHTML Markup safe string for Python.

Python     156   11 days ago


scrapy / cssselect

working with DOM tree with CSS selectors

Python     148   2 months ago


dam5s / happymapper

Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)

Ruby     98   2 months ago


mbklein / equivalent-xml

Easy equivalency tests for Nokogiri and Oga XML

Ruby     83   3 months ago


matiasb / demiurge

PyQuery-based scraping micro-framework.

Python     53   3 months ago


alir3z4 / python-sanitize

Bringing sanity to world of messed-up data.

Python     30   %d years ago


compileinc / hodor

Simple lxml wrapper group results from structured pages with pagination and grouping 🕷

Python     14   2 months ago