Refine by Language

Refine by Category

HTML/XML Processing Projects


cheeriojs / cheerio

Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

JavaScript     12575   3 days ago


sparklemotion / nokogiri

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.

Ruby     4451   today


martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON

Python     2503   2 months ago


jch / html-pipeline

HTML processing filters and utilities

Ruby     1794   15 days ago


leizongmin / js-xss

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

JavaScript     1457   6 days ago


inikulin / parse5

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.

JavaScript     1423   7 days ago


fb55 / htmlparser2

forgiving html and xml parser

JavaScript     1331   5 months ago


mozilla / bleach

An easy, HTML5, whitelisting HTML sanitizer.

Python     1308   1 months ago


xhtml2pdf / xhtml2pdf

HTML/CSS to PDF converter.

Python     1251   yesterday


gawel / pyquery

A jquery-like library for python

Python     1238   8 months ago


yorickpeterse / oga

Oga is an XML/HTML parser written in Ruby.

Ruby     1126   7 days ago


mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.

JavaScript     1079   1 months ago


lxml / lxml

The lxml XML toolkit for Python

Python     1012   15 days ago


technosophos / querypath

QueryPath is a PHP library for manipulating XML and HTML. It is designed to work not only with local files, but also with web services and database resources.

PHP     730   20 days ago


isaacs / sax-js

A sax style parser for JS

JavaScript     717   3 days ago


flavorjones / loofah

HTML/XML manipulation and sanitization based on Nokogiri

Ruby     625   2 months ago


html5lib / html5lib-python

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Python     601   1 months ago


ohler55 / ox

Ruby Optimized XML Parser

Ruby     552   14 days ago


kurtmckee / feedparser

Parse feeds in Python

Python     484   2 months ago


masterminds / html5-php

An HTML5 parser and serializer for PHP.

PHP     425   2 months ago


stchris / untangle

Converts XML to Python objects

Python     244   18 days ago


empact / roxml

ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.

Ruby     178   %d years ago


pallets / markupsafe

Implements a XML/HTML/XHTML Markup safe string for Python.

Python     169   1 months ago


scrapy / cssselect

working with DOM tree with CSS selectors

Python     154   4 months ago


dam5s / happymapper

Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)

Ruby     100   23 days ago


mbklein / equivalent-xml

Easy equivalency tests for Nokogiri and Oga XML

Ruby     84   1 months ago


matiasb / demiurge

PyQuery-based scraping micro-framework.

Python     55   5 months ago


alir3z4 / python-sanitize

Bringing sanity to world of messed-up data.

Python     32   %d years ago


compileinc / hodor

Simple lxml wrapper group results from structured pages with pagination and grouping 🕷

Python     16   8 days ago


jurismarches / chopper

Tool we used every day at work. It permit to extract a part of webpage with applied css rules keeping html correct.

Python     13   3 months ago