Refine by Language

Refine by Category

HTML/XML Processing Projects

cheeriojs / cheerio

Fast, flexible, and lean implementation of core jQuery designed specifically for the server.

JavaScript     12575   3 days ago

sparklemotion / nokogiri

Nokogiri (鋸) is a Rubygem providing HTML, XML, SAX, and Reader parsers with XPath and CSS selector support.

Ruby     4451   today

martinblech / xmltodict

Python module that makes working with XML feel like you are working with JSON

Python     2503   2 months ago

jch / html-pipeline

HTML processing filters and utilities

Ruby     1794   15 days ago

leizongmin / js-xss

Sanitize untrusted HTML (to prevent XSS) with a configuration specified by a Whitelist

JavaScript     1457   6 days ago

inikulin / parse5

HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.

JavaScript     1423   7 days ago

fb55 / htmlparser2

forgiving html and xml parser

JavaScript     1331   5 months ago

mozilla / bleach

An easy, HTML5, whitelisting HTML sanitizer.

Python     1308   1 months ago

xhtml2pdf / xhtml2pdf

HTML/CSS to PDF converter.

Python     1251   yesterday

gawel / pyquery

A jquery-like library for python

Python     1238   8 months ago

yorickpeterse / oga

Oga is an XML/HTML parser written in Ruby.

Ruby     1126   7 days ago

mathiasbynens / he

A robust HTML entity encoder/decoder written in JavaScript.

JavaScript     1079   1 months ago

lxml / lxml

The lxml XML toolkit for Python

Python     1012   15 days ago

technosophos / querypath

QueryPath is a PHP library for manipulating XML and HTML. It is designed to work not only with local files, but also with web services and database resources.

PHP     730   20 days ago

isaacs / sax-js

A sax style parser for JS

JavaScript     717   3 days ago

flavorjones / loofah

HTML/XML manipulation and sanitization based on Nokogiri

Ruby     625   2 months ago

html5lib / html5lib-python

Standards-compliant library for parsing and serializing HTML documents and fragments in Python

Python     601   1 months ago

ohler55 / ox

Ruby Optimized XML Parser

Ruby     552   14 days ago

kurtmckee / feedparser

Parse feeds in Python

Python     484   2 months ago

masterminds / html5-php

An HTML5 parser and serializer for PHP.

PHP     425   2 months ago

stchris / untangle

Converts XML to Python objects

Python     244   18 days ago

empact / roxml

ROXML is a module for binding Ruby classes to XML. It supports custom mapping and bidirectional marshalling between Ruby and XML using annotation-style class methods, via Nokogiri or LibXML.

Ruby     178   %d years ago

pallets / markupsafe

Implements a XML/HTML/XHTML Markup safe string for Python.

Python     169   1 months ago

scrapy / cssselect

working with DOM tree with CSS selectors

Python     154   4 months ago

dam5s / happymapper

Object to XML mapping library, using Nokogiri (Fork from John Nunemaker's Happymapper)

Ruby     100   23 days ago

mbklein / equivalent-xml

Easy equivalency tests for Nokogiri and Oga XML

Ruby     84   1 months ago

matiasb / demiurge

PyQuery-based scraping micro-framework.

Python     55   5 months ago

alir3z4 / python-sanitize

Bringing sanity to world of messed-up data.

Python     32   %d years ago

compileinc / hodor

Simple lxml wrapper group results from structured pages with pagination and grouping 🕷

Python     16   8 days ago

jurismarches / chopper

Tool we used every day at work. It permit to extract a part of webpage with applied css rules keeping html correct.

Python     13   3 months ago