Skip to content

Latest commit

 

History

History
38 lines (28 loc) · 1.04 KB

File metadata and controls

38 lines (28 loc) · 1.04 KB

Python Hadoop I/O Utilities

Pure Python SequenceFile Reader and Writer implementation that allows you to read and write Hadoop sequence files without using Java.

Installation

python setup.py install

or in your project requirements.txt:

-e git+https://github.com/commoncrawl/python-hadoop.git@main#egg=hadoop

Usage

See examples how to read and write SequenceFiles and other file formats specific to Hadoop resp. MapReduce.

Credits

Author: Matteo Bertozzi theo.bertozzi@gmail.com (see the original repository)

Contributions to this fork:

See the commit logs for a complete list of contributors.