GitLake is a distributed data lake management framework based on Git. It defines a file system that is optimized to perform ETLM tasks within a data lake environment. It also provides a CLI tool
gitlake which offers user a git-like experience to manage and share raw data files and perform massively parallel compute tasks.
- Documentation: https://gitlake.readthedocs.io
- Website: https://www.gitlake.com