数据集实时复制和版本控制:Dat
jopen
11年前
Dat 是一个支持数据集时复制,版本控制的开源的项目,提供每个文件格式和数据存储后端的流。

Streaming
Everything in dat is built using streaming + non-blocking components so that you can work with large datasets and get immediate, real-time results without running out of RAM.

Made with Modules
Dat stores data locally, but you can easily configure it to store its tabular data in the database of your choice (e.g. PostgreSQL) and its files in external file stores (e.g. Google Drive).

REST and CLI
You can stream data in and out of dat from the command line using any program that can write to stdin (e.g. R, Python, Ruby, etc) or you can use dat's built in HTTP REST API.
