快速的数据操作框架:Hydro
jopen
10年前
Hydro是一个免费和开源的数据 API 计算和服务框架,设计帮助 Web/应用服务器和其他数据消费者从不同的数据流中抽取数据,并快速处理。然后基于标准和统计数据呈现给不同的客户端/应用程序。
|-------| | DB1 |====== |-------| = . = Data API = |-----------| . = > | HYDRO - | |---------| = | Extract | | APP/Web | |--------| ===ETL===> . = | Transform | =====> | Server | =====> | Client | = > | Render | |---------| |--------| . = |-----------| = |-------| = | DBn |====== |-------|
#creating a plan object plan = PlanObject(params, source_id, conf) # defining data source and type plan.data_source = 'vertica-dash' plan.source_type = Configurator.VERTICA # time diff based on input params time_diff = (plan.TO_DATE - plan.FROM_DATE).total_seconds() # if time range is bigger than 125 days and application type is dashboard, abort! # since data need to be fetched quickly if time_diff > Configurator.SECONDS_IN_DAY*125 and params['APP_TYPE'].to_string() == 'Dashboard': raise HydroException('Time range is too big') # else, if average records per day is bigger than 1000 or client is convertro then run sample logic elif plan.AVG_RECORDS_PER_DAY > 1000 or params['CLIENT_ID'].to_string() == 'convertro': plan.template_file = 'device_grid_widget_sampling.sql' plan.sampling = True self.logger.debug('Sampling for the query has been turn on') # else run other logic else: plan.template_file = 'device_grid_widget.sql' #return plan object to the query engine return plan