traveltimes_prediction.data_processing package¶
Subpackages¶
- traveltimes_prediction.data_processing.data_entities package
- Submodules
- traveltimes_prediction.data_processing.data_entities.bck_tt module
- traveltimes_prediction.data_processing.data_entities.data_entity module
- traveltimes_prediction.data_processing.data_entities.det1 module
- traveltimes_prediction.data_processing.data_entities.det2 module
- traveltimes_prediction.data_processing.data_entities.ref_tt module
- traveltimes_prediction.data_processing.data_entities.time module
- Module contents
Submodules¶
traveltimes_prediction.data_processing.data_processor module¶
-
class
traveltimes_prediction.data_processing.data_processor.
DataProcessor
(section)[source]¶ Bases:
object
Class for processing of the data - aggregation, feature engineering.
-
static
align_training_dataset
(X, Y, Y_bck=None)[source]¶ Method for preparation of the data for training & visualization - aligning and leaving only the records with timestamp contained in both X and Y (and Y_bck).
Parameters: - X (pd.DataFrame) – dataframe of features
- Y (pd.DataFrame) – dataframe of the true values of travel times
- Y_bck (pd.DataFrame) – dataframe of the backward predicted traveltimes - optional
Returns: tuple - pd.DataFrame of features, pd.DataFrame of true values of travel times, optionally pd.DataFrame with the backward predicted traveltimes + dataframe of time
-
static
traveltimes_prediction.data_processing.db_interface module¶
-
class
traveltimes_prediction.data_processing.db_interface.
DBInterface
[source]¶ Bases:
object
Class implementing database connections to data sources and storages.
-
check_latest_referential_traveltime
(section)[source]¶ Method for retrieval of the most recent data timestamp of the section - checking the latest calculated referential traveltime.
Parameters: section (string) – e.g. ‘KOCE-LNCE’ Returns: datetime.datetime - latest calculated referential traveltime for given section
-
get_last_timestamp
(section)[source]¶ Method for retrieval of the most recent data timestamp of the section.
Parameters: section (string) – e.g. ‘KOCE-LNCE’ Returns: datetime
-
get_model_params
(section, model_type)[source]¶ Method used to retrieve the params of trained model.
Parameters: - section (string) – name of the section, e.g. ‘KOCE-LNCE”
- model_type (string) – name of the model, e.g. ‘TimeDomainModel’
Returns: tuple (model`s name, dictionary representation of model params used for training
-
load_model
(section, model_type)[source]¶ Method for loading of the saved trained model from DB.
Parameters: - section (string) – name of the section, e.g. ‘KOCE-LNCE’
- model_type (string) – name of the model, e.g. ‘TimeDomainModel’
Returns: dict - dictionary representation of model`s attributes
-
model_timestamp
(section, model_type)[source]¶ Method used to find out the timestamp of model creation.
Parameters: - section (string) – name of the section, e.g. ‘KOCE-LNCE”
- model_type (string) – name of the model, e.g. ‘TimeDomainModel’
Returns: boolean
-
save_model
(section, model, time_from, time_to, model_params)[source]¶ Method for saving of the trained model to database.
Parameters: - section (string) – name of the section for which the model was created - e.g. ‘KOCE-LNCE’
- model (dict) – dictionary representation of the model
- time_from (datetime) – timestamp of the earliest data used for creation of this model
- time_to (datetime) – timestamp of the latest data used for creation of this model
- model_params (dict) – dictionary with the parameters of the mdoel which have been used by the training
-
traveltimes_prediction.data_processing.feature_engineering module¶
Describes generation of the features.
In variable features_to_extract
are defined the features to be created, sources of their data and the methods
used for their processing. Features are created for every type of data entity (feature entity) individually.
Feature/s is/are described by the dict with keys as follows:
- c_name_feat - list of strings - Names of the generated features (column`s name).
- c_name - list of strings - Names of the data columns, that are used for creation of the features.
- f - method - identifier of the method (method`s name), that is used for creation of the feature/s c_name_feat