Data warehouse maintenance and maintenance cost has been well studied in the literature. Integrating data sources, in a data warehouse environment, may often need data cleaning, transformation, or any other function applied to the data in order to integrate it. The impact on view maintenance, when data is integrated with other comparison operators than defined in theta join, has, however, not been closely looked at in previous studies.
In this study the impact of using a complex join in data warehouse environment is analyzed to measure how different maintenance strategies are affected when data needs to be integrated using other comparison operators than defined in a theta join. The analysis shows that maintenance cost is greatly increased when using complex joins since such joins often lack optimization techniques which are available when using a theta join. The study shows, among other things, that the join aware capability of sources is not of importance when performing complex joins, and incremental view maintenance is better approach than using recomputed view maintenance, when using complex joins. Strategies for maintaining data warehouses when data is integrated using a complex join are therefore different than when a theta join is used, and different maintenance strategies need to be applied.