• Login
    View Item 
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Engineering, University of Wisconsin--Madison
    • Department of Civil and Environmental Engineering
    • Theses--Civil Engineering
    • View Item
    •   MINDS@UW Home
    • MINDS@UW Madison
    • College of Engineering, University of Wisconsin--Madison
    • Department of Civil and Environmental Engineering
    • Theses--Civil Engineering
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    IMPROVING THE ACCURACY OF THE COST ESTIMATION OF PUBLIC TRANSPORTATION PROJECT: A DATA DRIVEN SELECTION OF THE ESTIMATING METHODOLOGY

    Thumbnail
    File(s)
    MS_Thesis_Abdellatif_Sarah.pdf (3.182Mb)
    Date
    2022-08-31
    Author
    Abdellatif, Sarah
    Advisor(s)
    Hanna, Awad
    Metadata
    Show full item record
    Abstract
    The preliminary engineer’s estimate for public highway projects has long been a deciding factor on whether State Transportation Agencies (STAs) can proceed with projects that are essential for the public’s well-being. Most transportation and infrastructure projects are funded from a limited reservoir provided by federal, state, and local government programs, and the preliminary engineer’s estimate acts as a benchmark for the spending of funds in said reservoir. Therefore, it is of paramount importance that the drafting of the preliminary engineer’s estimate considers the market conditions and is reflective of the contractor bids, for the proper allocation of funds to projects governed by STAs. Cost estimating in transportation and infrastructure projects is a dynamic process that transforms along the major phases of highway construction projects. The phases are broken out between planning, project development, final design, rightof- way acquisition, construction, and operation and maintenance. This research primarily explored the engineer’s estimate prepared during the final design phase in highway construction projects, which is referenced in this paper as the “engineer’s estimate”. The Federal Highway Administration (FHWA) measures the effectiveness and accuracy of the engineer’s estimate in terms of the percentage deviation of the low bid from the engineer’s estimate and recommends an accuracy defined by at least 50 percent of low bids falling between ±10 percent of the that estimate. Despite commendable efforts from STAs, high deviations of estimates from low bids remain a persistent problem that public agencies face. The Wisconsin Department of Transportation (WisDOT) requested the support of the Construction and Materials Support Center (CMSC) at the University of Wisconsin - Madison in running an estimating peer exchange with fellow STAs to determine underlying causes behind the high deviation of the final design engineer’s estimate from low bids. One important influencing factor on the effectiveness and accuracy of the estimate identified, is the method of cost estimation, which includes historical bid-based estimating, cost-based estimating, and combination estimating. The highway and infrastructure industry has no precise analysis or conclusion on the impact of the different methods of cost estimation on the estimates developed during the initial stages of a project and lack a universally accepted methodology for the choice of the method of cost estimation. Thus, there is a need for STAs to evaluate the effect of using the different methods of cost estimation on the estimate accuracy, as defined by the FHWA, to identify the most suitable approach for all project types. This research utilizes expert opinion from the estimating peer exchange and data-driven based algorithms for STAs to predict the better suited method of cost-estimation for the estimates created during the early stages of the project stages to better allocated funds from public agencies. Data was collected from eleven participating STAs, during the estimating peer exchange, as well as five other STAs using a survey. The data collected is related to the best scoping, cost estimation, and risk assessment practices during early stages of the project, as well as performance of the states’ engineer’s estimate accuracy from the year 2018 to 2020. Additionally, both qualitative and quantitative analysis was performed to evaluate the variation in estimate precision and accuracy using the method of cost estimation using the average yearly data from the STAs. Among the number of bidders, geographic location, shortage of estimators, and economic volatility, the method of cost estimation was identified as a majorly impactful factor on the engineer’s estimate accuracy, as defined by the FHWA. Historical bid-based estimating was found to be the most common method, followed by combination estimating, and finally cost-based estimating. The methods averaged at 47%, 48%, and 53% respectively, of the low bids falling within ±10 of the engineer’s estimate. While cost-based estimating results in the highest accuracy, it requires extensive training of the estimating personnel. The yearly average dataset was insufficient in concluding which method of cost estimation is better suited for the highway and infrastructure sector. Consequently, prediction machine learning algorithms were employed to predict the optimum method of cost estimation depending on project related variables and economic variables. Raw data was collected from six STAs, Montana Department of Transportation (MDT), Nebraska Department of Transportation (NDOT), North Dakota Department of Transportation (NDDOT), Tennessee Department of Transportation (TDOT), Washington State Department of Transportation (WSDOT), and Wisconsin Department of Transportation (WisDOT). The data obtained only included observations for projects estimated using historical bid-based estimating, and combination estimating with 5-10% line items estimated using cost-based estimating. The dataset spanned 11 unified project types, and was trained using the following machine learning algorithms, multiple linear regression (ML), logistic regression (LOGIT), classification and regression trees (CART), and random forests (RF) to predict the most suitable method of cost estimation. The gathered data were separated into two groups: one for training the model and the other for testing purposes. Using the same dataset, the models were developed, and then their performances were evaluated based on the area under the receiver operating curve (AUC). ML was used as the standard statistical analysis to evaluate the need for more complex machine learning models. It was unable to capture non-linear relationships, which proved to be a governing factor behind its low model performance. Economic variables were found to be the most influential on the optimal method of cost estimation, primarily the prime loan rate with a feature coefficient of -11.8611. The project types loosely followed behind with feature coefficients ranging between 0.1787 and 0.6571. LOGIT was found to be substantially better than the ML method in many respects, including the flexibility around linear relationships, and a obtained a significantly higher performance. Three models were developed using LOGIT, a base model, l1-regularization model, and l2- regularization model. All three models obtained a classification accuracy of 89%, but the l2- regularization model reduced the feature correlation and bias in the model, so it was deemed more fitting for predicting the method of cost estimation with a low risk of overfitting. The prime loan rate was again found to be of highest importance with a coefficient on -84.9338, followed by the project types ranging between 1.102 to 4.907. One CART model was then developed due to its flexible and non-parametric modeling properties, meaning that there are no strict assumptions. It was able to better capture nonlinearities between the features and the target variables than both the ML and LOGIT models. Using hyperparameter tuning, a maximum model accuracy of 0.99 was obtained using a maximum depth of 9, minimum samples per leaf of 4, and minimum samples per split of 8. CART models are notoriously susceptible to overfitting, and even with the hyperparameter tuning the CART model was deemed not optimal for the prediction of the method of estimation. The projects under the maintenance or minor upgrades type were ranked at the top of the list with a coefficient of 2.0997 followed by the crude oil prices at a coefficient of 0.7586. Similar to CART, RF was not sensitive to linear relationships between the features and the target variable. Multiple CART trees are combined, with an additional incorporated hyperparameter related to the number of CART trees to tackle the overfitting of the singular CART trees. The hyperparameter tuning resulted in a maximum depth of 2, minimum samples per leaf of 8, minimum samples per split of 9, and an optimum number of trees of 219. The model obtained a classification accuracy of 90%, which was the highest accuracy across all ML algorithms. The RF model was deemed the most suitable for the purpose of predicting the most optimal method of cost estimation. The data-driven model can be used by STAs to allocate teams of estimating professionals with varying degrees of experience in estimating. Estimators with a higher understanding of project cost, can be assigned to projects that require the use of costbased estimating, such that the burden of training estimating personnel on cost-based estimating as a method of estimation is lightened. Hence, the funds available to STAs can be more optimally allocated for the benefit of the public and the economy. Moreover, economic related factors in all the models consistently exceeded the influence of project related factors. Primarily in the form of the prime loan rate, and the crude oil prices. The project type was the leading influence among the project related features with safety and traffic control, maintenance and minor upgrades, environmental mitigation, roadway redesign, road or culvert replacement, earthwork, and resurfacing project types favoring the use of combination estimating while bride construction and bridge replacement projects consistently elected historical bid-based estimating as the preferred method of estimation.
    Subject
    engineer's estimate
    DOT
    machine learning
    data
    transportation
    highway
    Permanent Link
    http://digital.library.wisc.edu/1793/83525
    Type
    Thesis
    Part of
    • Theses--Civil Engineering

    Contact Us | Send Feedback
     

     

    Browse

    All of MINDS@UWCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Login

    Contact Us | Send Feedback