Predicting Domestic Tourists’ Length of Stay in Italy leveraging Regression Decision Tree Algorithms


Abstract


This study innovates in predicting domestic tourists' Length of Stay (LoS) in Italy by using decision tree models, addressing the gap in understanding LoS's determinants, and improving upon inconsistent results from traditional parametric analyses. Utilizing the 2019 “Viaggi e Vacanze” survey by the Italian National Institute of Statistics and categorizing variables into sociodemographic, economic, travel-related, and psychological factors, the research applies one-hot encoding to analyse 48,410,000 trips. Through evaluating random forest and gradient boosting models, the study highlights their superiority in identifying complex data patterns, offering actionable insights for tourism policymakers. These models enable precise LoS estimation, facilitating enhanced strategic planning for extending stays, optimizing services, and improving promotional efforts to maximize tourism's economic impact. This approach offers a comprehensive tool for developing policies that boost visitor engagement and economic benefits, showcasing a significant advancement in tourism management practices

DOI Code: 10.1285/i20705948v17n3p621

Keywords: Microdata, Length of stay, Machine-learning models, Decision trees, Tourism sector

References


Aguiló, E., Rosselló, J., & Vila, M. (2017). Length of stay and daily tourist expenditure: A joint analysis. Tour. Manag. Perspect., 21, 10–17. https://doi.org/10.1016/j.tmp.2016.10.008

Alén, E., Nicolau, J. L., Losada, N., & Domínguez, T. (2014). Determinant factors of senior tourists’ length of stay. Ann. Tour. Res., 49, 19–32. https://doi.org/10.1016/j.annals.2014.08.002

Antolini, F., Cesarini, S., & Simonetti, B. (2024). Factors determining Italian tourists’ expenses: A machine learning approach. Qual. Quant. https://doi.org/10.1007/s11135-024-01832-x

Atsız, O., Leoni, V., & Akova, O. (2022). Determinants of tourists’ length of stay in cultural destination: One-night vs longer stays. J. Hosp. Tour. Insights, 5(1), 62–78. https://doi.org/10.1108/JHTI-07-2020-0126

Barros, C. P., Correia, A., & Crouch, G. (2008). Determinants of the length of stay in Latin American tourism destinations. Tour. Anal., 13(4), 329–340.

Barros, C. P., & Machado, L. P. (2010). The length of stay in tourism. Ann. Tour. Res., 37(3), 692–706. https://doi.org/10.1016/j.annals.2009.12.005

Barros, C. P., Butler, R., & Correia, A. (2010). The length of stay of golf tourism: A survival analysis. Tour. Manag., 31(1), 13–21. https://doi.org/10.1016/j.tourman.2009.02.010

Bavik, A., Correia, A., & Kozak, M. (2021). What makes our stay longer or shorter? A study on Macau. J. China Tour. Res., 17(2), 192–209. https://doi.org/10.1080/19388160.2020.1745346.

Breiman, L. (1996). Bagging predictors. Mach. Learn., 24, 123–140.

Breiman, L. (2001). Random forests. Mach. Learn., 45, 5–32.

Breiman, L. (2017). Classification and regression trees. New York: Routledge. https://doi.org/10.1201/9781315139470

Chin, W. W. (1998). The partial least squares approach to structural equation modeling. Mod. Methods Bus. Res., 295(2), 295–336.

De Menezes, A. G., Moniz, A., & Vieira, J. C. (2008). The determinants of length of stay of tourists in the Azores. Tour. Econ., 14(1), 205–222. https://doi.org/10.5367/000000008783554866

Díaz-Pérez, F. M., Fyall, A., Fu, X., García-González, C. G., & Deel, G. (2021). Florida state parks: A CHAID approach to market segmentation. Anatolia, 32(2), 246–261. https://doi.org/10.1080/13032917.2020.1856158

Díaz-Pérez, F. M., García-González, C. G., & Fyall, A. (2020). The use of the CHAID algorithm for determining tourism segmentation: A purposeful outcome. Heliyon, 6(7), 1–11. https://doi.org/10.1016/j.heliyon.2020.e04256

Díaz-Pérez, F. M., & Bethencourt-Cejas, M. (2016). CHAID algorithm as an appropriate analytical method for tourism market segmentation. J. Destin. Mark. Manag., 5(3), 275–282. https://doi.org/10.1016/j.jdmm.2016.01.006

Gokovali, U., Bahar, O., & Kozak, M. (2007). Determinants of length of stay: A practical use of survival analysis. Tour. Manag., 28(3), 736–746. https://doi.org/10.1016/j.tourman.2006.05.004

Gössling, S., Scott, D., & Hall, C. M. (2018). Global trends in length of stay: Implications for destination management and climate change. J. Sustain. Tour., 26(12), 2087–2101. https://doi.org/10.1080/09669582.2018.152977

ISTAT (2022). Viaggi e Vacanze: File ad uso pubblico. https://www.istat.it/it/archivio/178695 (last access: 06.03.2023).

Jackman, M., & Naitram, S. (2023). Segmenting tourists by length of stay using regression tree models. J. Hosp. Tour. Insights, 6(1), 18–35. https://doi.org/10.1108/JHTI-03-2021-0084.

James, G., Witten, D., Hastie, T., & Tibshirani, R. (2020). An introduction to statistical learning. New York: Springer. https://doi.org/10.1007/978-1-0716-1418-1

Kruger, M., & Saayman, M. (2014). The determinants of visitor length of stay at the Kruger National Park. Koedoe, 56(2), 1–11. https://doi.org/10.4102/koedoe.v56i2.1114

Lantz, B. (2019). Machine learning with R: Expert techniques for predictive modeling. Birmingham: Packt Publishing Ltd. ISBN: 9781788295864

Lee, Y., & Kim, D. Y. (2021). The decision tree for longer-stay hotel guest: The relationship between hotel booking determinants and geographical distance. Int. J. Contemp. Hosp. Manag., 33(6), 2264–2282. https://doi.org/10.1108/IJCHM-06-2020-0594

Lewis, R. J. (2000). An introduction to classification and regression tree (CART) analysis. In Annual meeting of the society for academic emergency medicine in San Francisco, California (Vol. 14). San Francisco, CA, USA: Department of Emergency Medicine

Harbor-UCLA Medical Center Torrance.

Li, K. X., Jin, M., & Shi, W. (2018). Tourism as an important impetus to promoting economic growth: A critical review. Tour. Manag. Perspect., 26, 135–142. https://doi.org/10.1016/j.tmp.2017.10.002.

Lin, V. S., Qin, Y., Li, G., & Wu, J. (2020). Determinants of Chinese households’ tourism consumption: Evidence from China Family Panel Studies. Int. J. Tour. Res., 23(4), 542–554. https://10.1002/jtr.2425

Lundberg, S., & Lee, S. I. (2016). An unexpected unity among methods for interpreting model predictions. arXiv preprint arXiv:1611.07478.

Marcussen, C. H. (2011). Determinants of tourist spending in cross-sectional studies and at Danish destinations. Tour. Econ., 17(4), 833–855. https://10.5367/te.2011.0068

Marrocu, E., Paci, R., & Zara, A. (2015). Micro-economic determinants of tourist expenditure: A quantile regression approach. Tour. Manag., 50, 13–30. https://10.1016/j.tourman.2015.01.006

Martinez-Garcia, E., & Raya, J. M. (2008). Length of stay for low-cost tourism. Tour. Manag., 29(6), 1064–1075. https://doi.org/10.1016/j.tourman.2008.02.011

Oklevik, O., Kwiatkowski, G., Malchrowicz-Mośko, E., Ossowska, L., & Janiszewska, D. (2021). Determinants of tourists’ length of stay. PloS One, 16(12), 1-17. https://doi.org/10.1371/journal.pone.0259709.

Park, S., Woo, M., & Nicolau, J. L. (2020). Determinant factors of tourist expenses. J. Travel Res., 59(2), 267–280. https://doi.org/10.1177/0047287519829257

Peypoch, N., Randriamboarison, R., Rasoamananjara, F., & Solonandrasana, B. (2012). The length of stay of tourists in Madagascar. Tour. Manag., 33(5), 1230–1235. https://doi.org/10.1016/j.tourman.2011.11.003

Prebensen, N. K., Altin, M., & Uysal, M. (2015). Length of stay: A case of Northern Norway. Scand. J. Hosp. Tour., 15(sup1), 28–47. https://doi.org/10.1080/15022250.2015.1063795

Probst, P., Wright, M. N., & Boulesteix, A. L. (2019). Hyperparameters and tuning strategies for random forest. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 9(3), e1301. https://doi.org/10.1002/widm.1301

Raya, J. M. (2012). Length of stay for triathlon participants in the Challenge Maresme–Barcelona: A survival approach. J. Sport Soc. Issues, 36(1), 89–105. https://doi.org/10.1177/0193723511433868

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. https://doi.org/10.7551/mitpress/4943.003.0042

Rodriguez, X. A., Martinez-Roget, F., & Gonzalez-Murias, P. (2018). Length of stay: Evidence from Santiago de Compostela. Ann. Tour. Res., 68, 9–19. https://doi.org/10.1016/j.annals.2017.11.001.

Soler, I. P., Gemar, G., & Correia, M. B. (2020). The climate index-length of stay nexus. J. Sustain. Tour., 28(9), 1272–1289. https://doi.org/10.1080/09669582.2020.1734603

Thrane, C. (2012). Analysing tourists’ length of stay at destinations with survival models: A constructive critique based on a case study. Tour. Manag., 33, 126–132. https://doi.org/10.1016/j.tourman.2011.02.011

Thrane, C., & Farstad, E. (2012). Tourists’ length of stay: The case of international summer visitors to Norway. Tour. Econ., 18, 1069–1082. https://doi.org/10.5367/te.2012.0158

Thrane, C. (2015). Research note: The determinants of tourists’ length of stay: Some further modelling issues. Tour. Econ., 21(5), 1087–1093. https://doi.org/10.5367/te.2014.0385

Wang, L., Fong, D. K. C., Law, R., & Fang, B. (2018). Length of stay: Its determinants and outcomes. J. Travel Res., 57(4), 472–482. https://doi.org/10.1177/0047287517700315

World Tourism Organization. (2021). The economic contribution of tourism and the impact of COVID-19, preliminary version. UNWTO, Madrid. https://doi.org/10.18111/9789284423200


Full Text: pdf
کاغذ a4

Creative Commons License
This work is licensed under a Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia License.