SPATIAL DURBIN MODEL OF UNEMPLOYMENT RATE IN CENTRAL JAVA

Unemployment is a labor problem that is often faced by developing countries like Indonesia. The number of unemployed in Indonesia has fluctuated from year to year, including in Central Java Province. One of the efforts made to overcome this problem is to know the factors that influence unemployment. The region effect greatly affects the open unemployment rate. Modeling involving area effects is very precise, one of which is the Spatial Durbin Model (SDM). In this study, modeling of the open unemployment rate was carried out using a spatial approach in each district/city in Central Java. The models used in this study are Ordinary Last Square (OLS), Spatial Auto Regressive (SAR), Spatial Error Models (SEM), Spatial Durbin Model (SDM), Spatial Error Durbin Model (SDEM). The five methods were evaluated using the Akaike Information Criteria (AIC). The spatial weighting used in this study is Queen Contiguity. Based on the smallest AIC value (115.42), the best method in this study is HR. Meanwhile, the significant factors are the percentage of labor force participation rate (X 1 ), the number of poor people (X 4 ), the lag of economic growth, the lag of poverty, and the lag of the district/city minimum wage.


INTRODUCTION
One of 17 global goals or we know as Sustainable Development Goals (SDGs) is to achieve full and productive employment and decent work by 2030. It shows that unemployment is a serious issue in the worldwide. Unemployment in Indonesia is found in almost all provinces in Indonesia, especially those in big cities, including the Central Java province. Based on BPS data, Central Java's unemployment rate declined from year to year. During 2011 to 2017 the highest unemployment rate in Central Java was in 2011 which reached 7.07% (Imsar, 2018).
Some researchers have analyzed the factors that affect the open unemployment rate. The average variable length of school has a positive and significant effect on the level of open unemployment. The minimum wage variable is not significant, which means that the district/city minimum wage does not significantly affect the open unemployment rate. Other variables namely economic growth has a negative and significant effect on the open unemployment rate (Puspadjuita, 2017). Based on other studies, simultaneously wages and population growth have a significant effect on the unemployment rate (Kamran et al., 2014).
Based on economic factors, the unemployment rate in a region is effectd by the socio-economic conditions of the surrounding area. So it is very important to involve the effect of spatial aspects, especially areas on Java which have close proximity between districts / cities to one another. Therefore, there is a spatial dependence because the location of regions that are close together and have the same characteristics allows unemployment in a region affected by unemployment in the surrounding region. The existence of information on spatial relations between regions causes the need to accommodate spatial dependence into the model (Miller et al., 2007).
The Spatial Durbin Model (SDM) involves the concepts of spatial econometrics and spatial regression analysis (Y. Chen et al., 2022;Wang et al., 2021). SDM is used to model the relationship between dependent and independent variables in a spatial context. It is an extension of the traditional Durbin model that considers spatial effects in the analysis. The "neighbor effect," or spatial effect, is known in spatial econometrics. This effect refers to the influence of variables near a particular location or observation unit on the studied variable. Spatial effects arise due to the spatial dependence between observation units in a geographical distribution. In this case, neighboring units tend to have similar characteristics or influence each other.
Traditional Durbin models have been used to model the relationship between dependent and independent variables without considering spatial effects. However, spatial effects must be regarded in many cases to produce accurate estimates and overcome possible biases. SDM introduces spatial elements into the Durbin model by including spatial variables in the regression equation. This model allows the influence of neighbors or other observation units to be considered in estimating the coefficients of the independent variables. In SDM, the independent variables considered include neighbor-dependent and neighbor-independent variables. SDM modeling has been carried out by (Putri Andayani Suaib, 2022) model the factors that influence the gender development index on the island of Sulawesi. The results obtained are factors that significantly influence the Gender Development Index (IPG) on Sulawesi Island using the Spatial Durbin Model (HR) namely life expectancy, per capita expenditure, average length of schooling, and labor force participation rate. Several studies on SDM (H. Chen et al., 2023;Du & Ren, 2023;Guo et al., 2023) The SDM model is capable of modeling Environmental Regulation, Green Innovation, and Industrial Green Development (Feng & Chen, 2018). This study concludes that considering the impact of green innovation on industrial green development performance, in the absence of environmental regulatory constraints, green product innovation shows a certain promotional role, and green craft innovation has a significant inhibitory effect.
This research will use several spatial models including HR. This research will model the factors that influence the open unemployment rate in Central Java Province. This research is useful for the Central Java provincial government to reduce the unemployment rate based on significant factors.

MATERIALS AND METHODS Spatial Autoregressive Model (SAR)
Spatial Autoregressive Model (SAR) is a linear regression model which is the response variable has spatial correlation (Mariani et al., 2017). This model is called a mixture autoregression with regression model because it combines the linear regression with the spatial lag regression model in the response variable (Kelejian & Prucha, 2010).
: coefficient of spatial error ρ : coefficient of spatial lag u : vector of error

Spatial Error Model (SEM)
If in equation (2) ρ = 0 and λ ≠ 0, the equation will be formed as follows: Equation (5) is called the Spatial Error Model (SEM) (Mariani et al., 2017). The spatial error model is a linear regression model in which the error variable has spatial correlation. This is caused by the existence of explanatory variables that are not included in the linear regression model so that they will be counted as errors and those variables are spatially correlated with errors in other locations. The estimation of spatial error model parameters uses the maximum likelihood method. The estimator for is: A numerical iteration is needed to get the estimator for which maximizes the log likelihood function. (Septiawan et al., 2018) introduced SDEM where there is additional spatial lag effect on the response variable and spatial error effect. The following equation shows the SDEM model.

Spatial Durbin Model (SDM)
When in equation (1) λ = 0, then the spatial regression equation can be written in equation (5): Equation (5) assumes that the autoregressive process only occurs in the response variable. Spatial Durbin Model (SDM) is a special case of SAR, which is done by adding spatial lag to explanatory variables. This model is able to describe spatial relationships in response variables and explanatory variables. The SDM model can be written in equation (6): Equation (6) can be written as

Model Selection
Akaike Information Criteria (AIC), AIC is a measure of information that contains the best measurements in the feasibility test of model estimates. AIC is defined as: Where p is the number of model parameters and L is the maximum likelihood value from the estimation model. Evaluation is done by comparing the AIC value of the model obtained, the model with the smallest AIC value is the best model (Akaike, 1998).
The data used in this study is secondary data obtained from Badan Pusat Statistik (BPS) on unemployment in districts/cities in Central Java province in 2017. The response and explanatory variables used in this study are: The stages for obtaining the spatial regression model equations are as follows: 1. Exploring data with graphics. 2. Identify the pattern of relationships between the predictor variable response variables. 3. Determining the spatial weighting matrix W. 4. Spatial dependency testing using Moran`s I test statistics on each variable. 5. Estimating and testing the parameters of the classic regression model (OLS) and testing residual assumptions (identical, independent, and normally distributed). Test spatial dependencies using Moran`s I and test spatial heterogeneity using the Breusch-Pagan Test. 6. Test the effect of spatial dependence by using Lagrange Multiplier (LM) 7. Performing SAR, SEM, SDM, and SDEM modeling. 8. Choose the best model with compare these model: OLS, SAR, SEM, SDM, dan SDEM using AIC and coefficient of determination (R 2 ). 9. Interpret and conclude the results that have been obtained. 10. Testing the identical, independent, and normal distribution assumptions on the best models.

Identification of Patterns of Relationship between Predictor Variables and Response Variables
The pattern of the relationship between the open unemployment rate can be shown by a scatterplot as shown in Figure 2.  Table 2. is a test of the spatial autocorrelation of variables with a significant level of 10%. Significant variables include the variable open unemployment rate (Y), labor force participation rate (X1), economic growth rate (X2), percentage of poor population (X3), population (X4), minimum wage (X5). While significant is the labor force participation rate (X1) and population (X4). X1, X2, X3, X4 and X5, have positive autocorrelation or pattern data that are grouped and have similar characteristics in adjacent locations because having a value of Moran`s I is greater than the value of I_M0 = -0.02941. Figure 3 shows that there is a grouping in quadrant I (High-High) and quadrant III  Table 3 shows that the variables that significantly affect the percentage of open unemployment rates are the percentage of the labor force participation rate (X1).  The result of 2 is 41.24% indicating the percentage percentage of the open unemployment rate that can be explained by the classical regression model. The multicollinearity test has been fulfilled, as indicated by the Variance Inflation Factor (VIF) <10. The model formed by the OLS method is as follows:

Modeling with Classical Regression
= 19.8692785 − 0.2724716X 1 − 0.2119032X 2 + 0.0887178 3 + 0.0001772X 4 + 1.8654534X 5 + In the residual assumption test, it is found that the OLS model residuals are normally distributed, identical, independent. Residual normally distributed is indicated by the p-value in the Kolmogorovsmirnov test greater than α= 10%. The moran`s I residual value produced a value greater than 0 = −0,02941, and Z value = 2.6542 is greater than Z= 1,65, so there is autocorrelation. Spatial heterogeneity test with BP test produces p-value greater than α = 10% so that the conclusion is identical residuals. The OLS method has poor performance because independent assumptions are not fulfilled. Therefore, modeling needs to be modeled using spatial methods.

Determine Spatial Model
Initial identification before carrying out the spatial method namely LM Test as in table 4. The − on the LM test lag and error are 4.795×10 -5 and 0.0004768 respectively so that 0 is rejected at a significant level α=10% (table 4). Furthermore, the analysis needs to be continued with the SEM and SAR methods. Because there are spatial autocorrelations in predictor variables, the SEM and SAR models are developed into SDEM and SDM models. Table 5 shows the results of the parameter estimation of the SEM model. The only significant variable is the percentage of labor force participation rate (X1) and people living in poverty (X4). The resulting 2 is 58.954%, which means the percentage of open unemployment that can be explained by the model is 58.954%. Lambda coefficient is positive and significant at the level of α = 10%, meaning that there is an effect on the percentage of the open unemployment rate in an area with adjacent regions. The SEM model formed as follows: Table 6 shows the model results of the SAR model parameter estimation. A significant variable is the percentage of labor force participation rate (X1) people living in poverty (X4). The result of 2 is 56,916%, it means the percentage of the open unemployment rate that can be explained by the model is 56,916%. The Rho coefficient is positive and significant at the level of α = 10%, so there is an effect on the percentage of the open unemployment rate in an area with adjacent regions. The SAR model is as follows:

Modeling with SDEM Method
Significant variables in the SDEM model are the percentage of labor force participation rate (X1), people living in poverty (X4), lag of economic growth and lag of minimum wage. Parameter estimates using the SDEM method are presented in table 7. The result of 2 is 71.132%, which means that the percentage of the open unemployment rate that can be explained by the model is 71.132%. Lambda coefficient is positive and significant at the level of α = 10%, meaning that there is an effect percentage of the open unemployment rate in an area with adjacent regions. The SDEM model is as follows:

Modeling with SDM Method
The results of significant variable on SDM model are the labor force participation rate (X1), people living in poverty (X4), lag of economic growth, lag of poverty, and lag of minimum wages. Parameter estimates with the SDM method are presented in Table 8.

Goodness of Fit
Determination of the best model is based on the ratio of AIC and 2 on each model and based on evaluation of spatial econometrics modes. The following are the results of choosing the best method: The SDM model there are two types of effects, namely direct and indirect effects. The direct effect of the explanatory variables directly effects the response variable. Whereas indirect effects are obtained from the response variables and explanatory variables which indirectly effect the response variables that are adjacent to the region or the state of information in each region, or reciprocal relationships in each surrounding region. The HR model percentage of open unemployment rate that the direct effect of the percentage of labor force participation rate (X1) is the same for all districts/cities in Central Java Province with a value of 0.25444%, which means the percentage value of labor force participation rate (X1) in one district/city -the unit there will be a decrease in the percentage of the open unemployment rate of 0.25444% and other factors considered constant. Whereas the indirect effect obtained from the percentage response variable of the open unemployment rate in each district /city is 0.43958% and can be interpreted if there is an increase in the percentage of open unemployment of 1% in the area / regency that is related to a district/city. open unemployment of 0.43958%. Whereas the indirect effect obtained by the district/city minimum wage is 11,267 and can be interpreted if there is an increase in the district / city minimum wage unit in the area around a district/city that has a relationship so that the percentage of open unemployment is 11,267% in the region.
The Kolmogorov Smirnov test statistic value for residual normality testing in the TPT HR model is shown in Figure 4. The P-value generated from the Kolmogorov Smirnov test is 0.15 so that the residual conclusion on the model has been normally distributed.

Dependency
The residual Autocorrelation Function (ACF) plot of the SDM model used to determine residual autocorrelation. Seen in Figure 5 the ACF plot from the SDM model shows that there is no lag out of the boundary so that it can be concluded that there is no case of residual autocorrelation. More over that obtained value of Moran`s I test, = 0.75194 less than = 1,65 so the conclussion is there is no spatial autocorrelation. The existence of a case of hetero-plasticity is shown by a plot between the residual TPT HR model which is squared with the estimated value of y. Figure 6 is a residual plot of the TPT HR model that is squared against the estimated value of y. The plot between residual squares and y estimates does not form an external pattern meaning identical residuals and there is no case of heteroscedasticity.

CONCLUSION
There is spatial autocorrelation in the respond variable and some predictor variables so that modeling involves spatial effects to obtain better estimates than classical regression. Based on Y modeling using classical regression, SAR, SEM, SDEM and SDM methods, the best model is obtained with 2 71.386%. The significant variables are labor force participation rate (X1), people living in poverty (X4), lag of economic growth, lag of poverty, and lag of district / city minimum wage with α=10%.