Skip to main content
Fig. 3 | BMC Psychiatry

Fig. 3

From: Identification of risk factors for involuntary psychiatric hospitalization: using environmental socioeconomic data and methods of machine learning to improve prediction

Fig. 3

Model 5, part I: Each rectangle represents a node. The number of each node is given in the small circle within each rectangle. Node 1 includes the entire training sample. All other nodes represent subsamples that are defined by the variable given in the top line within the rectangle. Main Diagnosis: diagnosis representing the main target for diagnostics and therapy, F0: Organic mental disorders according to the ICD-10 classification. The numbers given with environmental socioeconomic data (ESED) variables (in this figure nodes 7 and 8) represent the cut-off values that were chosen by the algorithm in order to create a binary split from a continuous variable. Due to the previous standardization of the continuous variables, the unit of measurement is standard deviation (SD). With nodes 7 and 8, − 0.203 indicates that unemployment rates above or below − 0.203 SD are defined as different groups. Gini: The lower the value, the higher the purity of the node. Cases: The entire number of cases in each node. Values: The distribution of cases per node (V: number of voluntarily treated cases; I: number of involuntarily treated cases). The nodes are color coded in three different colors (red: predominantly involuntarily treated cases, green: predominantly voluntarily treated cases, white: 50/50 distribution between involuntarily and voluntarily treated cases) and two different color intensities (gini 0–0.25: strong color saturation, gini 0.25–0.5: weak color saturation). In addition, nodes are arranged in a way that, per split, the bottom node represents a larger proportion of involuntarily treated cases compared to the top node

Back to article page