3 Case Study I: Comparing 2 Types Logistics Regression

3.1 Multinomial Logistic

3.1.1 Definition of Multinomial Logistic

Multinomial Logistic Regression is a statistical modeling technique used when the dependent variable is categorical with more than two possible outcomes. Unlike binary logistic regression, which predicts the probability of one class versus another, the multinomial version generalizes the model to handle multiple categories simultaneously. This model is commonly applied in fields such as marketing (e.g., predicting consumer choice among several brands), education (e.g., predicting academic tracks), and healthcare (e.g., predicting disease types).

Mathematically, the model estimates the probability that an observation belongs to a particular category \(( k )\) out of \(( K )\) possible categories. One category is typically chosen as the reference class, and the model computes the log-odds of each other category relative to this baseline.

For \(( K )\) possible outcomes and predictors \(( X = (x_1, x_2, \dots, x_p) )\):

\[P(Y = k | X) = \frac{\exp(\beta_{k0} + \beta_{k1}x_1 + \beta_{k2}x_2 + \dots + \beta_{kp}x_p)}{1 + \sum_{j=1}^{K-1} \exp(\beta_{j0} + \beta_{j1}x_1 + \dots + \beta_{jp}x_p)} \] for \(( k = 1, 2, \dots, K-1 )\) and the baseline category \(( K )\) is given by:

\[P(Y = K | X) = \frac{1}{1 + \sum_{j=1}^{K-1} \exp(\beta_{j0} + \beta_{j1}x_1 + \dots + \beta_{jp}x_p)}\]

where:

\(( P(Y = k | X) )\): probability of class \(( k )\) given predictors \(( X )\)
\(( \beta_{kj} )\): regression coefficients for class \(( k )\) and predictor \(( j )\)
The coefficients are estimated using maximum likelihood estimation (MLE).

Interpretation

The exponential of the estimated coefficient, \(( e^{\beta_{kj}} )\), represents the odds ratio of being in category \(( k )\) compared to the reference class for a one-unit increase in predictor \(( x_j )\).

3.2 Case Study

3.2.1 Data Preparation

data <- read.csv("data/Dataset1.csv")

# Compute a linear score as a weighted combination of independent vaariables
data$score <- with(data,
                   0.5 * Advertising +
                   0.3 * Salespeople +
                   0.7 * Satisfaction -
                   0.6 * Competition)

# Normalize score to ensure a proper distribution
data$score_norm <- (data$score - min(data$score)) / (max(data$score) - min(data$score))

# Create a new variable with three categories
data$Success_Level <- cut(data$score_norm,
                          breaks = c(-Inf, 0.33, 0.66, Inf),
                          labels = c("Low", "Medium", "High"))

# Remove the previous 'success' column if it exists
data$Success <- NULL

library(DT)
datatable(data, 
          options = list(pageLength = 10, 
                         autoWidth = TRUE,
                         scrollX = TRUE))

3.2.1.1 Importing the dataset

The dataset was imported into R using the read.csv() function. It consists of 200 observations and four numerical predictors Advertising, Salespeople, Satisfaction, and Competition which collectively represent key business performance indicators. At this stage, the data structure was verified to ensure all variables were correctly loaded and ready for subsequent modeling procedures.

3.2.1.2 Data Transformation and Encoding the Dependent Variable

To construct a suitable categorical response variable for multinomial logistic regression, a composite performance score was generated as a weighted linear combination of the four predictors. The formula is expressed as:

\[ \text{Score} = 0.5(\text{Advertising}) + 0.3(\text{Salespeople}) + 0.7(\text{Satisfaction}) - 0.6(\text{Competition}) \] This linear combination captures the joint influence of marketing, sales, customer satisfaction, and competition intensity on overall business success. Positive coefficients for Advertising, Salespeople, and Satisfaction indicate that higher values in these predictors increase the likelihood of better performance, while the negative coefficient for Competition implies an inverse relationship with success. To enhance comparability among variables measured on vastly different scales (ranging approximately from 10¹³ to 10¹⁶), the composite score was normalized into the [0, 1] range using the min–max normalization formula:

\[ \text{Score}_{norm} = \frac{\text{Score} - \min(\text{Score})}{\max(\text{Score}) - \min(\text{Score})} \]

The normalized scores were then discretized into three ordered categories using fixed breakpoints at 0.33 and 0.66, representing Low, Medium, and High success levels, respectively. This categorization generated the new dependent variable Success_Level, which serves as the outcome variable for the subsequent predictive modeling process.

3.2.1.3 Interpretation of the Transformation Results

Each observation is now assigned a Success_Level category corresponding to its normalized score interval. The classification logic follows a clear monotonic rule:

Normalized Score Range	Category	Expected Business Profile
0.00 – 0.33	Low	Low Advertising & Satisfaction, High Competition
0.34 – 0.66	Medium	Moderate levels across predictors
0.67 – 1.00	High	High Advertising & Satisfaction, Low Competition

For instance:

Observation 4 (score_norm = 0.7596) is categorized as High, reflecting strong advertising and satisfaction with moderate competition.
Observation 6 (score_norm = 0.3039) falls under Low, showing weak satisfaction and intense competition.
Observations such as 2 and 3 (score_norm ≈ 0.65) are Medium, representing balanced business attributes.

This pattern confirms that the classification is theoretically and empirically coherent — observations with higher marketing and satisfaction levels (and lower competition) are consistently mapped to higher success categories.

3.2.1.4 Validation of Consistency Across the Dataset

The encoding structure applied through the cut() function in R ensures deterministic classification across all 200 observations. Since the defined cutpoints (0.33 and 0.66) partition the normalized range [0, 1] evenly, the categorization remains consistent across the entire dataset.

Therefore, the interpretation:

Low = weak performance under high competition,
Medium = balanced marketing and satisfaction,
High = strong marketing and satisfaction with reduced competition — is mathematically consistent with the normalization logic and semantically valid within the business context.

This validation confirms that the encoded variable Success_Level accurately reflects varying degrees of business performance and is statistically ready for multiclass predictive modeling using multinomial logistic regression.

Creating Multinomial logistic regression

library(nnet)

model_multinom <- multinom(Success_Level ~ Advertising + Salespeople + Satisfaction + Competition,
                           data = data)

## # weights:  18 (10 variable)
## initial  value 219.722458 
## iter  10 value 143.933108
## iter  20 value 9.020186
## iter  30 value 4.214189
## iter  40 value 3.504247
## iter  50 value 2.820396
## iter  60 value 2.303166
## iter  70 value 2.132932
## iter  80 value 1.624341
## iter  90 value 1.472079
## iter 100 value 1.195536
## final  value 1.195536 
## stopped after 100 iterations

summary(model_multinom)

## Call:
## multinom(formula = Success_Level ~ Advertising + Salespeople + 
##     Satisfaction + Competition, data = data)
## 
## Coefficients:
##        (Intercept) Advertising Salespeople Satisfaction Competition
## Medium   -226.9013    9.584952    5.806227     13.27848   -11.83763
## High     -478.6406   15.206729    9.860392     22.28077   -18.48662
## 
## Std. Errors:
##        (Intercept) Advertising Salespeople Satisfaction Competition
## Medium    114.0896    4.752920    2.913936     6.611718    5.784428
## High      201.4697    5.948203    4.036557     8.875431    7.155263
## 
## Residual Deviance: 2.391072 
## AIC: 22.39107