Sunday, September 15, 2019

Log-transformed Predictor in Regression Model

It is not uncommon in a regression model that a predictor is log-transformed to meet the normality assumption of the residuals. Below is an example where our goal is to examine a relationship between urinary arsenic concentration and white blood cell (WBC) count (in thousands). Urinary arsenic distribution had right-skew and hence the predictor was log-transformed for this regression. The output is below and the coefficient is highlighted in yellow.


The interpretation of regression coefficients can be sometimes confusing. However, when the predictor variable is a continuous variable (here it is LNUARS), it is easy to visualize it graphically. Simply, think that the coefficient is slope for a line on a graph where Y-axis has outcome (WBC count here) and X-axis has predictor. Now, we can interpret it as ‘change in Y (WBC here) for each unit change in X (LNUARS here)’. Note, we are saying a unit change and this unit can be any unit depending on a given variable.

Now, we have our predictor (urinary arsenic) log-transformed due to its skewed distribution. The coefficient (or slope of the graph) here means change in WBC count (unit is in thousands for this output) for one unit change in log of total normalized urinary arsenic. While this is an accurate interpretation of the coefficient, we don’t use log-scale measurements in our regular life. Further, we may find it difficult to communicate with others when describing results. Hence, it makes much more sense to convert total arsenic from log-scale to our usual scale.

As a general rule, and without going into mathematical details, the interpretation of a log-transformed variable is slightly different than usual interpretation that we would do otherwise. A simplest way is to multiply the coefficient with 0.01; the resulting value will be change in Y for 1% change in X. Note, it is not one unit change bur rather one percent change. Here, the coefficient is -0.195. Multiplying it with 0.01 gives us -0.00195. The Y = WBC here has unit in 1000 cells and X here is total urinary arsenic. Hence, we will say that for each 1% increase in total urinary arsenic, the WBC decreases by 0.00195 (in thousands). We can multiply 0.00195 by 1000 (=1.95) and then each 1% increase in normalized total urinary arsenic decreases WBC by about 2 cells.

The p-value is significant but the change of 2-cells for 1% change in urinary arsenic may not be large enough to be clinically meaningful; however, that is another topic of discussion – difference between statistically significant and clinically meaningful – for another day.

No comments: