Medical statistics and Data Science: published books

A list of published books

  1. An introduction to Directed Acyclic Graph (DAG) for health researchers
  2. How to get started with Stata for beginners (will be published soon)
  3. An introduction to R for Stata users (under development ......)
  4. Language and intuition of g-methods for health researchers (under development ......)

An introduction to Directed Acyclic Graph (DAG) for health researchers

A new book “An introduction to Directed Acyclic Graph (DAG) for health researchers” has been published in Amazon in the 21st December 2024

An introduction to Directed Acyclic Graph (DAG) for health researchers

Link to the book in Amazon

The book description:

"Directed acyclic graph (DAG) is increasingly used in modern epidemiology, especially guide researchers to implementing causal inference in observational studies. Casual DAG visually presents causal knowledge and assumptions between variables. Once one can manage the rules, it can facilitate many tasks, such as using DAG makes it easier to understand many concepts for example direct and indirect causal effects, mediation analysis, collider stratification bias, selection bias, and information bias, etc. It also makes easier to recognize and avoid mistakes in analytic decisions such as using the backdoor criterion to select variables to be adjusted."

"More advanced texts on DAGs are readily available in textbooks and in scientific papers, but a simple and comprehensive introduction to DAG is lacking."

"The book thoroughly introduces DAG in a plain language from the scratch, step by step with more simple and accessible language explaining the concepts, terminologies, rules, and potential applications. The book will pave the way for researchers using DAG."

A big picture of DAG

After years of teaching DAG, I have figured out a big picture (figure 2.2 in the book) illustrating how DAG works. The big picture of DAG has not appeared in any scientific articles or books. This is originally invent-ed/developed/formalized by researcher from Denmark.

Before DAG era

Before knowing DAG, as figure 2.1 in the book shows that we assume that we should theoretical-ly observe an association between an exposure and an outcome of interest if the exposure caus-ally affects the out-come as indicated by the arrow with the solid-black line form the causation to the association. When we ob-served an association between an exposure and an outcome, we wish to interpret the associations as causa-tion with potential biases induced by confounders, selection bias, and information bias which are listed in the up-right box. That is why an arrow with red-dashed line from association to causation.

An introduction to Directed Acyclic Graph (DAG) for health researchers

One of the challenges is how to choose variables to be adjusted in a multivariate regression. Before knowing DAG, it rarely describes how the adjusted variables are causally connected to the exposure and outcomes. We say that the adjustment is non-transparent and subjective at least from reader point of view.

DAG era

Selecting variables to be adjusted becomes super easy when a DAG is available. However, DAG can be an ab-stract concept and take a little energy to graph the rules. I have figured out a big picture of DAG (figure 2.2 in the book), which illustrates an overview of how DAG works.

An introduction to Directed Acyclic Graph (DAG) for health researchers

As the figure 2.2 shows, DAG concept can be viewed as the following 4 procedures. The colors and styles of the lines indicate relative difficulties. Briefly:

  1. Translate:
  2. translate knowledge and assumptions on causal relations between variables into a DAG.
  3. Back-translate:
  4. back-translate a given DAG to causal relations between the variables within the DAG.
  5. Applications:
  6. explain DAG terminologies, rules, and potential applications.
  7. Causal discovery:
  8. discover potential DAG reflecting the causal relations between the variables based on observed associations.
The book thoroughly introduces the four procedures. If one can understand the four procedures, it will pave a solid way for further study.

How to get started with Stata for beginners

Download the files

An introduction to R for Stata users

Under development ......

Language and intuition of g-methods for health researchers

Under development ......